diff options
Diffstat (limited to 'doc/gawk.info')
-rw-r--r-- | doc/gawk.info | 5090 |
1 files changed, 3272 insertions, 1818 deletions
diff --git a/doc/gawk.info b/doc/gawk.info index 8b70bd43..d10c3927 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -14,7 +14,7 @@ END-INFO-DIR-ENTRY Foundation, Inc. - This is Edition 3 of `GAWK: Effective AWK Programming: A User's + This is Edition 4 of `GAWK: Effective AWK Programming: A User's Guide for GNU Awk', for the 3.1.8 (or later) version of the GNU implementation of AWK. @@ -46,7 +46,7 @@ particular records in a file and perform operations upon them. Foundation, Inc. - This is Edition 3 of `GAWK: Effective AWK Programming: A User's + This is Edition 4 of `GAWK: Effective AWK Programming: A User's Guide for GNU Awk', for the 3.1.8 (or later) version of the GNU implementation of AWK. @@ -94,6 +94,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) * Library Functions:: A Library of `awk' Functions. * Sample Programs:: Many `awk' programs with complete explanations. +* Debugger:: The `dgawk' debugger. * Language History:: The evolution of the `awk' language. * Installation:: Installing `gawk' under various @@ -132,7 +133,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) * Comments:: Adding documentation to `gawk' programs. * Quoting:: More discussion of shell quoting issues. -* DOS Quoting:: Quoting in MS-DOS Batch Files. +* DOS Quoting:: Quoting in Windows Batch Files. * Sample Data Files:: Sample data files for use in the `awk' programs illustrated in this Info file. @@ -159,6 +160,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) * Nonconstant Fields:: Nonconstant Field Numbers. * Changing Fields:: Changing the Contents of a Field. * Field Separators:: The field separator and how to change it. +* Default Field Splitting:: How fields are normally separated. * Regexp Field Splitting:: Using regexps as the field separator. * Single Character Fields:: Making each character a separate field. * Command Line Field Separator:: Setting `FS' from the command-line. @@ -298,16 +300,18 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) `awk'. * Multi-scanning:: Scanning multidimensional arrays. * Array Sorting:: Sorting array values and indices. +* Arrays of Arrays:: True multidimensional arrays. * Built-in:: Summarizes the built-in functions. * Calling Built-in:: How to call built-in functions. * Numeric Functions:: Functions that work with numbers, including - `int', `sin' and `rand'. + `int()', `sin()' and + `rand()'. * String Functions:: Functions for string manipulation, such as - `split', `match' and - `sprintf'. + `split()', `match()' and + `sprintf()'. * Gory Details:: More than you want to know about `\' - and `&' with `sub', `gsub', - and `gensub'. + and `&' with `sub()', + `gsub()', and `gensub()'. * I/O Functions:: Functions for files and shell commands. * Time Functions:: Functions for dealing with timestamps. * Bitwise Functions:: Functions for bitwise operations. @@ -335,7 +339,6 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) process. * TCP/IP Networking:: Using `gawk' for network programming. -* Portal Files:: Using `gawk' with BSD portals. * Profiling:: Profiling your `awk' programs. * Command Line:: How to run `awk'. * Options:: Command-line options and their meanings. @@ -343,6 +346,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) * AWKPATH Variable:: Searching directories for `awk' programs. * Exit Status:: `gawk''s exit status. +* Include Files:: Including other files into your program. * Obsolete:: Obsolete Options and/or features. * Undocumented:: Undocumented Options and Features. * Known Bugs:: Known Bugs in `gawk'. @@ -352,10 +356,10 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) * Nextfile Function:: Two implementations of a `nextfile' function. * Strtonum Function:: A replacement for the built-in - `strtonum' function. + `strtonum()' function. * Assert Function:: A function for assertions in `awk' programs. -* Round Function:: A function for rounding if `sprintf' +* Round Function:: A function for rounding if `sprintf()' does not do it correctly. * Cliff Random Function:: The Cliff Random Number Generator. * Ordinal Functions:: Functions for using characters as numbers @@ -399,6 +403,23 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) files. * Signature Program:: People do amazing things with too much time on their hands. +* Debugging:: Introduction to `dgawk'. +* Debugging Concepts:: Debugging In General. +* Debugging Terms:: Additional Debugging Concepts. +* Awk Debugging:: Awk Debugging. +* Sample dgawk session:: Sample `dgawk' session. +* dgawk invocation:: `dgawk' Invocation. +* Finding The Bug:: Finding The Bug. +* List of Debugger Commands:: Main `dgawk' Commands. +* Breakpoint Control:: Control of breakpoints. +* Dgawk Execution Control:: Control of execution. +* Viewing And Changing Data:: Viewing and changing data. +* Dgawk Stack:: Dealing with the stack. +* Dgawk Info:: Obtaining information about the program and + the debugger state. +* Miscellaneous Dgawk Commands:: Miscellaneous Commands. +* Readline Support:: Readline Support. +* Dgawk Limitations:: Limitations and future plans. * V7/SVR3.1:: The major changes between V7 and System V Release 3.1. * SVR4:: Minor changes between System V Releases 3.1 @@ -491,8 +512,8 @@ File: gawk.info, Node: Foreword, Next: Preface, Prev: Top, Up: Top Foreword ******** -Arnold Robbins and I are good friends. We were introduced 11 years ago -by circumstances--and our favorite programming language, AWK. The +Arnold Robbins and I are good friends. We were introduced in 1990 by +circumstances--and our favorite programming language, AWK. The circumstances started a couple of years earlier. I was working at a new job and noticed an unplugged Unix computer sitting in the corner. No one knew how to use it, and neither did I. However, a couple of days @@ -581,17 +602,17 @@ Several kinds of tasks occur repeatedly when working with text files. You might want to extract certain lines and discard the rest. Or you may need to make changes wherever certain patterns appear, but leave the rest of the file alone. Writing single-use programs for these -tasks in languages such as C, C++, or Pascal is time-consuming and +tasks in languages such as C, C++, or Java is time-consuming and inconvenient. Such jobs are often easier with `awk'. The `awk' utility interprets a special-purpose programming language that makes it easy to handle simple data-reformatting jobs. The GNU implementation of `awk' is called `gawk'; it is fully -compatible with the System V Release 4 version of `awk'. `gawk' is -also compatible with the POSIX specification of the `awk' language. -This means that all properly written `awk' programs should work with -`gawk'. Thus, we usually don't distinguish between `gawk' and other -`awk' implementations. +compatible with the POSIX(1) specification of the `awk' language and +with the Unix version of `awk' maintained by Brian Kernighan. This +means that all properly written `awk' programs should work with `gawk'. +Thus, we usually don't distinguish between `gawk' and other `awk' +implementations. Using `awk' allows you to: @@ -616,17 +637,19 @@ This means that all properly written `awk' programs should work with This Info file teaches you about the `awk' language and how you can use it effectively. You should already be familiar with basic system -commands, such as `cat' and `ls',(1) as well as basic shell facilities, +commands, such as `cat' and `ls',(2) as well as basic shell facilities, such as input/output (I/O) redirection and pipes. Implementations of the `awk' language are available for many different computing environments. This Info file, while describing the `awk' language in general, also describes the particular implementation of `awk' called `gawk' (which stands for "GNU awk"). `gawk' runs on a -broad range of Unix systems, ranging from 80386 PC-based computers up -through large-scale systems, such as Crays. `gawk' has also been ported -to Mac OS X, MS-DOS, Microsoft Windows (all versions) and OS/2 PCs, -Atari microcomputers, BeOS, Tandem D20, and VMS. +broad range of Unix systems, ranging from Intel(R)-architecture +PC-based computers up through large-scale systems, such as Crays. +`gawk' has also been ported to Mac OS X, Microsoft Windows (all +versions) and OS/2 PCs, and VMS. (Other systems to which `gawk' was +once ported are no longer supported and the code for those systems has +been removed.) * Menu: @@ -643,7 +666,10 @@ Atari microcomputers, BeOS, Tandem D20, and VMS. ---------- Footnotes ---------- - (1) These commands are available on POSIX-compliant systems, as well + (1) The 2008 POSIX standard can be found online at +`http://www.opengroup.org/onlinepubs/9699919799/'. + + (2) These commands are available on POSIX-compliant systems, as well as on traditional Unix-based systems. If you are using some other operating system, you still need to be familiar with the ideas of I/O redirection and pipes. @@ -671,18 +697,18 @@ of `awk' was written in 1977 at AT&T Bell Laboratories. In 1985, a new version made the programming language more powerful, introducing user-defined functions, multiple input streams, and computed regular expressions. This new version became widely available with Unix System -V Release 3.1 (SVR3.1). The version in SVR4 added some new features -and cleaned up the behavior in some of the "dark corners" of the -language. The specification for `awk' in the POSIX Command Language -and Utilities standard further clarified the language. Both the `gawk' -designers and the original Bell Laboratories `awk' designers provided -feedback for the POSIX specification. +V Release 3.1 (1987) The version in System V Release 4 (1989) added +some new features and cleaned up the behavior in some of the "dark +corners" of the language. The specification for `awk' in the POSIX +Command Language and Utilities standard further clarified the language. +Both the `gawk' designers and the original Bell Laboratories `awk' +designers provided feedback for the POSIX specification. Paul Rubin wrote the GNU implementation, `gawk', in 1986. Jay Fenlason completed it, with advice from Richard Stallman. John Woods contributed parts of the code as well. In 1988 and 1989, David Trueman, with help from me, thoroughly reworked `gawk' for compatibility -with the newer `awk'. Circa 1995, I became the primary maintainer. +with the newer `awk'. Circa 1994, I became the primary maintainer. Current development focuses on bug fixes, performance improvements, standards compliance, and occasionally, new features. @@ -706,20 +732,20 @@ The `awk' language has evolved over the years. Full details are provided in *note Language History::. The language described in this Info file is often referred to as "new `awk'" (`nawk'). - Because of this, many systems have multiple versions of `awk'. Some -systems have an `awk' utility that implements the original version of -the `awk' language and a `nawk' utility for the new version. Others + Because of this, there are systems with multiple versions of `awk'. +Some systems have an `awk' utility that implements the original version +of the `awk' language and a `nawk' utility for the new version. Others have an `oawk' version for the "old `awk'" language and plain `awk' for the new one. Still others only have one version, which is usually the new one.(1) All in all, this makes it difficult for you to know which version of -`awk' you should run when writing your programs. The best advice I can -give here is to check your local documentation. Look for `awk', `oawk', -and `nawk', as well as for `gawk'. It is likely that you already have -some version of new `awk' on your system, which is what you should use -when running your programs. (Of course, if you're reading this Info -file, chances are good that you have `gawk'!) +`awk' you should run when writing your programs. The best advice we +can give here is to check your local documentation. Look for `awk', +`oawk', and `nawk', as well as for `gawk'. It is likely that you +already have some version of new `awk' on your system, which is what +you should use when running your programs. (Of course, if you're +reading this Info file, chances are good that you have `gawk'!) Throughout this Info file, whenever we refer to a language feature that should be available in any complete implementation of POSIX `awk', @@ -739,11 +765,12 @@ Using This Book The term `awk' refers to a particular program as well as to the language you use to tell this program what to do. When we need to be careful, we call the language "the `awk' language," and the program -"the `awk' utility." This Info file explains both the `awk' language -and how to run the `awk' utility. The term "`awk' program" refers to a -program written by you in the `awk' programming language. +"the `awk' utility." This Info file explains both how to write program +in the `awk' language and how to run the `awk' utility. The term +"`awk' program" refers to a program written by you in the `awk' +programming language. - Primarily, this Info file explains the features of `awk', as defined + Primarily, this Info file explains the features of `awk' as defined in the POSIX standard. It does so in the context of the `gawk' implementation. While doing so, it also attempts to describe important differences between `gawk' and other `awk' implementations.(1) Finally, @@ -774,7 +801,8 @@ particular the flavors supported by POSIX `awk' and `gawk'. *note Reading Files::, describes how `awk' reads your data. It introduces the concepts of records and fields, as well as the `getline' -command. I/O redirection is first described here. +command. I/O redirection is first described here. Network I/O is also +briefly introduced here. *note Printing::, describes how `awk' programs can produce output with `print' and `printf'. @@ -808,14 +836,16 @@ its command-line options, and how it finds `awk' program source files. sample `awk' programs. Reading them allows you to see `awk' solving real problems. + *note Debugger::, describes the `awk' debugger, `dgawk'. + *note Language History::, describes how the `awk' language has -evolved since first release to present. It also describes how `gawk' -has acquired features over time. +evolved since its first release to present. It also describes how +`gawk' has acquired features over time. *note Installation::, describes how to get `gawk', how to compile it -under Unix, and how to compile and use it on different non-Unix -systems. It also describes how to report bugs in `gawk' and where to -get three other freely available implementations of `awk'. +on POSIX-compatible systems, and how to compile and use it on different +non-POSIX systems. It also describes how to report bugs in `gawk' and +where to get other freely available `awk' implementations. *note Notes::, describes how to disable `gawk''s extensions, as well as how to contribute new code to `gawk', how to write extension @@ -851,11 +881,11 @@ both the printed and online versions of the documentation. This minor node briefly documents the typographical conventions used in Texinfo. Examples you would type at the command-line are preceded by the -common shell primary and secondary prompts, `$' and `>'. Output from -the command is preceded by the glyph "-|". This typically represents -the command's standard output. Error messages, and other output on the -command's standard error, are preceded by the glyph "error-->". For -example: +common shell primary and secondary prompts, `$' and `>'. Input that +you type is shown `like this'. Output from the command is preceded by +the glyph "-|". This typically represents the command's standard +output. Error messages, and other output on the command's standard +error, are preceded by the glyph "error-->". For example: $ echo hi on stdout -| hi on stdout @@ -876,14 +906,17 @@ Dark Corners illuminate, there's always a smaller but darker one. Brian Kernighan - Until the POSIX standard (and `The Gawk Manual'), many features of -`awk' were either poorly documented or not documented at all. -Descriptions of such features (often called "dark corners") are noted -in this Info file with "(d.c.)". They also appear in the index under -the heading "dark corner." + Until the POSIX standard (and `GAWK: Effective AWK Programming'), +many features of `awk' were either poorly documented or not documented +at all. Descriptions of such features (often called "dark corners") +are noted in this Info file with "(d.c.)". They also appear in the +index under the heading "dark corner." As noted by the opening quote, though, any coverage of dark corners -is, by definition, something that is incomplete. +is, by definition, incomplete. + + Extensions to the standard `awk' language are marked "(c.e.)," and +listed in the index under "common extensions" and "extensions, common." File: gawk.info, Node: Manual History, Next: How To Contribute, Prev: Conventions, Up: Preface @@ -911,19 +944,18 @@ This Info file may also be read from their web site Objective-C compilers, a symbolic debugger and dozens of large and small utilities (such as `gawk'), have all been completed and are freely available. The GNU operating system kernel (the HURD), has been -released but is still in an early stage of development. +released but remains in an early stage of development. Until the GNU operating system is more fully developed, you should consider using GNU/Linux, a freely distributable, Unix-like operating -system for Intel 80386, DEC Alpha, Sun SPARC, IBM S/390, and other -systems.(2) There are many books on GNU/Linux. One that is freely -available is `Linux Installation and Getting Started', by Matt Welsh. -Many GNU/Linux distributions are often available in computer stores or -bundled on CD-ROMs with books about Linux. (There are three other -freely available, Unix-like operating systems for 80386 and other -systems: NetBSD, FreeBSD, and OpenBSD. All are based on the 4.4-Lite -Berkeley Software Distribution, and they use recent versions of `gawk' -for their versions of `awk'.) +system for Intel(R), Power Architecture, Sun SPARC, IBM S/390, and other +systems.(2) Many GNU/Linux distributions are available for download +from the Internet. + + (There are numerous other freely available, Unix-like operating +systems based on the Berkeley Software Distribution, and they use +recent versions of `gawk' for their versions of `awk'. NetBSD, FreeBSD +and OpenBSD are three of the most popular ones, but there are others.) The Info file itself has gone through a number of previous editions. Paul Rubin wrote the very first draft of `The GAWK Manual'; it was @@ -937,17 +969,16 @@ it progressed, the FSF published several preliminary versions (numbered published the first two editions under the title `The GNU Awk User's Guide'. - This edition maintains the basic structure of Edition 1.0, but with -significant additional material, reflecting the host of new features in -`gawk' version 3.1. Of particular note is *note Array Sorting::, as -well as *note Bitwise Functions::, *note Internationalization::, and -also *note Advanced Features::, and *note Dynamic Extensions::. + This edition maintains the basic structure of Edition 1.0. For +Edition 4.0, the content has been thoroughly reviewed and updated. All +references to versions prior to 4.0 have been removed. Of significant +note for this edition is *note Debugger::. `GAWK: Effective AWK Programming' will undoubtedly continue to evolve. An electronic version comes with the `gawk' distribution from the FSF. If you find an error in this Info file, please report it! *Note Bugs::, for information on submitting problem reports -electronically, or write to me in care of the publisher. +electronically. ---------- Footnotes ---------- @@ -1004,56 +1035,63 @@ acknowledgments: better world and for his courage in founding the FSF and starting the GNU Project. - The following people (in alphabetical order) provided helpful -comments on various versions of this book, up to and including this -edition. Rick Adams, Nelson H.F. Beebe, Karl Berry, Dr. Michael -Brennan, Rich Burridge, Claire Cloutier, Diane Close, Scott Deifik, -Christopher ("Topher") Eliot, Jeffrey Friedl, Dr. Darrel Hankerson, -Michal Jaegermann, Dr. Richard J. LeBlanc, Michael Lijewski, Pat Rankin, -Miriam Robbins, Mary Sheehan, and Chuck Toporek. - - Robert J. Chassell provided much valuable advice on the use of -Texinfo. He also deserves special thanks for convincing me _not_ to -title this Info file `How To Gawk Politely'. Karl Berry helped -significantly with the TeX part of Texinfo. - - I would like to thank Marshall and Elaine Hartholz of Seattle and -Dr. Bert and Rita Schreiber of Detroit for large amounts of quiet -vacation time in their homes, which allowed me to make significant -progress on this Info file and on `gawk' itself. - - Phil Hughes of SSC contributed in a very important way by loaning me -his laptop GNU/Linux system, not once, but twice, which allowed me to -do a lot of work while away from home. - - David Trueman deserves special credit; he has done a yeoman job of -evolving `gawk' so that it performs well and without bugs. Although he -is no longer involved with `gawk', working with him on this project was -a significant pleasure. - - The intrepid members of the GNITS mailing list, and most notably -Ulrich Drepper, provided invaluable help and feedback for the design of -the internationalization features. - - Nelson Beebe, Andreas Buening, Antonio Colombo, Scott Deifik, John -H. DuBois III, Darrel Hankerson, Michal Jaegermann, Ju"rgen Kahrs, Dave -Pitts, Stepan Kasal, Pat Rankin, Andrew Schorr, Corinna Vinschen, -Anders Wallin, and Eli Zaretskii (in alphabetical order) make up the -current `gawk' "crack portability team." Without their hard work and -help, `gawk' would not be nearly the fine program it is today. It has -been and continues to be a pleasure working with this team of fine -people. - - David and I would like to thank Brian Kernighan of Bell Laboratories -for invaluable assistance during the testing and debugging of `gawk', -and for help in clarifying numerous points about the language. We + Earlier editins of this Info file had the following acknowledgements: + + The following people (in alphabetical order) provided helpful + comments on various versions of this book, Rick Adams, Nelson H.F. + Beebe, Karl Berry, Dr. Michael Brennan, Rich Burridge, Claire + Cloutier, Diane Close, Scott Deifik, Christopher ("Topher") Eliot, + Jeffrey Friedl, Dr. Darrel Hankerson, Michal Jaegermann, Dr. + Richard J. LeBlanc, Michael Lijewski, Pat Rankin, Miriam Robbins, + Mary Sheehan, and Chuck Toporek. + + Robert J. Chassell provided much valuable advice on the use of + Texinfo. He also deserves special thanks for convincing me _not_ + to title this Info file `How To Gawk Politely'. Karl Berry helped + significantly with the TeX part of Texinfo. + + I would like to thank Marshall and Elaine Hartholz of Seattle and + Dr. Bert and Rita Schreiber of Detroit for large amounts of quiet + vacation time in their homes, which allowed me to make significant + progress on this Info file and on `gawk' itself. + + Phil Hughes of SSC contributed in a very important way by loaning + me his laptop GNU/Linux system, not once, but twice, which allowed + me to do a lot of work while away from home. + + David Trueman deserves special credit; he has done a yeoman job of + evolving `gawk' so that it performs well and without bugs. + Although he is no longer involved with `gawk', working with him on + this project was a significant pleasure. + + The intrepid members of the GNITS mailing list, and most notably + Ulrich Drepper, provided invaluable help and feedback for the + design of the internationalization features. + + Chuck Toporek, Mary Sheehan, and Claire Coutier of O'Reilly & + Associates contributed significant editorial help for this Info + file for the 3.1 release of `gawk'. + + Nelson Beebe, Andreas Buening, Antonio Colombo, Stephen Davies, +Scott Deifik, John H. DuBois III, Darrel Hankerson, Michal Jaegermann, +Ju"rgen Kahrs, Dave Pitts, Stepan Kasal, Pat Rankin, Andrew Schorr, +Corinna Vinschen, Anders Wallin, and Eli Zaretskii (in alphabetical +order) make up the current `gawk' "crack portability team." Without +their hard work and help, `gawk' would not be nearly the fine program +it is today. It has been and continues to be a pleasure working with +this team of fine people. + + John Haque contributed the modifications to convert `gawk' into a +byte-code interpreter, including the debugger. Stephen Davies +contributed to the effort to bring the byte-code changes into the +mainstream code base. + + I would like to thank Brian Kernighan of Bell Laboratories for +invaluable assistance during the testing and debugging of `gawk', and +for ongoing help in clarifying numerous points about the language. We could not have done nearly as good a job on either `gawk' or its documentation without his help. - Chuck Toporek, Mary Sheehan, and Claire Coutier of O'Reilly & -Associates contributed significant editorial help for this Info file -for the 3.1 release of `gawk'. - I must thank my wonderful wife, Miriam, for her patience through the many versions of this project, for her proofreading, and for sharing me with the computer. I would like to thank my parents for their love, @@ -1066,7 +1104,7 @@ to take advantage of those opportunities. Arnold Robbins Nof Ayalon ISRAEL -February, 2010 +December, 2010 File: gawk.info, Node: Getting Started, Next: Regexp, Prev: Preface, Up: Top @@ -1196,8 +1234,8 @@ following command line: `awk' applies the PROGRAM to the "standard input", which usually means whatever you type on the terminal. This continues until you indicate end-of-file by typing `Ctrl-d'. (On other operating systems, the -end-of-file character may be different. For example, on OS/2 and -MS-DOS, it is `Ctrl-z'.) +end-of-file character may be different. For example, on OS/2, it is +`Ctrl-z'.) As an example, the following program prints a friendly piece of advice (from Douglas Adams's `The Hitchhiker's Guide to the Galaxy'), @@ -1229,7 +1267,7 @@ works is explained shortly). ---------- Footnotes ---------- - (1) If you use `bash' as your shell, you should execute the command + (1) If you use Bash as your shell, you should execute the command `set +H' before running this program interactively, to disable the `csh'-style command history, which treats `!' as a special character. We recommend putting this command into your personal startup file. @@ -1283,9 +1321,9 @@ File: gawk.info, Node: Executable Scripts, Next: Comments, Prev: Long, Up: R ------------------------------- Once you have learned `awk', you may want to write self-contained `awk' -scripts, using the `#!' script mechanism. You can do this on many Unix -systems(1) as well as on the GNU system. For example, you could update -the file `advice' to look like this: +scripts, using the `#!' script mechanism. You can do this on many +systems.(1) For example, you could update the file `advice' to look +like this: #! /bin/awk -f @@ -1327,9 +1365,8 @@ the name of your script (`advice'). Don't rely on the value of ---------- Footnotes ---------- - (1) The `#!' mechanism works on Linux systems, systems derived from -the 4.4-Lite Berkeley Software Distribution, and most commercial Unix -systems. + (1) The `#!' mechanism works on GNU/Linux systems, BSD-based systems +and commercial Unix systems. (2) The line beginning with `#!' lists the full file name of an interpreter to run and an optional initial command-line argument to @@ -1401,7 +1438,7 @@ File: gawk.info, Node: Quoting, Prev: Comments, Up: Running gawk * Menu: -* DOS Quoting:: Quoting in MS-DOS Batch Files. +* DOS Quoting:: Quoting in Windows Batch Files. For short to medium length `awk' programs, it is most convenient to enter the program on the `awk' command line. This is best done by @@ -1413,8 +1450,8 @@ writing it as part of a larger shell script: Once you are working with the shell, it is helpful to have a basic knowledge of shell quoting rules. The following rules apply only to -POSIX-compliant, Bourne-style shells (such as `bash', the GNU -Bourne-Again Shell). If you use `csh', you're on your own. +POSIX-compliant, Bourne-style shells (such as Bash, the GNU Bourne-Again +Shell). If you use `csh', you're on your own. * Quoted items can be concatenated with nonquoted items as well as with other quoted items. The shell turns everything into one @@ -1487,10 +1524,11 @@ Judge for yourself which of these two is the more readable. -| Here is a single quote <'> This option is also painful, because double quotes, backslashes, and -dollar signs are very common in `awk' programs. +dollar signs are very common in more advanced `awk' programs. - A third option is to use the octal escape sequence equivalents for -the single- and double-quote characters, like so: + A third option is to use the octal escape sequence equivalents +(*note Escape Sequences::) for the single- and double-quote characters, +like so: $ awk 'BEGIN { print "Here is a single quote <\47>" }' -| Here is a single quote <'> @@ -1513,14 +1551,14 @@ shell won't be part of the picture, and you can say what you mean. File: gawk.info, Node: DOS Quoting, Up: Quoting -1.1.6.1 Quoting in MS-DOS Batch Files -..................................... +1.1.6.1 Quoting in Windows Batch Files +...................................... Although this Info file generally only worries about POSIX systems and the POSIX shell, the following issue arises often enough for many users that it is worth addressing. - Systems providing an MS-DOS compatible "shell" use the double-quote + The "shell" on Microsoft Windows systems use the double-quote character for quoting, and make it difficult or impossible to include an escaped double-quote character in a command-line script. The following example, courtesy of Jeroen Brink, shows how to print all lines in a @@ -1697,7 +1735,7 @@ different ways to do the same things shown here: * Print the total number of kilobytes used by FILES: ls -l FILES | awk '{ x += $5 } - END { print "total K-bytes: " (x + 1023)/1024 }' + END { print "total K-bytes:", x /1024 }' * Print a sorted list of the login names of all users: @@ -1778,14 +1816,14 @@ summarize, select, and rearrange the output of another utility. It uses features that haven't been covered yet, so don't worry if you don't understand all the details: - ls -l | awk '$6 == "Nov" { sum += $5 } - END { print sum }' + LC_ALL=C ls -l | awk '$6 == "Nov" { sum += $5 } + END { print sum }' This command prints the total number of bytes in all the files in the current directory that were last modified in November (of any year). -(1) The `ls -l' part of this example is a system command that gives you -a listing of the files in a directory, including each file's size and -the date the file was last modified. Its output looks like this: +The `ls -l' part of this example is a system command that gives you a +listing of the files in a directory, including each file's size and the +date the file was last modified. Its output looks like this: -rw-r--r-- 1 arnold user 1933 Nov 7 13:05 Makefile -rw-r--r-- 1 arnold user 10809 Nov 7 13:03 awk.h @@ -1802,7 +1840,7 @@ identifies the owner of the file. The fourth field identifies the group of the file. The fifth field contains the size of the file in bytes. The sixth, seventh, and eighth fields contain the month, day, and time, respectively, that the file was last modified. Finally, the ninth field -contains the name of the file.(2) +contains the file name.(1) The `$6 == "Nov"' in our `awk' program is an expression that tests whether the sixth field of the output from `ls -l' matches the string @@ -1826,16 +1864,8 @@ reports. ---------- Footnotes ---------- - (1) In the C shell (`csh'), you need to type a semicolon and then a -backslash at the end of the first line; see *note Statements/Lines::, -for an explanation. In a POSIX-compliant shell, such as the Bourne -shell or `bash', you can type the example as shown. If the command -`echo $path' produces an empty output line, you are most likely using a -POSIX-compliant shell. Otherwise, you are probably using the C shell -or a shell derived from it. - - (2) On some very old systems, you may need to use `ls -lg' to get -this output. + (1) The `LC_ALL=C' is needed to produce traditional-style output +from `ls'. File: gawk.info, Node: Statements/Lines, Next: Other Features, Prev: More Complex, Up: Getting Started @@ -1867,24 +1897,24 @@ example: awk '/This regular expression is too long, so continue it\ on the next line/ { print $1 }' -We have generally not used backslash continuation in the sample programs -in this Info file. In `gawk', there is no limit on the length of a -line, so backslash continuation is never strictly necessary; it just -makes programs more readable. For this same reason, as well as for -clarity, we have kept most statements short in the sample programs -presented throughout the Info file. Backslash continuation is most -useful when your `awk' program is in a separate source file instead of -entered from the command line. You should also note that many `awk' -implementations are more particular about where you may use backslash -continuation. For example, they may not allow you to split a string -constant using backslash continuation. Thus, for maximum portability -of your `awk' programs, it is best not to split your lines in the -middle of a regular expression or a string. +We have generally not used backslash continuation in our sample +programs. `gawk' places no limit on the length of a line, so backslash +continuation is never strictly necessary; it just makes programs more +readable. For this same reason, as well as for clarity, we have kept +most statements short in the sample programs presented throughout the +Info file. Backslash continuation is most useful when your `awk' +program is in a separate source file instead of entered from the +command line. You should also note that many `awk' implementations are +more particular about where you may use backslash continuation. For +example, they may not allow you to split a string constant using +backslash continuation. Thus, for maximum portability of your `awk' +programs, it is best not to split your lines in the middle of a regular +expression or a string. *Caution:* _Backslash continuation does not work as described with the C shell._ It works for `awk' programs in files and for one-shot programs, _provided_ you are using a POSIX-compliant shell, such as the -Unix Bourne shell or `bash'. But the C shell behaves differently! +Unix Bourne shell or Bash. But the C shell behaves differently! There, you must use two backslashes in a row, followed by a newline. Note also that when using the C shell, _every_ newline in your awk program must be escaped with a backslash. To illustrate: @@ -1961,10 +1991,10 @@ There are other variables your program can set as well to control how In addition, `awk' provides a number of built-in functions for doing common computational and string-related operations. `gawk' provides built-in functions for working with timestamps, performing bit -manipulation, and for runtime string translation. +manipulation, for runtime string translation, and array sorting. As we develop our presentation of the `awk' language, we introduce -most of the variables and many of the functions. They are defined +most of the variables and many of the functions. They are described systematically in *note Built-in Variables::, and *note Built-in::. @@ -1983,7 +2013,7 @@ programs like `ls'. (*Note More Complex::.) Programs written with `awk' are usually much smaller than they would be in other languages. This makes `awk' programs easy to compose and -use. Often, `awk' programs can be quickly composed at your terminal, +use. Often, `awk' programs can be quickly composed at your keyboard, used once, and thrown away. Because `awk' programs are interpreted, you can avoid the (usually lengthy) compilation part of the typical edit-compile-test-debug cycle of software development. @@ -1991,12 +2021,10 @@ edit-compile-test-debug cycle of software development. Complex programs have been written in `awk', including a complete retargetable assembler for eight-bit microprocessors (*note Glossary::, for more information), and a microcode assembler for a special-purpose -Prolog computer. More recently, `gawk' was used for writing a a Wiki -clone (http://www.awk-scripting.de/cgi-bin/wiki.cgi/yawk/). While the -original `awk''s capabilities were strained by tasks of such -complexity, modern versions are more capable. Even the Bell Labs -version of `awk' has fewer predefined limits, and those that it has are -much larger than they used to be. +Prolog computer. While the original `awk''s capabilities were strained +by tasks of such complexity, modern versions are more capable. Even +the Bell Labs version of `awk' has fewer predefined limits, and those +that it has are much larger than they used to be. If you find yourself writing `awk' scripts of more than, say, a few hundred lines, you might consider using a different programming @@ -2059,13 +2087,12 @@ it: -| 555-6480 -| 555-2127 - `~' (tilde), `~' operator Regular expressions can also be used in -matching expressions. These expressions allow you to specify the -string to match against; it need not be the entire current input -record. The two operators `~' and `!~' perform regular expression -comparisons. Expressions using these operators can be used as -patterns, or in `if', `while', `for', and `do' statements. (*Note -Statements::.) For example: + Regular expressions can also be used in matching expressions. These +expressions allow you to specify the string to match against; it need +not be the entire current input record. The two operators `~' and `!~' +perform regular expression comparisons. Expressions using these +operators can be used as patterns, or in `if', `while', `for', and `do' +statements. (*Note Statements::.) For example: EXP ~ /REGEXP/ @@ -2167,20 +2194,20 @@ apply to both string constants and regexp constants: The hexadecimal value HH, where HH stands for a sequence of hexadecimal digits (`0'-`9', and either `A'-`F' or `a'-`f'). Like the same construct in ISO C, the escape sequence continues until - the first nonhexadecimal digit is seen. However, using more than - two hexadecimal digits produces undefined results. (The `\x' - escape sequence is not allowed in POSIX `awk'.) + the first nonhexadecimal digit is seen. (c.e.) However, using + more than two hexadecimal digits produces undefined results. (The + `\x' escape sequence is not allowed in POSIX `awk'.) `\/' A literal slash (necessary for regexp constants only). This - expression is used when you want to write a regexp constant that + sequence is used when you want to write a regexp constant that contains a slash. Because the regexp is delimited by slashes, you need to escape the slash that is part of the pattern, in order to tell `awk' to keep processing the rest of the regexp. `\"' A literal double quote (necessary for string constants only). - This expression is used when you want to write a string constant + This sequence is used when you want to write a string constant that contains a double quote. Because the string is delimited by double quotes, you need to escape the quote that is part of the string, in order to tell `awk' to keep processing the rest of the @@ -2224,8 +2251,8 @@ Strip the backslash out is the same as `"aqc"'. (Because this is such an easy bug both to introduce and to miss, `gawk' warns you about it.) Consider `FS = "[ \t]+\|[ \t]+"' to use vertical bars surrounded by whitespace as - the field separator. There should be two backslashes in the string - `FS = "[ \t]+\\|[ \t]+"'.) + the field separator. There should be two backslashes in the + string: `FS = "[ \t]+\\|[ \t]+"'.) Leave the backslash alone Some other `awk' implementations do this. In such @@ -2288,7 +2315,7 @@ sequences and that are not listed in the table stand for themselves: if ("line1\nLINE 2" ~ /1$/) ... -`.' +`. (period)' This matches any single character, _including_ the newline character. For example, `.P' matches any single character followed by a `P' in a string. Using concatenation, we can make a @@ -2386,10 +2413,10 @@ sequences and that are not listed in the table stand for themselves: Initially, because old programs may use `{' and `}' in regexp constants, `gawk' did _not_ match interval expressions in regexps. - However, beginning with version 3.2 *(FIXME: version)* `gawk' does - match interval expressions by default. This is because - compatibility with POSIX has become more important to most `gawk' - users than compatibility with old programs. + However, beginning with version 4.0, `gawk' does match interval + expressions by default. This is because compatibility with POSIX + has become more important to most `gawk' users than compatibility + with old programs. For programs that use `{' and `}' in regexp constants, it is good practice to always escape them with a backslash. Then the regexp @@ -2406,9 +2433,8 @@ themselves when there is nothing in the regexp that precedes them. For example, `/+/' matches a literal plus sign. However, many other versions of `awk' treat such a usage as a syntax error. - If `gawk' is in compatibility mode (*note Options::), POSIX -character classes and interval expressions are not available in regular -expressions. + If `gawk' is in compatibility mode (*note Options::), interval +expressions are not available in regular expressions. ---------- Footnotes ---------- @@ -2428,13 +2454,12 @@ File: gawk.info, Node: Character Lists, Next: GNU Regexp Operators, Prev: Reg Within a character list, a "range expression" consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, using the locale's collating sequence -and character set. For example, in the default C locale, `[a-dx-z]' is -equivalent to `[abcdxyz]'. Many locales sort characters in dictionary -order, and in these locales, `[a-dx-z]' is typically not equivalent to -`[abcdxyz]'; instead it might be equivalent to `[aBbCcDdxXyYz]', for -example. To obtain the traditional interpretation of bracket -expressions, you can use the C locale by setting the `LC_ALL' -environment variable to the value `C'. +and character set. For example, `[0-9]' is equivalent to +`[0123456789]'. + + Unfortunately, providing simple character ranges such as `[a-z]' +usually does not work like you might expect, due to locale-related +issues. This is discussed more fully, in *note Locales::. To include one of the characters `\', `]', `-', or `^' in a character list, put a `\' in front of it. For example: @@ -2449,9 +2474,9 @@ expressions in `awk' are a superset of the POSIX specification for Extended Regular Expressions (EREs). POSIX EREs are based on the regular expressions accepted by the traditional `egrep' utility. - "Character classes" are a new feature introduced in the POSIX -standard. A character class is a special notation for describing lists -of characters that have a specific attribute, but the actual characters + "Character classes" are a feature introduced in the POSIX standard. +A character class is a special notation for describing lists of +characters that have a specific attribute, but the actual characters can vary from country to country and/or from character set to character set. For example, the notion of what is an alphabetic character differs between the United States and France. @@ -2600,11 +2625,11 @@ No options `--traditional' Traditional Unix `awk' regexps are matched. The GNU operators are - not special, interval expressions are not available, nor are the - POSIX character classes (`[[:alnum:]]', etc.). Characters - described by octal and hexadecimal escape sequences are treated - literally, even if they represent regexp metacharacters. Also, - `gawk' silently skips directories named on the command line. + not special, and interval expressions are not available. The + POSIX character classes (`[[:alnum:]]', etc.) are supported, as + modern Unix `awk' does support them. Characters described by + octal and hexadecimal escape sequences are treated literally, even + if they represent regexp metacharacters. `--re-interval' Allow interval expressions in regexps, if `--traditional' has been @@ -2629,7 +2654,7 @@ read. There are two alternatives that you might prefer. One way to perform a case-insensitive match at a particular point in the program is to convert the data to a single case, using the -`tolower' or `toupper' built-in string functions (which we haven't +`tolower()' or `toupper()' built-in string functions (which we haven't discussed yet; *note String Functions::). For example: tolower($1) ~ /foo/ { ... } @@ -2655,7 +2680,7 @@ zero: case-insensitive and other rules case-sensitive, because there is no straightforward way to set `IGNORECASE' just for the pattern of a particular rule.(1) To do this, use either character lists or -`tolower'. However, one thing you can do with `IGNORECASE' only is +`tolower()'. However, one thing you can do with `IGNORECASE' only is dynamically turn case-sensitivity on or off for all the rules at once. `IGNORECASE' can be set on the command line or in a `BEGIN' rule @@ -2663,20 +2688,15 @@ dynamically turn case-sensitivity on or off for all the rules at once. `IGNORECASE' from the command line is a way to make a program case-insensitive without having to edit it. - Prior to `gawk' 3.0, the value of `IGNORECASE' affected regexp -operations only. It did not affect string comparison with `==', `!=', -and so on. Beginning with version 3.0, both regexp and string -comparison operations are also affected by `IGNORECASE'. + Both regexp and string comparison operations are affected by +`IGNORECASE'. - Beginning with `gawk' 3.0, the equivalences between upper- and -lowercase characters are based on the ISO-8859-1 (ISO Latin-1) -character set. This character set is a superset of the traditional 128 -ASCII characters, which also provides a number of characters suitable -for use with European languages. - - As of `gawk' 3.1.4, the case equivalences are fully locale-aware. -They are based on the C `<ctype.h>' facilities, such as `isalpha()' and -`toupper()'. + In multibyte locales, the equivalences between upper- and lowercase +characters are tested based on the wide-character values of the +locale's character set. Otherwise, the characters are tested based on +the ISO-8859-1 (ISO Latin-1) character set. This character set is a +superset of the traditional 128 ASCII characters, which also provides a +number of characters suitable for use with European languages.(2) The value of `IGNORECASE' has no effect if `gawk' is in compatibility mode (*note Options::). Case is always significant in @@ -2689,6 +2709,9 @@ using something like `IGNORECASE = 1 && /foObAr/ { ... }' and `IGNORECASE = 0 || /foobar/ { ... }'. However, this is somewhat obscure and we don't recommend it. + (2) If you don't understand this, don't worry about it; it just +means that `gawk' does the right thing. + File: gawk.info, Node: Leftmost Longest, Next: Computed Regexps, Prev: Case-sensitivity, Up: Regexp @@ -2699,9 +2722,9 @@ Consider the following: echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }' - This example uses the `sub' function (which we haven't discussed yet; -*note String Functions::) to make a change to the input record. Here, -the regexp `/a+/' indicates "one or more `a' characters," and the + This example uses the `sub()' function (which we haven't discussed +yet; *note String Functions::) to make a change to the input record. +Here, the regexp `/a+/' indicates "one or more `a' characters," and the replacement text is `<A>'. The input contains four `a' characters. `awk' (and POSIX) regular @@ -2713,11 +2736,11 @@ with `<A>' in this example: -| <A>bcd For simple match/no-match tests, this is not so important. But when -doing text matching and substitutions with the `match', `sub', `gsub', -and `gensub' functions, it is very important. *Note String Functions::, -for more information on these functions. Understanding this principle -is also important for regexp-based record and field splitting (*note -Records::, and also *note Field Separators::). +doing text matching and substitutions with the `match()', `sub()', +`gsub()', and `gensub()' functions, it is very important. *Note String +Functions::, for more information on these functions. Understanding +this principle is also important for regexp-based record and field +splitting (*note Records::, and also *note Field Separators::). File: gawk.info, Node: Computed Regexps, Next: Locales, Prev: Leftmost Longest, Up: Regexp @@ -2737,15 +2760,15 @@ that is computed in this way is called a "dynamic regexp": This sets `digits_regexp' to a regexp that describes one or more digits, and tests whether the input record matches this regexp. - *Caution:* When using the `~' and `!~' operators, there is a -difference between a regexp constant enclosed in slashes and a string -constant enclosed in double quotes. If you are going to use a string -constant, you have to understand that the string is, in essence, -scanned _twice_: the first time when `awk' reads your program, and the -second time when it goes to match the string on the lefthand side of -the operator with the pattern on the right. This is true of any -string-valued expression (such as `digits_regexp', shown previously), -not just string constants. + NOTE: When using the `~' and `!~' operators, there is a difference + between a regexp constant enclosed in slashes and a string + constant enclosed in double quotes. If you are going to use a + string constant, you have to understand that the string is, in + essence, scanned _twice_: the first time when `awk' reads your + program, and the second time when it goes to match the string on + the lefthand side of the operator with the pattern on the right. + This is true of any string-valued expression (such as + `digits_regexp', shown previously), not just string constants. What difference does it make if the string is scanned twice? The answer has to do with escape sequences, and particularly with @@ -2807,12 +2830,27 @@ File: gawk.info, Node: Locales, Prev: Computed Regexps, Up: Regexp Modern systems support the notion of "locales": a way to tell the system about the local character set and language. The current locale setting can affect the way regexp matching works, often in surprising -ways. In particular, many locales do case-insensitive matching, even -when you may have specified characters of only one particular case. +ways. + + For example, in the default C locale, `[a-dx-z]' is equivalent to +`[abcdxyz]'. Many locales sort characters in dictionary order, and in +these locales, `[a-dx-z]' is typically not equivalent to `[abcdxyz]'; +instead it might be equivalent to `[aBbCcdXxYyz]', for example. + + This point needs to be emphasized: Much literature teaches that one +should use `[a-z]' to match a lower case character. But on systems with +non-ASCII locales, this also matches all of the upper case characters +except `Z'! This is a continuous cause of confusion, even well into +the twenty-first century. - The following example uses the `sub' function, which does text -replacement (*note String Functions::). Here, the intent is to remove -trailing uppercase characters: + To obtain the traditional interpretation of bracket expressions, you +can use the C locale by setting the `LC_ALL' environment variable to the +value `C'. However, it is best to just use POSIX character classes, +such as `[[:lower:]]' to match specific classes of characters. + + To demonstrate these issues, the following example uses the `sub()' +function, which does text replacement (*note String Functions::). Here, +the intent is to remove trailing uppercase characters: $ echo something1234abc | gawk '{ sub("[A-Z]*$", ""); print }' -| something1234 @@ -2839,11 +2877,12 @@ such as `en_US.UTF-8'. (In general, such ranges should be avoided; either list the characters individually, or use a POSIX character class such as `[[:punct:]]'.) - For the normal case of `RS = "\n"', the locale is largely irrelevant. -For other single-character record separators, using `LC_ALL=C' will -give you much better performance when reading records. Otherwise, -`gawk' has to make several function calls, _per input character_ to -find the record terminator. + An additional factor relates to splitting recoreds. For the normal +case of `RS = "\n"', the locale is largely irrelevant. For other +single-character record separators, using `LC_ALL=C' will give you much +better performance when reading records. Otherwise, `gawk' has to make +several function calls, _per input character_ to find the record +terminator. Finally, the locale affects the value of the decimal point character used when `gawk' parses input data. This is discussed in detail in @@ -2855,7 +2894,7 @@ File: gawk.info, Node: Reading Files, Next: Printing, Prev: Regexp, Up: Top 3 Reading Input Files ********************* -In the typical `awk' program, all input is read either from the +In the typical `awk' program, `awk' reads all input either from the standard input (by default, this is the keyboard, but often it is a pipe from another command) or from files whose names you specify on the `awk' command line. If you specify input files, `awk' reads them in @@ -2900,9 +2939,9 @@ The `awk' utility divides the input for your `awk' program into records and fields. `awk' keeps track of the number of records that have been read so far from the current input file. This value is stored in a built-in variable called `FNR'. It is reset to zero when a new file is -started. Another built-in variable, `NR', is the total number of input -records read so far from all data files. It starts at zero, but is -never automatically reset to zero. +started. Another built-in variable, `NR', records the total number of +input records read so far from all data files. It starts at zero, but +is never automatically reset to zero. Records are separated by a character called the "record separator". By default, the record separator is the newline character. This is why @@ -3003,16 +3042,18 @@ currently being processed, as well as records already processed, are not affected. After the end of the record has been determined, `gawk' sets the -variable `RT' to the text in the input that matched `RS'. When using -`gawk', the value of `RS' is not limited to a one-character string. It -can be any regular expression (*note Regexp::). In general, each record -ends at the next string that matches the regular expression; the next -record starts at the end of the matching string. This general rule is -actually at work in the usual case, where `RS' contains just a newline: -a record ends at the beginning of the next matching string (the next -newline in the input), and the following record starts just after the -end of this string (at the first character of the following line). The -newline, because it matches `RS', is not part of either record. +variable `RT' to the text in the input that matched `RS'. + + When using `gawk', the value of `RS' is not limited to a +one-character string. It can be any regular expression (*note +Regexp::). In general, each record ends at the next string that +matches the regular expression; the next record starts at the end of +the matching string. This general rule is actually at work in the +usual case, where `RS' contains just a newline: a record ends at the +beginning of the next matching string (the next newline in the input), +and the following record starts just after the end of this string (at +the first character of the following line). The newline, because it +matches `RS', is not part of either record. When `RS' is a single character, `RT' contains the same single character. However, when `RS' is a regular expression, `RT' contains @@ -3096,9 +3137,9 @@ File: gawk.info, Node: Fields, Next: Nonconstant Fields, Prev: Records, Up: ==================== When `awk' reads an input record, the record is automatically "parsed" -or separated by the interpreter into chunks called "fields". By +or separated by the `awk' utility into chunks called "fields". By default, fields are separated by "whitespace", like words in a line. -Whitespace in `awk' means any string of one or more spaces, tabs, or +Whitespace in `awk' means any string of one or more spaces, TABs, or newlines;(1) other characters, such as formfeed, vertical tab, etc. that are considered whitespace by other languages, are _not_ considered whitespace by `awk'. @@ -3313,9 +3354,8 @@ The intervening field, `$5', is created with an empty value (indicated by the second pair of adjacent colons), and `NF' is updated with the value six. - *FIXME:* Verify that this is in POSIX. Decrementing `NF' throws -away the values of the fields after the new value of `NF' and -recomputes `$0'. (d.c.) Here is an example: + Decrementing `NF' throws away the values of the fields after the new +value of `NF' and recomputes `$0'. (d.c.) Here is an example: $ echo a b c d e f | awk '{ print "NF =", NF; > NF = 3; print $0 }' @@ -3338,8 +3378,8 @@ as we've shown here. There is a flip side to the relationship between `$0' and the fields. Any assignment to `$0' causes the record to be reparsed into fields using the _current_ value of `FS'. This also applies to any -built-in function that updates `$0', such as `sub' and `gsub' (*note -String Functions::). +built-in function that updates `$0', such as `sub()' and `gsub()' +(*note String Functions::). File: gawk.info, Node: Field Separators, Next: Constant Size, Prev: Changing Fields, Up: Reading Files @@ -3349,6 +3389,7 @@ File: gawk.info, Node: Field Separators, Next: Constant Size, Prev: Changing * Menu: +* Default Field Splitting:: How fields are normally separated. * Regexp Field Splitting:: Using regexps as the field separator. * Single Character Fields:: Making each character a separate field. * Command Line Field Separator:: Setting `FS' from the command-line. @@ -3372,7 +3413,7 @@ leading spaces in the values of the second and third fields. The field separator is represented by the built-in variable `FS'. Shell programmers take note: `awk' does _not_ use the name `IFS' that is used by the POSIX-compliant shells (such as the Unix Bourne shell, -`sh', or `bash'). +`sh', or Bash). The value of `FS' can be changed in the `awk' program with the assignment operator, `=' (*note Assignment Ops::). Often the right @@ -3404,7 +3445,13 @@ characters carefully to prevent such problems. (If the data is not in a form that is easy to process, perhaps you can massage it first with a separate `awk' program.) - Fields are normally separated by whitespace sequences (spaces, TABs, + +File: gawk.info, Node: Default Field Splitting, Next: Regexp Field Splitting, Up: Field Separators + +3.5.1 Whitespace Normally Separates Fields +------------------------------------------ + +Fields are normally separated by whitespace sequences (spaces, TABs, and newlines), not by single spaces. Two spaces in a row do not delimit an empty field. The default value of the field separator `FS' is a string containing a single space, `" "'. If `awk' interpreted @@ -3422,9 +3469,9 @@ space character is the only single character that does not follow these rules. -File: gawk.info, Node: Regexp Field Splitting, Next: Single Character Fields, Up: Field Separators +File: gawk.info, Node: Regexp Field Splitting, Next: Single Character Fields, Prev: Default Field Splitting, Up: Field Separators -3.5.1 Using Regular Expressions to Separate Fields +3.5.2 Using Regular Expressions to Separate Fields -------------------------------------------------- The previous node discussed the use of single characters or simple @@ -3489,30 +3536,22 @@ versions answer this question differently, and you should not rely on any specific behavior in your programs. (d.c.) As a point of information, the Bell Labs `awk' allows `^' to match -only at the beginning of the record. Versions of `gawk' after 3.1.6 -also work this way. For example: - - $ echo 'xxAA xxBxx C' | - > nawk -F '(^x+)|( +)' '{ for (i = 1; i <= NF; i++) printf "-->%s<--\n", $i }' - -| --><-- - -| -->AA<-- - -| -->xxBxx<-- - -| -->C<-- +only at the beginning of the record. `gawk' also works this way. For +example: - $ echo 'xxAA xxBxx C' | - > gawk-3.1.6 -F '(^x+)|( +)' '{ for (i = 1; i <= NF; i++) printf "-->%s<--\n", $i }' + $ echo 'xxAA xxBxx C' | + > gawk -F '(^x+)|( +)' '{ for (i = 1; i <= NF; i++) + > printf "-->%s<--\n", $i }' -| --><-- -| -->AA<-- -| --><-- -| -->Bxx<-- -| -->C<-- -As mentioned, `gawk' now behaves like the Bell Labs `awk'. - File: gawk.info, Node: Single Character Fields, Next: Command Line Field Separator, Prev: Regexp Field Splitting, Up: Field Separators -3.5.2 Making Each Character a Separate Field +3.5.3 Making Each Character a Separate Field -------------------------------------------- There are times when you may want to examine each character of a record @@ -3538,7 +3577,7 @@ way. File: gawk.info, Node: Command Line Field Separator, Next: Field Splitting Summary, Prev: Single Character Fields, Up: Field Separators -3.5.3 Setting `FS' from the Command Line +3.5.4 Setting `FS' from the Command Line ---------------------------------------- `FS' can be set on the command line. Use the `-F' option to do so. @@ -3625,7 +3664,7 @@ the entries for users who have no password: File: gawk.info, Node: Field Splitting Summary, Prev: Command Line Field Separator, Up: Field Separators -3.5.4 Field-Splitting Summary +3.5.5 Field-Splitting Summary ----------------------------- It is important to remember that when you assign a string constant as @@ -3719,18 +3758,18 @@ File: gawk.info, Node: Constant Size, Next: Splitting By Content, Prev: Field (This minor node discusses an advanced feature of `awk'. If you are a novice `awk' user, you might want to skip it on the first reading.) -`gawk' version 2.13 introduced a facility for dealing with fixed-width -fields with no distinctive field separator. For example, data of this -nature arises in the input for old Fortran programs where numbers are -run together, or in the output of programs that did not anticipate the -use of their output as input for other programs. +`gawk' provides a facility for dealing with fixed-width fields with no +distinctive field separator. For example, data of this nature arises +in the input for old Fortran programs where numbers are run together, +or in the output of programs that did not anticipate the use of their +output as input for other programs. An example of the latter is a table where all the columns are lined up by the use of a variable number of spaces and _empty fields are just spaces_. Clearly, `awk''s normal field splitting based on `FS' does not work well in this case. Although a portable `awk' program can use -a series of `substr' calls on `$0' (*note String Functions::), this is -awkward and inefficient for a large number of fields. +a series of `substr()' calls on `$0' (*note String Functions::), this +is awkward and inefficient for a large number of fields. The splitting of an input record into fixed-width fields is specified by assigning a string containing space-separated numbers to @@ -3825,11 +3864,11 @@ novice `awk' user, you might want to skip it on the first reading.) Normally, when using `FS', `gawk' defines the fields as the parts of the record that occur in between each field separator. In other words, -`FS' defines what a field _is not_, and not what a field _is_. +`FS' defines what a field _is not_, instead of what a field _is_. However, there are times when you really want to define the fields by what they are, and not by what they are not. - The most notorious such case is so-called Comma-Separated-Value + The most notorious such case is so-called "comma separated value" (CSV) data. Many spreadsheet programs, for example, can export their data into text files, where each record is terminated with a newline, and fields are separated by commas. If only commas separated the data, @@ -3898,6 +3937,15 @@ affects field splitting with `FPAT'. provides an elegant solution for the majority of cases, and the `gawk' maintainer is satisfied with that. + As written, the regexp used for `FPATH' requires that each field +have a least one character. A straightforward modification (changing +changed the first `+' to `*') allows fields to be empty: + + FPAT = "([^,]*)|(\"[^\"]+\")" + + Finally, the `patsplit()' function makes the same functionality +available for splitting regular strings (*note String Functions::). + ---------- Footnotes ---------- (1) At least, we don't know of one. @@ -3955,7 +4003,7 @@ field separations result from `FS'.(1) provide useful behavior in the default case (i.e., `FS' is equal to `" "'). This feature can be a problem if you really don't want the newline character to separate fields, because there is no way to -prevent it. However, you can work around this by using the `split' +prevent it. However, you can work around this by using the `split()' function to break up the record manually (*note String Functions::). If you have a single character field separator, you can work around the special feature in a different way, by making `FS' into a regexp for @@ -4068,8 +4116,8 @@ describing the error that occurred. In the following examples, COMMAND stands for a string value that represents a shell command. - NOTE: When `--sandbox' is specified, reading lines from files, - pipes and coprocesses is disabled. + NOTE: When `--sandbox' is specified (*note Options::), reading + lines from files, pipes and coprocesses is disabled. * Menu: @@ -4115,7 +4163,7 @@ processing on the next record _right now_. For example: u = index($0, "*/") offset = 0 } - # substr expression will be "" if */ + # substr() expression will be "" if */ # occurred at end of line $0 = tmp substr($0, offset + u + 2) } @@ -4211,7 +4259,7 @@ EXPRESSION contains unparenthesized operators other than `$'; for example, `getline < dir "/" file' is ambiguous because the concatenation operator is not parenthesized. You should write it as `getline < (dir "/" file)' if you want your program to be portable to -other `awk' implementations. +all `awk' implementations. File: gawk.info, Node: Getline/Variable/File, Next: Getline/Pipe, Prev: Getline/File, Up: Getline @@ -4243,7 +4291,7 @@ file FILENAME: program; it is taken directly from the data, specifically from the second field on the `@include' line. - The `close' function is called to ensure that if two identical + The `close()' function is called to ensure that if two identical `@include' lines appear in the input, the entire specified file is included twice. *Note Close Files And Pipes::. @@ -4273,7 +4321,7 @@ produced by running the rest of the line as a shell command: { if ($1 == "@execute") { - tmp = substr($0, 10) + tmp = substr($0, 10) # Remove "@execute" while ((tmp | getline) > 0) print close(tmp) @@ -4281,7 +4329,7 @@ produced by running the rest of the line as a shell command: print } -The `close' function is called to ensure that if two identical +The `close()' function is called to ensure that if two identical `@execute' lines appear in the input, the command is run for each one. *Note Close Files And Pipes::. Given the input: @@ -4314,16 +4362,15 @@ EXPRESSION contains unparenthesized operators other than `$'--for example, `"echo " "date" | getline' is ambiguous because the concatenation operator is not parenthesized. You should write it as `("echo " "date") | getline' if you want your program to be portable to -other `awk' implementations. +all `awk' implementations. NOTE: Unfortunately, `gawk' has not been consistent in its - treatment of a construct like `"echo " "date" | getline'. Up to - and including version 3.1.1 of `gawk', it was treated as `("echo " - "date") | getline'. (This how Unix `awk' behaves.) From 3.1.2 - through 3.1.5, it was treated as `"echo " ("date" | getline)'. - (This is how `mawk' behaves.) Starting with version 3.1.6, the - earlier behavior was reinstated. In short, _always_ use explicit - parentheses, and then you won't have to worry. + treatment of a construct like `"echo " "date" | getline'. Most + versions, including the current version, treat it at as `("echo " + "date") | getline'. (This how Unix `awk' behaves.) Some versions + changed and treated it as `"echo " ("date" | getline)'. (This is + how `mawk' behaves.) In short, _always_ use explicit parentheses, + and then you won't have to worry. File: gawk.info, Node: Getline/Variable/Pipe, Next: Getline/Coprocess, Prev: Getline/Pipe, Up: Getline @@ -4434,6 +4481,12 @@ in mind: probably by accident, and you should reconsider what it is you're trying to accomplish. + * *note Getline Summary::, presents a table summarizing the + `getline' variants and which variables they can affect. It is + worth noting that those variants which do not use redirection can + cause `FILENAME' to be updated if they cause `awk' to start + reading a new input file. + File: gawk.info, Node: Getline Summary, Prev: Getline Notes, Up: Getline @@ -4443,16 +4496,19 @@ File: gawk.info, Node: Getline Summary, Prev: Getline Notes, Up: Getline *note table-getline-variants:: summarizes the eight variants of `getline', listing which built-in variables are set by each one. -Variant Effect +Variant Effect Standad / + Extenstion -------------------------------------------------------------------------- -`getline' Sets `$0', `NF', `FNR', and `NR' -`getline' VAR Sets VAR, `FNR', and `NR' -`getline <' FILE Sets `$0' and `NF' -`getline VAR < FILE' Sets VAR -COMMAND `| getline' Sets `$0' and `NF' -COMMAND `| getline' VAR Sets VAR -COMMAND `|& getline' Sets `$0' and `NF'. This is a `gawk' extension -COMMAND `|& getline' VAR Sets VAR. This is a `gawk' extension +`getline' Sets `$0', `NF', `FNR', and Standard + `NR' +`getline' VAR Sets VAR, `FNR', and `NR' Standard +`getline <' FILE Sets `$0' and `NF' Standard +`getline VAR < FILE' Sets VAR Standard +COMMAND `| getline' Sets `$0' and `NF' Standard +COMMAND `| getline' VAR Sets VAR Standard +COMMAND `|& getline' Sets `$0' and `NF' Extension +COMMAND `|& getline' Sets VAR Extension +VAR Table 3.1: getline Variants and What They Set @@ -4462,10 +4518,7 @@ File: gawk.info, Node: BEGINFILE/ENDFILE, Next: Command line directories, Pre 3.10 The `BEGINFILE' and `ENDFILE' Special Patterns =================================================== -*FIXME:* Get the version right. - - NOTE: This minor node describes a `gawk'-specific feature added in - `gawk' 3.X. + NOTE: This minor node describes a `gawk'-specific feature. Two special kinds of rule, `BEGINFILE' and `ENDFILE', give you "hooks" into `gawk''s command-line file processing loop. As with the @@ -4489,7 +4542,7 @@ would otherwise be difficult or impossible to perform: string; if so, then `gawk' was not able to open the file. In this case, your program can execute the `nextfile' statement (*note Nextfile Statement::). This casuses `gawk' to skip the file - entirely. Otherwise, `gawk' will exit with the usual fatal error. + entirely. Otherwise, `gawk' exits with the usual fatal error. 2. If you have written extensions that modify the record handling (by inserting an "open hook"), you can invoke them at this point, @@ -4498,17 +4551,19 @@ would otherwise be difficult or impossible to perform: (http://xgawk.sourceforge.net).) The `ENDFILE' rule is called when `gawk' has finished processing the -last record in an input file. It will be called before any `END' rules. +last record in an input file. For the last input file, it will be +called before any `END' rules. - Normally, when an error occurs when reading input in the normal -input processing loop, the error is fatal. However, if an `ENDFILE' -rule is present, the error becomes non-fatal, and instead `ERRNO' is -set. This makes it possible to catch and process I/O errors at the -level of the `awk' program. + Normally, when an error occurs when reading input in the normal input +processing loop, the error is fatal. However, if an `ENDFILE' rule is +present, the error becomes non-fatal, and instead `ERRNO' is set. This +makes it possible to catch and process I/O errors at the level of the +`awk' program. - The `next' statement is not allowed inside either a `BEGINFILE' or -and `ENDFILE' rule. The `nextfile' statement is allowed only inside a -`BEGINFILE' rule, but not inside an `ENDFILE' rule. + The `next' statement (*note Next Statement::) is not allowed inside +either a `BEGINFILE' or and `ENDFILE' rule. The `nextfile' statement +(*note Nextfile Statement::) is allowed only inside a `BEGINFILE' rule, +but not inside an `ENDFILE' rule. The `getline' statement (*note Getline::) is restricted inside both `BEGINFILE' and `ENDFILE'. Only the `getline VARIABLE < FILE' form is @@ -4524,15 +4579,15 @@ File: gawk.info, Node: Command line directories, Prev: BEGINFILE/ENDFILE, Up: 3.11 Directories On The Command Line ==================================== -According to POSIX, files named on the `awk' command line must be text -files. The behavior is "undefined" if they are not. Most versions of -`awk' treat a directory on the command line as a fatal error. +According to the POSIX standard, files named on the `awk' command line +must be text files. It is a fatal error if they are not. Most +versions of `awk' treat a directory on the command line as a fatal +error. - *FIXME:* Get the version right. Starting with version 3.x of -`gawk', a directory on the command line produces a warning, but is -otherwise skipped. If either of the `--posix' or `--traditional' -options is given, then `gawk' reverts to treating directories on the -command line as a fatal error. + By default, `gawk' produces a warning for a directory on the command +line, but otherwise ignores it. If either of the `--posix' or +`--traditional' options is given, then `gawk' reverts to treating a +directory on the command line as a fatal error. File: gawk.info, Node: Printing, Next: Expressions, Prev: Reading Files, Up: Top @@ -4552,7 +4607,7 @@ statement (*note Printf::). Besides basic and formatted printing, this major node also covers I/O redirections to files and pipes, introduces the special file names -that `gawk' processes internally, and discusses the `close' built-in +that `gawk' processes internally, and discusses the `close()' built-in function. * Menu: @@ -4575,7 +4630,7 @@ File: gawk.info, Node: Print, Next: Print Examples, Up: Printing 4.1 The `print' Statement ========================= -The `print' statement is used to produce output with simple, +The `print' statement is used for producing output with simple, standardized formatting. Specify only the strings or numbers to print, in a list separated by commas. They are output, separated by single spaces, followed by a newline. The statement looks like this: @@ -4584,8 +4639,8 @@ spaces, followed by a newline. The statement looks like this: The entire list of items may be optionally enclosed in parentheses. The parentheses are necessary if any of the item expressions uses the `>' -relational operator; otherwise it could be confused with a redirection -(*note Redirection::). +relational operator; otherwise it could be confused with an output +redirection (*note Redirection::). The items to print can be constant strings or numbers, fields of the current record (such as `$1'), variables, or any `awk' expression. @@ -4602,12 +4657,12 @@ that a space is printed between any two items. File: gawk.info, Node: Print Examples, Next: Output Separators, Prev: Print, Up: Printing -4.2 Examples of `print' Statements -================================== +4.2 `print' Statement Examples +============================== Each `print' statement makes at least one line of output. However, it -isn't limited to only one line. If an item value is a string that -contains a newline, the newline is output along with the rest of the +isn't limited to only one line. If an item value is a string +containing a newline, the newline is output along with the rest of the string. A single `print' statement can make any number of lines this way. @@ -4718,7 +4773,7 @@ semicolon, with a blank line added after each newline: ... If the value of `ORS' does not contain a newline, the program's -output is run together on a single line. +output runs together on a single line. File: gawk.info, Node: OFMT, Next: Printf, Prev: Output Separators, Up: Printing @@ -4726,21 +4781,21 @@ File: gawk.info, Node: OFMT, Next: Printf, Prev: Output Separators, Up: Prin 4.4 Controlling Numeric Output with `print' =========================================== -When the `print' statement is used to print numeric values, `awk' +When printing numeric values with the `print' statement, `awk' internally converts the number to a string of characters and prints -that string. `awk' uses the `sprintf' function to do this conversion +that string. `awk' uses the `sprintf()' function to do this conversion (*note String Functions::). For now, it suffices to say that the -`sprintf' function accepts a "format specification" that tells it how +`sprintf()' function accepts a "format specification" that tells it how to format numbers (or strings), and that there are a number of different ways in which numbers can be formatted. The different format specifications are discussed more fully in *note Control Letters::. The built-in variable `OFMT' contains the default format -specification that `print' uses with `sprintf' when it wants to convert -a number to a string for printing. The default value of `OFMT' is -`"%.6g"'. The way `print' prints numbers can be changed by supplying -different format specifications as the value of `OFMT', as shown in the -following example: +specification that `print' uses with `sprintf()' when it wants to +convert a number to a string for printing. The default value of `OFMT' +is `"%.6g"'. The way `print' prints numbers can be changed by +supplying different format specifications as the value of `OFMT', as +shown in the following example: $ awk 'BEGIN { > OFMT = "%.0f" # print numbers as integers (rounds) @@ -4757,13 +4812,13 @@ File: gawk.info, Node: Printf, Next: Redirection, Prev: OFMT, Up: Printing 4.5 Using `printf' Statements for Fancier Printing ================================================== -For more precise control over the output format than what is normally -provided by `print', use `printf'. `printf' can be used to specify the -width to use for each item, as well as various formatting choices for -numbers (such as what output base to use, whether to print an exponent, -whether to print a sign, and how many digits to print after the decimal -point). This is done by supplying a string, called the "format -string", that controls how and where to print the other arguments. +For more precise control over the output format than what is provided +by `print', use `printf'. With `printf' you can specify the width to +use for each item, as well as various formatting choices for numbers +(such as what output base to use, whether to print an exponent, whether +to print a sign, and how many digits to print after the decimal point). +You do this by supplying a string, called the "format string", that +controls how and where to print the other arguments. * Menu: @@ -4784,8 +4839,8 @@ A simple `printf' statement looks like this: The entire list of arguments may optionally be enclosed in parentheses. The parentheses are necessary if any of the item expressions use the `>' -relational operator; otherwise, it can be confused with a redirection -(*note Redirection::). +relational operator; otherwise, it can be confused with an output +redirection (*note Redirection::). The difference between `printf' and `print' is the FORMAT argument. This is an expression whose value is taken as a string; it specifies @@ -4811,7 +4866,7 @@ statements. For example: > }' -| Dont Panic! -Here, neither the `+' nor the `OUCH' appear when the message is printed. +Here, neither the `+' nor the `OUCH' appear in the output message. File: gawk.info, Node: Control Letters, Next: Format Modifiers, Prev: Basic Printf, Up: Printf @@ -4827,28 +4882,28 @@ print. The rest of the format specifier is made up of optional width. Here is a list of the format-control letters: `%c' - This prints a number as an ASCII character; thus, `printf "%c", - 65' outputs the letter `A'. (The output for a string value is the - first character of the string.) - - NOTE: The `%c' format does _not_ handle values outside the - range 0-255. On most systems, values from 0-127 are within - the range of ASCII and will yield an ASCII character. Values - in the range 128-255 may format as characters in some - extended character set, or they may not. System 390 (IBM - architecture mainframe) systems use 8-bit characters, and - thus values from 0-255 yield the corresponding EBCDIC - character. Any value above 255 is treated as modulo 255; - i.e., the lowest eight bits of the value are used. The - locale and character set are always ignored. + Print a number as an ASCII character; thus, `printf "%c", 65' + outputs the letter `A'. The output for a string value is the first + character of the string. + + NOTE: The POSIX standard says the first character of a string + is printed. In locales with multibyte characters, `gawk' + attempts to convert the leading bytes of the string into a + valid wide character and then to print the multibyte encoding + of that character. Similarly, when printing a numeric value, + `gawk' allows the value to be within the numeric range of + values that can be held in a wide character. + + Other `awk' versions generally restrict themselves to printing + the first byte of a string or to numeric values within the + range of a single byte (0-255). `%d, %i' - These are equivalent; they both print a decimal integer. (The - `%i' specification is for compatibility with ISO C.) + Print a decimal integer. The two control letters are equivalent. + (The `%i' specification is for compatibility with ISO C.) `%e, %E' - These print a number in scientific (exponential) notation; for - example: + Print a number in scientific (exponential) notation; for example: printf "%4.3e\n", 1950 @@ -4858,7 +4913,7 @@ width. Here is a list of the format-control letters: of `e' in the output. `%f' - This prints a number in floating-point notation. For example: + Print a number in floating-point notation. For example: printf "%4.3f", 1950 @@ -4879,30 +4934,28 @@ width. Here is a list of the format-control letters: support it. On those that don't, `gawk' uses `%f' instead. `%g, %G' - These print a number in either scientific notation or in - floating-point notation, whichever uses fewer characters; if the - result is printed in scientific notation, `%G' uses `E' instead of - `e'. + Print a number in either scientific notation or in floating-point + notation, whichever uses fewer characters; if the result is + printed in scientific notation, `%G' uses `E' instead of `e'. `%o' - This prints an unsigned octal integer. + Print an unsigned octal integer. `%s' - This prints a string. + Print a string. `%u' - This prints an unsigned decimal integer. (This format is of - marginal use, because all numbers in `awk' are floating-point; it - is provided primarily for compatibility with C.) + Print an unsigned decimal integer. (This format is of marginal + use, because all numbers in `awk' are floating-point; it is + provided primarily for compatibility with C.) `%x, %X' - These print an unsigned hexadecimal integer; `%X' uses the letters - `A' through `F' instead of `a' through `f'. + Print an unsigned hexadecimal integer; `%X' uses the letters `A' + through `F' instead of `a' through `f'. `%%' - This isn't a format-control letter, but it does have meaning--the - sequence `%%' outputs one `%'; it does not consume an argument and - it ignores any modifiers. + Print a single `%'. This does not consume an argument and it + ignores any modifiers. NOTE: When using the integer format-control letters for values that are outside the range of the widest C integer type, `gawk' @@ -4966,7 +5019,7 @@ which they may appear: `#' Use an "alternate form" for certain control letters. For `%o', supply a leading zero. For `%x' and `%X', supply a leading `0x' - or `0X' for a nonzero result. For `%e', `%E', and `%f', the + or `0X' for a nonzero result. For `%e', `%E', `%f', and `%F', the result always contains a decimal point. For `%g' and `%G', trailing zeros are not removed from the result. @@ -5024,15 +5077,15 @@ which they may appear: to use when printing. The meaning of the precision varies by control letter: - `%e', `%E', `%f' + `%d', `%i', `%o', `%u', `%x', `%X' + Minimum number of digits to print. + + `%e', `%E', `%f', `%F' Number of digits to the right of the decimal point. `%g', `%G' Maximum number of significant digits. - `%d', `%i', `%o', `%u', `%x', `%X' - Minimum number of digits to print. - `%s' Maximum number of characters from the string that should print. @@ -5072,10 +5125,9 @@ This is not particularly easy to read but it does work. C programmers may be used to supplying additional `l', `L', and `h' modifiers in `printf' format strings. These are not valid in `awk'. -Most `awk' implementations silently ignore these modifiers. If -`--lint' is provided on the command line (*note Options::), `gawk' -warns about their use. If `--posix' is supplied, their use is a fatal -error. +Most `awk' implementations silently ignore them. If `--lint' is +provided on the command line (*note Options::), `gawk' warns about +their use. If `--posix' is supplied, their use is a fatal error. File: gawk.info, Node: Printf Examples, Prev: Format Modifiers, Up: Printf @@ -5083,7 +5135,7 @@ File: gawk.info, Node: Printf Examples, Prev: Format Modifiers, Up: Printf 4.5.4 Examples Using `printf' ----------------------------- -The following is a simple example of how to use `printf' to make an +The following simple example shows how to use `printf' to make an aligned table: awk '{ printf "%-10s %s\n", $1, $2 }' BBS-list @@ -5124,7 +5176,7 @@ beginning of the `awk' program: print "---- ------" } { printf "%-10s %s\n", $1, $2 }' BBS-list - The above example mixed `print' and `printf' statements in the same + The above example mixes `print' and `printf' statements in the same program. Using just `printf' statements can produce the same results: awk 'BEGIN { printf "%-10s %s\n", "Name", "Number" @@ -5155,11 +5207,11 @@ File: gawk.info, Node: Redirection, Next: Special Files, Prev: Printf, Up: P ============================================== So far, the output from `print' and `printf' has gone to the standard -output, usually the terminal. Both `print' and `printf' can also send +output, usually the screen. Both `print' and `printf' can also send their output to other places. This is called "redirection". - NOTE: When `--sandbox' is specified, redirecting output to files - and pipes is disabled. + NOTE: When `--sandbox' is specified (*note Options::), redirecting + output to files and pipes is disabled. A redirection appears after the `print' or `printf' statement. Redirections in `awk' are written just like redirections in shell @@ -5171,10 +5223,10 @@ to a coprocess. They are all shown for the `print' statement, but they work identically for `printf': `print ITEMS > OUTPUT-FILE' - This type of redirection prints the items into the output file - named OUTPUT-FILE. The file name OUTPUT-FILE can be any - expression. Its value is changed to a string and then used as a - file name (*note Expressions::). + This redirection prints the items into the output file named + OUTPUT-FILE. The file name OUTPUT-FILE can be any expression. + Its value is changed to a string and then used as a file name + (*note Expressions::). When this type of redirection is used, the OUTPUT-FILE is erased before the first output is written to it. Subsequent writes to @@ -5199,17 +5251,17 @@ work identically for `printf': Each output file contains one name or number per line. `print ITEMS >> OUTPUT-FILE' - This type of redirection prints the items into the pre-existing - output file named OUTPUT-FILE. The difference between this and the - single-`>' redirection is that the old contents (if any) of - OUTPUT-FILE are not erased. Instead, the `awk' output is appended - to the file. If OUTPUT-FILE does not exist, then it is created. + This redirection prints the items into the pre-existing output file + named OUTPUT-FILE. The difference between this and the single-`>' + redirection is that the old contents (if any) of OUTPUT-FILE are + not erased. Instead, the `awk' output is appended to the file. + If OUTPUT-FILE does not exist, then it is created. `print ITEMS | COMMAND' - It is also possible to send output to another program through a - pipe instead of into a file. This type of redirection opens a - pipe to COMMAND, and writes the values of ITEMS through this pipe - to another process created to execute COMMAND. + It is possible to send output to another program through a pipe + instead of into a file. This redirection opens a pipe to + COMMAND, and writes the values of ITEMS through this pipe to + another process created to execute COMMAND. The redirection argument COMMAND is actually an `awk' expression. Its value is converted to a string whose contents give the shell @@ -5240,7 +5292,7 @@ work identically for `printf': program. (The parentheses group the items to concatenate--see *note Concatenation::.) - The `close' function is called here because it's a good idea to + The `close()' function is called here because it's a good idea to close the pipe as soon as all the intended output has been sent to it. *Note Close Files And Pipes::, for more information. @@ -5251,11 +5303,11 @@ work identically for `printf': that the string value be spelled identically every time. `print ITEMS |& COMMAND' - This type of redirection prints the items to the input of COMMAND. - The difference between this and the single-`|' redirection is that - the output from COMMAND can be read with `getline'. Thus COMMAND - is a "coprocess", which works together with, but subsidiary to, - the `awk' program. + This redirection prints the items to the input of COMMAND. The + difference between this and the single-`|' redirection is that the + output from COMMAND can be read with `getline'. Thus COMMAND is a + "coprocess", which works together with, but subsidiary to, the + `awk' program. This feature is a `gawk' extension, and is not available in POSIX `awk'. *Note Getline/Coprocess::, for a brief discussion. *Note @@ -5301,7 +5353,7 @@ lowercase. The following program is both simple and efficient: END { close("sh") } - The `tolower' function returns its argument string with all + The `tolower()' function returns its argument string with all uppercase characters converted to lowercase (*note String Functions::). The program builds up a list of command lines, using the `mv' utility to rename the files. It then sends the list to the shell for execution. @@ -5314,7 +5366,7 @@ File: gawk.info, Node: Special Files, Next: Close Files And Pipes, Prev: Redi `gawk' provides a number of special file names that it interprets internally. These file names provide access to standard file -descriptors, process-related information, and TCP/IP networking. +descriptors and TCP/IP networking. * Menu: @@ -5331,7 +5383,7 @@ File: gawk.info, Node: Special FD, Next: Special Network, Up: Special Files Running programs conventionally have three input and output streams already available to them for reading and writing. These are known as the "standard input", "standard output", and "standard error output". -These streams are, by default, connected to your terminal, but they are +These streams are, by default, connected to your screen, but they are often redirected with the shell, via the `<', `<<', `>', `>>', `>&', and `|' operators. Standard error is typically used for writing error messages; the reason there are two separate streams, standard output @@ -5346,13 +5398,16 @@ This works by opening a pipeline to a shell command that can access the standard error stream that it inherits from the `awk' process. This is far from elegant, and it is also inefficient, because it requires a separate process. So people writing `awk' programs often don't do -this. Instead, they send the error messages to the terminal, like this: +this. Instead, they send the error messages to the screen, like this: print "Serious error detected!" > "/dev/tty" +(`/dev/tty' is a special file supplied by the operating system that is +connected to your keyboard and screen. It represents the "terminal,"(1) +which on modern systems is a keyboard and screen, not a serial console.) This usually has the same effect but not always: although the standard -error stream is usually the terminal, it can be redirected; when that -happens, writing to the terminal is not correct. In fact, if `awk' is +error stream is usually the screen, it can be redirected; when that +happens, writing to the screen is not correct. In fact, if `awk' is run from a background job, it may not have a terminal at all. Then opening `/dev/tty' fails. @@ -5389,10 +5444,17 @@ error message in a `gawk' program is to use `/dev/stderr', like this: redirection, the value must be a string. It is a common error to omit the quotes, which leads to confusing results. - Finally, usng the `close' function on a file name of the form + Finally, usng the `close()' function on a file name of the form `"/dev/fd/N"', for file descriptor numbers above two, will actually close the given file descriptor. + The `/dev/stdin', `/dev/stdout', and `/dev/stderr' special files are +also recognized internally by several other versions of `awk'. + + ---------- Footnotes ---------- + + (1) The "tty" in `/dev/tty' stands for "Teletype," a serial terminal. + File: gawk.info, Node: Special Network, Next: Special Caveats, Prev: Special FD, Up: Special Files @@ -5424,28 +5486,13 @@ names that `gawk' provides: * Recognition of these special file names is disabled if `gawk' is in compatibility mode (*note Options::). - * The special files that provide process-related information are now - considered obsolete and will disappear entirely in the next - release of `gawk'. `gawk' prints a warning message every time you - use one of these files. To obtain process-related information, - use the `PROCINFO' array. *Note Built-in Variables::. - - * Starting with version 3.1, `gawk' _always_ interprets these - special file names.(1) For example, using `/dev/fd/4' for output - actually writes on file descriptor 4, and not on a new file - descriptor that is `dup''ed from file descriptor 4. Most of the - time this does not matter; however, it is important to _not_ close - any of the files related to file descriptors 0, 1, and 2. Doing - so results in unpredictable behavior. - - ---------- Footnotes ---------- - - (1) Older versions of `gawk' would interpret these names internally -only if the system did not actually have a `/dev/fd' directory or any -of the other special files listed earlier. Usually this didn't make a -difference, but sometimes it did; thus, it was decided to make `gawk''s -behavior consistent on all systems and to have it always interpret the -special file names itself. + * `gawk' _always_ interprets these special file names. For example, + using `/dev/fd/4' for output actually writes on file descriptor 4, + and not on a new file descriptor that is `dup''ed from file + descriptor 4. Most of the time this does not matter; however, it + is important to _not_ close any of the files related to file + descriptors 0, 1, and 2. Doing so results in unpredictable + behavior. File: gawk.info, Node: Close Files And Pipes, Prev: Special Files, Up: Printing @@ -5460,14 +5507,14 @@ time only. At that time, the first record of input is read from that file or command. The next time the same file or command is used with `getline', another record is read from it, and so on. - Similarly, when a file or pipe is opened for output, the file name or -command associated with it is remembered by `awk', and subsequent -writes to the same file or command are appended to the previous writes. -The file or pipe stays open until `awk' exits. + Similarly, when a file or pipe is opened for output, `awk' remembers +the file name or command associated with it, and subsequent writes to +the same file or command are appended to the previous writes. The file +or pipe stays open until `awk' exits. This implies that special steps are necessary in order to read the same file again from the beginning, or to rerun a shell command (rather -than reading more output from the same command). The `close' function +than reading more output from the same command). The `close()' function makes these things possible: close(FILENAME) @@ -5532,7 +5579,7 @@ programs. Here are some of the reasons for closing an output file: `gawk' attempts to multiplex the available open files among your data files. `gawk''s ability to do this depends upon the facilities of your operating system, so it may not always work. It is therefore both good -practice and good portability advice to always use `close' on your +practice and good portability advice to always use `close()' on your files when you are done with them. In fact, if you are using a lot of pipes, it is essential that you close commands when done. For example, consider something like this: @@ -5547,17 +5594,18 @@ consider something like this: } This example creates a new pipeline based on data in _each_ record. -Without the call to `close' indicated in the comment, `awk' creates +Without the call to `close()' indicated in the comment, `awk' creates child processes to run the commands, until it eventually runs out of file descriptors for more pipelines. Even though each command has finished (as indicated by the end-of-file return status from `getline'), the child process is not terminated;(1) more importantly, the file descriptor for the pipe is -not closed and released until `close' is called or `awk' exits. +not closed and released until `close()' is called or `awk' exits. - `close' will silently do nothing if given an argument that does not -represent a file, pipe or coprocess that was opened with a redirection. + `close()' will silently do nothing if given an argument that does +not represent a file, pipe or coprocess that was opened with a +redirection. Note also that `close(FILENAME)' has no "magic" effects on the implicit loop that reads through the files named on the command line. @@ -5567,38 +5615,38 @@ silently does nothing. When using the `|&' operator to communicate with a coprocess, it is occasionally useful to be able to close one end of the two-way pipe without closing the other. This is done by supplying a second argument -to `close'. As in any other call to `close', the first argument is the -name of the command or special file used to start the coprocess. The -second argument should be a string, with either of the values `"to"' or -`"from"'. Case does not matter. As this is an advanced feature, a -more complete discussion is delayed until *note Two-way I/O::, which -discusses it in more detail and gives an example. - -Advanced Notes: Using `close''s Return Value --------------------------------------------- +to `close()'. As in any other call to `close()', the first argument is +the name of the command or special file used to start the coprocess. +The second argument should be a string, with either of the values +`"to"' or `"from"'. Case does not matter. As this is an advanced +feature, a more complete discussion is delayed until *note Two-way +I/O::, which discusses it in more detail and gives an example. + +Advanced Notes: Using `close()''s Return Value +---------------------------------------------- -In many versions of Unix `awk', the `close' function is actually a +In many versions of Unix `awk', the `close()' function is actually a statement. It is a syntax error to try and use the return value from -`close': (d.c.) +`close()': (d.c.) command = "..." command | getline info - retval = close(command) # syntax error in most Unix awks + retval = close(command) # syntax error in many Unix awks - `gawk' treats `close' as a function. The return value is -1 if the -argument names something that was never opened with a redirection, or -if there is a system problem closing the file or process. In these + `gawk' treats `close()' as a function. The return value is -1 if +the argument names something that was never opened with a redirection, +or if there is a system problem closing the file or process. In these cases, `gawk' sets the built-in variable `ERRNO' to a string describing the problem. In `gawk', when closing a pipe or coprocess (input or output), the return value is the exit status of the command.(2) Otherwise, it is the -return value from the system's `close' or `fclose' C functions when +return value from the system's `close()' or `fclose()' C functions when closing input or output files, respectively. This value is zero if the close succeeds, or -1 if it fails. - The POSIX standard is very vague; it says that `close' returns zero -on success and non-zero otherwise. In general, different + The POSIX standard is very vague; it says that `close()' returns +zero on success and non-zero otherwise. In general, different implementations vary in what they report when closing pipes; thus the return value cannot be used portably. (d.c.) @@ -5608,7 +5656,7 @@ return value cannot be used portably. (d.c.) is called a "zombie," and cleaning up after it is referred to as "reaping." - (2) This is a full 16-bit value as returned by the `wait' system + (2) This is a full 16-bit value as returned by the `wait()' system call. See the system manual pages for information on how to decode this value. @@ -5755,9 +5803,9 @@ program text. However, such numbers in the input data are not treated differently; doing so by default would break old programs. (If you really need to do this, use the `--non-decimal-data' command-line option; *note Nondecimal Data::.) If you have octal or hexadecimal -data, you can use the `strtonum' function (*note String Functions::) to -convert the data into a number. Most of the time, you will want to use -octal or hexadecimal constants when working with the built-in bit +data, you can use the `strtonum()' function (*note String Functions::) +to convert the data into a number. Most of the time, you will want to +use octal or hexadecimal constants when working with the built-in bit manipulation functions; see *note Bitwise Functions::, for more information. @@ -5840,11 +5888,11 @@ the contents of the current input record. This feature of the language has never been well documented until the POSIX specification. Constant regular expressions are also used as the first argument for -the `gensub', `sub', and `gsub' functions, and as the second argument -of the `match' function (*note String Functions::). Modern +the `gensub()', `sub()', and `gsub()' functions, and as the second +argument of the `match()' function (*note String Functions::). Modern implementations of `awk', including `gawk', allow the third argument of -`split' to be a regexp constant, but some older implementations do not. -(d.c.) This can lead to confusion when attempting to use regexp +`split()' to be a regexp constant, but some older implementations do +not. (d.c.) This can lead to confusion when attempting to use regexp constants as arguments to user-defined functions (*note User-defined::). For example: @@ -5866,7 +5914,7 @@ For example: In this example, the programmer wants to pass a regexp constant to the user-defined function `mysub', which in turn passes it on to either -`sub' or `gsub'. However, what really happens is that the `pat' +`sub()' or `gsub()'. However, what really happens is that the `pat' parameter is either one or zero, depending upon whether or not `$0' matches `/hi/'. `gawk' issues a warning when it sees a regexp constant used as a parameter to a user-defined function, since passing a truth @@ -6002,8 +6050,8 @@ interpreted as valid numbers convert to zero. The exact manner in which numbers are converted into strings is controlled by the `awk' built-in variable `CONVFMT' (*note Built-in -Variables::). Numbers are converted using the `sprintf' function with -`CONVFMT' as the format specifier (*note String Functions::). +Variables::). Numbers are converted using the `sprintf()' function +with `CONVFMT' as the format specifier (*note String Functions::). `CONVFMT''s default value is `"%.6g"', which prints a value with at most six significant digits. For some applications, you might want to @@ -6012,11 +6060,11 @@ digits is enough to capture a floating-point number's value exactly, most of the time.(1) Strange results can occur if you set `CONVFMT' to a string that -doesn't tell `sprintf' how to format floating-point numbers in a useful -way. For example, if you forget the `%' in the format, `awk' converts -all numbers to the same constant string. As a special case, if a -number is an integer, then the result of converting it to a string is -_always_ an integer, no matter what the value of `CONVFMT' may be. +doesn't tell `sprintf()' how to format floating-point numbers in a +useful way. For example, if you forget the `%' in the format, `awk' +converts all numbers to the same constant string. As a special case, +if a number is an integer, then the result of converting it to a string +is _always_ an integer, no matter what the value of `CONVFMT' may be. Given the following code fragment: CONVFMT = "%2.2f" @@ -6088,7 +6136,7 @@ Feature Default `--posix' or `--use-lc-numeric' `%'g' Use locale Use locale `%g' Use period Use locale Input Use period Use locale -`strtonum' Use period Use locale +`strtonum()'Use period Use locale Table 5.1: Locale Decimal Point versus A Period @@ -6403,9 +6451,9 @@ righthand expression. For example: } The indices of `bar' are practically guaranteed to be different, because -`rand' returns different values each time it is called. (Arrays and -the `rand' function haven't been covered yet. *Note Arrays::, and see -*note Numeric Functions::, for more information). This example +`rand()' returns different values each time it is called. (Arrays and +the `rand()' function haven't been covered yet. *Note Arrays::, and +see *note Numeric Functions::, for more information). This example illustrates an important fact about assignment operators: the lefthand expression is only evaluated _once_. It is up to the implementation as to which expression is evaluated first, the lefthand or the righthand. @@ -6633,8 +6681,8 @@ these rules: STRING attribute. * Fields, `getline' input, `FILENAME', `ARGV' elements, `ENVIRON' - elements, and the elements of an array created by `split' and - `match' that are numeric strings have the STRNUM attribute. + elements, and the elements of an array created by `split()' and + `match()' that are numeric strings have the STRNUM attribute. Otherwise, they have the STRING attribute. Uninitialized variables also have the STRNUM attribute. @@ -6965,11 +7013,11 @@ File: gawk.info, Node: Function Calls, Next: Precedence, Prev: Truth Values a A "function" is a name for a particular calculation. This enables you to ask for it by name at any point in the program. For example, the -function `sqrt' computes the square root of a number. +function `sqrt()' computes the square root of a number. A fixed set of functions are "built-in", which means they are -available in every `awk' program. The `sqrt' function is one of these. -*Note Built-in::, for a list of built-in functions and their +available in every `awk' program. The `sqrt()' function is one of +these. *Note Built-in::, for a list of built-in functions and their descriptions. In addition, you can define functions for use in your program. *Note User-defined::, for instructions on how to do this. @@ -6993,8 +7041,8 @@ concatenation of a variable with an expression inside parentheses. With built-in functions, space before the parenthesis is harmless, but it is best not to get into the habit of using space to avoid mistakes with user-defined functions. Each function expects a -particular number of arguments. For example, the `sqrt' function must -be called with a single argument, the number of which to take the +particular number of arguments. For example, the `sqrt()' function +must be called with a single argument, the number of which to take the square root: sqrt(ARGUMENT) @@ -7441,7 +7489,7 @@ the order in which they are executed doesn't matter. *Note Options::, for more information on using library functions. *Note Library Functions::, for a number of useful library functions. - If an `awk' program has only a `BEGIN' rule and no other rules, then + If an `awk' program has only `BEGIN' rules and no other rules, then the program exits after the `BEGIN' rule is run.(1) However, if an `END' rule exists, then the input is read, even if there are no other rules in the program. This is necessary in case the `END' rule checks @@ -7968,11 +8016,8 @@ Statement::.) loop. However, although it was never documented, historical implementations of `awk' treated the `break' statement outside of a loop as if it were a `next' statement (*note Next Statement::). Recent -versions of Unix `awk' no longer allow this usage. `gawk' supports -this use of `break' only if `--traditional' has been specified on the -command line (*note Options::). Otherwise, it is treated as an error, -since the POSIX standard specifies that `break' should only be used -inside the body of a loop. (d.c.) +versions of Unix `awk' no longer allow this usage, nor does `gawk'. +(d.c.) File: gawk.info, Node: Continue Statement, Next: Next Statement, Prev: Break Statement, Up: Statements @@ -8021,10 +8066,7 @@ This program loops forever once `x' reaches 5. a loop. Historical versions of `awk' treated a `continue' statement outside a loop the same way they treated a `break' statement outside a loop: as if it were a `next' statement (*note Next Statement::). -Recent versions of Unix `awk' no longer work this way, and `gawk' -allows it only if `--traditional' is specified on the command line -(*note Options::). Just like the `break' statement, the POSIX standard -specifies that `continue' should only be used inside the body of a loop. +Recent versions of Unix `awk' no longer work this way, nor does `gawk'. (d.c.) @@ -8106,7 +8148,7 @@ continue scanning the unwanted records. The `nextfile' statement accomplishes this much more efficiently. While one might think that `close(FILENAME)' would accomplish the -same as `nextfile', this isn't true. `close' is reserved for closing +same as `nextfile', this isn't true. `close()' is reserved for closing files, pipes, and coprocesses that are opened with redirections. It is not related to the main processing that `awk' does with the files listed in `ARGV'. @@ -8243,7 +8285,7 @@ specific to `gawk' are marked with a pound sign (`#'). `CONVFMT' This string controls conversion of numbers to strings (*note Conversion::). It works by being passed, in effect, as the first - argument to the `sprintf' function (*note String Functions::). + argument to the `sprintf()' function (*note String Functions::). Its default value is `"%.6g"'. `CONVFMT' was introduced by the POSIX standard. @@ -8297,13 +8339,13 @@ specific to `gawk' are marked with a pound sign (`#'). `IGNORECASE #' If `IGNORECASE' is nonzero or non-null, then all string comparisons and all regular expression matching are case independent. Thus, - regexp matching with `~' and `!~', as well as the `gensub', - `gsub', `index', `match', `split', and `sub' functions, record - termination with `RS', and field splitting with `FS', all ignore - case when doing their particular regexp operations. However, the - value of `IGNORECASE' does _not_ affect array subscripting and it - does not affect field splitting when using a single-character - field separator. *Note Case-sensitivity::. + regexp matching with `~' and `!~', as well as the `gensub()', + `gsub()', `index()', `match()', `split()', and `sub()' functions, + record termination with `RS', and field splitting with `FS', all + ignore case when doing their particular regexp operations. + However, the value of `IGNORECASE' does _not_ affect array + subscripting and it does not affect field splitting when using a + single-character field separator. *Note Case-sensitivity::. If `gawk' is in compatibility mode (*note Options::), then `IGNORECASE' has no special meaning. Thus, string and regexp @@ -8329,7 +8371,7 @@ specific to `gawk' are marked with a pound sign (`#'). `OFMT' This string controls conversion of numbers to strings (*note Conversion::) for printing with the `print' statement. It works - by being passed as the first argument to the `sprintf' function + by being passed as the first argument to the `sprintf()' function (*note String Functions::). Its default value is `"%.6g"'. Earlier versions of `awk' also used `OFMT' to specify the format for converting numbers to strings in general expressions; this is @@ -8368,9 +8410,9 @@ specific to `gawk' are marked with a pound sign (`#'). This variable is used for internationalization of programs at the `awk' level. It sets the default text domain for specially marked string constants in the source text, as well as for the - `dcgettext', `dcngettext' and `bindtextdomain' functions (*note - Internationalization::). The default value of `TEXTDOMAIN' is - `"messages"'. + `dcgettext()', `dcngettext()' and `bindtextdomain()' functions + (*note Internationalization::). The default value of `TEXTDOMAIN' + is `"messages"'. This variable is a `gawk' extension. In other `awk' implementations, or if `gawk' is in compatibility mode (*note @@ -8446,7 +8488,7 @@ with a pound sign (`#'). are the values of the particular environment variables. For example, `ENVIRON["HOME"]' might be `/home/arnold'. Changing this array does not affect the environment passed on to any programs - that `awk' may spawn via redirection or the `system' function. + that `awk' may spawn via redirection or the `system()' function. Some operating systems may not have environment variables. On such systems, the `ENVIRON' array is empty (except for @@ -8454,13 +8496,12 @@ with a pound sign (`#'). `ERRNO #' If a system error occurs during a redirection for `getline', - during a read for `getline', or during a `close' operation, then + during a read for `getline', or during a `close()' operation, then `ERRNO' contains a string describing the error. - *FIXME:* Get the version right. Starting with version 3.X, `gawk' - clears `ERRNO' before opening each command line input file. This - enables checking if the file is readable inside a `BEGINFILE' - pattern (*note BEGINFILE/ENDFILE::). + Starting with version 4.0, `gawk' clears `ERRNO' before opening + each command line input file. This enables checking if the file is + readable inside a `BEGINFILE' pattern (*note BEGINFILE/ENDFILE::). Otherwise, `ERRNO' works similarly to the C variable `errno'. Except for the case just mentioned, `gawk' _never_ clears it (sets @@ -8550,17 +8591,17 @@ with a pound sign (`#'). special. `RLENGTH' - The length of the substring matched by the `match' function (*note - String Functions::). `RLENGTH' is set by invoking the `match' - function. Its value is the length of the matched string, or -1 if - no match is found. + The length of the substring matched by the `match()' function + (*note String Functions::). `RLENGTH' is set by invoking the + `match()' function. Its value is the length of the matched + string, or -1 if no match is found. `RSTART' The start-index in characters of the substring that is matched by - the `match' function (*note String Functions::). `RSTART' is set - by invoking the `match' function. Its value is the position of - the string where the matched substring starts, or zero if no match - was found. + the `match()' function (*note String Functions::). `RSTART' is + set by invoking the `match()' function. Its value is the position + of the string where the matched substring starts, or zero if no + match was found. `RT #' This is set each time a record is read. It contains the input text @@ -8729,6 +8770,7 @@ cannot have a variable and an array with the same name in the same * Multi-dimensional:: Emulating multidimensional arrays in `awk'. * Array Sorting:: Sorting array values and indices. +* Arrays of Arrays:: True multidimensional arrays. File: gawk.info, Node: Array Basics, Next: Delete, Up: Arrays @@ -8755,7 +8797,7 @@ File: gawk.info, Node: Array Intro, Next: Reference to Elements, Up: Array Ba 7.1.1 Introduction to Arrays ---------------------------- - Doing linear scans over an associateive array is like tryinng to + Doing linear scans over an associative array is like tryinng to club someone to death with a loaded Uzi. Larry Wall @@ -8844,7 +8886,7 @@ automatically converts it to a string. The value of `IGNORECASE' has no effect upon array subscripting. The identical string value used to store an array element must be used -to retrieve it. When `awk' creates an array (e.g., with the `split' +to retrieve it. When `awk' creates an array (e.g., with the `split()' built-in function), that array's indices are consecutive integers starting at one. (*Note String Functions::.) @@ -8995,7 +9037,7 @@ the word as index. The second rule scans the elements of `used' to find all the distinct words that appear in the input. It prints each word that is more than 10 characters long and also prints the number of such words. *Note String Functions::, for more information on the -built-in function `length'. +built-in function `length()'. # Record a 1 for each word that is used at least once { @@ -9080,7 +9122,7 @@ clear out an array:(1) split("", array) - The `split' function (*note String Functions::) clears out the + The `split()' function (*note String Functions::) clears out the target array first. This call asks it to split apart the null string. Because there is no data to split out, the function simply clears the array and then returns. @@ -9293,8 +9335,8 @@ _way of accessing_ an array. However, if your program has an array that is always accessed as multidimensional, you can get the effect of scanning it by combining the scanning `for' statement (*note Scanning an Array::) with the -built-in `split' function (*note String Functions::). It works in the -following manner: +built-in `split()' function (*note String Functions::). It works in +the following manner: for (combined in array) { split(combined, separate, SUBSEP) @@ -9310,8 +9352,8 @@ become the elements of the array `separate'. element with index `"1\034foo"' exists in `array'. (Recall that the default value of `SUBSEP' is the character with code 034.) Sooner or later, the `for' statement finds that index and does an iteration with -the variable `combined' set to `"1\034foo"'. Then the `split' function -is called as follows: +the variable `combined' set to `"1\034foo"'. Then the `split()' +function is called as follows: split("1\034foo", separate, "\034") @@ -9320,7 +9362,7 @@ The result is to set `separate[1]' to `"1"' and `separate[2]' to recovered. -File: gawk.info, Node: Array Sorting, Prev: Multi-dimensional, Up: Arrays +File: gawk.info, Node: Array Sorting, Next: Arrays of Arrays, Prev: Multi-dimensional, Up: Arrays 7.6 Sorting Array Values and Indices with `gawk' ================================================ @@ -9329,24 +9371,24 @@ The order in which an array is scanned with a `for (i in array)' loop is essentially arbitrary. In most `awk' implementations, sorting an array requires writing a `sort' function. While this can be educational for exploring different sorting algorithms, usually that's -not the point of the program. `gawk' provides the built-in `asort' and -`asorti' functions (*note String Functions::) for sorting arrays. For -example: +not the point of the program. `gawk' provides the built-in `asort()' +and `asorti()' functions (*note String Functions::) for sorting arrays. +For example: POPULATE THE ARRAY data n = asort(data) for (i = 1; i <= n; i++) DO SOMETHING WITH data[i] - After the call to `asort', the array `data' is indexed from 1 to + After the call to `asort()', the array `data' is indexed from 1 to some number N, the total number of elements in `data'. (This count is -`asort''s return value.) `data[1]' <= `data[2]' <= `data[3]', and so +`asort()''s return value.) `data[1]' <= `data[2]' <= `data[3]', and so on. The comparison of array elements is done using `gawk''s usual comparison rules (*note Typing and Comparison::). - An important side effect of calling `asort' is that _the array's + An important side effect of calling `asort()' is that _the array's original indices are irrevocably lost_. As this isn't always -desirable, `asort' accepts a second argument: +desirable, `asort()' accepts a second argument: POPULATE THE ARRAY source n = asort(source, dest) @@ -9359,9 +9401,9 @@ array is not affected. Often, what's needed is to sort on the values of the _indices_ instead of the values of the elements. To do that, starting with -`gawk' 3.1.2, use the `asorti' function. The interface is identical to -that of `asort', except that the index values are used for sorting, and -become the values of the result array: +`gawk' 3.1.2, use the `asorti()' function. The interface is identical +to that of `asort()', except that the index values are used for +sorting, and become the values of the result array: { source[$0] = some_func($0) } @@ -9375,8 +9417,8 @@ become the values of the result array: } If your version of `gawk' is 3.1.0 or 3.1.1, you don't have -`asorti'. Instead, use a helper array to hold the sorted index values, -and then access the original array's elements. It works in the +`asorti()'. Instead, use a helper array to hold the sorted index +values, and then access the original array's elements. It works in the following way: POPULATE THE ARRAY data @@ -9400,19 +9442,125 @@ indices. Copying array indices and elements isn't expensive in terms of memory. Internally, `gawk' maintains "reference counts" to data. For -example, when `asort' copies the first array to the second one, there +example, when `asort()' copies the first array to the second one, there is only one copy of the original array elements' data, even though both arrays use the values. Similarly, when copying the indices from `data' to `ind', there is only one copy of the actual index strings. We said previously that comparisons are done using `gawk''s "usual comparison rules." Because `IGNORECASE' affects string comparisons, -the value of `IGNORECASE' also affects sorting for both `asort' and -`asorti'. Note also that the locale's sorting order does _not_ come +the value of `IGNORECASE' also affects sorting for both `asort()' and +`asorti()'. Note also that the locale's sorting order does _not_ come into play; comparisons are based on character values only. Caveat Emptor. +File: gawk.info, Node: Arrays of Arrays, Prev: Array Sorting, Up: Arrays + +7.7 Arrays of Arrays +==================== + +`gawk' supports arrays of arrays. Elements of a subarray are referred +to by their own indices enclosed in square brackets, just like the +elements of the main array. For example, the following creates a +two-element subarray at index `1' of the main array `a': + + a[1][1] = 1 + a[1][2] = 2 + + This simulates a true two-dimensional array. Each subarray element +can contain another subarray as a value, which in turn can hold other +arrays as well. In this way, you can create arrays of three or more +dimensions. The indices can be any `awk' expression, including scalars +seperated by commas (that is, a regular `awk' simulated +multidimensional subscript). So the following is valid in `gawk': + + a[1][3][1, "name"] = "barney" + + Each subarray and the main array can be of different length. In +fact, the elements of an array or its subarray do not all have to have +the same type. This means that the main array and any of its subarrays +can be non-rectangular, or jagged in structure. One can assign a scalar +value to the index `4' of the main array `a': + + a[4] = "An element in a jagged array" + + The terms "dimension", "row" and "column" are meaningless when +applied to such an array, but we will use "dimension" henceforth to +imply the maximum number of indices needed to refer to an existing +element. The type of any element that has already been assigned cannot +be changed by assigning a value of a different type. You have to first +delete the current element, which effectively makes `gawk' forget about +the element at that index: + + delete a[4] + a[4][5][6][7] = "An element in a four-dimensional array" + +This removes the scalar value from index `4' and then inserts a +subarray of subarray of subarray containing a scalar. You can also +delete an entire subarray or subarray of subarrays: + + delete a[4][5] + a[4][5] = "An element in subarray a[4]" + + But recall that you can not delete the main array `a' and then use it +as a scalar. + + The built-in functions which take array arguments can also be used +with subarrays. For example, the following code fragment uses `length()' +to determine the number of elements in the main array `a' and its +subarrays: + + print length(a), length(a[1]), length(a[1][3]) + +This results in the following output for our main array `a': + + 2, 3, 1 + +The `SUBSCRIPT in ARRAY' expression (*note Reference to Elements::) +works similarly for both regular `awk'-style arrays and arrays of +arrays. For example, the tests `1 in a', `3 in a[1]', and `(1, "name") +in a[1][3]' all evaluate to one (true) for our array `a'. + + The `for (item in array)' statement (*note Scanning an Array::) can +be nested to scan all the elements of an array of arrays if it is +rectangular in structure. In order to print the contents (scalar +values) of a two-dimensional array of arrays with each subarray having +the same length, you could use the following code: + + for (i in array) + for (j in array[j]) + print array[i][j] + + If the structure of a jagged array of arrays is known in advance, +you can often devise workarounds using control statements. For example, +the following code prints the elements of our main array `a': + + for (i in a) { + for (j in a[j]) { + if (j == 3) { + for (k in a[i][j]) + print a[i][j][k] + } else + print a[i][j] + } + } + + Recall that a reference to an uninitialized array element yields a +value of `""', the null string. This has one important implication when +you intend to use a subarray as an argument to a function, as +illustrated by the following example: + + $ gawk 'BEGIN { split("a b c d", b[1]); print b[1][1] }' + error--> gawk: cmd. line:1: fatal: split: second argument is not an array + + The way to work around this is to first force `b[1]' to be an array +by creating an arbitray index: + + $ gawk 'BEGIN { b[1][1] = ""; split("a b c d", b[1]); print b[1][1] }' + -| a + + File: gawk.info, Node: Functions, Next: Internationalization, Prev: Arrays, Up: Top 8 Functions @@ -9448,9 +9596,10 @@ for your convenience. * Calling Built-in:: How to call built-in functions. * Numeric Functions:: Functions that work with numbers, including - `int', `sin' and `rand'. + `int()', `sin()' and `rand()'. * String Functions:: Functions for string manipulation, such as - `split', `match' and `sprintf'. + `split()', `match()' and + `sprintf()'. * I/O Functions:: Functions for files and shell commands. * Time Functions:: Functions for dealing with timestamps. * Bitwise Functions:: Functions for bitwise operations. @@ -9464,7 +9613,7 @@ File: gawk.info, Node: Calling Built-in, Next: Numeric Functions, Up: Built-i To call one of `awk''s built-in functions, write the name of the function followed by arguments in parentheses. For example, `atan2(y + -z, 1)' is a call to the function `atan2' and has two arguments. +z, 1)' is a call to the function `atan2()' and has two arguments. Whitespace is ignored between the built-in function name and the open parenthesis, and it is good practice to avoid using whitespace @@ -9486,7 +9635,7 @@ For example, in the following code fragment: i = 4 j = sqrt(i++) -the variable `i' is incremented to the value five before `sqrt' is +the variable `i' is incremented to the value five before `sqrt()' is called with a value of four for its actual parameter. The order of evaluation of the expressions used for the function's parameters is undefined. Thus, avoid writing programs that assume that parameters @@ -9496,9 +9645,9 @@ are evaluated from left to right or from right to left. For example: j = atan2(i++, i *= 2) If the order of evaluation is left to right, then `i' first becomes -6, and then 12, and `atan2' is called with the two arguments 6 and 12. -But if the order of evaluation is right to left, `i' first becomes 10, -then 11, and `atan2' is called with the two arguments 11 and 10. +6, and then 12, and `atan2()' is called with the two arguments 6 and +12. But if the order of evaluation is right to left, `i' first becomes +10, then 11, and `atan2()' is called with the two arguments 11 and 10. File: gawk.info, Node: Numeric Functions, Next: String Functions, Prev: Calling Built-in, Up: Built-in @@ -9540,9 +9689,9 @@ brackets ([ ]): This returns the arctangent of `Y / X' in radians. `rand()' - This returns a random number. The values of `rand' are uniformly - distributed between zero and one. The value could be zero but is - never one.(1) + This returns a random number. The values of `rand()' are + uniformly distributed between zero and one. The value could be + zero but is never one.(1) Often random integers are needed instead. Following is a user-defined function that can be used to obtain a random @@ -9553,7 +9702,7 @@ brackets ([ ]): } The multiplication produces a random number greater than zero and - less than `n'. Using `int', this result is made into an integer + less than `n'. Using `int()', this result is made into an integer between zero and `n' - 1, inclusive. The following example uses a similar function to produce random @@ -9570,17 +9719,17 @@ brackets ([ ]): roll(6)+roll(6)+roll(6)) } - *Caution:* In most `awk' implementations, including `gawk', `rand' - starts generating numbers from the same starting number, or - "seed", each time you run `awk'. Thus, a program generates the + *Caution:* In most `awk' implementations, including `gawk', + `rand()' starts generating numbers from the same starting number, + or "seed", each time you run `awk'. Thus, a program generates the same results each time you run it. The numbers are random within one `awk' run but predictable from run to run. This is convenient for debugging, but if you want a program to do different things each time it is used, you must change the seed to a value that is - different in each run. To do this, use `srand'. + different in each run. To do this, use `srand()'. `srand([X])' - The function `srand' sets the starting point, or seed, for + The function `srand()' sets the starting point, or seed, for generating random numbers to the value X. Each seed value leads to a particular sequence of random @@ -9596,17 +9745,17 @@ brackets ([ ]): date and time of day are used for a seed. This is the way to get random numbers that are truly unpredictable. - The return value of `srand' is the previous seed. This makes it + The return value of `srand()' is the previous seed. This makes it easy to keep track of the seeds in case you need to consistently reproduce sequences of random numbers. ---------- Footnotes ---------- - (1) The C version of `rand' is known to produce fairly poor + (1) The C version of `rand()' is known to produce fairly poor sequences of random numbers. However, nothing requires that an `awk' -implementation use the C `rand' to implement the `awk' version of -`rand'. In fact, `gawk' uses the BSD `random' function, which is -considerably better than `rand', to produce random numbers. +implementation use the C `rand()' to implement the `awk' version of +`rand()'. In fact, `gawk' uses the BSD `random' function, which is +considerably better than `rand()', to produce random numbers. (2) Computer-generated random numbers really are not truly random. They are technically known as "pseudorandom." This means that while @@ -9627,11 +9776,11 @@ with a pound sign (`#'): * Menu: * Gory Details:: More than you want to know about `\' and - `&' with `sub', `gsub', and - `gensub'. + `&' with `sub()', `gsub()', and + `gensub()'. `asort(SOURCE [, DEST]) #' - `asort' is a `gawk'-specific extension, returning the number of + `asort()' is a `gawk'-specific extension, returning the number of elements in the array SOURCE. The contents of SOURCE are sorted using `gawk''s normal rules for comparing values (in particular, `IGNORECASE' affects the sorting) and the indices of the sorted @@ -9645,7 +9794,7 @@ with a pound sign (`#'): a["first"] = "sac" a["middle"] = "cul" - A call to `asort': + A call to `asort()': asort(a) @@ -9655,20 +9804,20 @@ with a pound sign (`#'): a[2] = "de" a[3] = "sac" - The `asort' function is described in more detail in *note Array - Sorting::. `asort' is a `gawk' extension; it is not available in - compatibility mode (*note Options::). + The `asort()' function is described in more detail in *note Array + Sorting::. `asort()' is a `gawk' extension; it is not available + in compatibility mode (*note Options::). `asorti(SOURCE [, DEST]) #' - `asorti' is a `gawk'-specific extension, returning the number of - elements in the array SOURCE. It works similarly to `asort', + `asorti()' is a `gawk'-specific extension, returning the number of + elements in the array SOURCE. It works similarly to `asort()', however, the _indices_ are sorted, instead of the values. As array indices are always strings, the comparison performed is always a string comparison. (Here too, `IGNORECASE' affects the sorting.) - The `asorti' function is described in more detail in *note Array - Sorting::. It was added in `gawk' 3.1.2. `asorti' is a `gawk' + The `asorti()' function is described in more detail in *note Array + Sorting::. It was added in `gawk' 3.1.2. `asorti()' is a `gawk' extension; it is not available in compatibility mode (*note Options::). @@ -9680,8 +9829,8 @@ with a pound sign (`#'): $ awk 'BEGIN { print index("peanut", "an") }' -| 3 - If FIND is not found, `index' returns zero. (Remember that string - indices in `awk' start at one.) + If FIND is not found, `index()' returns zero. (Remember that + string indices in `awk' start at one.) `length([STRING])' This returns the number of characters in STRING. If STRING is a @@ -9691,17 +9840,16 @@ with a pound sign (`#'): and 525 is then converted to the string `"525"', which has three characters. - If no argument is supplied, `length' returns the length of `$0'. + If no argument is supplied, `length()' returns the length of `$0'. - NOTE: In older versions of `awk', the `length' function could - be called without any parentheses. Doing so is marked as - "deprecated" in the POSIX standard. This means that while a - program can do this, it is a feature that can eventually be - removed from a future version of the standard. Therefore, - for programs to be maximally portable, always supply the + NOTE: In older versions of `awk', the `length()' function + could be called without any parentheses. Doing so is + considered poor practice, although the 2008 POSIX standard + explicitly allows it, to support historical practice. For + programs to be maximally portable, always supply the parentheses. - If `length' is called with a variable that has not been used, + If `length()' is called with a variable that has not been used, `gawk' forces the variable to be a scalar. Other implementations of `awk' leave the variable without a type. (d.c.) Consider: @@ -9716,16 +9864,16 @@ with a pound sign (`#'): warning about this. Beginning with `gawk' version 3.1.5, when supplied an array - argument, the `length' function returns the number of elements in - the array. This is less useful than it might seem at first, as the - array is not guaranteed to be indexed from one to the number of - elements in it. If `--lint' is provided on the command line + argument, the `length()' function returns the number of elements + in the array. This is less useful than it might seem at first, as + the array is not guaranteed to be indexed from one to the number + of elements in it. If `--lint' is provided on the command line (*note Options::), `gawk' warns that passing an array argument is not portable. If `--posix' is supplied, using an array argument is a fatal error (*note Arrays::). `match(STRING, REGEXP [, ARRAY])' - The `match' function searches STRING for the longest, leftmost + The `match()' function searches STRING for the longest, leftmost substring matched by the regular expression, REGEXP. It returns the character position, or "index", at which that substring begins (one, if it starts at the beginning of STRING). If no match is @@ -9738,11 +9886,12 @@ with a pound sign (`#'): implications for writing your program correctly. The order of the first two arguments is backwards from most other - string functions that work with regular expressions, such as `sub' - and `gsub'. It might help to remember that for `match', the order - is the same as for the `~' operator: `STRING ~ REGEXP'. + string functions that work with regular expressions, such as + `sub()' and `gsub()'. It might help to remember that for + `match()', the order is the same as for the `~' operator: `STRING + ~ REGEXP'. - The `match' function sets the built-in variable `RSTART' to the + The `match()' function sets the built-in variable `RSTART' to the index. It also sets the built-in variable `RLENGTH' to the length in characters of the matched substring. If no match is found, `RSTART' is set to zero, and `RLENGTH' to -1. @@ -9808,7 +9957,7 @@ with a pound sign (`#'): text; thus they should be tested for with the `in' operator (*note Reference to Elements::). - The ARRAY argument to `match' is a `gawk' extension. In + The ARRAY argument to `match()' is a `gawk' extension. In compatibility mode (*note Options::), using a third argument is a fatal error. @@ -9820,11 +9969,11 @@ with a pound sign (`#'): argument, FIELDPAT, is a regexp describing the fields in STRING (just as `FPAT' is a regexp describing the fields in input records). If FIELDPAT is omitted, the value of `FPAT' is used. - `patsplit' returns the number of elements created. `SEPS[I]' is + `patsplit()' returns the number of elements created. `SEPS[I]' is the separator string between `ARRAY[I]' and `ARRAY[I+1]'. Any leading separator will be in `SEPS[0]'. - The `patsplit' function splits strings into pieces in a manner + The `patsplit()' function splits strings into pieces in a manner similar to the way input lines are split into fields using `FPAT'. `split(STRING, ARRAY [, FIELDSEP [, SEPS ] ])' @@ -9835,14 +9984,14 @@ with a pound sign (`#'): argument, FIELDSEP, is a regexp describing where to split STRING (much as `FS' can be a regexp describing where to split input records). If FIELDSEP is omitted, the value of `FS' is used. - `split' returns the number of elements created. SEPS is a `gawk' - extension with `SEPS[I]' being the separator string between + `split()' returns the number of elements created. SEPS is a + `gawk' extension with `SEPS[I]' being the separator string between `ARRAY[I]' and `ARRAY[I+1]'. If FIELDSEP is a single space then any leading whitespace goes into `SEPS[0]' and any trailing whitespace goes into `SEPS[N]' where N is the return value of `split()' (that is, the number of elements in ARRAY). - The `split' function splits strings into pieces in a manner + The `split()' function splits strings into pieces in a manner similar to the way input lines are split into fields. For example: split("cul-de-sac", a, "-", seps) @@ -9859,7 +10008,7 @@ with a pound sign (`#'): seps[1] = "-" seps[2] = "-" - The value returned by this call to `split' is three. + The value returned by this call to `split()' is three. As with input field-splitting, when the value of FIELDSEP is `" "', leading and trailing whitespace is ignored in ARRAY but not @@ -9868,9 +10017,9 @@ with a pound sign (`#'): string, each individual character in the string is split into its own array element. (This is a `gawk'-specific extension.) - Note, however, that `RS' has no effect on the way `split' works. + Note, however, that `RS' has no effect on the way `split()' works. Even though `RS = ""' causes newline to also be an input field - separator, this does not affect how `split' splits strings. + separator, this does not affect how `split()' splits strings. Modern implementations of `awk', including `gawk', allow the third argument to be a regexp constant (`/abc/') as well as a string. @@ -9879,7 +10028,7 @@ with a pound sign (`#'): string constant or a regexp constant, and the implications for writing your program correctly. - Before splitting the string, `split' deletes any previously + Before splitting the string, `split()' deletes any previously existing elements in the arrays ARRAY and SEPS. If STRING is null, the array has no elements. (So this is a @@ -9901,26 +10050,26 @@ with a pound sign (`#'): `strtonum(STR) #' Examines STR and returns its numeric value. If STR begins with a - leading `0', `strtonum' assumes that STR is an octal number. If - STR begins with a leading `0x' or `0X', `strtonum' assumes that + leading `0', `strtonum()' assumes that STR is an octal number. If + STR begins with a leading `0x' or `0X', `strtonum()' assumes that STR is a hexadecimal number. For example: $ echo 0x11 | > gawk '{ printf "%d\n", strtonum($1) }' -| 17 - Using the `strtonum' function is _not_ the same as adding zero to - a string value; the automatic coercion of strings to numbers works - only for decimal data, not for octal or hexadecimal.(1) + Using the `strtonum()' function is _not_ the same as adding zero + to a string value; the automatic coercion of strings to numbers + works only for decimal data, not for octal or hexadecimal.(1) - Note also that `strtonum' uses the current locale's decimal point + Note also that `strtonum()' uses the current locale's decimal point for recognizing numbers. - `strtonum' is a `gawk' extension; it is not available in + `strtonum()' is a `gawk' extension; it is not available in compatibility mode (*note Options::). `sub(REGEXP, REPLACEMENT [, TARGET])' - The `sub' function alters the value of TARGET. It searches this + The `sub()' function alters the value of TARGET. It searches this value, which is treated as a string, for the leftmost, longest substring matched by the regular expression REGEXP. Then the entire string is changed by replacing the matched text with @@ -9934,7 +10083,7 @@ with a pound sign (`#'): This function is peculiar because TARGET is not simply used to compute a value, and not just any expression will do--it must be a - variable, field, or array element so that `sub' can store a + variable, field, or array element so that `sub()' can store a modified value there. If this argument is omitted, then the default is to use and alter `$0'.(2) For example: @@ -9944,8 +10093,8 @@ with a pound sign (`#'): sets `str' to `"wither, water, everywhere"', by replacing the leftmost longest occurrence of `at' with `ith'. - The `sub' function returns the number of substitutions made (either - one or zero). + The `sub()' function returns the number of substitutions made + (either one or zero). If the special character `&' appears in REPLACEMENT, it stands for the precise substring that was matched by REGEXP. (If the regexp @@ -9977,10 +10126,10 @@ with a pound sign (`#'): { sub(/\|/, "\\&"); print } - As mentioned, the third argument to `sub' must be a variable, + As mentioned, the third argument to `sub()' must be a variable, field or array reference. Some versions of `awk' allow the third argument to be an expression that is not an lvalue. In such a - case, `sub' still searches for the pattern and returns zero or + case, `sub()' still searches for the pattern and returns zero or one, but the result of the substitution (if any) is thrown away because there is no place to put it. Such versions of `awk' accept expressions such as the following: @@ -9997,9 +10146,9 @@ with a pound sign (`#'): regexp to match. `gsub(REGEXP, REPLACEMENT [, TARGET])' - This is similar to the `sub' function, except `gsub' replaces + This is similar to the `sub()' function, except `gsub()' replaces _all_ of the longest, leftmost, _nonoverlapping_ matching - substrings it can find. The `g' in `gsub' stands for "global," + substrings it can find. The `g' in `gsub()' stands for "global," which means replace everywhere. For example: { gsub(/Britain/, "United Kingdom"); print } @@ -10007,25 +10156,25 @@ with a pound sign (`#'): replaces all occurrences of the string `Britain' with `United Kingdom' for all input records. - The `gsub' function returns the number of substitutions made. If + The `gsub()' function returns the number of substitutions made. If the variable to search and alter (TARGET) is omitted, then the - entire input record (`$0') is used. As in `sub', the characters + entire input record (`$0') is used. As in `sub()', the characters `&' and `\' are special, and the third argument must be assignable. `gensub(REGEXP, REPLACEMENT, HOW [, TARGET]) #' - `gensub' is a general substitution function. Like `sub' and - `gsub', it searches the target string TARGET for matches of the - regular expression REGEXP. Unlike `sub' and `gsub', the modified - string is returned as the result of the function and the original - target string is _not_ changed. If HOW is a string beginning with - `g' or `G', then it replaces all matches of REGEXP with - REPLACEMENT. Otherwise, HOW is treated as a number that indicates - which match of REGEXP to replace. If no TARGET is supplied, `$0' - is used. - - `gensub' provides an additional feature that is not available in - `sub' or `gsub': the ability to specify components of a regexp in - the replacement text. This is done by using parentheses in the + `gensub()' is a general substitution function. Like `sub()' and + `gsub()', it searches the target string TARGET for matches of the + regular expression REGEXP. Unlike `sub()' and `gsub()', the + modified string is returned as the result of the function and the + original target string is _not_ changed. If HOW is a string + beginning with `g' or `G', then it replaces all matches of REGEXP + with REPLACEMENT. Otherwise, HOW is treated as a number that + indicates which match of REGEXP to replace. If no TARGET is + supplied, `$0' is used. + + `gensub()' provides an additional feature that is not available in + `sub()' or `gsub()': the ability to specify components of a regexp + in the replacement text. This is done by using parentheses in the regexp to mark the components and then specifying `\N' in the replacement text, where N is a digit from 1 to 9. For example: @@ -10037,7 +10186,7 @@ with a pound sign (`#'): > }' -| def abc - As with `sub', you must type two backslashes in order to get one + As with `sub()', you must type two backslashes in order to get one into the string. In the replacement text, the sequence `\0' represents the entire matched text, as does the character `&'. @@ -10048,19 +10197,19 @@ with a pound sign (`#'): > gawk '{ print gensub(/a/, "AA", 2) }' -| a b c AA b c - In this case, `$0' is used as the default target string. `gensub' - returns the new string as its result, which is passed directly to - `print' for printing. + In this case, `$0' is used as the default target string. + `gensub()' returns the new string as its result, which is passed + directly to `print' for printing. If the HOW argument is a string that does not begin with `g' or `G', or if it is a number that is less than or equal to zero, only one substitution is performed. If HOW is zero, `gawk' issues a warning message. - If REGEXP does not match TARGET, `gensub''s return value is the + If REGEXP does not match TARGET, `gensub()''s return value is the original unchanged value of TARGET. - `gensub' is a `gawk' extension; it is not available in + `gensub()' is a `gawk' extension; it is not available in compatibility mode (*note Options::). `substr(STRING, START [, LENGTH])' @@ -10075,31 +10224,31 @@ with a pound sign (`#'): also returned if LENGTH is greater than the number of characters remaining in the string, counting from character START. - If START is less than one, `substr' treats it as if it was one. + If START is less than one, `substr()' treats it as if it was one. (POSIX doesn't specify what to do in this case: Unix `awk' acts this way, and therefore `gawk' does too.) If START is greater - than the number of characters in the string, `substr' returns the - null string. Similarly, if LENGTH is present but less than or + than the number of characters in the string, `substr()' returns + the null string. Similarly, if LENGTH is present but less than or equal to zero, the null string is returned. - The string returned by `substr' _cannot_ be assigned. Thus, it is - a mistake to attempt to change a portion of a string, as shown in - the following example: + The string returned by `substr()' _cannot_ be assigned. Thus, it + is a mistake to attempt to change a portion of a string, as shown + in the following example: string = "abcdef" # try to get "abCDEf", won't work substr(string, 3, 3) = "CDE" - It is also a mistake to use `substr' as the third argument of - `sub' or `gsub': + It is also a mistake to use `substr()' as the third argument of + `sub()' or `gsub()': gsub(/xyz/, "pdq", substr($0, 5, 20)) # WRONG - (Some commercial versions of `awk' do in fact let you use `substr' - this way, but doing so is not portable.) + (Some commercial versions of `awk' do in fact let you use + `substr()' this way, but doing so is not portable.) If you need to replace bits and pieces of a string, combine - `substr' with string concatenation, in the following manner: + `substr()' with string concatenation, in the following manner: string = "abcdef" ... @@ -10133,10 +10282,10 @@ is number zero. File: gawk.info, Node: Gory Details, Up: String Functions -8.1.3.1 More About `\' and `&' with `sub', `gsub', and `gensub' -............................................................... +8.1.3.1 More About `\' and `&' with `sub()', `gsub()', and `gensub()' +..................................................................... -When using `sub', `gsub', or `gensub', and trying to get literal +When using `sub()', `gsub()', or `gensub()', and trying to get literal backslashes and ampersands into the replacement text, you need to remember that there are several levels of "escape processing" going on. @@ -10156,13 +10305,13 @@ example, `"a\qb"' is treated as `"aqb"'. At the runtime level, the various functions handle sequences of `\' and `&' differently. The situation is (sadly) somewhat complex. -Historically, the `sub' and `gsub' functions treated the two character -sequence `\&' specially; this sequence was replaced in the generated -text with a single `&'. Any other `\' within the REPLACEMENT string -that did not precede an `&' was passed through unchanged. This is -illustrated in *note table-sub-escapes::. +Historically, the `sub()' and `gsub()' functions treated the two +character sequence `\&' specially; this sequence was replaced in the +generated text with a single `&'. Any other `\' within the REPLACEMENT +string that did not precede an `&' was passed through unchanged. This +is illustrated in *note table-sub-escapes::. - You type `sub' sees `sub' generates + You type `sub()' sees `sub()' generates ------- --------- -------------- `\&' `&' the matched text `\\&' `\&' a literal `&' @@ -10176,20 +10325,20 @@ Table 8.1: Historical Escape Sequence Processing for sub and gsub This table shows both the lexical-level processing, where an odd number of backslashes becomes an even number at the runtime level, as well as -the runtime processing done by `sub'. (For the sake of simplicity, the -rest of the following tables only show the case of even numbers of +the runtime processing done by `sub()'. (For the sake of simplicity, +the rest of the following tables only show the case of even numbers of backslashes entered at the lexical level.) The problem with the historical approach is that there is no way to get a literal `\' followed by the matched text. The 1992 POSIX standard attempted to fix this problem. That standard -says that `sub' and `gsub' look for either a `\' or an `&' after the -`\'. If either one follows a `\', that character is output literally. -The interpretation of `\' and `&' then becomes as shown in *note -table-sub-posix-92::. +says that `sub()' and `gsub()' look for either a `\' or an `&' after +the `\'. If either one follows a `\', that character is output +literally. The interpretation of `\' and `&' then becomes as shown in +*note table-sub-posix-92::. - You type `sub' sees `sub' generates + You type `sub()' sees `sub()' generates ------- --------- -------------- `&' `&' the matched text `\\&' `\&' a literal `&' @@ -10217,7 +10366,7 @@ proposed rules have special cases that make it possible to produce a `\' preceding the matched text. This is shown in *note table-sub-proposed::. - You type `sub' sees `sub' generates + You type `sub()' sees `sub()' generates ------- --------- -------------- `\\\\\\&' `\\\&' a literal `\&' `\\\\&' `\\&' a literal `\', followed by the matched text @@ -10233,8 +10382,8 @@ there was only one. However, as in the historical case, any `\' that is not part of one of these three sequences is not special and appears in the output literally. - `gawk' 3.0 and 3.1 follow these proposed POSIX rules for `sub' and -`gsub'. The POSIX standard took much longer to be revised than was + `gawk' 3.0 and 3.1 follow these proposed POSIX rules for `sub()' and +`gsub()'. The POSIX standard took much longer to be revised than was expected in 1996. The 2001 standard does not follow the above rules. Instead, the rules there are somewhat simpler. The results are similar except for one case. @@ -10244,7 +10393,7 @@ produces a literal `&', `\\' produces a literal `\', and `\' followed by anything else is not special; the `\' is placed straight into the output. These rules are presented in *note table-posix-2001-sub::. - You type `sub' sees `sub' generates + You type `sub()' sees `sub()' generates ------- --------- -------------- `\\\\\\&' `\\\&' a literal `\&' `\\\\&' `\\&' a literal `\', followed by the matched text @@ -10262,16 +10411,16 @@ Table 8.4: POSIX 2001 rules for sub follow the 1996 proposed rules, since that had been its behavior for many seven years. - As of version 3.2, `gawk' uses the POSIX 2001 rules. + As of version 4.0, `gawk' uses the POSIX 2001 rules. - The rules for `gensub' are considerably simpler. At the runtime + The rules for `gensub()' are considerably simpler. At the runtime level, whenever `gawk' sees a `\', if the following character is a digit, then the text that matched the corresponding parenthesized subexpression is placed in the generated output. Otherwise, no matter what character follows the `\', it appears in the generated text and the `\' does not, as shown in *note table-gensub-escapes::. - You type `gensub' sees `gensub' generates + You type `gensub()' sees `gensub()' generates ------- ------------ ----------------- `&' `&' the matched text `\\&' `\&' a literal `&' @@ -10283,15 +10432,15 @@ the `\' does not, as shown in *note table-gensub-escapes::. Table 8.5: Escape Sequence Processing for gensub Because of the complexity of the lexical and runtime level processing -and the special cases for `sub' and `gsub', we recommend the use of -`gawk' and `gensub' when you have to do substitutions. +and the special cases for `sub()' and `gsub()', we recommend the use of +`gawk' and `gensub()' when you have to do substitutions. Advanced Notes: Matching the Null String ---------------------------------------- In `awk', the `*' operator can match the null string. This is -particularly important for the `sub', `gsub', and `gensub' functions. -For example: +particularly important for the `sub()', `gsub()', and `gensub()' +functions. For example: $ echo abc | awk '{ gsub(/m*/, "X"); print }' -| XaXbXcX @@ -10320,7 +10469,7 @@ parameters are enclosed in square brackets ([ ]): When closing a coprocess, it is occasionally useful to first close one end of the two-way pipe and then to close the other. This is - done by providing a second argument to `close'. This second + done by providing a second argument to `close()'. This second argument should be one of the two string values `"to"' or `"from"', indicating which end of the pipe to close. Case in the string does not matter. *Note Two-way I/O::, which discusses this feature in @@ -10338,23 +10487,23 @@ parameters are enclosed in square brackets ([ ]): little bit of information as soon as it is ready. However, sometimes it is necessary to force a program to "flush" its buffers; that is, write the information to its destination, even - if a buffer is not full. This is the purpose of the `fflush' - function--`gawk' also buffers its output and the `fflush' function - forces `gawk' to flush its buffers. + if a buffer is not full. This is the purpose of the `fflush()' + function--`gawk' also buffers its output and the `fflush()' + function forces `gawk' to flush its buffers. - `fflush' was added to the Bell Laboratories research version of + `fflush()' was added to the Bell Laboratories research version of `awk' in 1994; it is not part of the POSIX standard and is not available if `--posix' has been specified on the command line (*note Options::). - `gawk' extends the `fflush' function in two ways. The first is to - allow no argument at all. In this case, the buffer for the + `gawk' extends the `fflush()' function in two ways. The first is + to allow no argument at all. In this case, the buffer for the standard output is flushed. The second is to allow the null string (`""') as the argument. In this case, the buffers for _all_ open output files and pipes are flushed. Current versions of the Bell Labs `awk' also support these extensions. - `fflush' returns zero if the buffer is successfully flushed; + `fflush()' returns zero if the buffer is successfully flushed; otherwise, it returns -1. In the case where all buffers are flushed, the return value is zero only if all buffers were flushed successfully. Otherwise, it is -1, and `gawk' warns about the @@ -10363,12 +10512,12 @@ parameters are enclosed in square brackets ([ ]): `gawk' also issues a warning message if you attempt to flush a file or pipe that was opened for reading (such as with `getline'), or if FILENAME is not an open file, pipe, or coprocess. In such a - case, `fflush' returns -1, as well. + case, `fflush()' returns -1, as well. `system(COMMAND)' Executes operating-system commands and then returns to the `awk' - program. The `system' function executes the command given by the - string COMMAND. It returns the status returned by the command + program. The `system()' function executes the command given by + the string COMMAND. It returns the status returned by the command that was executed as its value. For example, if the following fragment of code is put in your `awk' @@ -10390,13 +10539,14 @@ parameters are enclosed in square brackets ([ ]): print COMMAND | "/bin/sh" close("/bin/sh") - However, if your `awk' program is interactive, `system' is useful - for cranking up large self-contained programs, such as a shell or - an editor. Some operating systems cannot implement the `system' - function. `system' causes a fatal error if it is not supported. + However, if your `awk' program is interactive, `system()' is + useful for cranking up large self-contained programs, such as a + shell or an editor. Some operating systems cannot implement the + `system()' function. `system()' causes a fatal error if it is not + supported. - NOTE: When `--sandbox' is specified, the `system' function is - disabled. + NOTE: When `--sandbox' is specified, the `system()' function + is disabled. Advanced Notes: Interactive Versus Noninteractive Buffering @@ -10431,17 +10581,17 @@ this example: Here, no output is printed until after the `Ctrl-d' is typed, because it is all buffered and sent down the pipe to `cat' in one shot. -Advanced Notes: Controlling Output Buffering with `system' ----------------------------------------------------------- +Advanced Notes: Controlling Output Buffering with `system()' +------------------------------------------------------------ -The `fflush' function provides explicit control over output buffering +The `fflush()' function provides explicit control over output buffering for individual files and pipes. However, its use is not portable to many other `awk' implementations. An alternative method to flush output -buffers is to call `system' with a null string as its argument: +buffers is to call `system()' with a null string as its argument: system("") # flush output -`gawk' treats this use of the `system' function as a special case and +`gawk' treats this use of the `system()' function as a special case and is smart enough not to run a shell (or other command interpreter) with the empty command. Therefore, with `gawk', this idiom is not only useful, it is also efficient. While this method should work with other @@ -10451,7 +10601,7 @@ associated with the standard output and not necessarily all buffered output.) If you think about what a programmer expects, it makes sense that -`system' should flush any pending output. The following program: +`system()' should flush any pending output. The following program: BEGIN { print "first print" @@ -10471,7 +10621,7 @@ and not: first print second print - If `awk' did not flush its buffers before calling `system', you + If `awk' did not flush its buffers before calling `system()', you would see the latter (undesirable) output. ---------- Footnotes ---------- @@ -10510,7 +10660,7 @@ Optional parameters are enclosed in square brackets ([ ]): `mktime(DATESPEC)' This function turns DATESPEC into a timestamp in the same form as - is returned by `systime'. It is similar to the function of the + is returned by `systime()'. It is similar to the function of the same name in ISO C. The argument, DATESPEC, is a string of the form `"YYYY MM DD HH MM SS [DST]"'. The string consists of six or seven numbers representing, respectively, the full year including @@ -10525,11 +10675,11 @@ Optional parameters are enclosed in square brackets ([ ]): assumed to be in the local timezone. If the daylight-savings flag is positive, the time is assumed to be daylight savings time; if zero, the time is assumed to be standard time; and if negative - (the default), `mktime' attempts to determine whether daylight + (the default), `mktime()' attempts to determine whether daylight savings time is in effect for the specified time. If DATESPEC does not contain enough elements or if the resulting - time is out of range, `mktime' returns -1. + time is out of range, `mktime()' returns -1. `strftime([FORMAT [, TIMESTAMP [, UTC-FLAG]]])' This function returns a string. It is similar to the function of @@ -10539,32 +10689,32 @@ Optional parameters are enclosed in square brackets ([ ]): is formatted as UTC (Coordinated Universal Time, formerly GMT or Greenwich Mean Time). Otherwise, the value is formatted for the local time zone. The TIMESTAMP is in the same format as the value - returned by the `systime' function. If no TIMESTAMP argument is + returned by the `systime()' function. If no TIMESTAMP argument is supplied, `gawk' uses the current time of day as the timestamp. - If no FORMAT argument is supplied, `strftime' uses + If no FORMAT argument is supplied, `strftime()' uses `"%a %b %d %H:%M:%S %Z %Y"'. This format string produces output that is (almost) equivalent to that of the `date' utility. (Versions of `gawk' prior to 3.0 require the FORMAT argument.) - The `systime' function allows you to compare a timestamp from a log -file with the current time of day. In particular, it is easy to + The `systime()' function allows you to compare a timestamp from a +log file with the current time of day. In particular, it is easy to determine how long ago a particular record was logged. It also allows you to produce log records using the "seconds since the epoch" format. - The `mktime' function allows you to convert a textual representation -of a date and time into a timestamp. This makes it easy to do -before/after comparisons of dates and times, particularly when dealing -with date and time data coming from an external source, such as a log -file. + The `mktime()' function allows you to convert a textual +representation of a date and time into a timestamp. This makes it +easy to do before/after comparisons of dates and times, particularly +when dealing with date and time data coming from an external source, +such as a log file. - The `strftime' function allows you to easily turn a timestamp into -human-readable information. It is similar in nature to the `sprintf' + The `strftime()' function allows you to easily turn a timestamp into +human-readable information. It is similar in nature to the `sprintf()' function (*note String Functions::), in that it copies nonformat specification characters verbatim to the returned string, while substituting date and time values for format specifications in the FORMAT string. - `strftime' is guaranteed by the 1999 ISO C standard(4) to support + `strftime()' is guaranteed by the 1999 ISO C standard(4) to support the following date format specifications: `%a' @@ -10718,9 +10868,9 @@ defines a default `"C"' locale, which is an environment that is typical of what most C programmers are used to. For systems that are not yet fully standards-compliant, `gawk' -supplies a copy of `strftime' from the GNU C Library. It supports all -of the just listed format specifications. If that version is used to -compile `gawk' (*note Installation::), then the following additional +supplies a copy of `strftime()' from the GNU C Library. It supports +all of the just listed format specifications. If that version is used +to compile `gawk' (*note Installation::), then the following additional format specifications are available: `%k' @@ -10788,7 +10938,7 @@ shell scripts. (3) Occasionally there are minutes in a year with a leap second, which is why the seconds can go up to 60. - (4) As this is a recent standard, not every system's `strftime' + (4) As this is a recent standard, not every system's `strftime()' necessarily supports all of the conversions listed here. (5) If you don't understand any of this, don't worry about it; these @@ -10797,9 +10947,9 @@ Other internationalization features are described in *note Internationalization::. (6) This is because ISO C leaves the behavior of the C version of -`strftime' undefined and `gawk' uses the system's version of `strftime' -if it's there. Typically, the conversion specifier either does not -appear in the returned string or appears literally. +`strftime()' undefined and `gawk' uses the system's version of +`strftime()' if it's there. Typically, the conversion specifier either +does not appear in the returned string or appears literally. File: gawk.info, Node: Bitwise Functions, Next: I18N Functions, Prev: Time Functions, Up: Built-in @@ -10915,7 +11065,7 @@ at the end, it pads the value with zeros to represent multiples of The main code in the `BEGIN' rule shows the difference between the decimal and octal values for the same numbers (*note Nondecimal-numbers::), and then demonstrates the results of the -`compl', `lshift', and `rshift' functions. +`compl()', `lshift()', and `rshift()' functions. ---------- Footnotes ---------- @@ -10956,8 +11106,8 @@ brackets ([ ]): testing). It returns the directory in which DOMAIN is "bound." The default DOMAIN is the value of `TEXTDOMAIN'. If DIRECTORY is - the null string (`""'), then `bindtextdomain' returns the current - binding for the given DOMAIN. + the null string (`""'), then `bindtextdomain()' returns the + current binding for the given DOMAIN. File: gawk.info, Node: User-defined, Next: Indirect Calls, Prev: Built-in, Up: Functions @@ -11145,7 +11295,7 @@ way: The C `ctime' function takes a timestamp and returns it in a string, formatted in a well-known fashion. The following example uses the -built-in `strftime' function (*note Time Functions::) to create an +built-in `strftime()' function (*note Time Functions::) to create an `awk' version of `ctime': # ctime.awk @@ -11711,7 +11861,7 @@ File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev: I18N The facilities in GNU `gettext' focus on messages; strings printed by a program, either directly or via formatting with `printf' or -`sprintf'.(1) +`sprintf()'.(1) When using GNU `gettext', each application has its own "text domain". This is a unique name, such as `kpilot' or `gawk', that @@ -11752,7 +11902,7 @@ in this order: 7. For testing and development, it is possible to tell `gettext' to use `.mo' files in a different directory than the standard one by - using the `bindtextdomain' function. + using the `bindtextdomain()' function. 8. At runtime, `guide' looks up each string via a call to `gettext'. The returned string is the translated string if available, or the @@ -11864,9 +12014,9 @@ internationalization: if you want to use the current domain. *Caution:* The order of arguments to the `awk' version of the - `dcgettext' function is purposely different from the order for the - C version. The `awk' version's order was chosen to be simple and - to allow for reasonable `awk'-style default arguments. + `dcgettext()' function is purposely different from the order for + the C version. The `awk' version's order was chosen to be simple + and to allow for reasonable `awk'-style default arguments. `dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])' This built-in function returns the plural form used for NUMBER of @@ -11876,7 +12026,7 @@ internationalization: message. The default value for DOMAIN is the current value of `TEXTDOMAIN'. The default value for CATEGORY is `"LC_MESSAGES"'. - The same remarks as for the `dcgettext' function apply. + The same remarks as for the `dcgettext()' function apply. `bindtextdomain(DIRECTORY [, DOMAIN])' This built-in function allows you to specify the directory in which @@ -11885,8 +12035,8 @@ internationalization: returns the directory in which DOMAIN is "bound." The default DOMAIN is the value of `TEXTDOMAIN'. If DIRECTORY is - the null string (`""'), then `bindtextdomain' returns the current - binding for the given DOMAIN. + the null string (`""'), then `bindtextdomain()' returns the + current binding for the given DOMAIN. To use these facilities in your `awk' program, follow the steps outlined in *note Explaining gettext::, like so: @@ -11909,19 +12059,19 @@ outlined in *note Explaining gettext::, like so: printf(_"Number of users is %d\n", nusers) 3. If you are creating strings dynamically, you can still translate - them, using the `dcgettext' built-in function: + them, using the `dcgettext()' built-in function: message = nusers " users logged in" message = dcgettext(message, "adminprog") print message - Here, the call to `dcgettext' supplies a different text domain + Here, the call to `dcgettext()' supplies a different text domain (`"adminprog"') in which to find the message, but it uses the default `"LC_MESSAGES"' category. 4. During development, you might want to put the `.mo' file in a private directory for testing. This is done with the - `bindtextdomain' built-in function: + `bindtextdomain()' built-in function: BEGIN { TEXTDOMAIN = "guide" # our text domain @@ -11976,9 +12126,9 @@ option to create the initial `.po' file: Instead, it parses it as usual and prints all marked strings to standard output in the format of a GNU `gettext' Portable Object file. Also included in the output are any constant strings that appear as the -first argument to `dcgettext' or as the first and second argument to -`dcngettext'.(1) *Note I18N Example::, for the full list of steps to go -through to create and test translations for `guide'. +first argument to `dcgettext()' or as the first and second argument to +`dcngettext()'.(1) *Note I18N Example::, for the full list of steps to +go through to create and test translations for `guide'. ---------- Footnotes ---------- @@ -11991,7 +12141,7 @@ File: gawk.info, Node: Printf Ordering, Next: I18N Portability, Prev: String 9.4.2 Rearranging `printf' Arguments ------------------------------------ -Format strings for `printf' and `sprintf' (*note Printf::) present a +Format strings for `printf' and `sprintf()' (*note Printf::) present a special problem for translation. Consider the following:(1) printf(_"String `%s' has %d characters\n", @@ -12090,9 +12240,10 @@ actually almost portable, requiring very little change: it.(1) Typically, the variable `_' has the null string (`""') as its value, leaving the original string constant as the result. - * By defining "dummy" functions to replace `dcgettext', `dcngettext' - and `bindtextdomain', the `awk' program can be made to run, but - all the messages are output in the original language. For example: + * By defining "dummy" functions to replace `dcgettext()', + `dcngettext()' and `bindtextdomain()', the `awk' program can be + made to run, but all the messages are output in the original + language. For example: function bindtextdomain(dir, domain) { @@ -12109,16 +12260,16 @@ actually almost portable, requiring very little change: return (number == 1 ? string1 : string2) } - * The use of positional specifications in `printf' or `sprintf' is + * The use of positional specifications in `printf' or `sprintf()' is _not_ portable. To support `gettext' at the C level, many - systems' C versions of `sprintf' do support positional specifiers. - But it works only if enough arguments are supplied in the function - call. Many versions of `awk' pass `printf' formats and arguments - unchanged to the underlying C library version of `sprintf', but - only one format and argument at a time. What happens if a - positional specification is used is anybody's guess. However, - since the positional specifications are primarily for use in - _translated_ format strings, and since non-GNU `awk's never + systems' C versions of `sprintf()' do support positional + specifiers. But it works only if enough arguments are supplied in + the function call. Many versions of `awk' pass `printf' formats + and arguments unchanged to the underlying C library version of + `sprintf()', but only one format and argument at a time. What + happens if a positional specification is used is anybody's guess. + However, since the positional specifications are primarily for use + in _translated_ format strings, and since non-GNU `awk's never retrieve the translated string, this should not be a problem in practice. @@ -12203,8 +12354,8 @@ proper directory so that `gawk' can find it: -| Like, the scoop is 42 -| Pardon me, Zaphod who? - If the three replacement functions for `dcgettext', `dcngettext' and -`bindtextdomain' (*note I18N Portability::) are in a file named + If the three replacement functions for `dcgettext()', `dcngettext()' +and `bindtextdomain()' (*note I18N Portability::) are in a file named `libintl.awk', then we can run `guide.awk' unchanged as follows: $ gawk --posix -f guide.awk -f libintl.awk @@ -12246,9 +12397,9 @@ of a "grab bag" of items that are otherwise unrelated to each other. First, a command-line option allows `gawk' to recognize nondecimal numbers in input data, not just in `awk' programs. Next, two-way I/O, discussed briefly in earlier parts of this Info file, is described in -full detail, along with the basics of TCP/IP networking and BSD portal -files. Finally, `gawk' can "profile" an `awk' program, making it -possible to tune it for performance. +full detail, along with the basics of TCP/IP networking. Finally, +`gawk' can "profile" an `awk' program, making it possible to tune it +for performance. *note Dynamic Extensions::, discusses the ability to dynamically add new built-in functions to `gawk'. As this feature is still immature @@ -12259,7 +12410,6 @@ and likely to change, its description is relegated to an appendix. * Nondecimal Data:: Allowing nondecimal input data. * Two-way I/O:: Two-way communications with another process. * TCP/IP Networking:: Using `gawk' for network programming. -* Portal Files:: Using `gawk' with BSD portals. * Profiling:: Profiling your `awk' programs. @@ -12299,7 +12449,7 @@ leave this facility disabled. If you want it, you must explicitly request it. *Caution:* _Use of this option is not recommended._ It can break old -programs very badly. Instead, use the `strtonum' function to convert +programs very badly. Instead, use the `strtonum()' function to convert your data (*note Nondecimal-numbers::). This makes your programs easier to write and easier to read, and leads to less surprising results. @@ -12383,7 +12533,7 @@ or pipeline of programs, that can be started by the shell. waiting for the other one to do something. It is possible to close just one end of the two-way pipe to a -coprocess, by supplying a second argument to the `close' function of +coprocess, by supplying a second argument to the `close()' function of either `"to"' or `"from"' (*note Close Files And Pipes::). These strings tell `gawk' to close the end of the pipe that sends data to the process or the end that reads from it, respectively. @@ -12442,7 +12592,7 @@ regular pipes. `csh'. -File: gawk.info, Node: TCP/IP Networking, Next: Portal Files, Prev: Two-way I/O, Up: Advanced Features +File: gawk.info, Node: TCP/IP Networking, Next: Profiling, Prev: Two-way I/O, Up: Advanced Features 10.3 Using `gawk' for Network Programming ========================================= @@ -12517,24 +12667,9 @@ programming is documented separately. *Note Top::, for a much more complete introduction and discussion, as well as extensive examples. -File: gawk.info, Node: Portal Files, Next: Profiling, Prev: TCP/IP Networking, Up: Advanced Features - -10.4 Using `gawk' with BSD Portals -================================== - -Similar to the `/inet' special files, if `gawk' is configured with the -`--enable-portals' option (*note Quick Installation::), then `gawk' -treats files whose pathnames begin with `/p' as 4.4 BSD-style portals. - - When used with the `|&' operator, `gawk' opens the file for two-way -communications. The operating system's portal mechanism then manages -creating the process associated with the portal and the corresponding -communications with the portal's process. - - -File: gawk.info, Node: Profiling, Prev: Portal Files, Up: Advanced Features +File: gawk.info, Node: Profiling, Prev: TCP/IP Networking, Up: Advanced Features -10.5 Profiling Your `awk' Programs +10.4 Profiling Your `awk' Programs ================================== Beginning with version 3.1 of `gawk', you may produce execution traces @@ -12774,6 +12909,7 @@ full details. * AWKPATH Variable:: Searching directories for `awk' programs. * Exit Status:: `gawk''s exit status. +* Include Files:: Including other files into your program. * Obsolete:: Obsolete Options and/or features. * Undocumented:: Undocumented Options and Features. * Known Bugs:: Known Bugs in `gawk'. @@ -12876,7 +13012,6 @@ The following list describes `gawk'-specific options: an easy way to tell `gawk': "hands off my data!". `-c' -`--compat' `--traditional' Specifies "compatibility mode", in which the GNU extensions to the `awk' language are disabled, so that `gawk' behaves just like the @@ -12886,7 +13021,6 @@ The following list describes `gawk'-specific options: `-C' `--copyright' -`--copyleft' Print the short version of the General Public License and then exit. @@ -12940,11 +13074,10 @@ The following list describes `gawk'-specific options: `-h' `--help' -`--usage' Prints a "usage" message summarizing the short and long style options that `gawk' accepts and then exit. -`-l [value]' +`-L [value]' `--lint[=value]' Warns about constructs that are dubious or nonportable to other `awk' implementations. Some warnings are issued when `gawk' first @@ -12963,11 +13096,6 @@ The following list describes `gawk'-specific options: inappropriate construct. As `awk' programs are usually short, doing so is not burdensome. -`-L' -`--lint-old' - Warns about constructs that are not available in the original - version of `awk' from Version 7 Unix (*note V7/SVR3.1::). - `-n' `--non-decimal-data' Enable automatic interpretation of octal and hexadecimal values in @@ -13029,7 +13157,7 @@ The following list describes `gawk'-specific options: * The locale's decimal point character is used for parsing input data (*note Locales::). - * The `fflush' built-in function is not supported (*note I/O + * The `fflush()' built-in function is not supported (*note I/O Functions::). If you supply both `--traditional' and `--posix' on the command @@ -13045,13 +13173,18 @@ The following list describes `gawk'-specific options: `-S' `--sandbox' - In sandbox mode, the `system' function, input redirections with + In sandbox mode, the `system()' function, input redirections with `getline', output redirections with `print' and `printf' and dynamic extensions are disabled. This is particularly useful when you want to run `awk' scripts from questionable sources and need to make sure the scripts can't access your system (other then the specified input data file). +`-t' +`--lint-old' + Warns about constructs that are not available in the original + version of `awk' from Version 7 Unix (*note V7/SVR3.1::). + `-V' `--version' Prints version information for this particular copy of `gawk'. @@ -13099,7 +13232,7 @@ environment variable to turn on strict POSIX mode. If `--lint' is supplied on the command line and `gawk' turns on POSIX mode because of `POSIXLY_CORRECT', then it issues a warning message indicating that POSIX mode is in effect. You would typically set this variable in your -shell's startup file. For a Bourne-compatible shell (such as `bash'), +shell's startup file. For a Bourne-compatible shell (such as Bash), you would add these lines to the `.profile' file in your home directory: POSIXLY_CORRECT=true @@ -13235,7 +13368,7 @@ the value of `$(datadir)' generated when `gawk' was configured. You probably don't need to worry about this, though. -File: gawk.info, Node: Exit Status, Next: Obsolete, Prev: AWKPATH Variable, Up: Invoking Gawk +File: gawk.info, Node: Exit Status, Next: Include Files, Prev: AWKPATH Variable, Up: Invoking Gawk 11.5 `gawk''s Exit Status ========================= @@ -13253,9 +13386,105 @@ with the value of the C constant `EXIT_SUCCESS'. This is usually zero. non-POSIX systems, this value may be mapped to `EXIT_FAILURE'. -File: gawk.info, Node: Obsolete, Next: Undocumented, Prev: Exit Status, Up: Invoking Gawk +File: gawk.info, Node: Include Files, Next: Obsolete, Prev: Exit Status, Up: Invoking Gawk -11.6 Obsolete Options and/or Features +11.6 Including Other Files Into Your Program +============================================ + +*FIXME:* This section still needs some editing. + + Beginning with version *FIXME:* 3.1.8-bc of `gawk', the `@include' +keyword can be used to read external source `awk' files. That gives +the ability to split huge `awk' source files into smaller and +manageable files and also to reuse common `awk' code from various `awk' +scripts. In other words, you can group together `awk' functions, used +to carry out some sort of tasks, in external files. These files can be +used just like function libraries, using the `@include' keyword in +conjuction with the `AWKPATH' environment variable. + + Let's see an example to demonstrate file inclusion in `gawk'. To do +so, we'll use two (trivial) `awk' scripts, namely the `test1' and +`test2' `gawk' scripts. Here follows the `test1' `gawk' script file: + + BEGIN { + print "This is script test1." + } + +and the `test2' file: + + @include "test1" + BEGIN { + print "This is script test2." + } + + Running `gawk' with the `test2' script you'll get the following +result: + + $ gawk -f test2 + -| This is file test1. + -| This is file test2. + + `gawk' runs the `test2' script where `test1' has been included in +the source of `test2' by means of the `@include' keyword. So, to +include external `awk' source files you just use `@include' followed by +the name of the file to be included in double quotes. + + NOTE: Keep in mind that this is a language construct and the file + name cannot be a string variable, but rather just a literal string + in double quotes. + + The files to be included may be nested; e.g. given a third script, +namely `test3': + + @include "test2" + BEGIN { + print "This is script test3." + } + +and running `gawk' with the `test3' script you'll get the following +result: + + $ gawk -f test3 + -| This is file test1. + -| This is file test2. + -| This is file test3. + + The file name can, of course, be a pathname, e.g. + + @include "../io_funcs" + +or + + @include "/usr/awklib/network" + +are valid. The `AWKPATH' environment variable can be of great value in +`@include' constructs. The same rules dominating the use of `AWKPATH' +variable in command line file searches are valid in `@include' +constructs too. That can be prooved very helpful in constructing `gawk' +function libraries. You can edit huge scripts containing usefull +`gawk' libraries and put those files in a special directory. You can +then include those "libraries" using either the full pathnames of the +files or by setting accordingly the `AWKPATH' environment variable and +then use `@include' with just the name part of the full file pathname. +Of course you can have more than one directory to keep library files; +the more complex the working enviroment is, the more directories you +need to organize the files to be included. + + The whole stuff of file inclusion can, of course, be carried out in +the command line, using as many `-f' options as required with the files +to be included as arguments, but the `@include' keyword can help you in +constructing self-contained `gawk' programs, thus reducing the need of +writing complex and tedious command lines. + + `AWKPATH' is also used by the `@include' mechanism, that is the +files to be included will be seeked in the directories specified. Keep +in mind, however, that the current directory is been searched first, +either it's listed in the `AWKPATH' string or not. + + +File: gawk.info, Node: Obsolete, Next: Undocumented, Prev: Include Files, Up: Invoking Gawk + +11.7 Obsolete Options and/or Features ===================================== This minor node describes features and/or command-line options from @@ -13274,7 +13503,7 @@ worked. As of version 3.2, they are no longer interpreted specially by File: gawk.info, Node: Undocumented, Next: Known Bugs, Prev: Obsolete, Up: Invoking Gawk -11.7 Undocumented Options and Features +11.8 Undocumented Options and Features ====================================== Use the Source, Luke! @@ -13285,7 +13514,7 @@ File: gawk.info, Node: Undocumented, Next: Known Bugs, Prev: Obsolete, Up: I File: gawk.info, Node: Known Bugs, Prev: Undocumented, Up: Invoking Gawk -11.8 Known Bugs in `gawk' +11.9 Known Bugs in `gawk' ========================= * The `-F' option for changing the value of `FS' (*note Options::) @@ -13462,12 +13691,12 @@ programming use. * Nextfile Function:: Two implementations of a `nextfile' function. -* Strtonum Function:: A replacement for the built-in `strtonum' - function. +* Strtonum Function:: A replacement for the built-in + `strtonum()' function. * Assert Function:: A function for assertions in `awk' programs. -* Round Function:: A function for rounding if `sprintf' does - not do it correctly. +* Round Function:: A function for rounding if `sprintf()' + does not do it correctly. * Cliff Random Function:: The Cliff Random Number Generator. * Ordinal Functions:: Functions for using characters as numbers and vice versa. @@ -13568,7 +13797,7 @@ File: gawk.info, Node: Strtonum Function, Next: Assert Function, Prev: Nextfi 12.2.2 Converting Strings To Numbers ------------------------------------ -The `strtonum' function (*note String Functions::) is a `gawk' +The `strtonum()' function (*note String Functions::) is a `gawk' extension. The following function provides an implementation for other versions of `awk': @@ -13633,7 +13862,7 @@ computing the return value. Similar logic applies to the code that checks for and converts a hexadecimal value, which starts with `0x' or `0X'. The use of -`tolower' simplifies the computation for finding the correct numeric +`tolower()' simplifies the computation for finding the correct numeric value for each hexadecimal digit. Finally, if the string matches the (rather complicated) regex for a @@ -13642,7 +13871,7 @@ regular decimal integer or floating-point number, the computation `ret A commented-out test program is included, so that the function can be tested with `gawk' and the results compared to the built-in -`strtonum' function. +`strtonum()' function. File: gawk.info, Node: Assert Function, Next: Round Function, Prev: Strtonum Function, Up: General Functions @@ -13735,15 +13964,15 @@ File: gawk.info, Node: Round Function, Next: Cliff Random Function, Prev: Ass 12.2.4 Rounding Numbers ----------------------- -The way `printf' and `sprintf' (*note Printf::) perform rounding often -depends upon the system's C `sprintf' subroutine. On many machines, -`sprintf' rounding is "unbiased," which means it doesn't always round a -trailing `.5' up, contrary to naive expectations. In unbiased -rounding, `.5' rounds to even, rather than always up, so 1.5 rounds to -2 but 4.5 rounds to 4. This means that if you are using a format that -does rounding (e.g., `"%.0f"'), you should check what your system does. -The following function does traditional rounding; it might be useful if -your awk's `printf' does unbiased rounding: +The way `printf' and `sprintf()' (*note Printf::) perform rounding +often depends upon the system's C `sprintf()' subroutine. On many +machines, `sprintf()' rounding is "unbiased," which means it doesn't +always round a trailing `.5' up, contrary to naive expectations. In +unbiased rounding, `.5' rounds to even, rather than always up, so 1.5 +rounds to 2 but 4.5 rounds to 4. This means that if you are using a +format that does rounding (e.g., `"%.0f"'), you should check what your +system does. The following function does traditional rounding; it +might be useful if your awk's `printf' does unbiased rounding: # round.awk --- do normal rounding function round(x, ival, aval, fraction) @@ -13799,7 +14028,7 @@ less than 10 lines of `awk' code: This algorithm requires an initial "seed" of 0.1. Each new value uses the current seed as input for the calculation. If the built-in -`rand' function (*note Numeric Functions::) isn't random enough, you +`rand()' function (*note Numeric Functions::) isn't random enough, you might try using this function instead. @@ -13909,7 +14138,7 @@ but it should also have a reasonable default behavior. It is called with an array as well as the beginning and ending indices of the elements in the array to be merged. This assumes that the array indices are numeric--a reasonable assumption since the array was likely -created with `split' (*note String Functions::): +created with `split()' (*note String Functions::): # join.awk --- join an array into a string function join(array, start, end, sep, result, i) @@ -13945,9 +14174,9 @@ File: gawk.info, Node: Gettimeofday Function, Prev: Join Function, Up: Genera 12.2.8 Managing the Time of Day ------------------------------- -The `systime' and `strftime' functions described in *note Time +The `systime()' and `strftime()' functions described in *note Time Functions::, provide the minimum functionality necessary for dealing -with the time of day in human readable form. While `strftime' is +with the time of day in human readable form. While `strftime()' is extensive, the control formats are not necessarily easy to remember or intuitively obvious when reading a program. @@ -14016,9 +14245,9 @@ current time formatted in the same way as the `date' utility: } The string indices are easier to use and read than the various -formats required by `strftime'. The `alarm' program presented in *note -Alarm Program::, uses this function. A more general design for the -`gettimeofday' function would have allowed the user to supply an +formats required by `strftime()'. The `alarm' program presented in +*note Alarm Program::, uses this function. A more general design for +the `gettimeofday' function would have allowed the user to supply an optional timestamp value to use instead of the current time. @@ -14388,7 +14617,7 @@ to process both normal and GNU-style long options (*note Options::). handy in `awk' programs as well. Following is an `awk' version of `getopt'. This function highlights one of the greatest weaknesses in `awk', which is that it is very poor at manipulating single characters. -Repeated calls to `substr' are necessary for accessing individual +Repeated calls to `substr()' are necessary for accessing individual characters (*note String Functions::).(1) The discussion that follows walks through the code a bit at a time: @@ -14468,7 +14697,7 @@ one at a time. If `_opti' is equal to zero, it is set to two, which is the index in the string of the next character to look at (we skip the `-', which is at position one). The variable `thisopt' holds the character, obtained -with `substr'. It is saved in `Optopt' for the main program to use. +with `substr()'. It is saved in `Optopt' for the main program to use. If `thisopt' is not in the `options' string, then it is an invalid option. If `Opterr' is nonzero, `getopt' prints an error message on @@ -14575,7 +14804,7 @@ use `getopt' to process their arguments. (1) This function was written before `gawk' acquired the ability to split strings into single characters using `""' as the separator. We -have left it alone, since using `substr' is more portable. +have left it alone, since using `substr()' is more portable. File: gawk.info, Node: Passwd Functions, Next: Group Functions, Prev: Getopt Function, Up: Library Functions @@ -14649,7 +14878,7 @@ Full name The user's full name, and perhaps other Home directory The user's login (or "home") directory (familiar to shell programmers as `$HOME'). Login shell The program that is run when the user logs in. - This is usually a shell, such as `bash'. + This is usually a shell, such as Bash. A few lines representative of `pwcat''s output are as follows: @@ -15045,7 +15274,7 @@ very simple, relying on `awk''s associative arrays to do work. The `id' program in *note Id Program::, uses these functions. -File: gawk.info, Node: Sample Programs, Next: Language History, Prev: Library Functions, Up: Top +File: gawk.info, Node: Sample Programs, Next: Debugger, Prev: Library Functions, Up: Top 13 Practical `awk' Programs *************************** @@ -15366,8 +15595,8 @@ out between the fields: This version of `cut' relies on `gawk''s `FIELDWIDTHS' variable to do the character-based cutting. While it is possible in other `awk' -implementations to use `substr' (*note String Functions::), it is also -extremely painful. The `FIELDWIDTHS' variable supplies an elegant +implementations to use `substr()' (*note String Functions::), it is +also extremely painful. The `FIELDWIDTHS' variable supplies an elegant solution to the problem of picking the input line apart by characters. @@ -15713,8 +15942,8 @@ File: gawk.info, Node: Split Program, Next: Tee Program, Prev: Id Program, U 13.2.4 Splitting a Large File into Pieces ----------------------------------------- -The `split' program splits large text files into smaller pieces. Usage -is as follows: +The `split()' program splits large text files into smaller pieces. +Usage is as follows: split [-COUNT] file [ PREFIX ] @@ -15726,7 +15955,7 @@ lines in them instead of 1000. To change the name of the output files to something like `myfileaa', `myfileab', and so on, supply an additional argument that specifies the file name prefix. - Here is a version of `split' in `awk'. It uses the `ord' and `chr' + Here is a version of `split()' in `awk'. It uses the `ord' and `chr' functions presented in *note Ordinal Functions::. The program first sets its defaults, and then tests to make sure @@ -15911,7 +16140,7 @@ usage is as follows: The options for `uniq' are: `-d' - Pnly print only repeated lines. + Print only repeated lines. `-u' Print only nonrepeated lines. @@ -16024,13 +16253,13 @@ characters. If no field count and no character count are specified, `are_equal' simply returns one or zero depending upon the result of a simple string comparison of `last' and `$0'. Otherwise, things get more complicated. If fields have to be skipped, each line is broken into an -array using `split' (*note String Functions::); the desired fields are -then joined back into a line using `join'. The joined lines are stored -in `clast' and `cline'. If no fields are skipped, `clast' and `cline' -are set to `last' and `$0', respectively. Finally, if characters are -skipped, `substr' is used to strip off the leading `charcount' -characters in `clast' and `cline'. The two strings are then compared -and `are_equal' returns the result: +array using `split()' (*note String Functions::); the desired fields +are then joined back into a line using `join'. The joined lines are +stored in `clast' and `cline'. If no fields are skipped, `clast' and +`cline' are set to `last' and `$0', respectively. Finally, if +characters are skipped, `substr()' is used to strip off the leading +`charcount' characters in `clast' and `cline'. The two strings are +then compared and `are_equal' returns the result: function are_equal( n, m, clast, cline, alast, aline) { @@ -16429,7 +16658,7 @@ alarm: exit 1 } - Finally, the program uses the `system' function (*note I/O + Finally, the program uses the `system()' function (*note I/O Functions::) to call the `sleep' utility. The `sleep' utility simply pauses for the given number of seconds. If the exit status is not zero, the program assumes that `sleep' was interrupted and exits. If `sleep' @@ -16481,9 +16710,9 @@ most of the job. The `translate' program demonstrates one of the few weaknesses of standard `awk': dealing with individual characters is very painful, -requiring repeated use of the `substr', `index', and `gsub' built-in -functions (*note String Functions::).(2) There are two functions. The -first, `stranslate', takes three arguments: +requiring repeated use of the `substr()', `index()', and `gsub()' +built-in functions (*note String Functions::).(2) There are two +functions. The first, `stranslate', takes three arguments: `from' A list of characters from which to translate. @@ -16559,10 +16788,10 @@ record: function, it is not necessarily efficient, and we (the `gawk' authors) started to consider adding a built-in function. However, shortly after writing this program, we learned that the System V Release 4 `awk' had -added the `toupper' and `tolower' functions (*note String Functions::). -These functions handle the vast majority of the cases where character -transliteration is necessary, and so we chose to simply add those -functions to `gawk' as well and then leave well enough alone. +added the `toupper()' and `tolower()' functions (*note String +Functions::). These functions handle the vast majority of the cases +where character transliteration is necessary, and so we chose to simply +add those functions to `gawk' as well and then leave well enough alone. An obvious improvement to this program would be to set up the `t_ar' array only once, in a `BEGIN' rule. However, this assumes that the @@ -16741,8 +16970,8 @@ itself on real text files: having an alphabetized table of how frequently each word occurs. The way to solve these problems is to use some of `awk''s more -advanced features. First, we use `tolower' to remove case -distinctions. Next, we use `gsub' to remove punctuation characters. +advanced features. First, we use `tolower()' to remove case +distinctions. Next, we use `gsub()' to remove punctuation characters. Finally, we use the system `sort' utility to process the output of the `awk' script. Here is the new version of the program: @@ -16874,9 +17103,9 @@ input files: The following program, `extract.awk', reads through a Texinfo source file and does two things, based on the special comments. Upon seeing `@c system ...', it runs a command, by extracting the command text from -the control line and passing it on to the `system' function (*note I/O -Functions::). Upon seeing `@c file FILENAME', each subsequent line is -sent to the file FILENAME, until `@c endfile' is encountered. The +the control line and passing it on to the `system()' function (*note +I/O Functions::). Upon seeing `@c file FILENAME', each subsequent line +is sent to the file FILENAME, until `@c endfile' is encountered. The rules in `extract.awk' match either `@c' or `@comment' by letting the `omment' part be optional. Lines containing `@group' and `@end group' are simply removed. `extract.awk' uses the `join' library function @@ -16911,7 +17140,7 @@ looks something like this: `extract.awk' begins by setting `IGNORECASE' to one, so that mixed upper- and lowercase letters in the directives won't matter. - The first rule handles calling `system', checking that a command is + The first rule handles calling `system()', checking that a command is given (`NF' is at least three) and also checking that the command exits with a zero exit status, signifying OK: @@ -16956,7 +17185,7 @@ comments within examples are also ignored. Most of the work is in the following few lines. If the line has no `@' symbols, the program can print it directly. Otherwise, each leading `@' must be stripped off. To remove the `@' symbols, the line -is split into separate elements of the array `a', using the `split' +is split into separate elements of the array `a', using the `split()' function (*note String Functions::). The `@' symbol is used as the separator character. Each element of `a' that is empty indicates two successive `@' symbols in the original line. For each two empty @@ -17047,7 +17276,7 @@ middle of a pipeline: Here, `s/old/new/g' tells `sed' to look for the regexp `old' on each input line and globally replace it with the text `new', i.e., all the -occurrences on a line. This is similar to `awk''s `gsub' function +occurrences on a line. This is similar to `awk''s `gsub()' function (*note String Functions::). The following program, `awksed.awk', accepts at least two @@ -17525,7 +17754,1042 @@ following copyright terms: We leave it to you to determine what the program does. -File: gawk.info, Node: Language History, Next: Installation, Prev: Sample Programs, Up: Top +File: gawk.info, Node: Debugger, Next: Language History, Prev: Sample Programs, Up: Top + +14 `dgawk': The `awk' Debugger +****************************** + +It would be nice if computer programs worked perfectly the first time +they were run, but in real life, this rarely happens for programs of +any complexity. Thus, most programming languages have facilities +available for "debugging" programs, and now `awk' is no exception. + + The `dgawk' debugger is purposely modeled after the GNU Debugger +(GDB) command-line debugger. If you are familiar with GDB, learning +`dgawk' is easy. + +* Menu: + +* Debugging:: Introduction to `dgawk'. +* Sample dgawk session:: Sample `dgawk' session. +* List of Debugger Commands:: Main `dgawk' Commands. +* Readline Support:: Readline Support. +* Dgawk Limitations:: Limitations and future plans. + + +File: gawk.info, Node: Debugging, Next: Sample dgawk session, Up: Debugger + +14.1 Introduction to `dgawk' +============================ + +* Menu: + +* Debugging Concepts:: Debugging In General. +* Debugging Terms:: Additional Debugging Concepts. +* Awk Debugging:: Awk Debugging. + + +File: gawk.info, Node: Debugging Concepts, Next: Debugging Terms, Up: Debugging + +14.1.1 Debugging In General +--------------------------- + +(If you have used debuggers in other languages, you may want to skip +ahead to the next section on the specific features of the `awk' +debugger.) + + Of course, a debugging program cannot remove bugs for you, since it +has no way of knowing what you or your users consider a "bug" and what +is a "feature." (Sometimes, we humans have a hard time with this +ourselves.) In that case, what can you expect from such a tool? The +answer to that depends on the language being debugged, but in general, +it includes at least the following: + + * The ability to watch a program execute its instructions one by one, + giving you, the programmer, the opportunity to think about what is + happening on a time scale of seconds, minutes, or hours, rather + than the nanosecond time scale at which the code usually runs. + + * The opportunity to not only passively observe the operation of your + program, but to control it and try different paths of execution, + without having to change your source files. + + * The chance to see the values of data in the program at any point in + execution, and also to change that data on the fly, to see how that + effects what happens afterwards. (This often includes the ability + to look at internal data structures besides the variables you + actually defined in your code.) + + * The ability to obtain additional information about your program's + state or even its internal structure. + + All of these tools provide a great amount of help in using your own +skills and understanding of the goals of your program to find where it +is going wrong (or, for that matter, to better comprehend a perfectly +functional program that you or someone else wrote.) + + +File: gawk.info, Node: Debugging Terms, Next: Awk Debugging, Prev: Debugging Concepts, Up: Debugging + +14.1.2 Additional Debugging Concepts +------------------------------------ + +Before diving in to the details, we need to introduce a few more +important concepts that apply to just about all debuggers, including +`dgawk'. + +"Stack Frame" + Programs generally call functions during the course of their + execution. One function can call another, or a function can call + itself (recursion). You can view the chain of called functions + (main program calls A, which calls B, which calls C), as a stack + of executing functions: the currently running function is the + topmost one on the stack, and when it finishes (returns), the next + one down then becomes the active function. Such a stack is termed + a "call stack". + + For each function on the call stack, the system maintains a data + area that contains the function's parameters, local variables, and + return value, as well as any other "bookkeeping" information + needed to manage the call stack. This data area is termed a + "stack frame". + + `gawk' also follows this model, and `dgawk' gives you access to + the call stack and to each stack frame. You can see the call + stack, as well as from where each function on the stack was + invoked. Commands that print the call stack print information about + each stack frame (as detailed later on). + +"Breakpoint" + During debugging, you often wish to let the program run until it + reaches a certain point, and then continue execution from there one + statement (or instruction) at a time. The way to do this is to set + a "breakpoint" within the program. A breakpoint is where the + execution of the program should break off (stop), so that you can + take over control of the program's execution. You can add and + remove as many breakpoints as you like. + +"Watchpoint" + A watchpoint is similar to a breakpoint. The difference is that + breakpoints are oriented around the code: stop when a certain + point in the code is reached. A watchpoint, however, specifies + that program execution should stop when a _data value_ is changed. + This is useful, since sometimes it happens that a variable + receives an erroneous value, and it's hard to track down where + this happens just by looking at the code. By using a watchpoint, + you can stop whenever a variable is assigned to, and usually find + the errant code quite quickly. + + +File: gawk.info, Node: Awk Debugging, Prev: Debugging Terms, Up: Debugging + +14.1.3 Awk Debugging +-------------------- + +Debugging an `awk' program has some specific aspects that are not +shared with other programming languages. + + First of all, the fact that `awk' programs usually take input +line-by-line from a file or files and operate on those lines using +specific rules makes it especially useful to organize viewing the +execution of the program in terms of these rules. As we will see, each +`awk' rule is treated almost like a function call, with its own +specific block of instructions. + + In addition, since `awk' is by design a very concise language, it is +easy to lose sight of everything that is going on "inside" each line of +`awk' code. The debugger provides the opportunity to look at the +individual primitive instructions carried out by the higher-level `awk' +commands. + + +File: gawk.info, Node: Sample dgawk session, Next: List of Debugger Commands, Prev: Debugging, Up: Debugger + +14.2 Sample `dgawk' session +=========================== + +In order to illustrate the use of `dgawk', let's look at a sample +debugging session. We will use the `awk' implementation of the POSIX +`uniq' command described earlier (*note Uniq Program::) as our example. + +* Menu: + +* dgawk invocation:: `dgawk' Invocation. +* Finding The Bug:: Finding The Bug. + + +File: gawk.info, Node: dgawk invocation, Next: Finding The Bug, Up: Sample dgawk session + +14.2.1 `dgawk' Invocation +------------------------- + +Starting `dgawk' is exactly like running `awk'. The file(s) containing +the program and any supporting code are given on the command line as +arguments to one or more `-f' options. (`dgawk' is not designed to +debug command-line programs, only programs contained in files.) In our +case, we call `dgawk' like this: + + $ dgawk -f getopt.awk -f join.awk -f uniq.awk inputfile + +where both `getopt.awk' and `uniq.awk' are in `$AWKPATH'. (Experienced +users of `gdb' or similar debuggers should note that this syntax is +slightly different from what they are used to. With `dgawk', the +arguments for running the program are given in the command line to the +debugger rather than as part of the `run' command at the debugger +prompt.) + + Instead of immediately running the program on `inputfile', as `gawk' +would ordinarily do, `dgawk' merely loads all the program source files, +compiles them internally, and then gives us a prompt: + + dgawk> + +from which we can issue commands to the debugger. At this point, no +code has been executed. + + +File: gawk.info, Node: Finding The Bug, Prev: dgawk invocation, Up: Sample dgawk session + +14.2.2 Finding The Bug +---------------------- + +Let's say that we are having a problem using (a faulty version of) +`uniq.awk' in the "field-skipping" mode, and it doesn't seem to be +catching lines which should be identical when skipping the first field, +such as: + + awk is a wonderful program! + gawk is a wonderful program! + + This could happen if we were thinking (C-like) of the fields in a +record as being numbered in a zero-based fashion, so instead of the +lines: + + clast = join(alast, fcount+1, n) + cline = join(aline, fcount+1, m) + +we wrote: + + clast = join(alast, fcount, n) + cline = join(aline, fcount, m) + + The first thing we usually want to do when trying to investigate a +problem like this is to put a breakpoint in the program so that we can +watch it at work and catch what it is doing wrong. A reasonable spot +for a breakpoint in `uniq.awk' is at the beginning of the function +`are_equal', which compares the current line with the previous one. To +set the breakpoint, use the `b' (breakpoint) command: + + dgawk> b are_equal + -| Breakpoint 1 set at file `awklib/eg/prog/uniq.awk', line 64 + + The debugger tells us the file and line number where the breakpoint +is. Now type `r' or `run' and the program runs until it hits the +breakpoint the first time: + + dgawk> r + -| Starting program: + -| Stopping in Rule ... + -| Breakpoint 1, are_equal(n, m, clast, cline, alast, aline) at `awklib/eg/prog/uniq.awk':64 + -| 64 if (fcount == 0 && charcount == 0) + dgawk> + + Now we can look at what's going on inside our program. First of all, +let's see how we got to where we are. At the prompt, we type `bt' +(short for "backtrace"), and `dgawk' responds with a listing of the +current stack frames: + + dgawk> bt + -| #0 are_equal(n, m, clast, cline, alast, aline) at `awklib/eg/prog/uniq.awk':69 + -| #1 in main() at `awklib/eg/prog/uniq.awk':89 + + This tells us that `are_equal' was called by the main program at +line 89 of `uniq.awk'. (This is not a big surprise, since this is the +only call to `are_equal' in the program, but in more complex programs, +knowing who called a function and with what parameters can be the key +to finding the source of the problem.) + + Now that we're in `are_equal', we can start looking at the values of +some variables. Let's say we type `p n' (`p' is short for "print"). +We would expect to see the value of `n', a parameter to `are_equal'. +Actually, `dgawk' gives us: + + dgawk> p n + -| n = untyped variable + +In this case, `n' is an uninitialized local variable, since the +function was called without arguments (*note Function Calls::). + + A more useful variable to display might be the current record: + + dgawk> p $0 + -| $0 = string ("gawk is a wonderful program!") + +This might be a bit puzzling at first since this is the second line of +our test input above. Let's look at `NR': + + dgawk> p NR + -| NR = number (2) + +So we can see that `are_equal' was only called for the second record of +the file. Of course, this is because our program contained a rule for +`NR == 1': + + NR == 1 { + last = $0 + next + } + + OK, let's just check that that rule worked correctly: + + dgawk> p last + -| last = string ("awk is a wonderful program!") + + Everything we have done so far has verified that the program has +worked as planned, up to and including the call to `are_equal', so the +problem must be inside this function. To investigate further, we have +to begin "stepping through" the lines of `are_equal'. We start by +typing `n' (for "next"): + + dgawk> n + -| 67 if (fcount > 0) { + + This tells us that `gawk' is now ready to execute line 67, which +decides whether to give the lines the special "field skipping" treatment +indicated by the `-f' command-line option. (Notice that we skipped +from where we were before at line 64 to here, since the condition in +line 64 + + if (fcount == 0 && charcount == 0) + +was false.) + + Continuing to step, we now get to the splitting of the current and +last records: + + dgawk> n + -| 68 n = split(last, alast) + dgawk> n + -| 69 m = split($0, aline) + + At this point, we should be curious to see what our records were +split into, so we try to look: + + dgawk> p n m alast aline + -| n = number (5) + -| m = number (5) + -| alast = array, 5 elements + -| aline = array, 5 elements + +(The `p' command can take more than one argument, similar to `awk''s +`print' statement.) + + This is kind of disappointing, though. All we found out is that +there are five elements in each of our arrays. Useful enough (we now +know that none of the words were accidentally left out), but what if we +want to see inside the array? + + The first choice would be to use subscripts: + + dgawk> p alast[0] + -| "0" not in array `alast' + +Oops! + + dgawk> p alast[1] + -| alast["1"] = string ("awk") + + This would be kind of slow for a 100-member array, though, so +`dgawk' provides a shortcut (reminiscent of another language not to be +mentioned): + + dgawk> p @alast + -| alast["4"] = string ("wonderful") + -| alast["5"] = string ("program!") + -| alast["1"] = string ("awk") + -| alast["2"] = string ("is") + -| alast["3"] = string ("a") + + Ignoring the ordering of the elements for now (a `dgawk' internals +issue), it looks like we got this far OK. Let's take another step or +two: + + dgawk> n + -| 70 clast = join(alast, fcount, n) + dgawk> n + -| 71 cline = join(aline, fcount, m) + + Well, here we are at our error (sorry to spoil the supense). What we +had in mind was to join the fields starting from the second one to make +the virtual record to compare, and if the first field was numbered zero, +this would work. Let's look at what we've got: + + dgawk> p cline clast + -| cline = string ("gawk is a wonderful program!") + -| clast = string ("awk is a wonderful program!") + + Hey, those look pretty familiar! They're just our original, +unaltered, input records. A little thinking (the human brain is still +the best debugging tool), and we realize that we were off by one! + + We get out of `dgawk': + + dgawk> q + -| The program is running. Exit anyway (y/n)? y + +Then we get into an editor: + + clast = join(alast, fcount+1, n) + cline = join(aline, fcount+1, m) + +and problem solved! + + +File: gawk.info, Node: List of Debugger Commands, Next: Readline Support, Prev: Sample dgawk session, Up: Debugger + +14.3 Main `dgawk' Commands +========================== + +The `dgawk' command set can be divided into the following categories: + + * Breakpoint control + + * Execution control + + * Viewing and changing data + + * Working with the stack + + * Getting information + + * Miscellaneous + + Each of these are discussed in the following subsections. In the +following descriptions, commands which may be abbreviated show the +abbreviation on a second description line. A `dgawk' command name may +also be truncated if that partial name is unambiguous. `dgawk' does +have a built-in capability to automatically repeat the previous command +when just hitting <Enter>. This works for the commands `list', `next', +`nexti', `step', `stepi' and `continue' executed without any argument. + +* Menu: + +* Breakpoint Control:: Control of breakpoints. +* Dgawk Execution Control:: Control of execution. +* Viewing And Changing Data:: Viewing and changing data. +* Dgawk Stack:: Dealing with the stack. +* Dgawk Info:: Obtaining information about the program and + the debugger state. +* Miscellaneous Dgawk Commands:: Miscellaneous Commands. + + +File: gawk.info, Node: Breakpoint Control, Next: Dgawk Execution Control, Up: List of Debugger Commands + +14.3.1 Control Of Breakpoints +----------------------------- + +As we saw above, the first thing you probably want to do in a debugging +session is to get your breakpoints set up, since otherwise your program +will just run as if it was not under the debugger. The commands for +controlling breakpoints are: + +`break' [[FILENAME`:']N | FUNCTION] [`"EXPRESSION"'] +`b' [[FILENAME`:']N | FUNCTION] [`"EXPRESSION"'] + Without any argument, set a breakpoint at the next instruction to + be executed in the selected stack frame. Arguments can be one of + the following: + + N + Set a breakpoint at line number N in the current source file. + + FILENAME`:'N + Set a breakpoint at line number N in source file FILENAME. + + FUNCTION + Set a breakpoint at entry to (the first instruction of) + function FUNCTION. + + With a breakpoint, you may also supply a condition. This is an + `awk' expression that `dgawk' evaluates whenever the breakpoint is + reached. If the condition is true, then `dgawk' stops execution + and prompts for a command. Otherwise, `dgawk' continues executing + the program. + +`clear' [[FILENAME`:']N | FUNCTION] + Without any argument, delete any breakpoint at the next instruction + to be executed in the selected stack frame. If the program stops at + a breakpoint, this deletes that breakpoint so that the program + does not stop at that location again. + + N + Delete breakpoint(s) set at line number N in the current + source file. + + FILENAME`:'N + Delete breakpoint(s) set at line number N in source file + FILENAME. + + FUNCTION + Delete breakpoint(s) set at entry to function FUNCTION. + +`condition' N `"EXPRESSION"' + Add a condition to existing breakpoint or watchpoint N. The + condition is an `awk' expression that `dgawk' evaluates whenever + the breakpoint is reached. If the condition is true, then `dgawk' + stops execution and prompts for a command. Otherwise, `dgawk' + continues executing the program. + +`delete' [N1 N2 ...] [N-M] +`d' [N1 N2 ...] [N-M] + Delete specified breakpoints or a range of breakpoints. Deletes + all defined breakpoints if no argument is supplied. + +`disable' [N1 N2 ... | N-M] + Disable specified breakpoints or a range of breakpoints. Without + any argument, disables all breakpoints. + +`enable' [`once' | `del'] [N1 N2 ...] [N-M] +`e' [`once' | `del'] [N1 N2 ...] [N-M] + Enable specified breakpoints or a range of breakpoints. Without + any argument, enables all breakpoints. Optionally, you can + specify how to enable the breakpoint: + + `del' + Enable breakpoint(s) tempoarily, then delete it when the + program stops at the breakpoint. + + `once' + Enable breakpoint(s) temporarily, then disable it when the + program stops at the breakpoint. + +`ignore' N COUNT + Ignore breakpoint number N the next COUNT times it is hit. + +`tbreak' [[FILENAME`:']N | FUNCTION] +`t' [[FILENAME`:']N | FUNCTION] + Set a temporary breakpoint (enabled for only one stop). + + +File: gawk.info, Node: Dgawk Execution Control, Next: Viewing And Changing Data, Prev: Breakpoint Control, Up: List of Debugger Commands + +14.3.2 Control of Execution +--------------------------- + +Now that your breakpoints are ready, you can start running the program +and observing its behavior. There are more commands for controlling +execution of the program than we saw in our earlier example: + +`commands' [N] +`silent' +... +`end' + Set a list of commands to be executed upon stopping at a + breakpoint or watchpoint. N is the breakpoint or watchpoint + number. Without a number, last one set is used. The actual + commands follow starting on the next line and are terminated by + the `end' command. If the command `silent' is in the list, the + usual messages about stopping at a breakpoint and the source line + are not printed. Any command in the list that resumes execution + (e.g. `continue') terminates the list (an implicit `end'), and + subsequent commands are ignored. For example: + + dgawk> commands + > silent + > printf "A silent breakpoint; i = %d\n", i + > info locals + > set i = 10 + > continue + > end + dgawk> + +`continue' [COUNT] +`c' [COUNT] + Resume program execution. If continued from a breakpoint and COUNT + is specified, ignores the breakpoint at that location the next + COUNT times before stopping. + +`finish' + Execute until the selected stack frame returns. Prints the + returned value. + +`next' [COUNT] +`n' [COUNT] + Continue execution to the next source line, stepping over function + calls. The argument COUNT controls how many times to repeat the + action, as in `step'. + +`nexti' [COUNT] +`ni' [COUNT] + Execute one (or COUNT) instruction(s), stepping over function + calls. + +`return' [VALUE] + Cancel execution of a function call. If VALUE (either a string or a + number) is specified, it is used as the function's return value. + If used in a frame other than the innermost one (the currently + executing function, i.e., frame number 0), discard all inner + frames in addition to the selected one, and the caller of that + frame becomes the innermost frame. + +`run' +`r' + Start/restart execution of the program. When restarting, `dgawk' + retains the current breakpoints, watchpoints, command history, + automatic display variables, and debugger options. + +`step' [COUNT] +`s' [COUNT] + Continue execution until control reaches a different source line + in the current stack frame. `step' steps inside any function + called within the line. If the argument COUNT is supplied, steps + that many times before stopping, unless it encounters a breakpoint + or watchpoint. + +`stepi' [COUNT] +`si' [COUNT] + Execute one (or COUNT) instruction(s), stepping inside function + calls. (For illustration of what is meant by an "instruction" in + `gawk', see the output shown under `dump' in *note Miscellaneous + Dgawk Commands::). + +`until' [[FILENAME`:']N | FUNCTION] +`u' [[FILENAME`:']N | FUNCTION] + Without any argument, continues execution until a line past the + current line in current stack frame is reached. With argument, + continues execution until the specified location is reached, or + the current stack frame returns. + + +File: gawk.info, Node: Viewing And Changing Data, Next: Dgawk Stack, Prev: Dgawk Execution Control, Up: List of Debugger Commands + +14.3.3 Viewing and Changing Data +-------------------------------- + +The commands for viewing and changing variables inside of `gawk' are: + +`display' [VAR | `$'N] + Add variable VAR (or field `$N') to the display list. The value + of the variable or field is displayed each time the program stops. + Each variable added to the list is identified by a unique number: + + dgawk> display x + -| 10: x = 1 + + displays the assigned item number, the variable name and its + current value. If the display variable refers to a function + parameter, it is silently deleted from the list as soon as the + execution reaches a context where no such variable of the given + name exists. Without argument, `display' displays the current + values of items on the list. + +`eval "AWK STATEMENTS"' + Evaluate AWK STATEMENTS in the context of the running program. + You can do anything that an `awk' program would do: assign values + to variables, call functions, and so on. + +`eval' PARAM, ... +AWK STATEMENTS +`end' + This form of `eval' is similar, but it allows you to define "local + variables" that exist in the context of the AWK STATEMENTS, + instead of using variables or function parameters defined by the + program. + +`print' VAR1[`,' VAR2 ...] +`p' VAR1[`,' VAR2 ...] + Print the value of a `gawk' variable or field. Fields must be + referenced by constants: + + dgawk> print $3 + + prints the third field in the input record (if the specified field + does not exist, it prints `Null field'). A variable can be an + array element, with the subscripts being constant values. To print + the contents of an array, prefix the name of the array with the + `@' symbol: + + gawk> print @a + + prints the index and the corresponding value for all elements in + the array `a'. + +`printf' FORMAT [`,' ARG ...] + Print formatted text. The FORMAT may include escape sequences, + such as `\n' (*note Escape Sequences::). No newline is printed + unless one is specified. + +`set' VAR`='VALUE + Assign a constant (number or string) value to an `awk' variable or + field. String values must be enclosed between double quotes + (`"..."'). + + You can also set special `awk' variables, such as `FS', `NF', + `NR', etc. + +`watch' VAR | `$'N [`"EXPRESSION"'] +`w' VAR | `$'N [`"EXPRESSION"'] + Add variable VAR (or field `$N') to the watch list. `dgawk' then + stops whenever the value of the variable or field changes. Each + watched item is assigned a number which can be used to delete it + from the watch list using the `unwatch' command. + + With a watchpoint, you may also supply a condition. This is an + `awk' expression that `dgawk' evaluates whenever the watchpoint is + reached. If the condition is true, then `dgawk' stops execution + and prompts for a command. Otherwise, `dgawk' continues executing + the program. + +`undisplay' [N] + Remove item number N (or all items, if no argument) from the + automatic display list. + +`unwatch' [N] + Remove item number N (or all items, if no argument) from the watch + list. + + + +File: gawk.info, Node: Dgawk Stack, Next: Dgawk Info, Prev: Viewing And Changing Data, Up: List of Debugger Commands + +14.3.4 Dealing With The Stack +----------------------------- + +Whenever you run a program which contains any function calls, `gawk' +maintains a stack which has all of the functions leading up to where +the program is right now. You can see how you got to where you are, +and also move around in the stack to see what the state of things was +in the functions which called the one you are in. The commands for +doing this are: + +`backtrace' [COUNT] +`bt' [COUNT] + Print a backtrace of all function calls (stack frames), or + innermost COUNT frames if COUNT > 0. Print the outermost COUNT + frames if COUNT < 0. The backtrace displays the name and + arguments to each function, the source file name, and the line + number. + +`down' [COUNT] + Move COUNT (default 1) frames down the stack toward the innermost + frame. Then select and print the frame. + +`frame' [N] +`f' [N] + Select and print (frame number, function and argument names, + source file, and the source line) stack frame N. Frame 0 is the + currently executing, or "innermost", frame (function call), frame + 1 is the frame that called the innermost one. The highest numbered + frame is the one for the main program. + +`up' [COUNT] + Move COUNT (default 1) frames up the stack toward the outermost + frame. The select and print the frame. + + +File: gawk.info, Node: Dgawk Info, Next: Miscellaneous Dgawk Commands, Prev: Dgawk Stack, Up: List of Debugger Commands + +14.3.5 Obtaining Information About The Program and The Debugger State +--------------------------------------------------------------------- + +Besides looking at the values of variables, there is often a need to get +other sorts of information about the state of your program and of the +debugging environment itself. `dgawk' has one command which provides +this information, appropriately called `info'. `info' is used with one +of a number of arguments which tell it exactly what you want to know: + +`info' WHAT +`i' WHAT + The value for WHAT should be one of the following: + + `args' + Arguments of the selected frame. + + `break' + List all currently set breakpoints. + + `display' + List of all items in the automatic display list. + + `frame' + Description of the selected stack frame. + + `functions' + List all function definitions including source file names and + line numbers. + + `locals' + Local variables of the selected frame. + + `source' + The name of the current source file. Each time the program + stops, the current source file is the file containing the + current instruction. When `dgawk' first starts, the current + source file is the first file included via the `-f' option. + The `list FILENAME:LINENO' command can be used at any time to + change the current source. + + `sources' + List all program sources. + + `variables' + List all global variables. + + `watch' + List of all items in the watch list. + + Additional commands give you control over the debugger, the ability +to save the debugger's state, and the ability to run debugger commands +from a file. The commands are: + +`option' [NAME[`='VALUE]] +`o' [NAME[`='VALUE]] + Without an argument, display the available debugger options and + their current values. `option NAME' shows the current value of the + named option. `option NAME=VALUE' assigns a new value to the named + option. The available options are: + + `history_size' + The maximum number of lines to keep in the history file + `./.dgawk_history'. The default is 100. + + `listsize' + The number of lines that `list' prints. The default is 15. + + `outfile' + Sends `gawk' output to a file; debugger output still goes to + standard output. An empty string (`""') resets output to + standard output. + + `prompt' + The debugger prompt. The default is `dgawk>'. + + `save_history [on | off]' + Save command history to file `./.dgawk_history'. The default + is `on'. + + `save_options [on | off]' + Save current options to file `./.dgawkrc' upon exit. The + default is `on'. Options are read back in to the next + session upon startup. + + `trace [on | off]' + Turn instruction tracing on or off. The default is `off'. + +`save' FILENAME + Save the commands from the current session to the given file name, + so that they can be replayed using the `source' command. + +`source' FILENAME + Run command(s) from a file; an error in any command does not + terminate execution of subsequent commands. Comments (lines + starting with `#') are allowed in a command file. Empty lines are + ignored; they do _not_ repeat the last command. You can't restart + the program by having more than one `run' command in the file. + Also, the list of commands may include additional `source' + commands; however, `dgawk' will not source the same file more than + once in order to avoid infinite recursion. + + In addition to, or instead of the `source' command, you can use + the `-R FILE' or `--command=FILE' command-line options to execute + commands from a file non-interactively. + + +File: gawk.info, Node: Miscellaneous Dgawk Commands, Prev: Dgawk Info, Up: List of Debugger Commands + +14.3.6 Miscellaneous Commands +----------------------------- + +There are a few more commands which do not fit into the previous +categories, as follows: + +`dump' [FILENAME] + Dump bytecode of the program to standard output or to the file + named in FILENAME. This prints a representation of the internal + instructions which `gawk' executes to implement the `awk' commands + in a program. This can be very enlightening, as the following + partial dump of Davide Brini's obfuscated code (*note Signature + Program::) demonstrates: + + dgawk> dump + -| # BEGIN + -| + -| [ 2:0x1d4355f0] Op_rule : [in_rule = BEGIN] [source_file = brini.awk] + -| [ 3:0x1d435710] Op_push_i : "~" [MALLOC|PERM|STRING|STRCUR] + -| [ 3:0x1d4357c0] Op_push_i : "~" [MALLOC|PERM|STRING|STRCUR] + -| [ 3:0x1d435790] Op_match : + -| [ 3:0x1d435680] Op_push_lhs : O [do_reference = FALSE] + -| [ 3:0x1d4356b0] Op_assign : + -| [ :0x1d4356e0] Op_pop : + -| [ 4:0x1d4358c0] Op_push_i : "==" [MALLOC|PERM|STRING|STRCUR] + -| [ 4:0x1d435970] Op_push_i : "==" [MALLOC|PERM|STRING|STRCUR] + -| [ 4:0x1d435940] Op_equal : + -| [ 4:0x1d435810] Op_push_lhs : o [do_reference = FALSE] + -| [ 4:0x1d435860] Op_assign : + -| [ :0x1d435890] Op_pop : + -| [ 5:0x1d435a70] Op_push : o + -| [ 5:0x1d435a40] Op_plus_i : 0 [MALLOC|NUMCUR|NUMBER] + -| [ 5:0x1d4359c0] Op_push_lhs : o [do_reference = TRUE] + -| [ 5:0x1d435910] Op_assign_plus : + -| [ :0x1d435a10] Op_pop : + -| [ 6:0x1d435b50] Op_push : O + -| [ 6:0x1d435b80] Op_push_i : "" [MALLOC|PERM|STRING|STRCUR] + -| [ :0x1d435c60] Op_no_op : + -| [ 6:0x1d435c30] Op_push : O + -| [ :0x1d435c90] Op_concat : [expr_count = 3] + -| [ 6:0x1d435ad0] Op_push_lhs : x [do_reference = FALSE] + -| [ 6:0x1d435aa0] Op_assign : + -| [ :0x1d435b00] Op_pop : + -| [ 7:0x1d435c00] Op_push_loop : [target_continue = 0x1d435bd0] [target_break = 0x1d435fc0] + -| [ 7:0x1d435bd0] Op_push_lhs : X [do_reference = TRUE] + -| [ 7:0x1d435cc0] Op_postincrement : + -| [ 7:0x1d435d70] Op_push : x + -| [ 7:0x1d435e00] Op_push : o + -| [ 7:0x1d435da0] Op_plus : + -| [ 7:0x1d435e60] Op_push : o + -| [ 7:0x1d435e30] Op_plus : + -| [ 7:0x1d435d20] Op_leq : + -| [ :0x1d435cf0] Op_jmp_false : [target_jmp = 0x1d435fc0] + -| [ 8:0x1d435f40] Op_push_i : "%c" [MALLOC|PERM|STRING|STRCUR] + -| [ :0x1d435ff0] Op_no_op : + -| [ 8:0x1d435dd0] Op_push_lhs : c [do_reference = FALSE] + -| [ 8:0x1d435e90] Op_assign_concat : + -| [ :0x1d435ec0] Op_pop : + -| [ :0x1d435f90] Op_jmp : [target_jmp = 0x1d435bd0] + -| [ :0x1d435fc0] Op_pop_loop : + -| + -| ... + -| + -| [ 9:0x1d435f10] Op_K_printf : [expr_count = 17] [redir_type = Op_illegal] + -| [ :0x1d435180] Op_no_op : + -| [ :0x1d435240] Op_exit : [exit_value = 0] + dgawk> + +`help' +`h' + Print a list of all of the `dgawk' commands with a short summary + of their usage. `help COMMAND' prints the information about the + command COMMAND. + +`list' [`-' | `+' | N | FILENAME`:'N | N--M | FUNCTION] +`l' [`-' | `+' | N | FILENAME`:'N | N--M | FUNCTION] + Print the specified lines (default 15) from the current source file + or the file named FILENAME. The possible arguments to `list' are + as follows: + + `-' + Print lines before the lines last printed. + + `+' + Print lines after the lines last printed. `list' without any + argument does the same thing. + + N + Print lines centered around line number N. + + N--M + Print lines from N to M. + + FILENAME`:'N + Print lines centered around line number N in source file + FILENAME. This command may change the current source file. + + FUNCTION + Print lines centered around beginning of the function + FUNCTION. This command may change the current source file. + +`quit' +`q' + Exit the debugger. Debugging is great fun, but sometimes we all + have to tend to other obligations in life (and sometimes we find + the bug, and are free to go on to the next one!). As we saw + above, if you are running a program, `dgawk' warns you if you + accidentally type `q' or `quit', to make sure you really want to + quit. + +`trace' `on' | `off' + Turn on or off a continuous printing of instructions which are + about to be executed, along with printing the `awk' line which they + implement. The default is `off'. + + It is to be hoped that most of the "opcodes" in these instructions + are fairly self-explanatory, and using `stepi' and `nexti' while + `trace' is on will make them into familiar friends. + + + +File: gawk.info, Node: Readline Support, Next: Dgawk Limitations, Prev: List of Debugger Commands, Up: Debugger + +14.4 Readline Support +===================== + +If compiled with the `readline' library, you can take advantage of its +command completion and history expansion features. The following types +of completion are available: + +Command completion + Command names. + +Source file name completion + Source file names. Relevant commands are `break', `clear', `list', + `tbreak', and `until'. + +Argument completion + Non-numeric arguments to a command. Relevant commands are `info' + and `enable'. + +Variable name completion + Global variable names, and function arguments in the current + context if the program is running. Relevant commands are `display', + `print', `set', and `watch'. + + + +File: gawk.info, Node: Dgawk Limitations, Prev: Readline Support, Up: Debugger + +14.5 Limitations and Future Plans +================================= + +We hope you find `dgawk' useful and enjoyable to work with, but as with +any program, especially in its early releases, it still has some +limitations. A few which are worth being aware of are: + + * At this point, `dgawk' does not give a detailed explanation of + what you did wrong when you type in something it doesn't like. + Rather, it just responds `syntax error'. When you do figure out + what your mistake was, though, you'll feel like a real guru. + + * If you perused the dump of opcodes in *Note Miscellaneous Dgawk + Commands::, (or you are already familiar with `gawk' internals), + you will realize that much of the internal manipulation of data in + `gawk', as in many interpreters, is done on a stack. `Op_push', + `Op_pop', etc., are the "bread and butter" of most `gawk' code. + Unfortunately, as of now, `dgawk' does not provide the capability + of examining the stack's contents. + + That is, the intermediate results of expression evaluation are on + the stack, but cannot be printed. Rather, only variables which + are defined in the program can actually be printed. Of course, a + workaround for this is to use more explicit variables at the + debugging stage and then change back to obscure, perhaps more + optimal code later. + + * There is no way right now to look "inside" the process of compiling + regular expressions to see if you got it right. As an `awk' + programmer, you are expected to know what `/[^[:alnum:][:blank:]]/' + means. + + * `dgawk' is designed to be used by running a program (with all its + parameters) on the command line, as described in *note dgawk + invocation::. There is no way (as of now) to attach or "break in" + to a running program. This seems reasonable for a language which + is used mainly for quickly executing, short programs. + + Look forward to a future release when these and other missing +features may be added, and of course feel free to try to add them +yourself if you want. + + +File: gawk.info, Node: Language History, Next: Installation, Prev: Debugger, Up: Top Appendix A The Evolution of the `awk' Language ********************************************** @@ -17574,13 +18838,13 @@ the changes, with cross-references to further details: * The `do'-`while' statement (*note Do Statement::). - * The built-in functions `atan2', `cos', `sin', `rand', and `srand' - (*note Numeric Functions::). + * The built-in functions `atan2()', `cos()', `sin()', `rand()', and + `srand()' (*note Numeric Functions::). - * The built-in functions `gsub', `sub', and `match' (*note String - Functions::). + * The built-in functions `gsub()', `sub()', and `match()' (*note + String Functions::). - * The built-in functions `close' and `system' (*note I/O + * The built-in functions `close()' and `system()' (*note I/O Functions::). * The `ARGC', `ARGV', `FNR', `RLENGTH', `RSTART', and `SUBSEP' @@ -17601,8 +18865,8 @@ the changes, with cross-references to further details: programs (*note Precedence::). * Regexps as the value of `FS' (*note Field Separators::) and as the - third argument to the `split' function (*note String Functions::), - rather than using only the first character of `FS'. + third argument to the `split()' function (*note String + Functions::), rather than using only the first character of `FS'. * Dynamic regexps as operands of the `~' and `!~' operators (*note Regexp Usage::). @@ -17639,10 +18903,10 @@ The System V Release 4 (1989) version of Unix `awk' added these features * The `\a', `\v', and `\x' escape sequences (*note Escape Sequences::). - * A defined return value for the `srand' built-in function (*note + * A defined return value for the `srand()' built-in function (*note Numeric Functions::). - * The `toupper' and `tolower' built-in string functions for case + * The `toupper()' and `tolower()' built-in string functions for case translation (*note String Functions::). * A cleaner specification for the `%c' format-control letter in the @@ -17704,9 +18968,12 @@ standard: * The locale's decimal point character is used for parsing input data (*note Locales::). - * The `fflush' built-in function is not supported (*note I/O + * The `fflush()' built-in function is not supported (*note I/O Functions::). + The 2008 POSIX standard can be found online at +`http://www.opengroup.org/onlinepubs/9699919799/'. + File: gawk.info, Node: BTL, Next: POSIX/GNU, Prev: POSIX, Up: Language History @@ -17724,8 +18991,8 @@ POSIX `awk': options; it continues to accept them to avoid breaking old programs. - * The `fflush' built-in function for flushing buffered output (*note - I/O Functions::). + * The `fflush()' built-in function for flushing buffered output + (*note I/O Functions::). * The `**' and `**=' operators (*note Arithmetic Ops:: and *note Assignment Ops::). @@ -17742,16 +19009,17 @@ extensions, originally developed for `gawk': * The `/dev/stdin', `/dev/stdout', and `/dev/stderr' special files (*note Special Files::). - * The ability for `FS' and for the third argument to `split' to be + * The ability for `FS' and for the third argument to `split()' to be null strings (*note Single Character Fields::). * The `nextfile' statement (*note Nextfile Statement::). * The ability to delete all of an array at once with `delete ARRAY' - (*note String Functions::). + (*note Delete::). - * The ability for the `length' function to accept an array argument - and return the number of elements in the array. (*note Delete::). + * The ability for the `length()' function to accept an array + argument and return the number of elements in the array. (*note + String Functions::). File: gawk.info, Node: POSIX/GNU, Next: Contributors, Prev: BTL, Up: Language History @@ -17779,8 +19047,8 @@ all be disabled with either the `--traditional' or `--posix' options * The `FIELDWIDTHS' variable and its effects (*note Constant Size::). - * The `systime' and `strftime' built-in functions for obtaining and - printing timestamps (*note Time Functions::). + * The `systime()' and `strftime()' built-in functions for obtaining + and printing timestamps (*note Time Functions::). * The `-W lint' option to provide error and portability checking for both the source code and at runtime (*note Options::). @@ -17801,7 +19069,8 @@ all be disabled with either the `--traditional' or `--posix' options through `ARGV' (*note Built-in Variables::). * The `ERRNO' variable, which contains the system error message when - `getline' returns -1 or `close' fails (*note Built-in Variables::). + `getline' returns -1 or `close()' fails (*note Built-in + Variables::). * The `/dev/pid', `/dev/ppid', `/dev/pgrpid', and `/dev/user' file name interpretation. (As of version 3.2, these names are no @@ -17826,13 +19095,13 @@ all be disabled with either the `--traditional' or `--posix' options * Full support for both POSIX and GNU regexps (*note Regexp::). - * The `gensub' function for more powerful text manipulation (*note + * The `gensub()' function for more powerful text manipulation (*note String Functions::). - * The `strftime' function acquired a default time format, allowing + * The `strftime()' function acquired a default time format, allowing it to be called with no arguments (*note Time Functions::). - * The ability for `FS' and for the third argument to `split' to be + * The ability for `FS' and for the third argument to `split()' to be null strings (*note Single Character Fields::). * The ability for `RS' to be a regexp (*note Records::). @@ -17844,7 +19113,7 @@ all be disabled with either the `--traditional' or `--posix' options available in the original Version 7 Unix version of `awk' (*note V7/SVR3.1::). - * The `-m' option and the `fflush' function from the Bell + * The `-m' option and the `fflush()' function from the Bell Laboratories research version of `awk' (*note Options::; also *note I/O Functions::). @@ -17882,30 +19151,30 @@ all be disabled with either the `--traditional' or `--posix' options * The `/inet' special files for TCP/IP networking using `|&' (*note TCP/IP Networking::). - * The optional second argument to `close' that allows closing one end - of a two-way pipe to a coprocess (*note Two-way I/O::). + * The optional second argument to `close()' that allows closing one + end of a two-way pipe to a coprocess (*note Two-way I/O::). - * The optional third argument to the `match' function for capturing - text-matching subexpressions within a regexp (*note String - Functions::). + * The optional third argument to the `match()' function for + capturing text-matching subexpressions within a regexp (*note + String Functions::). * Positional specifiers in `printf' formats for making translations easier (*note Printf Ordering::). - * The `asort' and `asorti' functions for sorting arrays (*note Array - Sorting::). + * The `asort()' and `asorti()' functions for sorting arrays (*note + Array Sorting::). - * The `bindtextdomain', `dcgettext' and `dcngettext' functions for - internationalization (*note Programmer i18n::). + * The `bindtextdomain()', `dcgettext()' and `dcngettext()' functions + for internationalization (*note Programmer i18n::). - * The `extension' built-in function and the ability to add new + * The `extension()' built-in function and the ability to add new built-in functions dynamically (*note Dynamic Extensions::). - * The `mktime' built-in function for creating timestamps (*note Time - Functions::). + * The `mktime()' built-in function for creating timestamps (*note + Time Functions::). - * The `and', `or', `xor', `compl', `lshift', `rshift', and - `strtonum' built-in functions (*note Bitwise Functions::). + * The `and()', `or()', `xor()', `compl()', `lshift()', `rshift()', + and `strtonum()' built-in functions (*note Bitwise Functions::). * The support for `next file' as two words was removed completely (*note Nextfile Statement::). @@ -17928,8 +19197,9 @@ all be disabled with either the `--traditional' or `--posix' options decimal point for parsing input data (*note Conversion::). * The `--enable-portals' configuration option to enable special - treatment of pathnames that begin with `/p' as BSD portals (*note - Portal Files::). + treatment of pathnames that begin with `/p' as BSD portals. (This + option is no longer available; the related code was removed since + it was never used.) * The use of GNU Automake to help in standardizing the configuration process (*note Quick Installation::). @@ -17950,18 +19220,18 @@ all be disabled with either the `--traditional' or `--posix' options at compile time (*note Additional Configuration Options::). * The `--with-whiny-user-strftime' configuration option to force the - use of the included version of the `strftime' function for + use of the included version of the `strftime()' function for deficient systems (*note Additional Configuration Options::). - * POSIX compliance for `sub' and `gsub' (*note Gory Details::). + * POSIX compliance for `sub()' and `gsub()' (*note Gory Details::). * The `--exec' option, for use in CGI scripts (*note Options::). - * The `length' function was extended to accept an array argument and - return the number of elements in the array (*note String + * The `length()' function was extended to accept an array argument + and return the number of elements in the array (*note String Functions::). - * The `strftime' function acquired a third argument to enable + * The `strftime()' function acquired a third argument to enable printing times as UTC (*note Time Functions::). Version 3.2 of `gawk' introduced the following features: @@ -17999,12 +19269,20 @@ all be disabled with either the `--traditional' or `--posix' options * The `FPAT' variable and its effects (*note Splitting By Content::). - * The `patsplit' function (*note String Functions::). + * The `patsplit()' function (*note String Functions::). * The `/inet4' and `/inet6' special files for TCP/IP networking using `|&' to specify which version of the IP protocol to use. (*note TCP/IP Networking::). + * The `--compat', `--copyleft' and `--usage' options were removed. + + * The `break' and `continue' statements may no longer be used + outside a loop, even with `--traditional' (*note Break + Statement::, and *note Continue Statement::). + + * The `--enable-portals' configuration option was removed. + File: gawk.info, Node: Contributors, Prev: POSIX/GNU, Up: Language History @@ -18068,7 +19346,7 @@ Info file, in approximate chronological order: various PC operating systems. He is also instrumental in keeping the documentation up to date for the various PC platforms. - * Christos Zoulas provided the `extension' built-in function for + * Christos Zoulas provided the `extension()' built-in function for dynamically adding new modules. * Ju"rgen Kahrs contributed the initial version of the TCP/IP @@ -18086,9 +19364,9 @@ Info file, in approximate chronological order: * Arno Peters did the initial work to convert `gawk' to use GNU Automake and `gettext'. - * Alan J. Broder provided the initial version of the `asort' function - as well as the code for the new optional third argument to the - `match' function. + * Alan J. Broder provided the initial version of the `asort()' + function as well as the code for the new optional third argument + to the `match()' function. * Andreas Buening updated the `gawk' port for OS/2. @@ -18338,8 +19616,8 @@ Various `.c', `.y', and `.h' files Files needed for building `gawk' on POSIX-compliant systems. `pc/*' - Files needed for building `gawk' under MS-DOS, MS Windows and OS/2 - (*note PC Installation::, for details). + Files needed for building `gawk' under MS Windows and OS/2 (*note + PC Installation::, for details). `vms/*' Files needed for building `gawk' under VMS (*note VMS @@ -18420,12 +19698,8 @@ B.2.2 Additional Configuration Options There are several additional options you may use on the `configure' command line when compiling `gawk' from scratch, including: -`--enable-portals' - Treat pathnames that begin with `/p' as BSD portal files when - doing two-way I/O with the `|&' operator (*note Portal Files::). - `--with-whiny-user-strftime' - Force use of the included version of the `strftime' function for + Force use of the included version of the `strftime()' function for deficient systems `--disable-lint' @@ -18704,7 +19978,7 @@ for `DYN_MAKEXP': pick the one that matches your target. To build some of the example extension libraries, `cd' to the extension directory and copy `Makefile.pc' to `Makefile'. You can then build using the same two targets. To run the example `awk' scripts, -you'll need to either change the call to the `extension' function to +you'll need to either change the call to the `extension()' function to match the name of the library (for instance, change `"./ordchr.so"' to `"ordchr.dll"' or simply `"ordchr"'), or rename the library to match the call (for instance, rename `ordchr.dll' to `ordchr.so'). @@ -18768,8 +20042,8 @@ is set to `e:' the complete default search path is An `sh'-like shell (as opposed to `command.com' under MS-DOS or `cmd.exe' under OS/2) may be useful for `awk' programming. Ian Stewartson has written an excellent shell for MS-DOS and OS/2, Daisuke -Aoyama has ported GNU `bash' to MS-DOS using the DJGPP tools, and -several shells are available for OS/2, including `ksh'. The file +Aoyama has ported GNU Bash to MS-DOS using the DJGPP tools, and several +shells are available for OS/2, including `ksh'. The file `README_d/README.pc' in the `gawk' distribution contains information on these shells. Users of Stewartson's shell on DOS should examine its documentation for handling command lines; in particular, the setting @@ -18843,8 +20117,8 @@ B.3.1.5 Using `gawk' In The Cygwin Environment `gawk' can be used "out of the box" under Windows if you are using the Cygwin environment (http://www.cygwin.com). This environment provides -an excellent simulation of Unix, using the GNU tools, such as `bash', -the GNU Compiler Collection (GCC), GNU Make, and other GNU tools. +an excellent simulation of Unix, using the GNU tools, such as Bash, the +GNU Compiler Collection (GCC), GNU Make, and other GNU tools. Compilation and installation for Cygwin is the same as for a Unix system: @@ -18859,7 +20133,7 @@ the `make' proceeds as usual. NOTE: The `|&' operator and TCP/IP networking (*note TCP/IP Networking::) are fully supported in the Cygwin environment. This - is not true for any other environment for MS-DOS or MS-Windows. + is not true for any other environment for MS-Windows. File: gawk.info, Node: MSYS, Prev: Cygwin, Up: PC Installation @@ -19108,22 +20382,22 @@ B.4.1.1 Compiling `gawk' on the Atari ST A proper compilation of `gawk' sources when `sizeof(int)' differs from `sizeof(void *)' requires an ISO C compiler. An initial port was done -with `gcc'. You may actually prefer executables where `int's are four -bytes wide but the other variant works as well. +with `gcc'. You may actually prefer executables where `int()'s are +four bytes wide but the other variant works as well. You may need quite a bit of memory when trying to recompile the `gawk' sources, as some source files (`regex.c' in particular) are quite big. If you run out of memory compiling such a file, try reducing the optimization level for this particular file, which may help. - With a reasonable shell (`bash' will do), you have a pretty good -chance that the `configure' utility will succeed, and in particular if -you run GNU/Linux, MiNT or a similar operating system. Otherwise -sample versions of `config.h' and `Makefile.st' are given in the -`atari' subdirectory and can be edited and copied to the corresponding -files in the main source directory. Even if `configure' produces -something, it might be advisable to compare its results with the sample -versions and possibly make adjustments. + With a reasonable shell (Bash will do), you have a pretty good chance +that the `configure' utility will succeed, and in particular if you run +GNU/Linux, MiNT or a similar operating system. Otherwise sample +versions of `config.h' and `Makefile.st' are given in the `atari' +subdirectory and can be edited and copied to the corresponding files in +the main source directory. Even if `configure' produces something, it +might be advisable to compare its results with the sample versions and +possibly make adjustments. Some `gawk' source code fragments depend on a preprocessor define `atarist'. This basically assumes the TOS environment with `gcc'. @@ -19131,12 +20405,12 @@ Modify these sections as appropriate if they are not right for your environment. Also see the remarks about `AWKPATH' and `envsep' in *note Atari Using::. - As shipped, the sample `config.h' claims that the `system' function -is missing from the libraries, which is not true, and an alternative -implementation of this function is provided in + As shipped, the sample `config.h' claims that the `system()' +function is missing from the libraries, which is not true, and an +alternative implementation of this function is provided in `unsupported/atari/system.c'. Depending upon your particular combination of shell and operating system, you might want to change the -file to indicate that `system' is available. +file to indicate that `system()' is available. File: gawk.info, Node: Atari Using, Prev: Atari Compiling, Up: Atari Installation @@ -19184,12 +20458,12 @@ output, and a calling shell has redirected standard output to a file. libraries, it accepts both `/' and `\' as path separators. While this is convenient, it should be remembered that this removes one technically valid character (`/') from your file name. It may also -create problems for external programs called via the `system' function, -which may not support this convention. Whenever it is possible that a -file created by `gawk' will be used by some other program, use only -backslashes. Also remember that in `awk', backslashes in strings have -to be doubled in order to get literal backslashes (*note Escape -Sequences::). +create problems for external programs called via the `system()' +function, which may not support this convention. Whenever it is +possible that a file created by `gawk' will be used by some other +program, use only backslashes. Also remember that in `awk', +backslashes in strings have to be doubled in order to get literal +backslashes (*note Escape Sequences::). File: gawk.info, Node: BeOS Installation, Next: Tandem Installation, Prev: Atari Installation, Up: Unsupported @@ -19219,7 +20493,7 @@ then `make install': ... $ make install - BeOS uses `bash' as its shell; thus, you use `gawk' the same way you + BeOS uses Bash as its shell; thus, you use `gawk' the same way you would under Unix. If these steps do not work, please send in a bug report (*note Bugs::). @@ -19391,7 +20665,7 @@ Unix `awk' `mawk' has the following extensions that are not in POSIX `awk': - * The `fflush' built-in function for flushing buffered output + * The `fflush()' built-in function for flushing buffered output (*note I/O Functions::). * The `**' and `**=' operators (*note Arithmetic Ops:: and also @@ -19406,8 +20680,8 @@ Unix `awk' Special Files::). Use `"-"' instead of `"/dev/stdin"' with `mawk'. - * The ability for `FS' and for the third argument to `split' to - be null strings (*note Single Character Fields::). + * The ability for `FS' and for the third argument to `split()' + to be null strings (*note Single Character Fields::). * The ability to delete all of an array at once with `delete ARRAY' (*note Delete::). @@ -19501,7 +20775,7 @@ option. If `gawk' is compiled for debugging with `-DDEBUG', then there is one more option available on the command line: -`-W parsedebug' +`-Y' `--parsedebug' Prints out the parse stack information as the program is being parsed. @@ -19574,8 +20848,8 @@ possible for me to include your changes: * Put the name of the function at the beginning of its own line. - * Put the return type of the function, even if it is `int', on - the line above the line with the name and arguments of the + * Put the return type of the function, even if it is `int()', + on the line above the line with the name and arguments of the function. * Put spaces around parentheses used in control structures @@ -19820,12 +21094,12 @@ when writing extensions. The next minor node shows how they are used: This function returns the actual number of parameters passed to the current function. Inside the code of an extension this can be used to determine the maximum index which is safe to use with - `stack_ptr'. If this value is greater than `tree->param_cnt', the + `get_actual_argument'. If this value is greater than `nargs', the function was called incorrectly from the `awk' program. *Caution:* This function is new as of `gawk' 3.1.4. -`n->param_cnt' +`nargs' Inside an extension function, this is the maximum number of expected parameters, as set by the `make_builtin' function. @@ -19852,8 +21126,8 @@ when writing extensions. The next minor node shows how they are used: `NODE **assoc_lookup(NODE *symbol, NODE *subs, int reference)' Finds, and installs if necessary, array elements. `symbol' is the array, `subs' is the subscript. This is usually a value created - with `tmp_string' (see below). `reference' should be `TRUE' if it - is an error to use the value before it is created. Typically, + with `make_string' (see below). `reference' should be `TRUE' if + it is an error to use the value before it is created. Typically, `FALSE' is the correct value to use from extension functions. `NODE *make_string(char *s, size_t len)' @@ -19866,25 +21140,15 @@ when writing extensions. The next minor node shows how they are used: be stored appropriately. This is permanent storage; understanding of `gawk' memory management is helpful. -`NODE *tmp_string(char *s, size_t len);' - Take a C string and turn it into a pointer to a `NODE' that can be - stored appropriately. This is temporary storage; understanding of - `gawk' memory management is helpful. - -`NODE *tmp_number(AWKNUM val)' - Take an `AWKNUM' and turn it into a pointer to a `NODE' that can - be stored appropriately. This is temporary storage; understanding - of `gawk' memory management is helpful. - `NODE *dupnode(NODE *n)' Duplicate a node. In most cases, this increments an internal reference count instead of actually duplicating the entire `NODE'; understanding of `gawk' memory management is helpful. -`void free_temp(NODE *n)' +`void unref(NODE *n)' This macro releases the memory associated with a `NODE' allocated - with `tmp_string' or `tmp_number'. Understanding of `gawk' memory - management is helpful. + with `make_string' or `make_number'. Understanding of `gawk' + memory management is helpful. `void make_builtin(char *name, NODE *(*func)(NODE *), int count)' Register a C function pointed to by `func' as new built-in @@ -19895,17 +21159,17 @@ when writing extensions. The next minor node shows how they are used: /* do_xxx --- do xxx function for gawk */ NODE * - do_xxx(NODE *tree) + do_xxx(int nargs) { ... } -`NODE *get_argument(NODE *tree, int i)' +`NODE *get_argument(int i)' This function is called from within a C extension function to get the `i'-th argument from the function call. The first argument is argument zero. -`NODE *get_actual_argument(NODE *tree, unsigned int i,' +`NODE *get_actual_argument(int i,' ` int optional, int wantarray);' This function retrieves a particular argument `i'. `wantarray' is `TRUE' if the argument should be an array, `FALSE' otherwise. If @@ -19915,22 +21179,16 @@ when writing extensions. The next minor node shows how they are used: *Caution:* This function is new as of `gawk' 3.1.4. -`get_scalar_argument(t, i, opt)' +`get_scalar_argument(i, opt)' This is a convenience macro that calls `get_actual_argument'. *Caution:* This macro is new as of `gawk' 3.1.4. -`get_array_argument(t, i, opt)' +`get_array_argument(i, opt)' This is a convenience macro that calls `get_actual_argument'. *Caution:* This macro is new as of `gawk' 3.1.4. -`void set_value(NODE *tree)' - This function is called from within a C extension function to set - the return value from the extension function. This value is what - the `awk' program sees as the return value from the new `awk' - function. - `void update_ERRNO(void)' This function is called from within a C extension function to set the value of `gawk''s `ERRNO' variable, based on the current value @@ -20031,13 +21289,19 @@ boilerplate code now suffices: /* if necessary, clear it: */ assoc_clear(the_arg); - As of version 3.1.4, the internals improved again, and became even + In version 3.1.4, the internals improved again, and became even simpler: NODE *the_arg; the_arg = get_array_argument(tree, 2, FALSE); /* assume need 3rd arg, 0-based */ + As of version 4.0, the internals changed again: + + NODE *the_arg; + + the_arg = get_array_argument(2, FALSE); /* assume need 3rd arg, 0-based */ + Again, you should spend time studying the `gawk' internals; don't just blindly copy this code. @@ -20130,7 +21394,7 @@ fails. It fills in the following elements: `"ctime"' The file's last access, modification, and inode update times, respectively. These are numeric timestamps, suitable for - formatting with `strftime' (*note Built-in::). + formatting with `strftime()' (*note Built-in::). `"pmode"' The file's "printable mode." This is a string representation of @@ -20199,8 +21463,7 @@ other POSIX-compliant systems:(1) chdir() builtin for gawk */ static NODE * - do_chdir(tree) - NODE *tree; + do_chdir(int nargs) { NODE *newdir; int ret = -1; @@ -20208,39 +21471,32 @@ other POSIX-compliant systems:(1) if (do_lint && get_curfunc_arg_count() != 1) lintwarn("chdir: called with incorrect number of arguments"); - newdir = get_scalar_argument(tree, 0); + newdir = get_scalar_argument(0, FALSE); The file includes the `"awk.h"' header file for definitions for the `gawk' internals. It includes `<sys/sysmacros.h>' for access to the `major' and `minor' macros. By convention, for an `awk' function `foo', the function that -implements it is called `do_foo'. The function should take a `NODE *' -argument, usually called `tree', that represents the argument list to -the function. The `newdir' variable represents the new directory to -change to, retrieved with `get_argument'. Note that the first argument -is numbered zero. +implements it is called `do_foo'. The function should take a `int' +argument, usually called `nargs', that represents the number of defined +arguments for the function. The `newdir' variable represents the new +directory to change to, retrieved with `get_scalar_argument'. Note +that the first argument is numbered zero. This code actually accomplishes the `chdir'. It first forces the argument to be a string and passes the string value to the `chdir' -system call. If the `chdir' fails, `ERRNO' is updated. The result of -`force_string' has to be freed with `free_temp': +system call. If the `chdir' fails, `ERRNO' is updated. (void) force_string(newdir); ret = chdir(newdir->stptr); if (ret < 0) update_ERRNO(); - free_temp(newdir); - Finally, the function returns the return value to the `awk' level, -using `set_value'. Then it must return a value from the call to the new -built-in (this value ignored by the interpreter): + Finally, the function returns the return value to the `awk' level: /* Set the return value */ - set_value(tmp_number((AWKNUM) ret)); - - /* Just to make the interpreter happy */ - return tmp_number((AWKNUM) 0); + return make_number((AWKNUM) ret); } The `stat' built-in is more involved. First comes a function that @@ -20263,10 +21519,9 @@ variable declarations and argument checking: /* do_stat --- provide a stat() function for gawk */ static NODE * - do_stat(tree) - NODE *tree; + do_stat(int nargs) { - NODE *file, *array; + NODE *file, *array, *tmp; struct stat sbuf; int ret; NODE **aptr; @@ -20283,8 +21538,8 @@ in case the file is a symbolic link. If there's an error, we set `ERRNO' and return: /* directory is first arg, array to hold results is second */ - file = get_scalar_argument(tree, 0, FALSE); - array = get_array_argument(tree, 1, FALSE); + file = get_scalar_argument(0, FALSE); + array = get_array_argument(1, FALSE); /* empty out the array */ assoc_clear(array); @@ -20294,37 +21549,31 @@ in case the file is a symbolic link. If there's an error, we set ret = lstat(file->stptr, & sbuf); if (ret < 0) { update_ERRNO(); - - set_value(tmp_number((AWKNUM) ret)); - - free_temp(file); - return tmp_number((AWKNUM) 0); + return make_number((AWKNUM) ret); } Now comes the tedious part: filling in the array. Only a few of the calls are shown here, since they all follow the same pattern: /* fill in the array */ - aptr = assoc_lookup(array, tmp_string("name", 4), FALSE); + aptr = assoc_lookup(array, tmp = make_string("name", 4), FALSE); *aptr = dupnode(file); + unref(tmp); - aptr = assoc_lookup(array, tmp_string("mode", 4), FALSE); + aptr = assoc_lookup(array, tmp = make_string("mode", 4), FALSE); *aptr = make_number((AWKNUM) sbuf.st_mode); + unref(tmp); - aptr = assoc_lookup(array, tmp_string("pmode", 5), FALSE); + aptr = assoc_lookup(array, tmp = make_string("pmode", 5), FALSE); pmode = format_mode(sbuf.st_mode); *aptr = make_string(pmode, strlen(pmode)); + unref(tmp); - When done, we free the temporary value containing the file name, set -the return value, and return: + When done, return the `lstat' return value: - free_temp(file); /* Set the return value */ - set_value(tmp_number((AWKNUM) ret)); - - /* Just to make the interpreter happy */ - return tmp_number((AWKNUM) 0); + return make_number((AWKNUM) ret); } Finally, it's necessary to provide the "glue" that loads the new @@ -20340,7 +21589,7 @@ named `dlload' that does the job: { make_builtin("chdir", do_chdir, 1); make_builtin("stat", do_stat, 2); - return tmp_number((AWKNUM) 0); + return make_number((AWKNUM) 0); } And that's it! As an exercise, consider adding functions to @@ -20367,7 +21616,7 @@ a GNU/Linux shared library: $ gcc -shared -DHAVE_CONFIG_H -c -O -g -IIDIR filefuncs.c $ ld -o filefuncs.so -shared filefuncs.o - Once the library exists, it is loaded by calling the `extension' + Once the library exists, it is loaded by calling the `extension()' built-in function. This function takes two arguments: the name of the library to load and the name of a function to call when the library is first loaded. This function adds the new functions to `gawk'. It @@ -20854,9 +22103,9 @@ definition of the language and the original POSIX standards specified that `awk' only understands decimal numbers (base 10), and not octal (base 8) or hexadecimal numbers (base 16). - As of this writing (February, 2007), changes in the language of the -current POSIX standard can be interpreted to imply that `awk' should -support additional features. These features are: + Changes in the language of the 2001 and 2004 POSIX standard can be +interpreted to imply that `awk' should support additional features. +These features are: * Interpretation of floating point data values specified in hexadecimal notation (`0xDEADBEEF'). (Note: data values, _not_ @@ -20885,9 +22134,15 @@ interpretation of the standard, which requires a certain amount of by the standard developers, either. In other words, "we see how you got where you are, but we don't think that that's where you want to be." - Nevertheless, on systems that support IEEE floating point, it seems -reasonable to provide _some_ way to support NaN and Infinity values. -The solution implemented in `gawk', as of version 3.1.6, is as follows: + The 2008 POSIX standard added explicit wording to allow, but not +require, that `awk' support hexadecimal floating point values and +special values for "Not A Number" and infinity. + + Although the `gawk' maintainer continues to feel that providing +those features is inadvisable, nevertheless, on systems that support +IEEE floating point, it seems reasonable to provide _some_ way to +support NaN and Infinity values. The solution implemented in `gawk' is +as follows: 1. With the `--posix' command-line option, `gawk' becomes "hands off." String values are passed directly to the system library's @@ -21015,14 +22270,14 @@ Boolean Expression Bourne Shell The standard shell (`/bin/sh') on Unix and Unix-like systems, - originally written by Steven R. Bourne. Many shells (`bash', - `ksh', `pdksh', `zsh') are generally upwardly compatible with the - Bourne shell. + originally written by Steven R. Bourne. Many shells (Bash, `ksh', + `pdksh', `zsh') are generally upwardly compatible with the Bourne + shell. Built-in Function The `awk' language provides built-in functions that perform various numerical, I/O-related, and string computations. Examples are - `sqrt' (for the square root of a number) and `substr' (for a + `sqrt()' (for the square root of a number) and `substr()' (for a substring of a string). `gawk' provides functions for timestamp management, bit manipulation, and runtime string translation. (*Note Built-in::.) @@ -21189,10 +22444,10 @@ Floating-Point Number Format Format strings are used to control the appearance of output in the - `strftime' and `sprintf' functions, and are used in the `printf' - statement as well. Also, data conversions from numbers to strings - are controlled by the format string contained in the built-in - variable `CONVFMT'. (*Note Control Letters::.) + `strftime()' and `sprintf()' functions, and are used in the + `printf' statement as well. Also, data conversions from numbers + to strings are controlled by the format string contained in the + built-in variable `CONVFMT'. (*Note Control Letters::.) Free Documentation License This document describes the terms under which this Info file is @@ -21500,8 +22755,8 @@ Text Domain Timestamp A value in the "seconds since the epoch" format used by Unix and - POSIX systems. Used for the `gawk' functions `mktime', - `strftime', and `systime'. See also "Epoch" and "UTC." + POSIX systems. Used for the `gawk' functions `mktime()', + `strftime()', and `systime()'. See also "Epoch" and "UTC." Unix A computer operating system originally developed in the early @@ -22939,7 +24194,8 @@ Index * % (percent sign), %= operator: Assignment Ops. (line 129) * & (ampersand), && operator <1>: Precedence. (line 86) * & (ampersand), && operator: Boolean Ops. (line 57) -* & (ampersand), gsub/gensub/sub functions and: Gory Details. (line 6) +* & (ampersand), gsub()/gensub()/sub() functions and: Gory Details. + (line 6) * ' (single quote) <1>: Quoting. (line 31) * ' (single quote) <2>: Long. (line 33) * ' (single quote): One-shot. (line 15) @@ -22953,10 +24209,10 @@ Index (line 86) * * (asterisk), * operator, null strings, matching: Gory Details. (line 159) -* * (asterisk), ** operator <1>: Options. (line 216) +* * (asterisk), ** operator <1>: Options. (line 208) * * (asterisk), ** operator <2>: Precedence. (line 49) * * (asterisk), ** operator: Arithmetic Ops. (line 81) -* * (asterisk), **= operator <1>: Options. (line 216) +* * (asterisk), **= operator <1>: Options. (line 208) * * (asterisk), **= operator <2>: Precedence. (line 95) * * (asterisk), **= operator: Assignment Ops. (line 129) * * (asterisk), *= operator <1>: Precedence. (line 95) @@ -22976,79 +24232,73 @@ Index * - (hyphen), -= operator <1>: Precedence. (line 95) * - (hyphen), -= operator: Assignment Ops. (line 129) * - (hyphen), filenames beginning with: Options. (line 56) -* - (hyphen), in character lists: Character Lists. (line 17) +* - (hyphen), in character lists: Character Lists. (line 16) * --assign option: Options. (line 30) -* --c option: Options. (line 76) +* --c option: Options. (line 75) * --characters-as-bytes option: Options. (line 65) -* --compat option: Options. (line 76) -* --copyleft option: Options. (line 85) -* --copyright option: Options. (line 85) +* --copyright option: Options. (line 83) * --disable-lint configuration option: Additional Configuration Options. - (line 17) + (line 13) * --disable-nls configuration option: Additional Configuration Options. - (line 32) + (line 28) * --dump-variables option <1>: Library Names. (line 45) -* --dump-variables option: Options. (line 90) -* --enable-portals configuration option <1>: Additional Configuration Options. - (line 9) -* --enable-portals configuration option: Portal Files. (line 6) -* --exec option: Options. (line 112) +* --dump-variables option: Options. (line 88) +* --exec option: Options. (line 110) * --field-separator option: Options. (line 21) * --file option: Options. (line 25) -* --gen-pot option <1>: Options. (line 131) +* --gen-pot option <1>: Options. (line 129) * --gen-pot option: String Extraction. (line 6) -* --help option: Options. (line 139) -* --L option: Options. (line 163) -* --lint option <1>: Options. (line 144) +* --help option: Options. (line 136) +* --L option: Options. (line 244) +* --lint option <1>: Options. (line 141) * --lint option: Command Line. (line 20) -* --lint-old option: Options. (line 163) -* --non-decimal-data option <1>: Options. (line 168) +* --lint-old option: Options. (line 244) +* --non-decimal-data option <1>: Options. (line 160) * --non-decimal-data option: Nondecimal Data. (line 6) -* --non-decimal-data option, strtonum function and: Nondecimal Data. +* --non-decimal-data option, strtonum() function and: Nondecimal Data. (line 36) -* --optimize option: Options. (line 181) -* --posix option: Options. (line 200) -* --posix option, --traditional option and: Options. (line 230) -* --profile option <1>: Options. (line 188) +* --optimize option: Options. (line 173) +* --posix option: Options. (line 192) +* --posix option, --traditional option and: Options. (line 222) +* --profile option <1>: Options. (line 180) * --profile option: Profiling. (line 15) -* --re-interval option: Options. (line 236) -* --sandbox option: Options. (line 243) +* --re-interval option: Options. (line 228) +* --sandbox option: Options. (line 235) * --sandbox option, disabling system function: I/O Functions. (line 88) * --sandbox option, input redirection with getline: Getline. (line 19) * --sandbox option, output redirection with print, printf: Redirection. (line 6) -* --source option: Options. (line 104) -* --traditional option: Options. (line 76) -* --traditional option, --posix option and: Options. (line 230) -* --usage option: Options. (line 139) -* --use-lc-numeric option: Options. (line 176) -* --version option: Options. (line 252) +* --source option: Options. (line 102) +* --traditional option: Options. (line 75) +* --traditional option, --posix option and: Options. (line 222) +* --use-lc-numeric option: Options. (line 168) +* --version option: Options. (line 249) * --with-whiny-user-strftime configuration option: Additional Configuration Options. - (line 13) + (line 9) * -b option: Options. (line 65) -* -C option: Options. (line 85) -* -d option: Options. (line 90) -* -E option: Options. (line 112) -* -e option: Options. (line 104) +* -C option: Options. (line 83) +* -d option: Options. (line 88) +* -E option: Options. (line 110) +* -e option: Options. (line 102) * -f option: Options. (line 25) * -F option <1>: Options. (line 21) * -F option: Command Line Field Separator. (line 6) * -f option: Long. (line 12) -* -F option, -Ft sets FS to TAB: Options. (line 260) -* -f option, on command line: Options. (line 265) +* -F option, -Ft sets FS to TAB: Options. (line 257) +* -f option, on command line: Options. (line 262) * -F option, troubleshooting: Known Bugs. (line 6) -* -g option: Options. (line 131) -* -h option: Options. (line 139) -* -l option: Options. (line 144) -* -N option: Options. (line 176) -* -n option: Options. (line 168) -* -O option: Options. (line 181) -* -P option: Options. (line 200) -* -p option: Options. (line 188) -* -r option: Options. (line 236) -* -S option: Options. (line 243) -* -V option: Options. (line 252) +* -g option: Options. (line 129) +* -h option: Options. (line 136) +* -l option: Options. (line 141) +* -N option: Options. (line 168) +* -n option: Options. (line 160) +* -O option: Options. (line 173) +* -P option: Options. (line 192) +* -p option: Options. (line 180) +* -r option: Options. (line 228) +* -S option: Options. (line 235) +* -V option: Options. (line 249) * -v option: Options. (line 30) * -v option, variables, assigning: Assignment Options. (line 12) * -W option: Options. (line 44) @@ -23068,11 +24318,10 @@ Index (line 148) * / (forward slash), patterns and: Expression Patterns. (line 24) * /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) -* /dev/... special files (gawk): Special FD. (line 41) +* /dev/... special files (gawk): Special FD. (line 44) * /inet/ files (gawk): TCP/IP Networking. (line 6) * /inet4/ files (gawk): TCP/IP Networking. (line 6) * /inet6/ files (gawk): TCP/IP Networking. (line 6) -* /p files (gawk): Portal Files. (line 6) * ; (semicolon): Statements/Lines. (line 90) * ; (semicolon), AWKPATH variable and: PC Using. (line 11) * ; (semicolon), separating statements in actions <1>: Statements. @@ -23145,11 +24394,11 @@ Index * \ (backslash), continuing lines and: Statements/Lines. (line 19) * \ (backslash), continuing lines and, comments and: Statements/Lines. (line 75) -* \ (backslash), continuing lines and, in csh <1>: Statements/Lines. +* \ (backslash), continuing lines and, in csh: Statements/Lines. (line 44) -* \ (backslash), continuing lines and, in csh: More Complex. (line 15) -* \ (backslash), gsub/gensub/sub functions and: Gory Details. (line 6) -* \ (backslash), in character lists: Character Lists. (line 17) +* \ (backslash), gsub()/gensub()/sub() functions and: Gory Details. + (line 6) +* \ (backslash), in character lists: Character Lists. (line 16) * \ (backslash), in escape sequences: Escape Sequences. (line 6) * \ (backslash), in escape sequences, POSIX and: Escape Sequences. (line 113) @@ -23157,12 +24406,12 @@ Index * ^ (caret) <1>: GNU Regexp Operators. (line 59) * ^ (caret): Regexp Operators. (line 22) -* ^ (caret), ^ operator <1>: Options. (line 216) +* ^ (caret), ^ operator <1>: Options. (line 208) * ^ (caret), ^ operator: Precedence. (line 49) -* ^ (caret), ^= operator <1>: Options. (line 216) +* ^ (caret), ^= operator <1>: Options. (line 208) * ^ (caret), ^= operator <2>: Precedence. (line 95) * ^ (caret), ^= operator: Assignment Ops. (line 129) -* ^ (caret), in character lists: Character Lists. (line 17) +* ^ (caret), in character lists: Character Lists. (line 16) * ^, in FS: Regexp Field Splitting. (line 59) * _ (underscore), _ C macro: Explaining gettext. (line 68) @@ -23181,16 +24430,15 @@ Index * adding, features to gawk: Adding Code. (line 6) * adding, fields: Changing Fields. (line 53) * adding, functions to gawk: Dynamic Extensions. (line 10) -* advanced features, buffering: I/O Functions. (line 100) -* advanced features, close function: Close Files And Pipes. - (line 130) +* advanced features, buffering: I/O Functions. (line 101) +* advanced features, close() function: Close Files And Pipes. + (line 131) * advanced features, constants, values of: Nondecimal-numbers. (line 67) -* advanced features, data files as single record: Records. (line 170) +* advanced features, data files as single record: Records. (line 172) * advanced features, fixed-width data: Constant Size. (line 9) -* advanced features, FNR/NR variables: Auto-set. (line 193) +* advanced features, FNR/NR variables: Auto-set. (line 192) * advanced features, gawk: Advanced Features. (line 6) -* advanced features, gawk, BSD portals: Portal Files. (line 6) * advanced features, gawk, network programming: TCP/IP Networking. (line 6) * advanced features, gawk, nondecimal input data: Nondecimal Data. @@ -23217,10 +24465,11 @@ Index (line 148) * ampersand (&), && operator: Boolean Ops. (line 57) * ampersand (&), &&operator: Precedence. (line 86) -* ampersand (&), gsub/gensub/sub functions and: Gory Details. (line 6) +* ampersand (&), gsub()/gensub()/sub() functions and: Gory Details. + (line 6) * AND bitwise operation: Bitwise Functions. (line 6) * and Boolean-logic operator: Boolean Ops. (line 6) -* and function (gawk): Bitwise Functions. (line 39) +* and() function (gawk): Bitwise Functions. (line 39) * ANSI: Glossary. (line 30) * archeologists: Bugs. (line 6) * ARGC/ARGV variables <1>: ARGC and ARGV. (line 6) @@ -23236,7 +24485,7 @@ Index * arguments, command-line, invoking awk: Command Line. (line 6) * arguments, in function calls: Function Calls. (line 16) * arguments, processing: Getopt Function. (line 6) -* arguments, retrieving: Internals. (line 121) +* arguments, retrieving: Internals. (line 111) * arithmetic operators: Arithmetic Ops. (line 6) * arrays: Arrays. (line 6) * arrays, as parameters to functions: Function Caveats. (line 55) @@ -23269,10 +24518,10 @@ Index * artificial intelligence, gawk and: Distribution contents. (line 47) * ASCII: Ordinal Functions. (line 44) -* asort function (gawk) <1>: String Functions. (line 18) -* asort function (gawk): Array Sorting. (line 6) -* asort function (gawk), arrays, sorting: Array Sorting. (line 6) -* asorti function (gawk): String Functions. (line 47) +* asort() function (gawk) <1>: String Functions. (line 18) +* asort() function (gawk): Array Sorting. (line 6) +* asort() function (gawk), arrays, sorting: Array Sorting. (line 6) +* asorti() function (gawk): String Functions. (line 47) * assert function (C library): Assert Function. (line 6) * assert user-defined function: Assert Function. (line 28) * assertions: Assert Function. (line 6) @@ -23289,22 +24538,22 @@ Index (line 86) * asterisk (*), * operator, null strings, matching: Gory Details. (line 159) -* asterisk (*), ** operator <1>: Options. (line 216) +* asterisk (*), ** operator <1>: Options. (line 208) * asterisk (*), ** operator <2>: Precedence. (line 49) * asterisk (*), ** operator: Arithmetic Ops. (line 81) -* asterisk (*), **= operator <1>: Options. (line 216) +* asterisk (*), **= operator <1>: Options. (line 208) * asterisk (*), **= operator <2>: Precedence. (line 95) * asterisk (*), **= operator: Assignment Ops. (line 129) * asterisk (*), *= operator <1>: Precedence. (line 95) * asterisk (*), *= operator: Assignment Ops. (line 129) -* atan2 function: Numeric Functions. (line 37) +* atan2() function: Numeric Functions. (line 37) * atari: Atari Installation. (line 9) * awf (amazingly workable formatter) program: Glossary. (line 20) * awk language, POSIX version: Assignment Ops. (line 136) * awk programs <1>: Two Rules. (line 6) * awk programs <2>: Executable Scripts. (line 6) * awk programs: Getting Started. (line 12) -* awk programs, complex: When. (line 31) +* awk programs, complex: When. (line 29) * awk programs, documenting <1>: Library Names. (line 6) * awk programs, documenting: Comments. (line 6) * awk programs, examples of: Sample Programs. (line 6) @@ -23316,7 +24565,7 @@ Index * awk programs, location of: Options. (line 25) * awk programs, one-line examples: Very Simple. (line 45) * awk programs, profiling: Profiling. (line 6) -* awk programs, profiling, enabling: Options. (line 188) +* awk programs, profiling, enabling: Options. (line 180) * awk programs, running <1>: Long. (line 6) * awk programs, running: Running gawk. (line 6) * awk programs, running, from shell scripts: One-shot. (line 22) @@ -23324,7 +24573,7 @@ Index * awk programs, shell variables in: Using Shell Variables. (line 6) * awk, function of: Getting Started. (line 6) -* awk, gawk and <1>: This Manual. (line 13) +* awk, gawk and <1>: This Manual. (line 14) * awk, gawk and: Preface. (line 22) * awk, history of: History. (line 17) * awk, implementation issues, pipes: Redirection. (line 135) @@ -23356,7 +24605,8 @@ Index * AWKPATH environment variable: AWKPATH Variable. (line 6) * awkprof.out file: Profiling. (line 10) * awksed.awk program: Simple Sed. (line 25) -* awkvars.out file: Options. (line 90) +* awkvars.out file: Options. (line 88) +* b debugger command (alias for break): Breakpoint Control. (line 11) * backslash (\) <1>: Regexp Operators. (line 18) * backslash (\) <2>: Quoting. (line 31) * backslash (\) <3>: Comments. (line 50) @@ -23398,20 +24648,21 @@ Index * backslash (\), continuing lines and: Statements/Lines. (line 19) * backslash (\), continuing lines and, comments and: Statements/Lines. (line 75) -* backslash (\), continuing lines and, in csh <1>: Statements/Lines. +* backslash (\), continuing lines and, in csh: Statements/Lines. (line 44) -* backslash (\), continuing lines and, in csh: More Complex. (line 15) -* backslash (\), gsub/gensub/sub functions and: Gory Details. (line 6) -* backslash (\), in character lists: Character Lists. (line 17) +* backslash (\), gsub()/gensub()/sub() functions and: Gory Details. + (line 6) +* backslash (\), in character lists: Character Lists. (line 16) * backslash (\), in escape sequences: Escape Sequences. (line 6) * backslash (\), in escape sequences, POSIX and: Escape Sequences. (line 113) * backslash (\), regexp constants: Computed Regexps. (line 28) +* backtrace debugger command: Dgawk Stack. (line 13) * BBS-list file: Sample Data Files. (line 6) -* Beebe, Nelson: Acknowledgments. (line 53) +* Beebe, Nelson: Acknowledgments. (line 59) * Beebe, Nelson H.F.: Other Versions. (line 93) * BEGIN pattern <1>: BEGIN/END. (line 6) -* BEGIN pattern <2>: Field Separators. (line 43) +* BEGIN pattern <2>: Field Separators. (line 44) * BEGIN pattern: Records. (line 29) * BEGIN pattern, assert user-defined function and: Assert Function. (line 82) @@ -23436,13 +24687,13 @@ Index * Bell Laboratories awk extensions: BTL. (line 6) * Benzinger, Michael: Contributors. (line 89) * BeOS: BeOS Installation. (line 6) -* Berry, Karl: Acknowledgments. (line 30) +* Berry, Karl: Acknowledgments. (line 32) * binary input/output: User-modified. (line 10) -* bindtextdomain function (C library): Explaining gettext. (line 47) -* bindtextdomain function (gawk) <1>: Programmer i18n. (line 45) -* bindtextdomain function (gawk): I18N Functions. (line 26) -* bindtextdomain function (gawk), portability and: I18N Portability. - (line 32) +* bindtextdomain() function (C library): Explaining gettext. (line 47) +* bindtextdomain() function (gawk) <1>: Programmer i18n. (line 45) +* bindtextdomain() function (gawk): I18N Functions. (line 26) +* bindtextdomain() function (gawk), portability and: I18N Portability. + (line 33) * BINMODE variable <1>: PC Using. (line 40) * BINMODE variable: User-modified. (line 10) * bits2str user-defined function: Bitwise Functions. (line 60) @@ -23459,6 +24710,7 @@ Index * braces ({}), pgawk program: Profiling. (line 140) * braces ({}), statements, grouping: Statements. (line 10) * bracket expressions, See character lists: Regexp Operators. (line 55) +* break debugger command: Breakpoint Control. (line 11) * break statement: Break Statement. (line 6) * Brennan, Michael <1>: Other Versions. (line 6) * Brennan, Michael <2>: Simple Sed. (line 25) @@ -23466,14 +24718,14 @@ Index * Brennan, Michael: Delete. (line 51) * Broder, Alan J.: Contributors. (line 80) * Brown, Martin: Contributors. (line 75) -* BSD portals: Portal Files. (line 6) * BSD-based operating systems: Glossary. (line 582) +* bt debugger command (alias for backtrace): Dgawk Stack. (line 13) * Buening, Andreas <1>: Bugs. (line 70) * Buening, Andreas <2>: Contributors. (line 84) -* Buening, Andreas: Acknowledgments. (line 53) +* Buening, Andreas: Acknowledgments. (line 59) * buffering, input/output <1>: Two-way I/O. (line 71) -* buffering, input/output: I/O Functions. (line 132) -* buffering, interactive vs. noninteractive: I/O Functions. (line 100) +* buffering, input/output: I/O Functions. (line 133) +* buffering, interactive vs. noninteractive: I/O Functions. (line 101) * buffers, flushing: I/O Functions. (line 29) * buffers, operators for: GNU Regexp Operators. (line 48) @@ -23490,12 +24742,12 @@ Index * caret (^) <1>: GNU Regexp Operators. (line 59) * caret (^): Regexp Operators. (line 22) -* caret (^), ^ operator <1>: Options. (line 216) +* caret (^), ^ operator <1>: Options. (line 208) * caret (^), ^ operator: Precedence. (line 49) -* caret (^), ^= operator <1>: Options. (line 216) +* caret (^), ^= operator <1>: Options. (line 208) * caret (^), ^= operator <2>: Precedence. (line 95) * caret (^), ^= operator: Assignment Ops. (line 129) -* caret (^), in character lists: Character Lists. (line 17) +* caret (^), in character lists: Character Lists. (line 16) * case keyword: Switch Statement. (line 6) * case sensitivity, array indices and: Array Intro. (line 93) * case sensitivity, converting case: String Functions. (line 492) @@ -23504,16 +24756,16 @@ Index * case sensitivity, regexps and <1>: User-modified. (line 82) * case sensitivity, regexps and: Case-sensitivity. (line 6) * case sensitivity, string comparisons and: User-modified. (line 82) -* CGI, awk scripts for: Options. (line 112) +* CGI, awk scripts for: Options. (line 110) * character encodings: Ordinal Functions. (line 44) * character lists <1>: Character Lists. (line 6) * character lists: Regexp Operators. (line 55) -* character lists, character classes: Character Lists. (line 30) -* character lists, collating elements: Character Lists. (line 71) -* character lists, collating symbols: Character Lists. (line 78) +* character lists, character classes: Character Lists. (line 29) +* character lists, collating elements: Character Lists. (line 70) +* character lists, collating symbols: Character Lists. (line 77) * character lists, complemented: Regexp Operators. (line 62) -* character lists, equivalence classes: Character Lists. (line 84) -* character lists, non-ASCII: Character Lists. (line 71) +* character lists, equivalence classes: Character Lists. (line 83) +* character lists, non-ASCII: Character Lists. (line 70) * character lists, range expressions: Character Lists. (line 6) * character sets: Ordinal Functions. (line 44) * character sets (machine character encodings): Glossary. (line 137) @@ -23521,29 +24773,30 @@ Index * characters, counting: Wc Program. (line 6) * characters, transliterating: Translate Program. (line 6) * characters, values of as numbers: Ordinal Functions. (line 6) -* Chassell, Robert J.: Acknowledgments. (line 30) +* Chassell, Robert J.: Acknowledgments. (line 32) * chdir function, implementing in gawk: Sample Library. (line 6) * chem utility: Glossary. (line 145) * chr user-defined function: Ordinal Functions. (line 16) +* clear debugger command: Breakpoint Control. (line 33) * Cliff random numbers: Cliff Random Function. (line 6) * cliff_rand user-defined function: Cliff Random Function. (line 12) -* close function <1>: I/O Functions. (line 10) -* close function <2>: Close Files And Pipes. +* close() function <1>: I/O Functions. (line 10) +* close() function <2>: Close Files And Pipes. (line 18) -* close function <3>: Getline/Pipe. (line 24) -* close function: Getline/Variable/File. +* close() function <3>: Getline/Pipe. (line 24) +* close() function: Getline/Variable/File. (line 30) -* close function, return values: Close Files And Pipes. - (line 130) -* close function, two-way pipes and: Two-way I/O. (line 78) +* close() function, return values: Close Files And Pipes. + (line 131) +* close() function, two-way pipes and: Two-way I/O. (line 78) * Close, Diane <1>: Contributors. (line 21) -* Close, Diane: Manual History. (line 40) -* close_func input method: Internals. (line 178) -* collating elements: Character Lists. (line 71) -* collating symbols: Character Lists. (line 78) -* Colombo, Antonio: Acknowledgments. (line 53) +* Close, Diane: Manual History. (line 39) +* close_func input method: Internals. (line 162) +* collating elements: Character Lists. (line 70) +* collating symbols: Character Lists. (line 77) +* Colombo, Antonio: Acknowledgments. (line 59) * columns, aligning: Print Examples. (line 70) * columns, cutting: Cut Program. (line 6) * comma (,), in range patterns: Ranges. (line 6) @@ -23564,8 +24817,11 @@ Index * command line, variables, assigning on: Assignment Options. (line 6) * command-line options, processing: Getopt Function. (line 6) * command-line options, string extraction: String Extraction. (line 6) +* commands debugger command: Dgawk Execution Control. + (line 10) * commenting: Comments. (line 6) * commenting, backslash continuation and: Statements/Lines. (line 75) +* common extensions, \x escape sequence: Escape Sequences. (line 61) * comp.lang.awk newsgroup: Bugs. (line 37) * comparison expressions: Typing and Comparison. (line 9) @@ -23578,22 +24834,21 @@ Index (line 60) * compatibility mode (gawk), octal numbers: Nondecimal-numbers. (line 60) -* compatibility mode (gawk), specifying: Options. (line 76) +* compatibility mode (gawk), specifying: Options. (line 75) * compiled programs <1>: Glossary. (line 155) * compiled programs: Basic High Level. (line 15) -* compl function (gawk): Bitwise Functions. (line 43) +* compl() function (gawk): Bitwise Functions. (line 43) * complement, bitwise: Bitwise Functions. (line 25) * compound statements, control statements and: Statements. (line 10) * concatenating: Concatenation. (line 9) +* condition debugger command: Breakpoint Control. (line 50) * conditional expressions: Conditional Exp. (line 6) * configuration option, --disable-lint: Additional Configuration Options. - (line 17) + (line 13) * configuration option, --disable-nls: Additional Configuration Options. - (line 32) -* configuration option, --enable-portals: Additional Configuration Options. - (line 9) + (line 28) * configuration option, --with-whiny-user-strftime: Additional Configuration Options. - (line 13) + (line 9) * configuration options, gawk: Additional Configuration Options. (line 6) * constants, nondecimal: Nondecimal Data. (line 6) @@ -23616,11 +24871,10 @@ Index * coprocesses, closing: Close Files And Pipes. (line 6) * coprocesses, getline from: Getline/Coprocess. (line 6) -* cos function: Numeric Functions. (line 34) +* cos() function: Numeric Functions. (line 34) * counting: Wc Program. (line 6) * csh utility: Statements/Lines. (line 44) -* csh utility, backslash continuation and: More Complex. (line 15) -* csh utility, POSIXLY_CORRECT environment variable: Options. (line 303) +* csh utility, POSIXLY_CORRECT environment variable: Options. (line 300) * csh utility, |& operator, comparison with: Two-way I/O. (line 44) * ctime user-defined function: Function Example. (line 72) * currency symbols, localization: Explaining gettext. (line 99) @@ -23628,6 +24882,7 @@ Index (line 29) * cut utility: Cut Program. (line 6) * cut.awk program: Cut Program. (line 44) +* d debugger command (alias for break): Breakpoint Control. (line 57) * d.c., See dark corner: Conventions. (line 37) * dark corner <1>: Glossary. (line 187) * dark corner <2>: Truth Values. (line 24) @@ -23638,8 +24893,8 @@ Index * dark corner, array subscripts: Uninitialized Subscripts. (line 42) * dark corner, break statement: Break Statement. (line 47) -* dark corner, close function: Close Files And Pipes. - (line 130) +* dark corner, close() function: Close Files And Pipes. + (line 131) * dark corner, command-line arguments: Assignment Options. (line 43) * dark corner, continue statement: Continue Statement. (line 43) * dark corner, CONVFMT variable: Conversion. (line 40) @@ -23649,15 +24904,15 @@ Index * dark corner, exit statement: Exit Statement. (line 29) * dark corner, field separators: Field Splitting Summary. (line 47) -* dark corner, FILENAME variable <1>: Auto-set. (line 93) +* dark corner, FILENAME variable <1>: Auto-set. (line 92) * dark corner, FILENAME variable: Getline Notes. (line 19) -* dark corner, FNR/NR variables: Auto-set. (line 193) +* dark corner, FNR/NR variables: Auto-set. (line 192) * dark corner, format-control characters: Control Letters. (line 18) * dark corner, FS as null string: Single Character Fields. (line 20) * dark corner, input files: Records. (line 98) * dark corner, invoking awk: Command Line. (line 16) -* dark corner, length function: String Functions. (line 88) +* dark corner, length() function: String Functions. (line 87) * dark corner, multiline records: Multiple Line. (line 35) * dark corner, NF variable, decrementing: Changing Fields. (line 107) * dark corner, OFMT variable: OFMT. (line 27) @@ -23667,8 +24922,8 @@ Index (line 148) * dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps. (line 44) -* dark corner, split function: String Functions. (line 259) -* dark corner, strings, storing: Records. (line 186) +* dark corner, split() function: String Functions. (line 259) +* dark corner, strings, storing: Records. (line 188) * data, fixed-width: Constant Size. (line 9) * data-driven languages: Basic High Level. (line 85) * database, group, reading: Group Functions. (line 6) @@ -23678,28 +24933,127 @@ Index * dates, converting to timestamps: Time Functions. (line 72) * dates, information related to, localization: Explaining gettext. (line 111) -* Davies, Stephen: Contributors. (line 69) -* dcgettext function (gawk) <1>: Programmer i18n. (line 19) -* dcgettext function (gawk): I18N Functions. (line 12) -* dcgettext function (gawk), portability and: I18N Portability. - (line 32) -* dcngettext function (gawk) <1>: Programmer i18n. (line 35) -* dcngettext function (gawk): I18N Functions. (line 18) -* dcngettext function (gawk), portability and: I18N Portability. - (line 32) +* Davies, Stephen <1>: Contributors. (line 69) +* Davies, Stephen: Acknowledgments. (line 59) +* dcgettext() function (gawk) <1>: Programmer i18n. (line 19) +* dcgettext() function (gawk): I18N Functions. (line 12) +* dcgettext() function (gawk), portability and: I18N Portability. + (line 33) +* dcngettext() function (gawk) <1>: Programmer i18n. (line 35) +* dcngettext() function (gawk): I18N Functions. (line 18) +* dcngettext() function (gawk), portability and: I18N Portability. + (line 33) * deadlocks: Two-way I/O. (line 71) +* debugger commands, b (break): Breakpoint Control. (line 11) +* debugger commands, backtrace: Dgawk Stack. (line 13) +* debugger commands, break: Breakpoint Control. (line 11) +* debugger commands, bt (backtrace): Dgawk Stack. (line 13) +* debugger commands, c (continue): Dgawk Execution Control. + (line 33) +* debugger commands, clear: Breakpoint Control. (line 33) +* debugger commands, commands: Dgawk Execution Control. + (line 10) +* debugger commands, condition: Breakpoint Control. (line 50) +* debugger commands, continue: Dgawk Execution Control. + (line 33) +* debugger commands, d (delete): Breakpoint Control. (line 57) +* debugger commands, delete: Breakpoint Control. (line 57) +* debugger commands, disable: Breakpoint Control. (line 62) +* debugger commands, display: Viewing And Changing Data. + (line 8) +* debugger commands, down: Dgawk Stack. (line 21) +* debugger commands, dump: Miscellaneous Dgawk Commands. + (line 9) +* debugger commands, e (enable): Breakpoint Control. (line 66) +* debugger commands, enable: Breakpoint Control. (line 66) +* debugger commands, end: Dgawk Execution Control. + (line 10) +* debugger commands, eval: Viewing And Changing Data. + (line 23) +* debugger commands, f (frame): Dgawk Stack. (line 25) +* debugger commands, finish: Dgawk Execution Control. + (line 39) +* debugger commands, frame: Dgawk Stack. (line 25) +* debugger commands, h (help): Miscellaneous Dgawk Commands. + (line 71) +* debugger commands, help: Miscellaneous Dgawk Commands. + (line 71) +* debugger commands, i (info): Dgawk Info. (line 12) +* debugger commands, ignore: Breakpoint Control. (line 80) +* debugger commands, info: Dgawk Info. (line 12) +* debugger commands, l (list): Miscellaneous Dgawk Commands. + (line 77) +* debugger commands, list: Miscellaneous Dgawk Commands. + (line 77) +* debugger commands, n (next): Dgawk Execution Control. + (line 43) +* debugger commands, next: Dgawk Execution Control. + (line 43) +* debugger commands, nexti: Dgawk Execution Control. + (line 49) +* debugger commands, ni (nexti): Dgawk Execution Control. + (line 49) +* debugger commands, o (option): Dgawk Info. (line 56) +* debugger commands, option: Dgawk Info. (line 56) +* debugger commands, p (print): Viewing And Changing Data. + (line 36) +* debugger commands, print: Viewing And Changing Data. + (line 36) +* debugger commands, printf: Viewing And Changing Data. + (line 54) +* debugger commands, q (quit): Miscellaneous Dgawk Commands. + (line 104) +* debugger commands, quit: Miscellaneous Dgawk Commands. + (line 104) +* debugger commands, r (run): Dgawk Execution Control. + (line 62) +* debugger commands, return: Dgawk Execution Control. + (line 54) +* debugger commands, run: Dgawk Execution Control. + (line 62) +* debugger commands, s (step): Dgawk Execution Control. + (line 68) +* debugger commands, set: Viewing And Changing Data. + (line 59) +* debugger commands, si (stepi): Dgawk Execution Control. + (line 76) +* debugger commands, silent: Dgawk Execution Control. + (line 10) +* debugger commands, step: Dgawk Execution Control. + (line 68) +* debugger commands, stepi: Dgawk Execution Control. + (line 76) +* debugger commands, t (tbreak): Breakpoint Control. (line 83) +* debugger commands, tbreak: Breakpoint Control. (line 83) +* debugger commands, trace: Miscellaneous Dgawk Commands. + (line 113) +* debugger commands, u (until): Dgawk Execution Control. + (line 83) +* debugger commands, undisplay: Viewing And Changing Data. + (line 80) +* debugger commands, until: Dgawk Execution Control. + (line 83) +* debugger commands, unwatch: Viewing And Changing Data. + (line 84) +* debugger commands, up: Dgawk Stack. (line 33) +* debugger commands, w (watch): Viewing And Changing Data. + (line 67) +* debugger commands, watch: Viewing And Changing Data. + (line 67) * debugging gawk: Known Bugs. (line 6) * debugging gawk, bug reports: Bugs. (line 9) -* decimal point character, locale specific: Options. (line 224) +* decimal point character, locale specific: Options. (line 216) * decrement operators: Increment Ops. (line 35) * default keyword: Switch Statement. (line 6) * Deifik, Scott <1>: Bugs. (line 69) * Deifik, Scott <2>: Contributors. (line 53) -* Deifik, Scott: Acknowledgments. (line 53) +* Deifik, Scott: Acknowledgments. (line 59) +* delete debugger command: Breakpoint Control. (line 57) * delete statement: Delete. (line 6) * deleting elements in arrays: Delete. (line 6) * deleting entire arrays: Delete. (line 39) -* differences between gawk and awk: String Functions. (line 102) +* dgawk: Debugger. (line 6) +* differences between gawk and awk: String Functions. (line 101) * differences in awk and gawk, ARGC/ARGV variables: ARGC and ARGV. (line 86) * differences in awk and gawk, ARGIND variable: Auto-set. (line 40) @@ -23713,7 +25067,7 @@ Index (line 40) * differences in awk and gawk, BINMODE variable: User-modified. (line 23) -* differences in awk and gawk, close function: Close Files And Pipes. +* differences in awk and gawk, close() function: Close Files And Pipes. (line 81) * differences in awk and gawk, ERRNO variable: Auto-set. (line 72) * differences in awk and gawk, error messages: Special FD. (line 15) @@ -23738,27 +25092,27 @@ Index * differences in awk and gawk, line continuations: Conditional Exp. (line 34) * differences in awk and gawk, LINT variable: User-modified. (line 97) -* differences in awk and gawk, match function: String Functions. +* differences in awk and gawk, match() function: String Functions. (line 165) * differences in awk and gawk, next/nextfile statements: Nextfile Statement. (line 6) * differences in awk and gawk, print/printf statements: Format Modifiers. (line 13) -* differences in awk and gawk, PROCINFO array: Auto-set. (line 124) +* differences in awk and gawk, PROCINFO array: Auto-set. (line 123) * differences in awk and gawk, record separators: Records. (line 112) * differences in awk and gawk, regexp constants: Using Constant Regexps. (line 44) * differences in awk and gawk, regular expressions: Case-sensitivity. (line 26) -* differences in awk and gawk, RS/RT variables: Records. (line 162) -* differences in awk and gawk, RT variable: Auto-set. (line 182) +* differences in awk and gawk, RS/RT variables: Records. (line 164) +* differences in awk and gawk, RT variable: Auto-set. (line 181) * differences in awk and gawk, single-character fields: Single Character Fields. (line 6) -* differences in awk and gawk, split function: String Functions. +* differences in awk and gawk, split() function: String Functions. (line 248) * differences in awk and gawk, strings: Scalar Constants. (line 20) -* differences in awk and gawk, strings, storing: Records. (line 182) -* differences in awk and gawk, strtonum function (gawk): String Functions. +* differences in awk and gawk, strings, storing: Records. (line 184) +* differences in awk and gawk, strtonum() function (gawk): String Functions. (line 286) * differences in awk and gawk, TEXTDOMAIN variable: User-modified. (line 152) @@ -23769,6 +25123,9 @@ Index (line 6) * directories, searching <1>: Igawk Program. (line 358) * directories, searching: AWKPATH Variable. (line 6) +* disable debugger command: Breakpoint Control. (line 62) +* display debugger command: Viewing And Changing Data. + (line 8) * division: Arithmetic Ops. (line 44) * do-while statement <1>: Do Statement. (line 6) * do-while statement: Regexp Usage. (line 19) @@ -23784,13 +25141,17 @@ Index * double quote ("): Read Terminal. (line 25) * double quote ("), regexp constants: Computed Regexps. (line 28) * double-precision floating-point: Basic Data Typing. (line 33) -* Drepper, Ulrich: Acknowledgments. (line 49) -* DuBois, John: Acknowledgments. (line 53) -* dupnode internal function: Internals. (line 97) +* down debugger command: Dgawk Stack. (line 21) +* Drepper, Ulrich: Acknowledgments. (line 51) +* DuBois, John: Acknowledgments. (line 59) +* dump debugger command: Miscellaneous Dgawk Commands. + (line 9) +* dupnode internal function: Internals. (line 87) * dupword.awk program: Dupword Program. (line 31) +* e debugger command (alias for break): Breakpoint Control. (line 66) * EBCDIC: Ordinal Functions. (line 44) * egrep utility <1>: Egrep Program. (line 6) -* egrep utility: Character Lists. (line 24) +* egrep utility: Character Lists. (line 23) * egrep.awk program: Egrep Program. (line 54) * elements in arrays: Reference to Elements. (line 6) @@ -23803,6 +25164,9 @@ Index * empty pattern: Empty. (line 6) * empty strings, See null strings: Regexp Field Splitting. (line 43) +* enable debugger command: Breakpoint Control. (line 66) +* end debugger command: Dgawk Execution Control. + (line 10) * END pattern: BEGIN/END. (line 6) * END pattern, assert user-defined function and: Assert Function. (line 74) @@ -23822,7 +25186,7 @@ Index * endgrent user-defined function: Group Functions. (line 218) * endpwent function (C library): Passwd Functions. (line 199) * endpwent user-defined function: Passwd Functions. (line 202) -* ENVIRON variable <1>: Internals. (line 165) +* ENVIRON variable <1>: Internals. (line 149) * ENVIRON variable: Auto-set. (line 60) * environment variables: Auto-set. (line 60) * epoch, definition of: Glossary. (line 229) @@ -23830,16 +25194,19 @@ Index * equals sign (=), == operator <1>: Precedence. (line 65) * equals sign (=), == operator: Comparison Operators. (line 11) -* EREs (Extended Regular Expressions): Character Lists. (line 24) -* ERRNO variable <1>: Internals. (line 152) +* EREs (Extended Regular Expressions): Character Lists. (line 23) +* ERRNO variable <1>: Internals. (line 136) * ERRNO variable <2>: Auto-set. (line 72) * ERRNO variable: Getline. (line 19) * error handling: Special FD. (line 15) * error handling, ERRNO variable and: Auto-set. (line 72) * error output: Special FD. (line 6) -* escape processing, gsub/gensub/sub functions: Gory Details. (line 6) +* escape processing, gsub()/gensub()/sub() functions: Gory Details. + (line 6) * escape sequences: Escape Sequences. (line 6) -* escape sequences, unrecognized: Options. (line 204) +* escape sequences, unrecognized: Options. (line 196) +* eval debugger command: Viewing And Changing Data. + (line 23) * evaluation order: Increment Ops. (line 61) * evaluation order, concatenation: Concatenation. (line 42) * evaluation order, functions: Calling Built-in. (line 30) @@ -23861,7 +25228,7 @@ Index * exclamation point (!), !~ operator: Regexp Usage. (line 19) * exit statement: Exit Statement. (line 6) * exit status, of gawk: Exit Status. (line 6) -* exp function: Numeric Functions. (line 22) +* exp() function: Numeric Functions. (line 22) * expand utility: Very Simple. (line 69) * expressions: Expressions. (line 6) * expressions, as patterns: Expression Patterns. (line 6) @@ -23873,15 +25240,17 @@ Index * expressions, matching, See comparison expressions: Typing and Comparison. (line 9) * expressions, selecting: Conditional Exp. (line 6) -* Extended Regular Expressions (EREs): Character Lists. (line 24) -* extension function (gawk): Using Internal File Ops. +* Extended Regular Expressions (EREs): Character Lists. (line 23) +* extension() function (gawk): Using Internal File Ops. (line 15) * extensions, Bell Laboratories awk: BTL. (line 6) +* extensions, common, \x escape sequence: Escape Sequences. (line 61) * extensions, in gawk, not in POSIX awk: POSIX/GNU. (line 6) * extensions, mawk: Other Versions. (line 51) * extract.awk program: Extract Program. (line 77) * extraction, of marked strings (internationalization): String Extraction. (line 6) +* f debugger command (alias for frame): Dgawk Stack. (line 25) * false, logical: Truth Values. (line 6) * FDL (Free Documentation License): GNU Free Documentation License. (line 6) @@ -23891,14 +25260,14 @@ Index * features, undocumented: Undocumented. (line 6) * Fenlason, Jay <1>: Contributors. (line 19) * Fenlason, Jay: History. (line 30) -* fflush function: I/O Functions. (line 25) -* fflush function, unsupported: Options. (line 227) +* fflush() function: I/O Functions. (line 25) +* fflush() function, unsupported: Options. (line 219) * field numbers: Nonconstant Fields. (line 6) * field operator $: Fields. (line 19) * field operators, dollar sign as: Fields. (line 19) * field separators <1>: User-modified. (line 56) -* field separators: Field Separators. (line 13) -* field separators, choice of: Field Separators. (line 49) +* field separators: Field Separators. (line 14) +* field separators, choice of: Field Separators. (line 50) * field separators, FIELDWIDTHS variable and: User-modified. (line 35) * field separators, FPAT variable and: User-modified. (line 45) * field separators, in multiline records: Multiple Line. (line 41) @@ -23909,7 +25278,7 @@ Index * field separators, POSIX and: Fields. (line 6) * field separators, regular expressions as <1>: Regexp Field Splitting. (line 6) -* field separators, regular expressions as: Field Separators. (line 49) +* field separators, regular expressions as: Field Separators. (line 50) * field separators, See Also OFS: Changing Fields. (line 64) * field separators, spaces as: Cut Program. (line 106) * fields <1>: Basic High Level. (line 73) @@ -23922,7 +25291,7 @@ Index * fields, number of: Fields. (line 33) * fields, numbers: Nonconstant Fields. (line 6) * fields, printing: Print Examples. (line 21) -* fields, separating: Field Separators. (line 13) +* fields, separating: Field Separators. (line 14) * fields, single-character: Single Character Fields. (line 6) * FIELDWIDTHS variable <1>: User-modified. (line 35) @@ -23930,8 +25299,8 @@ Index * file descriptors: Special FD. (line 6) * file names, distinguishing: Auto-set. (line 52) * file names, in compatibility mode: Special Caveats. (line 9) -* file names, standard streams in gawk: Special FD. (line 41) -* FILENAME variable <1>: Auto-set. (line 93) +* file names, standard streams in gawk: Special FD. (line 44) +* FILENAME variable <1>: Auto-set. (line 92) * FILENAME variable: Reading Files. (line 6) * FILENAME variable, getline, setting with: Getline Notes. (line 19) * filenames, assignments as: Ignoring Assigns. (line 6) @@ -23942,15 +25311,14 @@ Index * files, .po <1>: Translator i18n. (line 6) * files, .po: Explaining gettext. (line 36) * files, .po, converting to .mo: I18N Example. (line 62) -* files, /dev/... special files: Special FD. (line 41) +* files, /dev/... special files: Special FD. (line 44) * files, /inet/ (gawk): TCP/IP Networking. (line 6) * files, /inet4/ (gawk): TCP/IP Networking. (line 6) * files, /inet6/ (gawk): TCP/IP Networking. (line 6) -* files, /p (gawk): Portal Files. (line 6) -* files, as single records: Records. (line 191) +* files, as single records: Records. (line 193) * files, awk programs in: Long. (line 6) * files, awkprof.out: Profiling. (line 10) -* files, awkvars.out: Options. (line 90) +* files, awkvars.out: Options. (line 88) * files, closing: I/O Functions. (line 10) * files, descriptors, See file descriptors: Special FD. (line 6) * files, group: Group Functions. (line 6) @@ -23977,8 +25345,7 @@ Index * files, portable object: Explaining gettext. (line 36) * files, portable object, converting to message object files: I18N Example. (line 62) -* files, portable object, generating: Options. (line 131) -* files, portal: Portal Files. (line 6) +* files, portable object, generating: Options. (line 129) * files, processing, ARGIND variable and: Auto-set. (line 47) * files, reading: Rewind Function. (line 6) * files, reading, multiline records: Multiple Line. (line 6) @@ -23987,6 +25354,8 @@ Index * files, source, search path for: Igawk Program. (line 358) * files, splitting: Split Program. (line 6) * files, Texinfo, extracting programs from: Extract Program. (line 6) +* finish debugger command: Dgawk Execution Control. + (line 39) * Fish, Fred: Contributors. (line 50) * fixed-width data: Constant Size. (line 9) * flag variables <1>: Tee Program. (line 20) @@ -23994,9 +25363,9 @@ Index * floating-point: Unexpected Results. (line 6) * floating-point, numbers: Basic Data Typing. (line 21) * floating-point, numbers, AWKNUM internal type: Internals. (line 19) -* FNR variable <1>: Auto-set. (line 103) +* FNR variable <1>: Auto-set. (line 102) * FNR variable: Records. (line 6) -* FNR variable, changing: Auto-set. (line 193) +* FNR variable, changing: Auto-set. (line 192) * for statement: For Statement. (line 6) * for statement, in arrays: Scanning an Array. (line 20) * force_number internal function: Internals. (line 27) @@ -24004,7 +25373,7 @@ Index * format specifiers, mixing regular with positional specifiers: Printf Ordering. (line 57) * format specifiers, printf statement: Control Letters. (line 6) -* format specifiers, strftime function (gawk): Time Functions. +* format specifiers, strftime() function (gawk): Time Functions. (line 85) * format strings: Basic Printf. (line 15) * formats, numeric output: OFMT. (line 6) @@ -24017,26 +25386,28 @@ Index (line 148) * forward slash (/), patterns and: Expression Patterns. (line 24) * FPAT variable: User-modified. (line 45) +* frame debugger command: Dgawk Stack. (line 25) * Free Documentation License (FDL): GNU Free Documentation License. (line 6) * Free Software Foundation (FSF) <1>: Glossary. (line 286) +* Free Software Foundation (FSF) <2>: Getting. (line 10) * Free Software Foundation (FSF): Manual History. (line 6) -* free_temp internal macro: Internals. (line 102) * FreeBSD: Glossary. (line 582) * FS variable <1>: User-modified. (line 56) -* FS variable: Field Separators. (line 13) +* FS variable: Field Separators. (line 14) * FS variable, --field-separator option and: Options. (line 21) * FS variable, as null string: Single Character Fields. (line 20) -* FS variable, as TAB character: Options. (line 220) +* FS variable, as TAB character: Options. (line 212) * FS variable, changing value of <1>: Known Bugs. (line 6) -* FS variable, changing value of: Field Separators. (line 33) +* FS variable, changing value of: Field Separators. (line 34) * FS variable, running awk programs and: Cut Program. (line 66) * FS variable, setting from command line: Command Line Field Separator. (line 6) * FS, containing ^: Regexp Field Splitting. (line 59) * FSF (Free Software Foundation) <1>: Glossary. (line 286) +* FSF (Free Software Foundation) <2>: Getting. (line 10) * FSF (Free Software Foundation): Manual History. (line 6) * function calls: Function Calls. (line 6) * function calls, indirect: Indirect Calls. (line 6) @@ -24073,7 +25444,7 @@ Index * functions, names of <1>: Definition Syntax. (line 21) * functions, names of: Arrays. (line 17) * functions, recursive: Definition Syntax. (line 73) -* functions, return values, setting: Internals. (line 146) +* functions, return values, setting: Internals. (line 136) * functions, string-translation: I18N Functions. (line 6) * functions, undefined: Function Caveats. (line 79) * functions, user-defined: User-defined. (line 6) @@ -24084,14 +25455,14 @@ Index (line 39) * functions, user-defined, next/nextfile statements and: Next Statement. (line 39) -* G-d: Acknowledgments. (line 72) +* G-d: Acknowledgments. (line 79) * Garfinkle, Scott: Contributors. (line 37) -* gawk, awk and <1>: This Manual. (line 13) +* gawk, awk and <1>: This Manual. (line 14) * gawk, awk and: Preface. (line 22) * gawk, bitwise operations in: Bitwise Functions. (line 39) * gawk, break statement in: Break Statement. (line 47) * gawk, built-in variables and: Built-in Variables. (line 14) -* gawk, character classes and: Character Lists. (line 92) +* gawk, character classes and: Character Lists. (line 91) * gawk, coding style in: Adding Code. (line 32) * gawk, command-line options: GNU Regexp Operators. (line 70) @@ -24106,10 +25477,10 @@ Index * gawk, distribution: Distribution contents. (line 6) * gawk, escape sequences: Escape Sequences. (line 125) -* gawk, extensions, disabling: Options. (line 200) +* gawk, extensions, disabling: Options. (line 192) * gawk, features, adding: Adding Code. (line 6) * gawk, features, advanced: Advanced Features. (line 6) -* gawk, fflush function in: I/O Functions. (line 45) +* gawk, fflush() function in: I/O Functions. (line 45) * gawk, field separators and: User-modified. (line 77) * gawk, FIELDWIDTHS variable in: User-modified. (line 41) * gawk, file names in: Special Files. (line 6) @@ -24156,21 +25527,21 @@ Index * gawk, string-translation functions: I18N Functions. (line 6) * gawk, timestamps: Time Functions. (line 6) * gawk, uses for: Preface. (line 35) -* gawk, versions of, information about, printing: Options. (line 252) +* gawk, versions of, information about, printing: Options. (line 249) * gawk, word-boundary operator: GNU Regexp Operators. (line 63) * General Public License (GPL): Glossary. (line 295) * General Public License, See GPL: Manual History. (line 11) -* gensub function (gawk) <1>: String Functions. (line 400) -* gensub function (gawk): Using Constant Regexps. +* gensub() function (gawk) <1>: String Functions. (line 400) +* gensub() function (gawk): Using Constant Regexps. (line 44) -* gensub function (gawk), escape processing: Gory Details. (line 6) -* get_actual_argument internal function: Internals. (line 126) -* get_argument internal function: Internals. (line 121) -* get_array_argument internal macro: Internals. (line 141) +* gensub() function (gawk), escape processing: Gory Details. (line 6) +* get_actual_argument internal function: Internals. (line 116) +* get_argument internal function: Internals. (line 111) +* get_array_argument internal macro: Internals. (line 131) * get_curfunc_arg_count internal function: Internals. (line 37) -* get_record input method: Internals. (line 178) -* get_scalar_argument internal macro: Internals. (line 136) +* get_record input method: Internals. (line 162) +* get_scalar_argument internal macro: Internals. (line 126) * getgrent function (C library): Group Functions. (line 6) * getgrent user-defined function: Group Functions. (line 6) * getgrgid function (C library): Group Functions. (line 182) @@ -24206,7 +25577,7 @@ Index * gettext library, locale categories: Explaining gettext. (line 78) * gettimeofday user-defined function: Gettimeofday Function. (line 16) -* GNITS mailing list: Acknowledgments. (line 49) +* GNITS mailing list: Acknowledgments. (line 51) * GNU awk, See gawk: Preface. (line 48) * GNU Free Documentation License: GNU Free Documentation License. (line 6) @@ -24214,7 +25585,7 @@ Index * GNU Lesser General Public License: Glossary. (line 373) * GNU long options <1>: Options. (line 6) * GNU long options: Command Line. (line 13) -* GNU long options, printing list of: Options. (line 139) +* GNU long options, printing list of: Options. (line 136) * GNU Project <1>: Glossary. (line 304) * GNU Project: Manual History. (line 11) * GNU/Linux <1>: Glossary. (line 582) @@ -24223,26 +25594,31 @@ Index * GNU/Linux: Manual History. (line 28) * GPL (General Public License) <1>: Glossary. (line 295) * GPL (General Public License): Manual History. (line 11) -* GPL (General Public License), printing: Options. (line 85) +* GPL (General Public License), printing: Options. (line 83) * grcat program: Group Functions. (line 15) * Grigera, Juan: Contributors. (line 55) * group database, reading: Group Functions. (line 6) * group file: Group Functions. (line 6) * groups, information about: Group Functions. (line 6) -* gsub function <1>: String Functions. (line 384) -* gsub function: Using Constant Regexps. +* gsub() function <1>: String Functions. (line 384) +* gsub() function: Using Constant Regexps. (line 44) -* gsub function, arguments of: String Functions. (line 364) -* gsub function, escape processing: Gory Details. (line 6) +* gsub() function, arguments of: String Functions. (line 364) +* gsub() function, escape processing: Gory Details. (line 6) +* h debugger command (alias for help): Miscellaneous Dgawk Commands. + (line 71) * Hankerson, Darrel <1>: Contributors. (line 57) -* Hankerson, Darrel: Acknowledgments. (line 53) -* Hartholz, Elaine: Acknowledgments. (line 35) -* Hartholz, Marshall: Acknowledgments. (line 35) +* Hankerson, Darrel: Acknowledgments. (line 59) +* Haque, John: Acknowledgments. (line 59) +* Hartholz, Elaine: Acknowledgments. (line 37) +* Hartholz, Marshall: Acknowledgments. (line 37) * Hasegawa, Isamu: Contributors. (line 86) +* help debugger command: Miscellaneous Dgawk Commands. + (line 71) * hexadecimal numbers: Nondecimal-numbers. (line 6) -* hexadecimal values, enabling interpretation of: Options. (line 168) +* hexadecimal values, enabling interpretation of: Options. (line 160) * histsort.awk program: History Sorting. (line 25) -* Hughes, Phil: Acknowledgments. (line 40) +* Hughes, Phil: Acknowledgments. (line 42) * HUP signal: Profiling. (line 207) * hyphen (-), - operator: Precedence. (line 52) * hyphen (-), -- (decrement/increment) operators: Precedence. (line 46) @@ -24250,13 +25626,15 @@ Index * hyphen (-), -= operator <1>: Precedence. (line 95) * hyphen (-), -= operator: Assignment Ops. (line 129) * hyphen (-), filenames beginning with: Options. (line 56) -* hyphen (-), in character lists: Character Lists. (line 17) +* hyphen (-), in character lists: Character Lists. (line 16) +* i debugger command (alias for info): Dgawk Info. (line 12) * id utility: Id Program. (line 6) * id.awk program: Id Program. (line 30) * if statement <1>: If Statement. (line 6) * if statement: Regexp Usage. (line 19) * if statement, actions, changing: Ranges. (line 25) * igawk.sh program: Igawk Program. (line 118) +* ignore debugger command: Breakpoint Control. (line 80) * IGNORECASE variable <1>: User-modified. (line 82) * IGNORECASE variable: Case-sensitivity. (line 26) * IGNORECASE variable, array sorting and: Array Sorting. (line 86) @@ -24276,9 +25654,10 @@ Index * in operator, arrays and: Reference to Elements. (line 25) * increment operators: Increment Ops. (line 6) -* index function: String Functions. (line 60) +* index() function: String Functions. (line 60) * indexing arrays: Array Intro. (line 51) * indirect function calls: Indirect Calls. (line 6) +* info debugger command: Dgawk Info. (line 12) * initialization, automatic: More Complex. (line 38) * input files: Reading Files. (line 6) * input files, closing: Close Files And Pipes. @@ -24307,8 +25686,8 @@ Index * installation, tandem: Tandem Installation. (line 6) * installation, vms: VMS Installation. (line 6) * installing gawk: Installation. (line 6) -* int function: Numeric Functions. (line 11) * INT signal (MS-DOS): Profiling. (line 210) +* int() function: Numeric Functions. (line 11) * integers: Basic Data Typing. (line 21) * integers, unsigned: Basic Data Typing. (line 28) * interacting with other programs: I/O Functions. (line 64) @@ -24318,7 +25697,7 @@ Index (line 13) * internationalization, localization: User-modified. (line 152) * internationalization, localization, character classes: Character Lists. - (line 92) + (line 91) * internationalization, localization, gawk and: Internationalization. (line 13) * internationalization, localization, locale categories: Explaining gettext. @@ -24332,40 +25711,42 @@ Index * interpreted programs: Basic High Level. (line 15) * interval expressions: Regexp Operators. (line 115) * inventory-shipped file: Sample Data Files. (line 32) -* IOBUF internal structure: Internals. (line 178) -* iop_alloc internal function: Internals. (line 178) +* IOBUF internal structure: Internals. (line 162) +* iop_alloc internal function: Internals. (line 162) * ISO: Glossary. (line 355) * ISO 8859-1: Glossary. (line 137) * ISO Latin-1: Glossary. (line 137) * Jacobs, Andrew: Passwd Functions. (line 76) * Jaegermann, Michal <1>: Contributors. (line 45) -* Jaegermann, Michal: Acknowledgments. (line 53) +* Jaegermann, Michal: Acknowledgments. (line 59) * Java implementation of awk: Other Versions. (line 111) * jawk: Other Versions. (line 111) * Jedi knights: Undocumented. (line 6) * join user-defined function: Join Function. (line 18) * Kahrs, Ju"rgen <1>: Contributors. (line 65) -* Kahrs, Ju"rgen: Acknowledgments. (line 53) -* Kasal, Stepan: Acknowledgments. (line 53) +* Kahrs, Ju"rgen: Acknowledgments. (line 59) +* Kasal, Stepan: Acknowledgments. (line 59) * Kenobi, Obi-Wan: Undocumented. (line 6) * Kernighan, Brian <1>: Basic Data Typing. (line 71) * Kernighan, Brian <2>: Other Versions. (line 13) * Kernighan, Brian <3>: Contributors. (line 12) * Kernighan, Brian <4>: BTL. (line 6) * Kernighan, Brian <5>: Concatenation. (line 6) -* Kernighan, Brian <6>: Acknowledgments. (line 62) +* Kernighan, Brian <6>: Acknowledgments. (line 73) * Kernighan, Brian <7>: Conventions. (line 33) * Kernighan, Brian: History. (line 17) * kill command, dynamic profiling: Profiling. (line 185) * Knights, jedi: Undocumented. (line 6) * Kwok, Conrad: Contributors. (line 37) +* l debugger command (alias for list): Miscellaneous Dgawk Commands. + (line 77) * labels.awk program: Labels Program. (line 48) * languages, data-driven: Basic High Level. (line 85) * LC_ALL locale category: Explaining gettext. (line 116) * LC_COLLATE locale category: Explaining gettext. (line 89) * LC_CTYPE locale category: Explaining gettext. (line 93) * LC_MESSAGES locale category: Explaining gettext. (line 83) -* LC_MESSAGES locale category, bindtextdomain function (gawk): Programmer i18n. +* LC_MESSAGES locale category, bindtextdomain() function (gawk): Programmer i18n. (line 86) * LC_MONETARY locale category: Explaining gettext. (line 99) * LC_NUMERIC locale category: Explaining gettext. (line 103) @@ -24380,7 +25761,7 @@ Index (line 11) * left shift, bitwise: Bitwise Functions. (line 32) * leftmost longest match: Multiple Line. (line 26) -* length function: String Functions. (line 71) +* length() function: String Functions. (line 71) * Lesser General Public License (LGPL): Glossary. (line 373) * LGPL (Lesser General Public License): Glossary. (line 373) * libraries of awk functions: Library Functions. (line 6) @@ -24422,23 +25803,25 @@ Index * lint checking, array subscripts: Uninitialized Subscripts. (line 42) * lint checking, empty programs: Command Line. (line 16) -* lint checking, issuing warnings: Options. (line 144) +* lint checking, issuing warnings: Options. (line 141) * lint checking, POSIXLY_CORRECT environment variable: Options. - (line 290) + (line 287) * lint checking, undefined functions: Function Caveats. (line 96) * LINT variable: User-modified. (line 97) * Linux <1>: Glossary. (line 582) * Linux <2>: Atari Compiling. (line 16) * Linux <3>: I18N Example. (line 55) * Linux: Manual History. (line 28) +* list debugger command: Miscellaneous Dgawk Commands. + (line 77) * locale categories: Explaining gettext. (line 78) -* locale decimal point character: Options. (line 224) +* locale decimal point character: Options. (line 216) * locale, definition of: Locales. (line 6) * localization: I18N and L10N. (line 6) * localization, See internationalization, localization: I18N and L10N. (line 6) * log files, timestamps in: Time Functions. (line 6) -* log function: Numeric Functions. (line 27) +* log() function: Numeric Functions. (line 27) * logical false/true: Truth Values. (line 6) * logical operators, See Boolean expressions: Boolean Ops. (line 6) * login information: Passwd Functions. (line 16) @@ -24450,11 +25833,11 @@ Index * loops, See Also while statement: While Statement. (line 6) * Lost In Space: Dynamic Extensions. (line 6) * ls utility: More Complex. (line 15) -* lshift function (gawk): Bitwise Functions. (line 45) +* lshift() function (gawk): Bitwise Functions. (line 45) * lvalues/rvalues: Assignment Ops. (line 32) * mailing labels, printing: Labels Program. (line 6) -* mailing list, GNITS: Acknowledgments. (line 49) -* make_builtin internal function: Internals. (line 107) +* mailing list, GNITS: Acknowledgments. (line 51) +* make_builtin internal function: Internals. (line 97) * make_number internal function: Internals. (line 82) * make_string internal function: Internals. (line 77) * mark parity: Ordinal Functions. (line 44) @@ -24462,15 +25845,16 @@ Index (line 6) * marked strings, extracting: String Extraction. (line 6) * Marx, Groucho: Increment Ops. (line 61) -* match function: String Functions. (line 112) -* match function, RSTART/RLENGTH variables: String Functions. (line 129) +* match() function: String Functions. (line 111) +* match() function, RSTART/RLENGTH variables: String Functions. + (line 129) * matching, expressions, See comparison expressions: Typing and Comparison. (line 9) * matching, leftmost longest: Multiple Line. (line 26) * matching, null strings: Gory Details. (line 159) * mawk program: Other Versions. (line 34) * McPhee, Patrick: Contributors. (line 92) -* memory, releasing: Internals. (line 102) +* memory, releasing: Internals. (line 92) * message object files: Explaining gettext. (line 39) * message object files, converting from portable object files: I18N Example. (line 62) @@ -24479,10 +25863,12 @@ Index * message object files, specifying directory of: Explaining gettext. (line 51) * metacharacters, escape sequences for: Escape Sequences. (line 132) -* mktime function (gawk): Time Functions. (line 30) +* mktime() function (gawk): Time Functions. (line 30) * modifiers, in format specifiers: Format Modifiers. (line 6) * monetary information, localization: Explaining gettext. (line 99) * msgfmt utility: I18N Example. (line 62) +* n debugger command (alias for next): Dgawk Execution Control. + (line 43) * names, arrays/variables <1>: Library Names. (line 6) * names, arrays/variables: Arrays. (line 17) * names, functions <1>: Library Names. (line 6) @@ -24490,15 +25876,17 @@ Index * namespace issues <1>: Library Names. (line 6) * namespace issues: Arrays. (line 17) * namespace issues, functions: Definition Syntax. (line 21) +* nargs internal variable: Internals. (line 46) * nawk utility: Names. (line 17) * negative zero: Unexpected Results. (line 28) * NetBSD: Glossary. (line 582) * networks, programming: TCP/IP Networking. (line 6) * networks, support for: Special Network. (line 6) -* newlines <1>: Options. (line 207) +* newlines <1>: Options. (line 199) * newlines <2>: Boolean Ops. (line 67) * newlines: Statements/Lines. (line 6) -* newlines, as field separators: Field Separators. (line 63) +* newlines, as field separators: Default Field Splitting. + (line 6) * newlines, as record separators: Records. (line 20) * newlines, in dynamic regexps: Computed Regexps. (line 59) * newlines, in regexp constants: Computed Regexps. (line 69) @@ -24506,7 +25894,9 @@ Index * newlines, separating statements in actions <1>: Statements. (line 10) * newlines, separating statements in actions: Action Overview. (line 19) -* next file statement: POSIX/GNU. (line 154) +* next debugger command: Dgawk Execution Control. + (line 43) +* next file statement: POSIX/GNU. (line 155) * next file statement, in gawk: Nextfile Statement. (line 46) * next statement <1>: Next Statement. (line 6) * next statement: Boolean Ops. (line 85) @@ -24520,16 +25910,20 @@ Index * nextfile statement, user-defined functions and: Nextfile Statement. (line 39) * nextfile user-defined function: Nextfile Function. (line 38) -* NF variable <1>: Auto-set. (line 108) +* nexti debugger command: Dgawk Execution Control. + (line 49) +* NF variable <1>: Auto-set. (line 107) * NF variable: Fields. (line 33) * NF variable, decrementing: Changing Fields. (line 107) +* ni debugger command (alias for nexti): Dgawk Execution Control. + (line 49) * noassign.awk program: Ignoring Assigns. (line 15) * NODE internal type: Internals. (line 23) -* nodes, duplicating: Internals. (line 97) +* nodes, duplicating: Internals. (line 87) * not Boolean-logic operator: Boolean Ops. (line 6) -* NR variable <1>: Auto-set. (line 119) +* NR variable <1>: Auto-set. (line 118) * NR variable: Records. (line 6) -* NR variable, changing: Auto-set. (line 193) +* NR variable, changing: Auto-set. (line 192) * null strings <1>: Basic Data Typing. (line 47) * null strings <2>: Truth Values. (line 6) * null strings <3>: Regexp Field Splitting. @@ -24566,10 +25960,11 @@ Index * numeric, output format: OFMT. (line 6) * numeric, strings: Variable Typing. (line 6) * numeric, values: Internals. (line 27) +* o debugger command (alias for option): Dgawk Info. (line 56) * oawk utility: Names. (line 17) * obsolete features: Obsolete. (line 6) * octal numbers: Nondecimal-numbers. (line 6) -* octal values, enabling interpretation of: Options. (line 168) +* octal values, enabling interpretation of: Options. (line 160) * OFMT variable <1>: User-modified. (line 114) * OFMT variable <2>: Conversion. (line 54) * OFMT variable: OFMT. (line 15) @@ -24579,7 +25974,6 @@ Index * OFS variable: Changing Fields. (line 64) * OpenBSD: Glossary. (line 582) * OpenSolaris: Other Versions. (line 101) -* operating systems, BSD-based <1>: Portal Files. (line 6) * operating systems, BSD-based: Manual History. (line 28) * operating systems, PC, gawk on: PC Using. (line 6) * operating systems, PC, gawk on, installing: PC Installation. @@ -24612,6 +26006,7 @@ Index (line 48) * operators, word-boundary (gawk): GNU Regexp Operators. (line 63) +* option debugger command: Dgawk Info. (line 56) * options, command-line <1>: Options. (line 6) * options, command-line <2>: Command Line Field Separator. (line 6) @@ -24622,10 +26017,10 @@ Index * options, deprecated: Obsolete. (line 6) * options, long <1>: Options. (line 6) * options, long: Command Line. (line 13) -* options, printing list of: Options. (line 139) +* options, printing list of: Options. (line 136) * OR bitwise operation: Bitwise Functions. (line 6) * or Boolean-logic operator: Boolean Ops. (line 6) -* or function (gawk): Bitwise Functions. (line 39) +* or() function (gawk): Bitwise Functions. (line 39) * ord user-defined function: Ordinal Functions. (line 16) * order of evaluation, concatenation: Concatenation. (line 42) * ORS variable <1>: User-modified. (line 128) @@ -24644,13 +26039,14 @@ Index * output, printing, See printing: Printing. (line 6) * output, records: Output Separators. (line 20) * output, standard: Special FD. (line 6) +* p debugger command (alias for print): Viewing And Changing Data. + (line 36) * P1003.2 POSIX standard: Glossary. (line 426) -* param_cnt internal variable: Internals. (line 46) * parameters, number of: Internals. (line 46) * parentheses (): Regexp Operators. (line 78) * parentheses (), pgawk program: Profiling. (line 144) * password file: Passwd Functions. (line 16) -* patsplit function: String Functions. (line 200) +* patsplit() function: String Functions. (line 200) * patterns: Patterns and Actions. (line 6) * patterns, comparison expressions as: Expression Patterns. (line 14) @@ -24678,8 +26074,8 @@ Index (line 6) * pipes, input: Getline/Pipe. (line 6) * pipes, output: Redirection. (line 57) -* Pitts, Dave: Bugs. (line 72) -* Pitts, Davi: Acknowledgments. (line 53) +* Pitts, Dave <1>: Bugs. (line 72) +* Pitts, Dave: Acknowledgments. (line 59) * plus sign (+): Regexp Operators. (line 101) * plus sign (+), + operator: Precedence. (line 52) * plus sign (+), ++ operator <1>: Precedence. (line 46) @@ -24697,31 +26093,30 @@ Index * portability, backslash continuation and: Statements/Lines. (line 30) * portability, backslash in escape sequences: Escape Sequences. (line 113) -* portability, close function and: Close Files And Pipes. +* portability, close() function and: Close Files And Pipes. (line 81) -* portability, data files as single record: Records. (line 170) +* portability, data files as single record: Records. (line 172) * portability, deleting array elements: Delete. (line 51) * portability, example programs: Library Functions. (line 31) -* portability, fflush function and: I/O Functions. (line 29) +* portability, fflush() function and: I/O Functions. (line 29) * portability, functions, defining: Definition Syntax. (line 94) * portability, gawk: New Ports. (line 6) * portability, gettext library and: Explaining gettext. (line 10) * portability, internationalization and: I18N Portability. (line 6) -* portability, length function: String Functions. (line 80) +* portability, length() function: String Functions. (line 80) * portability, new awk vs. old awk: Conversion. (line 54) * portability, next statement in user-defined functions: Function Caveats. (line 99) -* portability, NF variable, decrementing: Changing Fields. (line 116) +* portability, NF variable, decrementing: Changing Fields. (line 115) * portability, operators: Increment Ops. (line 61) * portability, operators, not in POSIX awk: Precedence. (line 98) -* portability, POSIXLY_CORRECT environment variable: Options. (line 308) -* portability, substr function: String Functions. (line 482) +* portability, POSIXLY_CORRECT environment variable: Options. (line 305) +* portability, substr() function: String Functions. (line 482) * portable object files <1>: Translator i18n. (line 6) * portable object files: Explaining gettext. (line 36) * portable object files, converting to message object files: I18N Example. (line 62) -* portable object files, generating: Options. (line 131) -* portal files: Portal Files. (line 6) +* portable object files, generating: Options. (line 129) * porting gawk: New Ports. (line 6) * positional specifiers, printf statement <1>: Printf Ordering. (line 6) @@ -24730,7 +26125,7 @@ Index (line 57) * positive zero: Unexpected Results. (line 28) * POSIX awk <1>: Assignment Ops. (line 136) -* POSIX awk: This Manual. (line 13) +* POSIX awk: This Manual. (line 14) * POSIX awk, **= operator and: Assignment Ops. (line 142) * POSIX awk, < operator and: Getline/File. (line 26) * POSIX awk, arithmetic operators and: Arithmetic Ops. (line 36) @@ -24739,9 +26134,9 @@ Index * POSIX awk, BEGIN/END patterns: I/O And BEGIN/END. (line 16) * POSIX awk, break statement and: Break Statement. (line 47) * POSIX awk, changes in awk versions: POSIX. (line 6) -* POSIX awk, character lists and: Character Lists. (line 24) +* POSIX awk, character lists and: Character Lists. (line 23) * POSIX awk, character lists and, character classes: Character Lists. - (line 30) + (line 29) * POSIX awk, continue statement and: Continue Statement. (line 43) * POSIX awk, CONVFMT variable and: User-modified. (line 28) * POSIX awk, date utility and: Time Functions. (line 259) @@ -24750,8 +26145,8 @@ Index * POSIX awk, field separators and: Fields. (line 6) * POSIX awk, FS variable and: User-modified. (line 66) * POSIX awk, function keyword in: Definition Syntax. (line 78) -* POSIX awk, functions and, gsub/sub: Gory Details. (line 53) -* POSIX awk, functions and, length: String Functions. (line 80) +* POSIX awk, functions and, gsub()/sub(): Gory Details. (line 53) +* POSIX awk, functions and, length(): String Functions. (line 80) * POSIX awk, GNU long options and: Options. (line 15) * POSIX awk, interval expressions in: Regexp Operators. (line 134) * POSIX awk, next/nextfile statements and: Next Statement. (line 39) @@ -24763,14 +26158,16 @@ Index * POSIX awk, regular expressions and: Regexp Operators. (line 156) * POSIX awk, timestamps and: Time Functions. (line 6) * POSIX awk, | I/O operator and: Getline/Pipe. (line 52) -* POSIX mode: Options. (line 200) +* POSIX mode: Options. (line 192) * POSIX, awk and: Preface. (line 22) * POSIX, gawk extensions not included in: POSIX/GNU. (line 6) * POSIX, programs, implementing in awk: Clones. (line 6) -* POSIXLY_CORRECT environment variable: Options. (line 290) +* POSIXLY_CORRECT environment variable: Options. (line 287) * precedence <1>: Precedence. (line 6) * precedence: Increment Ops. (line 61) * precedence, regexp operators: Regexp Operators. (line 151) +* print debugger command: Viewing And Changing Data. + (line 36) * print statement: Printing. (line 16) * print statement, BEGIN/END patterns and: I/O And BEGIN/END. (line 16) * print statement, commas, omitting: Print Examples. (line 31) @@ -24779,7 +26176,9 @@ Index * print statement, OFMT variable and: User-modified. (line 123) * print statement, See Also redirection, of output: Redirection. (line 17) -* print statement, sprintf function and: Round Function. (line 6) +* print statement, sprintf() function and: Round Function. (line 6) +* printf debugger command: Viewing And Changing Data. + (line 54) * printf statement <1>: Printf. (line 6) * printf statement: Printing. (line 16) * printf statement, columns, aligning: Print Examples. (line 70) @@ -24794,10 +26193,10 @@ Index (line 57) * printf statement, See Also redirection, of output: Redirection. (line 17) -* printf statement, sprintf function and: Round Function. (line 6) +* printf statement, sprintf() function and: Round Function. (line 6) * printf statement, syntax of: Basic Printf. (line 6) * printing: Printing. (line 6) -* printing, list of options: Options. (line 139) +* printing, list of options: Options. (line 136) * printing, mailing labels: Labels Program. (line 6) * printing, unduplicated lines of text: Uniq Program. (line 6) * printing, user information: Id Program. (line 6) @@ -24806,9 +26205,8 @@ Index * processing data: Basic High Level. (line 6) * PROCINFO array <1>: Group Functions. (line 6) * PROCINFO array <2>: Passwd Functions. (line 6) -* PROCINFO array <3>: Auto-set. (line 124) -* PROCINFO array: Special Caveats. (line 12) -* PROCINFO variable: Internals. (line 165) +* PROCINFO array: Auto-set. (line 123) +* PROCINFO variable: Internals. (line 149) * profiling awk programs: Profiling. (line 6) * profiling awk programs, dynamically: Profiling. (line 177) * profiling gawk, See pgawk program: Profiling. (line 6) @@ -24824,7 +26222,7 @@ Index (line 10) * programming conventions, functions, writing: Definition Syntax. (line 55) -* programming conventions, gawk internals: Internal File Ops. (line 33) +* programming conventions, gawk internals: Internal File Ops. (line 32) * programming conventions, nextfile statement: Nextfile Function. (line 20) * programming conventions, private variable names: Library Names. @@ -24835,30 +26233,37 @@ Index * programming, basic steps: Basic High Level. (line 20) * programming, concepts: Basic Concepts. (line 6) * pwcat program: Passwd Functions. (line 23) +* q debugger command (alias for quit): Miscellaneous Dgawk Commands. + (line 104) * QSE Awk: Other Versions. (line 126) * question mark (?) <1>: GNU Regexp Operators. (line 59) * question mark (?): Regexp Operators. (line 110) * question mark (?), ?: operator: Precedence. (line 92) * QuikTrim Awk: Other Versions. (line 119) +* quit debugger command: Miscellaneous Dgawk Commands. + (line 104) * QUIT signal (MS-DOS): Profiling. (line 210) * quoting <1>: Comments. (line 27) * quoting <2>: Long. (line 26) * quoting: Read Terminal. (line 25) * quoting, rules for: Quoting. (line 6) * quoting, tricks for: Quoting. (line 71) +* r debugger command (alias for run): Dgawk Execution Control. + (line 62) * Rakitzis, Byron: History Sorting. (line 25) -* rand function: Numeric Functions. (line 40) +* rand() function: Numeric Functions. (line 40) * random numbers, Cliff: Cliff Random Function. (line 6) -* random numbers, rand/srand functions: Numeric Functions. (line 40) +* random numbers, rand()/srand() functions: Numeric Functions. + (line 40) * random numbers, seed of: Numeric Functions. (line 70) * range expressions: Character Lists. (line 6) * range patterns: Ranges. (line 6) * Rankin, Pat <1>: Bugs. (line 71) * Rankin, Pat <2>: Contributors. (line 35) * Rankin, Pat <3>: Assignment Ops. (line 100) -* Rankin, Pat: Acknowledgments. (line 53) +* Rankin, Pat: Acknowledgments. (line 59) * raw sockets: TCP/IP Networking. (line 36) * readable data files, checking: File Checking. (line 6) * readable.awk program: File Checking. (line 11) @@ -24874,7 +26279,7 @@ Index * records, printing: Print. (line 22) * records, splitting input into: Records. (line 6) * records, terminating: Records. (line 112) -* records, treating files as: Records. (line 191) +* records, treating files as: Records. (line 193) * recursive functions: Definition Syntax. (line 73) * redirection of input: Getline/File. (line 6) * redirection of output: Redirection. (line 6) @@ -24882,7 +26287,7 @@ Index * regexp constants <1>: Comparison Operators. (line 102) * regexp constants <2>: Regexp Constants. (line 6) -* regexp constants: Regexp Usage. (line 58) +* regexp constants: Regexp Usage. (line 57) * regexp constants, /=.../, /= operator and: Assignment Ops. (line 148) * regexp constants, as patterns: Expression Patterns. (line 36) * regexp constants, in gawk: Using Constant Regexps. @@ -24890,10 +26295,10 @@ Index * regexp constants, slashes vs. quotes: Computed Regexps. (line 28) * regexp constants, vs. string constants: Computed Regexps. (line 38) * regexp, See regular expressions: Regexp. (line 6) -* register_deferred_variable internal function: Internals. (line 165) -* register_open_hook internal function: Internals. (line 178) +* register_deferred_variable internal function: Internals. (line 149) +* register_open_hook internal function: Internals. (line 162) * regular expressions: Regexp. (line 6) -* regular expressions as field separators: Field Separators. (line 49) +* regular expressions as field separators: Field Separators. (line 50) * regular expressions, anchors in: Regexp Operators. (line 22) * regular expressions, as field separators: Regexp Field Splitting. (line 6) @@ -24904,13 +26309,13 @@ Index * regular expressions, case sensitivity: Case-sensitivity. (line 6) * regular expressions, computed: Computed Regexps. (line 6) * regular expressions, constants, See regexp constants: Regexp Usage. - (line 58) + (line 57) * regular expressions, dynamic: Computed Regexps. (line 6) * regular expressions, dynamic, with embedded newlines: Computed Regexps. (line 59) * regular expressions, gawk, command-line options: GNU Regexp Operators. (line 70) -* regular expressions, interval expressions and: Options. (line 236) +* regular expressions, interval expressions and: Options. (line 228) * regular expressions, leftmost longest match: Leftmost Longest. (line 6) * regular expressions, operators <1>: Regexp Operators. (line 6) @@ -24926,9 +26331,11 @@ Index * regular expressions, searching for: Egrep Program. (line 6) * relational operators, See comparison operators: Typing and Comparison. (line 9) +* return debugger command: Dgawk Execution Control. + (line 54) * return statement, user-defined functions: Return Statement. (line 6) -* return values, close function: Close Files And Pipes. - (line 130) +* return values, close() function: Close Files And Pipes. + (line 131) * rev user-defined function: Function Example. (line 52) * rewind user-defined function: Rewind Function. (line 16) * right angle bracket (>), > operator <1>: Precedence. (line 65) @@ -24942,8 +26349,8 @@ Index * right angle bracket (>), >> operator (I/O): Redirection. (line 50) * right shift, bitwise: Bitwise Functions. (line 32) * Ritchie, Dennis: Basic Data Typing. (line 71) -* RLENGTH variable: Auto-set. (line 169) -* RLENGTH variable, match function and: String Functions. (line 129) +* RLENGTH variable: Auto-set. (line 168) +* RLENGTH variable, match() function and: String Functions. (line 129) * Robbins, Arnold <1>: Future Extensions. (line 6) * Robbins, Arnold <2>: Bugs. (line 29) * Robbins, Arnold <3>: Contributors. (line 95) @@ -24953,11 +26360,11 @@ Index * Robbins, Arnold: Command Line Field Separator. (line 80) * Robbins, Bill: Getline/Pipe. (line 36) -* Robbins, Harry: Acknowledgments. (line 72) -* Robbins, Jean: Acknowledgments. (line 72) +* Robbins, Harry: Acknowledgments. (line 79) +* Robbins, Jean: Acknowledgments. (line 79) * Robbins, Miriam <1>: Passwd Functions. (line 76) * Robbins, Miriam <2>: Getline/Pipe. (line 36) -* Robbins, Miriam: Acknowledgments. (line 72) +* Robbins, Miriam: Acknowledgments. (line 79) * Robinson, Will: Dynamic Extensions. (line 6) * robot, the: Dynamic Extensions. (line 6) * Rommel, Kai Uwe: Contributors. (line 42) @@ -24967,21 +26374,25 @@ Index * RS variable <1>: User-modified. (line 133) * RS variable: Records. (line 20) * RS variable, multiline records and: Multiple Line. (line 17) -* rshift function (gawk): Bitwise Functions. (line 46) -* RSTART variable: Auto-set. (line 175) -* RSTART variable, match function and: String Functions. (line 129) -* RT variable <1>: Auto-set. (line 182) +* rshift() function (gawk): Bitwise Functions. (line 46) +* RSTART variable: Auto-set. (line 174) +* RSTART variable, match() function and: String Functions. (line 129) +* RT variable <1>: Auto-set. (line 181) * RT variable <2>: Multiple Line. (line 129) * RT variable: Records. (line 112) * Rubin, Paul <1>: Contributors. (line 16) * Rubin, Paul: History. (line 30) * rule, definition of: Getting Started. (line 21) +* run debugger command: Dgawk Execution Control. + (line 62) * rvalues/lvalues: Assignment Ops. (line 32) -* sandbox mode: Options. (line 243) +* s debugger command (alias for step): Dgawk Execution Control. + (line 68) +* sandbox mode: Options. (line 235) * scalar values: Basic Data Typing. (line 13) -* Schorr, Andrew: Acknowledgments. (line 53) -* Schreiber, Bert: Acknowledgments. (line 35) -* Schreiber, Rita: Acknowledgments. (line 35) +* Schorr, Andrew: Acknowledgments. (line 59) +* Schreiber, Bert: Acknowledgments. (line 37) +* Schreiber, Rita: Acknowledgments. (line 37) * search paths <1>: VMS Running. (line 28) * search paths: PC Using. (line 11) * search paths, for source files <1>: VMS Running. (line 28) @@ -25009,7 +26420,8 @@ Index * separators, for statements in actions: Action Overview. (line 19) * separators, record: User-modified. (line 133) * separators, subscript: User-modified. (line 146) -* set_value internal function: Internals. (line 146) +* set debugger command: Viewing And Changing Data. + (line 59) * shells, piping commands into: Redirection. (line 143) * shells, quoting: Using Shell Variables. (line 12) @@ -25019,11 +26431,13 @@ Index (line 6) * shift, bitwise: Bitwise Functions. (line 32) * short-circuit operators: Boolean Ops. (line 57) +* si debugger command (alias for stepi): Dgawk Execution Control. + (line 76) * side effects <1>: Increment Ops. (line 11) * side effects: Concatenation. (line 42) * side effects, array indexing: Reference to Elements. (line 30) -* side effects, asort function: Array Sorting. (line 25) +* side effects, asort() function: Array Sorting. (line 25) * side effects, assignment expressions: Assignment Ops. (line 23) * side effects, Boolean operators: Boolean Ops. (line 30) * side effects, conditional expressions: Conditional Exp. (line 22) @@ -25035,7 +26449,9 @@ Index * signals, INT/SIGINT (MS-DOS): Profiling. (line 210) * signals, QUIT/SIGQUIT (MS-DOS): Profiling. (line 210) * signals, USR1/SIGUSR1: Profiling. (line 185) -* sin function: Numeric Functions. (line 31) +* silent debugger command: Dgawk Execution Control. + (line 10) +* sin() function: Numeric Functions. (line 31) * single quote (') <1>: Quoting. (line 31) * single quote (') <2>: Long. (line 33) * single quote ('): One-shot. (line 15) @@ -25057,22 +26473,22 @@ Index * source code, Bell Laboratories awk: Other Versions. (line 13) * source code, gawk: Gawk Distribution. (line 6) * source code, mawk: Other Versions. (line 34) -* source code, mixing: Options. (line 104) +* source code, mixing: Options. (line 102) * source files, search path for: Igawk Program. (line 358) * sparse arrays: Array Intro. (line 72) * Spencer, Henry: Glossary. (line 12) -* split function: String Functions. (line 215) -* split function, array elements, deleting: Delete. (line 56) -* split utility: Split Program. (line 6) +* split() function: String Functions. (line 215) +* split() function, array elements, deleting: Delete. (line 56) +* split() utility: Split Program. (line 6) * split.awk program: Split Program. (line 30) -* sprintf function <1>: String Functions. (line 278) -* sprintf function: OFMT. (line 15) -* sprintf function, OFMT variable and: User-modified. (line 123) -* sprintf function, print/printf statements and: Round Function. +* sprintf() function <1>: String Functions. (line 278) +* sprintf() function: OFMT. (line 15) +* sprintf() function, OFMT variable and: User-modified. (line 123) +* sprintf() function, print/printf statements and: Round Function. (line 6) -* sqrt function: Numeric Functions. (line 18) +* sqrt() function: Numeric Functions. (line 18) * square brackets ([]): Regexp Operators. (line 55) -* srand function: Numeric Functions. (line 80) +* srand() function: Numeric Functions. (line 80) * Stallman, Richard <1>: Glossary. (line 286) * Stallman, Richard <2>: Contributors. (line 24) * Stallman, Richard <3>: Acknowledgments. (line 18) @@ -25084,12 +26500,16 @@ Index * statements, compound, control statements and: Statements. (line 10) * statements, control, in actions: Statements. (line 6) * statements, multiple: Statements/Lines. (line 90) +* step debugger command: Dgawk Execution Control. + (line 68) +* stepi debugger command: Dgawk Execution Control. + (line 76) * stlen internal variable: Internals. (line 50) * stptr internal variable: Internals. (line 50) * stream editors <1>: Simple Sed. (line 6) * stream editors: Field Splitting Summary. (line 47) -* strftime function (gawk): Time Functions. (line 53) +* strftime() function (gawk): Time Functions. (line 53) * string constants: Scalar Constants. (line 15) * string constants, vs. regexp constants: Computed Regexps. (line 38) * string extraction (internationalization): String Extraction. @@ -25110,14 +26530,14 @@ Index (line 43) * strings, numeric: Variable Typing. (line 6) * strings, splitting: String Functions. (line 234) -* strtonum function (gawk): String Functions. (line 286) -* strtonum function (gawk), --non-decimal-data option and: Nondecimal Data. +* strtonum() function (gawk): String Functions. (line 286) +* strtonum() function (gawk), --non-decimal-data option and: Nondecimal Data. (line 36) -* sub function <1>: String Functions. (line 307) -* sub function: Using Constant Regexps. +* sub() function <1>: String Functions. (line 307) +* sub() function: Using Constant Regexps. (line 44) -* sub function, arguments of: String Functions. (line 364) -* sub function, escape processing: Gory Details. (line 6) +* sub() function, arguments of: String Functions. (line 364) +* sub() function, escape processing: Gory Details. (line 6) * subscript separators: User-modified. (line 146) * subscripts in arrays, multidimensional: Multi-dimensional. (line 10) * subscripts in arrays, multidimensional, scanning: Multi-scanning. @@ -25129,14 +26549,16 @@ Index * SUBSEP variable: User-modified. (line 146) * SUBSEP variable, multidimensional arrays: Multi-dimensional. (line 16) -* substr function: String Functions. (line 451) +* substr() function: String Functions. (line 451) * Sumner, Andrew: Other Versions. (line 81) * switch statement: Switch Statement. (line 6) * syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) -* system function: I/O Functions. (line 64) -* systime function (gawk): Time Functions. (line 24) +* system() function: I/O Functions. (line 64) +* systime() function (gawk): Time Functions. (line 24) +* t debugger command (alias for tbreak): Breakpoint Control. (line 83) * tandem: Tandem Installation. (line 6) +* tbreak debugger command: Breakpoint Control. (line 83) * Tcl: Library Names. (line 57) * TCP/IP: TCP/IP Networking. (line 6) * TCP/IP, support for: Special Network. (line 6) @@ -25179,17 +26601,17 @@ Index * timestamps, converting dates to: Time Functions. (line 72) * timestamps, formatted: Gettimeofday Function. (line 6) -* tmp_number internal function: Internals. (line 92) -* tmp_string internal function: Internals. (line 87) -* tolower function: String Functions. (line 493) -* toupper function: String Functions. (line 499) +* tolower() function: String Functions. (line 493) +* toupper() function: String Functions. (line 499) * tr utility: Translate Program. (line 6) +* trace debugger command: Miscellaneous Dgawk Commands. + (line 113) * translate.awk program: Translate Program. (line 55) -* troubleshooting, --non-decimal-data option: Options. (line 171) +* troubleshooting, --non-decimal-data option: Options. (line 163) * troubleshooting, -F option: Known Bugs. (line 6) * troubleshooting, == operator: Comparison Operators. (line 37) -* troubleshooting, awk uses FS not IFS: Field Separators. (line 28) +* troubleshooting, awk uses FS not IFS: Field Separators. (line 29) * troubleshooting, backslash before nonspecial character: Escape Sequences. (line 113) * troubleshooting, division: Arithmetic Ops. (line 44) @@ -25197,7 +26619,7 @@ Index (line 22) * troubleshooting, fatal errors, printf format strings: Format Modifiers. (line 159) -* troubleshooting, fflush function: I/O Functions. (line 52) +* troubleshooting, fflush() function: I/O Functions. (line 52) * troubleshooting, function call syntax: Function Calls. (line 28) * troubleshooting, gawk <1>: Compatibility Mode. (line 6) * troubleshooting, gawk: Known Bugs. (line 6) @@ -25205,33 +26627,37 @@ Index * troubleshooting, gawk, fatal errors, function arguments: Calling Built-in. (line 16) * troubleshooting, getline function: File Checking. (line 24) -* troubleshooting, gsub/sub functions: String Functions. (line 374) -* troubleshooting, match function: String Functions. (line 195) +* troubleshooting, gsub()/sub() functions: String Functions. (line 374) +* troubleshooting, match() function: String Functions. (line 195) * troubleshooting, print statement, omitting commas: Print Examples. (line 31) * troubleshooting, printing: Redirection. (line 118) -* troubleshooting, quotes with file names: Special FD. (line 63) +* troubleshooting, quotes with file names: Special FD. (line 66) * troubleshooting, readable data files: File Checking. (line 6) * troubleshooting, regexp constants vs. string constants: Computed Regexps. (line 38) * troubleshooting, string concatenation: Concatenation. (line 27) -* troubleshooting, substr function: String Functions. (line 469) -* troubleshooting, system function: I/O Functions. (line 88) +* troubleshooting, substr() function: String Functions. (line 469) +* troubleshooting, system() function: I/O Functions. (line 88) * troubleshooting, typographical errors, global variables: Options. - (line 94) + (line 92) * true, logical: Truth Values. (line 6) * Trueman, David <1>: Contributors. (line 31) -* Trueman, David <2>: Acknowledgments. (line 44) +* Trueman, David <2>: Acknowledgments. (line 46) * Trueman, David: History. (line 30) * trunc-mod operation: Arithmetic Ops. (line 66) * truth values: Truth Values. (line 6) * type conversion: Conversion. (line 21) * type internal variable: Internals. (line 58) +* u debugger command (alias for until): Dgawk Execution Control. + (line 83) * undefined functions: Function Caveats. (line 79) * underscore (_), _ C macro: Explaining gettext. (line 68) * underscore (_), in names of private variables: Library Names. (line 29) * underscore (_), translatable string: Programmer i18n. (line 67) +* undisplay debugger command: Viewing And Changing Data. + (line 80) * undocumented features: Undocumented. (line 6) * uninitialized variables, as array subscripts: Uninitialized Subscripts. (line 6) @@ -25240,14 +26666,20 @@ Index * Unix: Glossary. (line 582) * Unix awk, backslashes in escape sequences: Escape Sequences. (line 125) -* Unix awk, close function and: Close Files And Pipes. - (line 130) +* Unix awk, close() function and: Close Files And Pipes. + (line 131) * Unix awk, password files, field separators and: Command Line Field Separator. (line 72) * Unix, awk scripts and: Executable Scripts. (line 6) +* unref internal function: Internals. (line 92) * unsigned integers: Basic Data Typing. (line 28) -* update_ERRNO internal function: Internals. (line 152) -* update_ERRNO_saved internal function: Internals. (line 157) +* until debugger command: Dgawk Execution Control. + (line 83) +* unwatch debugger command: Viewing And Changing Data. + (line 84) +* up debugger command: Dgawk Stack. (line 33) +* update_ERRNO internal function: Internals. (line 136) +* update_ERRNO_saved internal function: Internals. (line 141) * user database, reading: Passwd Functions. (line 6) * user-defined, functions: User-defined. (line 6) * user-defined, functions, counts: Profiling. (line 135) @@ -25276,7 +26708,7 @@ Index (line 6) * variables, getline command into, using: Getline/Variable. (line 6) * variables, global, for library functions: Library Names. (line 11) -* variables, global, printing list of: Options. (line 90) +* variables, global, printing list of: Options. (line 88) * variables, initializing: Using Variables. (line 18) * variables, names of: Arrays. (line 17) * variables, private: Library Names. (line 11) @@ -25294,26 +26726,29 @@ Index * vertical bar (|), |& I/O operator (I/O): Two-way I/O. (line 44) * vertical bar (|), |& operator (I/O) <1>: Precedence. (line 65) * vertical bar (|), |& operator (I/O): Getline/Coprocess. (line 6) -* vertical bar (|), |& operator (I/O), two-way communications: Portal Files. - (line 10) * vertical bar (|), || operator <1>: Precedence. (line 89) * vertical bar (|), || operator: Boolean Ops. (line 57) -* Vinschen, Corinna: Acknowledgments. (line 53) +* Vinschen, Corinna: Acknowledgments. (line 59) * vname internal variable: Internals. (line 62) +* w debugger command (alias for watch): Viewing And Changing Data. + (line 67) * w utility: Constant Size. (line 22) * Wall, Larry <1>: Future Extensions. (line 6) * Wall, Larry: Array Intro. (line 6) -* Wallin, Anders: Acknowledgments. (line 53) -* warnings, issuing: Options. (line 144) +* Wallin, Anders: Acknowledgments. (line 59) +* warnings, issuing: Options. (line 141) +* watch debugger command: Viewing And Changing Data. + (line 67) * wc utility: Wc Program. (line 6) * wc.awk program: Wc Program. (line 45) * Weinberger, Peter <1>: Contributors. (line 12) * Weinberger, Peter: History. (line 17) * while statement <1>: While Statement. (line 6) * while statement: Regexp Usage. (line 19) -* whitespace, as field separators: Field Separators. (line 63) +* whitespace, as field separators: Default Field Splitting. + (line 6) * whitespace, functions, calling: Calling Built-in. (line 10) -* whitespace, newlines as: Options. (line 207) +* whitespace, newlines as: Options. (line 199) * Williams, Kent: Contributors. (line 37) * Woehlke, Matthew: Contributors. (line 72) * Woods, John: Contributors. (line 28) @@ -25328,11 +26763,11 @@ Index * words, duplicate, searching for: Dupword Program. (line 6) * words, usage counts, generating: Word Sorting. (line 6) * xgettext utility: String Extraction. (line 13) -* XML: Internals. (line 178) +* XML: Internals. (line 162) * XOR bitwise operation: Bitwise Functions. (line 6) -* xor function (gawk): Bitwise Functions. (line 41) +* xor() function (gawk): Bitwise Functions. (line 41) * Zaretskii, Eli <1>: Bugs. (line 69) -* Zaretskii, Eli: Acknowledgments. (line 53) +* Zaretskii, Eli: Acknowledgments. (line 59) * zero, negative vs. positive: Unexpected Results. (line 28) * zerofile.awk program: Empty Files. (line 21) * Zoulas, Christos: Contributors. (line 62) @@ -25348,9 +26783,7 @@ Index * | (vertical bar), |& operator (I/O) <3>: Redirection. (line 102) * | (vertical bar), |& operator (I/O): Getline/Coprocess. (line 6) * | (vertical bar), |& operator (I/O), pipes, closing: Close Files And Pipes. - (line 117) -* | (vertical bar), |& operator (I/O), two-way communications: Portal Files. - (line 10) + (line 118) * | (vertical bar), || operator <1>: Precedence. (line 89) * | (vertical bar), || operator: Boolean Ops. (line 57) * ~ (tilde), ~ operator <1>: Expression Patterns. (line 24) @@ -25359,385 +26792,406 @@ Index (line 11) * ~ (tilde), ~ operator <4>: Regexp Constants. (line 6) * ~ (tilde), ~ operator <5>: Computed Regexps. (line 6) -* ~ (tilde), ~ operator: Case-sensitivity. (line 26) +* ~ (tilde), ~ operator <6>: Case-sensitivity. (line 26) +* ~ (tilde), ~ operator: Regexp Usage. (line 19) Tag Table: Node: Top1340 -Node: Foreword28547 -Node: Preface32868 -Ref: Preface-Footnote-135737 -Node: History35969 -Node: Names38185 -Ref: Names-Footnote-139657 -Node: This Manual39729 -Ref: This Manual-Footnote-144484 -Node: Conventions44584 -Node: Manual History46458 -Ref: Manual History-Footnote-149911 -Ref: Manual History-Footnote-249952 -Node: How To Contribute50026 -Node: Acknowledgments51170 -Node: Getting Started55048 -Node: Running gawk57420 -Node: One-shot58606 -Node: Read Terminal59831 -Ref: Read Terminal-Footnote-161492 -Ref: Read Terminal-Footnote-261768 -Node: Long61939 -Node: Executable Scripts63315 -Ref: Executable Scripts-Footnote-165211 -Ref: Executable Scripts-Footnote-265362 -Node: Comments65813 -Node: Quoting68181 -Node: DOS Quoting72758 -Node: Sample Data Files73430 -Node: Very Simple76462 -Node: Two Rules81067 -Node: More Complex83214 -Ref: More Complex-Footnote-186137 -Ref: More Complex-Footnote-286585 -Node: Statements/Lines86668 -Ref: Statements/Lines-Footnote-191050 -Node: Other Features91315 -Node: When92167 -Node: Regexp94423 -Node: Regexp Usage95877 -Node: Escape Sequences97929 -Node: Regexp Operators103668 -Ref: Regexp Operators-Footnote-1110877 -Ref: Regexp Operators-Footnote-2111024 -Node: Character Lists111122 -Ref: table-char-classes113079 -Node: GNU Regexp Operators115704 -Node: Case-sensitivity119434 -Ref: Case-sensitivity-Footnote-1122607 -Node: Leftmost Longest122842 -Node: Computed Regexps124033 -Node: Locales127414 -Node: Reading Files129680 -Node: Records131692 -Ref: Records-Footnote-1140250 -Node: Fields140287 -Ref: Fields-Footnote-1143317 -Node: Nonconstant Fields143403 -Node: Changing Fields145605 -Node: Field Separators150926 -Node: Regexp Field Splitting154417 -Node: Single Character Fields157970 -Node: Command Line Field Separator159021 -Node: Field Splitting Summary162460 -Ref: Field Splitting Summary-Footnote-1165646 -Node: Constant Size165747 -Node: Splitting By Content170231 -Ref: Splitting By Content-Footnote-1173460 -Node: Multiple Line173500 -Ref: Multiple Line-Footnote-1179238 -Node: Getline179417 -Node: Plain Getline181620 -Node: Getline/Variable183707 -Node: Getline/File184848 -Node: Getline/Variable/File186172 -Ref: Getline/Variable/File-Footnote-1187769 -Node: Getline/Pipe187856 -Node: Getline/Variable/Pipe190453 -Node: Getline/Coprocess191560 -Node: Getline/Variable/Coprocess192803 -Node: Getline Notes193517 -Node: Getline Summary195160 -Ref: table-getline-variants195444 -Node: BEGINFILE/ENDFILE196010 -Node: Command line directories198848 -Node: Printing199525 -Node: Print201154 -Node: Print Examples202480 -Node: Output Separators205275 -Node: OFMT207036 -Node: Printf208391 -Node: Basic Printf209310 -Node: Control Letters210845 -Node: Format Modifiers214728 -Node: Printf Examples220738 -Node: Redirection223455 -Node: Special Files230451 -Node: Special FD231014 -Node: Special Network234207 -Node: Special Caveats235062 -Ref: Special Caveats-Footnote-1236260 -Node: Close Files And Pipes236643 -Ref: Close Files And Pipes-Footnote-1243564 -Ref: Close Files And Pipes-Footnote-2243712 -Node: Expressions243860 -Node: Values244929 -Node: Constants245601 -Node: Scalar Constants246281 -Ref: Scalar Constants-Footnote-1247140 -Node: Nondecimal-numbers247322 -Node: Regexp Constants250384 -Node: Using Constant Regexps250859 -Node: Variables253941 -Node: Using Variables254596 -Node: Assignment Options256170 -Node: Conversion258051 -Ref: table-locale-affects263458 -Ref: Conversion-Footnote-1264082 -Node: All Operators264191 -Node: Arithmetic Ops264821 -Node: Concatenation267320 -Ref: Concatenation-Footnote-1270108 -Node: Assignment Ops270199 -Ref: table-assign-ops275183 -Node: Increment Ops276584 -Node: Truth Values and Conditions280062 -Node: Truth Values281145 -Node: Typing and Comparison282193 -Node: Variable Typing282914 -Ref: Variable Typing-Footnote-1286602 -Node: Comparison Operators286746 -Ref: table-relational-ops287124 -Node: Boolean Ops290673 -Ref: Boolean Ops-Footnote-1294751 -Node: Conditional Exp294842 -Node: Function Calls296574 -Node: Precedence300125 -Node: Patterns and Actions303775 -Node: Pattern Overview304829 -Node: Regexp Patterns306266 -Node: Expression Patterns306809 -Node: Ranges310359 -Node: BEGIN/END313448 -Node: Using BEGIN/END314198 -Ref: Using BEGIN/END-Footnote-1316930 -Node: I/O And BEGIN/END317044 -Node: Empty319311 -Node: Using Shell Variables319619 -Node: Action Overview321900 -Node: Statements324258 -Node: If Statement326114 -Node: While Statement327613 -Node: Do Statement329645 -Node: For Statement330794 -Node: Switch Statement333934 -Node: Break Statement335982 -Node: Continue Statement338039 -Node: Next Statement339943 -Node: Nextfile Statement342223 -Node: Exit Statement344939 -Node: Built-in Variables347210 -Node: User-modified348305 -Ref: User-modified-Footnote-1356230 -Node: Auto-set356292 -Ref: Auto-set-Footnote-1364980 -Node: ARGC and ARGV365185 -Node: Arrays368946 -Node: Array Basics370393 -Node: Array Intro371104 -Node: Reference to Elements375490 -Node: Assigning Elements377389 -Node: Array Example377880 -Node: Scanning an Array379612 -Node: Delete381887 -Ref: Delete-Footnote-1384275 -Node: Numeric Array Subscripts384332 -Node: Uninitialized Subscripts386519 -Node: Multi-dimensional388125 -Node: Multi-scanning391216 -Node: Array Sorting392796 -Node: Functions396578 -Node: Built-in397387 -Node: Calling Built-in398357 -Node: Numeric Functions400324 -Ref: Numeric Functions-Footnote-1404066 -Ref: Numeric Functions-Footnote-2404392 -Node: String Functions404661 -Ref: String Functions-Footnote-1426447 -Ref: String Functions-Footnote-2426576 -Ref: String Functions-Footnote-3426824 -Node: Gory Details426911 -Ref: table-sub-escapes428546 -Ref: table-sub-posix-92429881 -Ref: table-sub-proposed431220 -Ref: table-posix-2001-sub432572 -Ref: table-gensub-escapes433841 -Ref: Gory Details-Footnote-1435027 -Node: I/O Functions435078 -Ref: I/O Functions-Footnote-1441823 -Node: Time Functions441914 -Ref: Time Functions-Footnote-1452704 -Ref: Time Functions-Footnote-2452772 -Ref: Time Functions-Footnote-3452930 -Ref: Time Functions-Footnote-4453041 -Ref: Time Functions-Footnote-5453166 -Ref: Time Functions-Footnote-6453393 -Node: Bitwise Functions453655 -Ref: table-bitwise-ops454233 -Ref: Bitwise Functions-Footnote-1458467 -Node: I18N Functions458651 -Node: User-defined460372 -Node: Definition Syntax461176 -Node: Function Example465874 -Node: Function Caveats468454 -Node: Return Statement472379 -Node: Dynamic Typing475036 -Node: Indirect Calls475773 -Node: Internationalization485408 -Node: I18N and L10N486827 -Node: Explaining gettext487511 -Ref: Explaining gettext-Footnote-1492418 -Ref: Explaining gettext-Footnote-2492657 -Node: Programmer i18n492826 -Node: Translator i18n497049 -Node: String Extraction497840 -Ref: String Extraction-Footnote-1498793 -Node: Printf Ordering498919 -Ref: Printf Ordering-Footnote-1501697 -Node: I18N Portability501761 -Ref: I18N Portability-Footnote-1504189 -Node: I18N Example504252 -Ref: I18N Example-Footnote-1506866 -Node: Gawk I18N506938 -Node: Advanced Features507516 -Node: Nondecimal Data508915 -Node: Two-way I/O510474 -Ref: Two-way I/O-Footnote-1515955 -Node: TCP/IP Networking516032 -Node: Portal Files518825 -Node: Profiling519469 -Node: Invoking Gawk526925 -Node: Command Line528159 -Node: Options528944 -Ref: Options-Footnote-1542064 -Node: Other Arguments542089 -Node: AWKPATH Variable544770 -Ref: AWKPATH Variable-Footnote-1547545 -Node: Exit Status547805 -Node: Obsolete548472 -Node: Undocumented549271 -Node: Known Bugs549533 -Node: Library Functions550135 -Ref: Library Functions-Footnote-1553116 -Node: Library Names553287 -Ref: Library Names-Footnote-1556760 -Ref: Library Names-Footnote-2556979 -Node: General Functions557065 -Node: Nextfile Function558124 -Node: Strtonum Function562488 -Node: Assert Function565423 -Node: Round Function568727 -Node: Cliff Random Function570260 -Node: Ordinal Functions571273 -Ref: Ordinal Functions-Footnote-1574333 -Node: Join Function574549 -Ref: Join Function-Footnote-1576309 -Node: Gettimeofday Function576509 -Node: Data File Management580212 -Node: Filetrans Function580844 -Node: Rewind Function584270 -Node: File Checking585716 -Node: Empty Files586746 -Node: Ignoring Assigns588971 -Node: Getopt Function590519 -Ref: Getopt Function-Footnote-1601797 -Node: Passwd Functions601998 -Ref: Passwd Functions-Footnote-1610978 -Node: Group Functions611066 -Node: Sample Programs619163 -Node: Running Examples619840 -Node: Clones620568 -Node: Cut Program621700 -Node: Egrep Program631457 -Ref: Egrep Program-Footnote-1639207 -Node: Id Program639317 -Node: Split Program642924 -Node: Tee Program646388 -Node: Uniq Program649131 -Node: Wc Program656499 -Ref: Wc Program-Footnote-1660743 -Node: Miscellaneous Programs660939 -Node: Dupword Program662059 -Node: Alarm Program664090 -Node: Translate Program668630 -Ref: Translate Program-Footnote-1672998 -Ref: Translate Program-Footnote-2673235 -Node: Labels Program673369 -Ref: Labels Program-Footnote-1676660 -Node: Word Sorting676744 -Node: History Sorting681087 -Node: Extract Program682925 -Node: Simple Sed690277 -Node: Igawk Program693332 -Ref: Igawk Program-Footnote-1708063 -Ref: Igawk Program-Footnote-2708264 -Node: Signature Program708402 -Node: Language History709482 -Node: V7/SVR3.1710866 -Node: SVR4713139 -Node: POSIX714578 -Node: BTL716186 -Node: POSIX/GNU717865 -Node: Contributors727081 -Node: Installation730680 -Node: Gawk Distribution731651 -Node: Getting732135 -Node: Extracting732961 -Node: Distribution contents734349 -Node: Unix Installation739430 -Node: Quick Installation740021 -Node: Additional Configuration Options741723 -Node: Configuration Philosophy743641 -Node: Non-Unix Installation746005 -Node: PC Installation746470 -Node: PC Binary Installation747776 -Node: PC Compiling749619 -Node: PC Dynamic754124 -Node: PC Using756485 -Node: Cygwin761035 -Node: MSYS762031 -Node: VMS Installation762537 -Node: VMS Compilation763141 -Node: VMS Installation Details764718 -Node: VMS Running766348 -Node: VMS POSIX767945 -Node: VMS Old Gawk769243 -Node: Unsupported769712 -Node: Atari Installation770174 -Node: Atari Compiling771461 -Node: Atari Using773346 -Node: BeOS Installation776191 -Node: Tandem Installation777338 -Node: Bugs779017 -Node: Other Versions782849 -Node: Notes788067 -Node: Compatibility Mode788759 -Node: Additions789553 -Node: Adding Code790303 -Node: New Ports796353 -Node: Dynamic Extensions800485 -Node: Internals801810 -Node: Sample Library812813 -Node: Internal File Description813472 -Node: Internal File Ops817165 -Ref: Internal File Ops-Footnote-1822491 -Node: Using Internal File Ops822639 -Node: Future Extensions824662 -Node: Basic Concepts828699 -Node: Basic High Level829456 -Ref: Basic High Level-Footnote-1833572 -Node: Basic Data Typing833766 -Node: Floating Point Issues838203 -Node: String Conversion Precision839286 -Ref: String Conversion Precision-Footnote-1840980 -Node: Unexpected Results841089 -Node: POSIX Floating Point Problems842915 -Ref: POSIX Floating Point Problems-Footnote-1846389 -Node: Glossary846427 -Node: Copying870183 -Node: GNU Free Documentation License907740 -Node: next-edition932884 -Node: unresolved933236 -Node: revision933736 -Node: consistency934159 -Node: Index937512 +Node: Foreword29812 +Node: Preface34128 +Ref: Preface-Footnote-137080 +Ref: Preface-Footnote-237186 +Node: History37418 +Node: Names39650 +Ref: Names-Footnote-141127 +Node: This Manual41199 +Ref: This Manual-Footnote-146097 +Node: Conventions46197 +Node: Manual History48256 +Ref: Manual History-Footnote-151434 +Ref: Manual History-Footnote-251475 +Node: How To Contribute51549 +Node: Acknowledgments52693 +Node: Getting Started56962 +Node: Running gawk59334 +Node: One-shot60520 +Node: Read Terminal61745 +Ref: Read Terminal-Footnote-163395 +Ref: Read Terminal-Footnote-263669 +Node: Long63840 +Node: Executable Scripts65216 +Ref: Executable Scripts-Footnote-167077 +Ref: Executable Scripts-Footnote-267179 +Node: Comments67630 +Node: Quoting69998 +Node: DOS Quoting74615 +Node: Sample Data Files75283 +Node: Very Simple78315 +Node: Two Rules82912 +Node: More Complex85059 +Ref: More Complex-Footnote-187989 +Node: Statements/Lines88069 +Ref: Statements/Lines-Footnote-192425 +Node: Other Features92690 +Node: When93559 +Node: Regexp95702 +Node: Regexp Usage97156 +Node: Escape Sequences99182 +Node: Regexp Operators104925 +Ref: Regexp Operators-Footnote-1112097 +Ref: Regexp Operators-Footnote-2112244 +Node: Character Lists112342 +Ref: table-char-classes114117 +Node: GNU Regexp Operators116742 +Node: Case-sensitivity120455 +Ref: Case-sensitivity-Footnote-1123410 +Ref: Case-sensitivity-Footnote-2123645 +Node: Leftmost Longest123753 +Node: Computed Regexps124954 +Node: Locales128371 +Node: Reading Files131453 +Node: Records133469 +Ref: Records-Footnote-1142035 +Node: Fields142072 +Ref: Fields-Footnote-1145104 +Node: Nonconstant Fields145190 +Node: Changing Fields147392 +Node: Field Separators152677 +Node: Default Field Splitting155306 +Node: Regexp Field Splitting156423 +Node: Single Character Fields159773 +Node: Command Line Field Separator160824 +Node: Field Splitting Summary164263 +Ref: Field Splitting Summary-Footnote-1167449 +Node: Constant Size167550 +Node: Splitting By Content172021 +Ref: Splitting By Content-Footnote-1175623 +Node: Multiple Line175663 +Ref: Multiple Line-Footnote-1181403 +Node: Getline181582 +Node: Plain Getline183803 +Node: Getline/Variable185892 +Node: Getline/File187033 +Node: Getline/Variable/File188355 +Ref: Getline/Variable/File-Footnote-1189954 +Node: Getline/Pipe190041 +Node: Getline/Variable/Pipe192589 +Node: Getline/Coprocess193696 +Node: Getline/Variable/Coprocess194939 +Node: Getline Notes195653 +Node: Getline Summary197595 +Ref: table-getline-variants197879 +Node: BEGINFILE/ENDFILE198784 +Node: Command line directories201639 +Node: Printing202274 +Node: Print203905 +Node: Print Examples205242 +Node: Output Separators208026 +Node: OFMT209785 +Node: Printf211143 +Node: Basic Printf212049 +Node: Control Letters213586 +Node: Format Modifiers217335 +Node: Printf Examples223346 +Node: Redirection226061 +Node: Special Files233039 +Node: Special FD233572 +Ref: Special FD-Footnote-1237147 +Node: Special Network237221 +Node: Special Caveats238076 +Node: Close Files And Pipes238870 +Ref: Close Files And Pipes-Footnote-1245814 +Ref: Close Files And Pipes-Footnote-2245962 +Node: Expressions246112 +Node: Values247181 +Node: Constants247853 +Node: Scalar Constants248533 +Ref: Scalar Constants-Footnote-1249392 +Node: Nondecimal-numbers249574 +Node: Regexp Constants252638 +Node: Using Constant Regexps253113 +Node: Variables256210 +Node: Using Variables256865 +Node: Assignment Options258439 +Node: Conversion260320 +Ref: table-locale-affects265731 +Ref: Conversion-Footnote-1266355 +Node: All Operators266464 +Node: Arithmetic Ops267094 +Node: Concatenation269593 +Ref: Concatenation-Footnote-1272381 +Node: Assignment Ops272472 +Ref: table-assign-ops277460 +Node: Increment Ops278861 +Node: Truth Values and Conditions282339 +Node: Truth Values283422 +Node: Typing and Comparison284470 +Node: Variable Typing285191 +Ref: Variable Typing-Footnote-1288883 +Node: Comparison Operators289027 +Ref: table-relational-ops289405 +Node: Boolean Ops292954 +Ref: Boolean Ops-Footnote-1297032 +Node: Conditional Exp297123 +Node: Function Calls298855 +Node: Precedence302413 +Node: Patterns and Actions306063 +Node: Pattern Overview307117 +Node: Regexp Patterns308554 +Node: Expression Patterns309097 +Node: Ranges312647 +Node: BEGIN/END315736 +Node: Using BEGIN/END316486 +Ref: Using BEGIN/END-Footnote-1319217 +Node: I/O And BEGIN/END319331 +Node: Empty321598 +Node: Using Shell Variables321906 +Node: Action Overview324187 +Node: Statements326545 +Node: If Statement328401 +Node: While Statement329900 +Node: Do Statement331932 +Node: For Statement333081 +Node: Switch Statement336221 +Node: Break Statement338269 +Node: Continue Statement340089 +Node: Next Statement341787 +Node: Nextfile Statement344067 +Node: Exit Statement346785 +Node: Built-in Variables349056 +Node: User-modified350151 +Ref: User-modified-Footnote-1358097 +Node: Auto-set358159 +Ref: Auto-set-Footnote-1366821 +Node: ARGC and ARGV367026 +Node: Arrays370787 +Node: Array Basics372296 +Node: Array Intro373007 +Node: Reference to Elements377394 +Node: Assigning Elements379293 +Node: Array Example379784 +Node: Scanning an Array381516 +Node: Delete383793 +Ref: Delete-Footnote-1386183 +Node: Numeric Array Subscripts386240 +Node: Uninitialized Subscripts388427 +Node: Multi-dimensional390033 +Node: Multi-scanning393124 +Node: Array Sorting394708 +Node: Arrays of Arrays398538 +Node: Functions402646 +Node: Built-in403455 +Node: Calling Built-in404469 +Node: Numeric Functions406445 +Ref: Numeric Functions-Footnote-1410199 +Ref: Numeric Functions-Footnote-2410533 +Node: String Functions410802 +Ref: String Functions-Footnote-1432636 +Ref: String Functions-Footnote-2432765 +Ref: String Functions-Footnote-3433013 +Node: Gory Details433100 +Ref: table-sub-escapes434757 +Ref: table-sub-posix-92436103 +Ref: table-sub-proposed437446 +Ref: table-posix-2001-sub438806 +Ref: table-gensub-escapes440081 +Ref: Gory Details-Footnote-1441284 +Node: I/O Functions441335 +Ref: I/O Functions-Footnote-1448123 +Node: Time Functions448214 +Ref: Time Functions-Footnote-1459026 +Ref: Time Functions-Footnote-2459094 +Ref: Time Functions-Footnote-3459252 +Ref: Time Functions-Footnote-4459363 +Ref: Time Functions-Footnote-5459490 +Ref: Time Functions-Footnote-6459717 +Node: Bitwise Functions459983 +Ref: table-bitwise-ops460561 +Ref: Bitwise Functions-Footnote-1464801 +Node: I18N Functions464985 +Node: User-defined466708 +Node: Definition Syntax467512 +Node: Function Example472210 +Node: Function Caveats474792 +Node: Return Statement478717 +Node: Dynamic Typing481374 +Node: Indirect Calls482111 +Node: Internationalization491746 +Node: I18N and L10N493165 +Node: Explaining gettext493849 +Ref: Explaining gettext-Footnote-1498760 +Ref: Explaining gettext-Footnote-2498999 +Node: Programmer i18n499168 +Node: Translator i18n503403 +Node: String Extraction504194 +Ref: String Extraction-Footnote-1505151 +Node: Printf Ordering505277 +Ref: Printf Ordering-Footnote-1508057 +Node: I18N Portability508121 +Ref: I18N Portability-Footnote-1510566 +Node: I18N Example510629 +Ref: I18N Example-Footnote-1513249 +Node: Gawk I18N513321 +Node: Advanced Features513899 +Node: Nondecimal Data515214 +Node: Two-way I/O516775 +Ref: Two-way I/O-Footnote-1522258 +Node: TCP/IP Networking522335 +Node: Profiling525125 +Node: Invoking Gawk532586 +Node: Command Line533893 +Node: Options534678 +Ref: Options-Footnote-1547766 +Node: Other Arguments547791 +Node: AWKPATH Variable550472 +Ref: AWKPATH Variable-Footnote-1553247 +Node: Exit Status553507 +Node: Include Files554179 +Node: Obsolete557780 +Node: Undocumented558581 +Node: Known Bugs558843 +Node: Library Functions559445 +Ref: Library Functions-Footnote-1562426 +Node: Library Names562597 +Ref: Library Names-Footnote-1566070 +Ref: Library Names-Footnote-2566289 +Node: General Functions566375 +Node: Nextfile Function567438 +Node: Strtonum Function571802 +Node: Assert Function574743 +Node: Round Function578047 +Node: Cliff Random Function579587 +Node: Ordinal Functions580602 +Ref: Ordinal Functions-Footnote-1583662 +Node: Join Function583878 +Ref: Join Function-Footnote-1585640 +Node: Gettimeofday Function585840 +Node: Data File Management589551 +Node: Filetrans Function590183 +Node: Rewind Function593609 +Node: File Checking595055 +Node: Empty Files596085 +Node: Ignoring Assigns598310 +Node: Getopt Function599858 +Ref: Getopt Function-Footnote-1611140 +Node: Passwd Functions611343 +Ref: Passwd Functions-Footnote-1620321 +Node: Group Functions620409 +Node: Sample Programs628506 +Node: Running Examples629175 +Node: Clones629903 +Node: Cut Program631035 +Node: Egrep Program640794 +Ref: Egrep Program-Footnote-1648544 +Node: Id Program648654 +Node: Split Program652261 +Node: Tee Program655728 +Node: Uniq Program658471 +Node: Wc Program665838 +Ref: Wc Program-Footnote-1670082 +Node: Miscellaneous Programs670278 +Node: Dupword Program671398 +Node: Alarm Program673429 +Node: Translate Program677971 +Ref: Translate Program-Footnote-1682350 +Ref: Translate Program-Footnote-2682587 +Node: Labels Program682721 +Ref: Labels Program-Footnote-1686012 +Node: Word Sorting686096 +Node: History Sorting690443 +Node: Extract Program692281 +Node: Simple Sed699639 +Node: Igawk Program702696 +Ref: Igawk Program-Footnote-1717427 +Ref: Igawk Program-Footnote-2717628 +Node: Signature Program717766 +Node: Debugger718846 +Node: Debugging719722 +Node: Debugging Concepts720036 +Node: Debugging Terms721889 +Node: Awk Debugging724437 +Node: Sample dgawk session725329 +Node: dgawk invocation725821 +Node: Finding The Bug727005 +Node: List of Debugger Commands733520 +Node: Breakpoint Control734835 +Node: Dgawk Execution Control738045 +Node: Viewing And Changing Data741394 +Node: Dgawk Stack744690 +Node: Dgawk Info746151 +Node: Miscellaneous Dgawk Commands750089 +Node: Readline Support755805 +Node: Dgawk Limitations756621 +Node: Language History758793 +Node: V7/SVR3.1760170 +Node: SVR4762465 +Node: POSIX763910 +Node: BTL765622 +Node: POSIX/GNU767312 +Node: Contributors776976 +Node: Installation780581 +Node: Gawk Distribution781552 +Node: Getting782036 +Node: Extracting782862 +Node: Distribution contents784250 +Node: Unix Installation789323 +Node: Quick Installation789914 +Node: Additional Configuration Options791616 +Node: Configuration Philosophy793379 +Node: Non-Unix Installation795743 +Node: PC Installation796208 +Node: PC Binary Installation797514 +Node: PC Compiling799357 +Node: PC Dynamic803862 +Node: PC Using806225 +Node: Cygwin810773 +Node: MSYS811757 +Node: VMS Installation812263 +Node: VMS Compilation812867 +Node: VMS Installation Details814444 +Node: VMS Running816074 +Node: VMS POSIX817671 +Node: VMS Old Gawk818969 +Node: Unsupported819438 +Node: Atari Installation819900 +Node: Atari Compiling821187 +Node: Atari Using823076 +Node: BeOS Installation825923 +Node: Tandem Installation827068 +Node: Bugs828747 +Node: Other Versions832579 +Node: Notes837801 +Node: Compatibility Mode838493 +Node: Additions839276 +Node: Adding Code840026 +Node: New Ports846078 +Node: Dynamic Extensions850210 +Node: Internals851535 +Node: Sample Library861940 +Node: Internal File Description862599 +Node: Internal File Ops866294 +Ref: Internal File Ops-Footnote-1871134 +Node: Using Internal File Ops871282 +Node: Future Extensions873307 +Node: Basic Concepts877344 +Node: Basic High Level878101 +Ref: Basic High Level-Footnote-1882217 +Node: Basic Data Typing882411 +Node: Floating Point Issues886848 +Node: String Conversion Precision887931 +Ref: String Conversion Precision-Footnote-1889625 +Node: Unexpected Results889734 +Node: POSIX Floating Point Problems891560 +Ref: POSIX Floating Point Problems-Footnote-1895259 +Node: Glossary895297 +Node: Copying919065 +Node: GNU Free Documentation License956622 +Node: next-edition981766 +Node: unresolved982118 +Node: revision982618 +Node: consistency983041 +Node: Index986394 End Tag Table |