Move to gawk-3.0.2.

author: Arnold D. Robbins <arnold@skeeve.com> 2010-07-16 12:47:28 +0300
committer: Arnold D. Robbins <arnold@skeeve.com> 2010-07-16 12:47:28 +0300
commit: 6719bb6e1c5576e857ab6fc121ec31a75161a3e7 (patch)
tree: 97cba951750ceb73899e48490dbb33674e5b29e1 /doc/gawk.info
parent: 558ba97bdeac5a68bb9248a5c4cdf2feeb24e771 (diff)
download: egawk-6719bb6e1c5576e857ab6fc121ec31a75161a3e7.tar.gz
egawk-6719bb6e1c5576e857ab6fc121ec31a75161a3e7.tar.bz2
egawk-6719bb6e1c5576e857ab6fc121ec31a75161a3e7.zip
1 files changed, 18232 insertions, 0 deletions
diff --git a/doc/gawk.info b/doc/gawk.info
new file mode 100644
index 00000000..680fbab3
--- /dev/null
+++ b/doc/gawk.info
@@ -0,0 +1,18232 @@
+This is gawk.info, produced by makeinfo version 4.0 from gawk.texi.
+
+INFO-DIR-SECTION Programming Languages
+START-INFO-DIR-ENTRY
+* Gawk: (gawk.info).           A Text Scanning and Processing Language.
+END-INFO-DIR-ENTRY
+
+   This file documents `awk', a program that you can use to select
+particular records in a file and perform operations upon them.
+
+   This is Edition 1.0.1 of `The GNU Awk User's Guide', for the
+3.0.1 version of the GNU implementation of AWK.
+
+   Copyright (C) 1989, 1991, 92, 93, 96 Free Software Foundation, Inc.
+
+   Permission is granted to make and distribute verbatim copies of this
+manual provided the copyright notice and this permission notice are
+preserved on all copies.
+
+   Permission is granted to copy and distribute modified versions of
+this manual under the conditions for verbatim copying, provided that
+the entire resulting derived work is distributed under the terms of a
+permission notice identical to this one.
+
+   Permission is granted to copy and distribute translations of this
+manual into another language, under the above conditions for modified
+versions, except that this permission notice may be stated in a
+translation approved by the Foundation.
+
+
+File: gawk.info,  Node: Top,  Next: Preface,  Prev: (dir),  Up: (dir)
+
+General Introduction
+********************
+
+   This file documents `awk', a program that you can use to select
+particular records in a file and perform operations upon them.
+
+   This is Edition 1.0.1 of `The GNU Awk User's Guide',
+for the 3.0.1 version of the GNU implementation
+of AWK.
+
+* Menu:
+
+* Preface::                     What this Info file is about; brief
+                                history and acknowledgements.
+* What Is Awk::                 What is the `awk' language; using this
+                                Info file.
+* Getting Started::             A basic introduction to using `awk'. How
+                                to run an `awk' program. Command line
+                                syntax.
+* One-liners::                  Short, sample `awk' programs.
+* Regexp::                      All about matching things using regular
+                                expressions.
+* Reading Files::               How to read files and manipulate fields.
+* Printing::                    How to print using `awk'.  Describes the
+                                `print' and `printf' statements.
+                                Also describes redirection of output.
+* Expressions::                 Expressions are the basic building blocks of
+                                statements.
+* Patterns and Actions::        Overviews of patterns and actions.
+* Statements::                  The various control statements are described
+                                in detail.
+* Built-in Variables::          Built-in Variables
+* Arrays::                      The description and use of arrays. Also
+                                includes array-oriented control statements.
+* Built-in::                    The built-in functions are summarized here.
+* User-defined::                User-defined functions are described in
+                                detail.
+* Invoking Gawk::               How to run `gawk'.
+* Library Functions::           A Library of `awk' Functions.
+* Sample Programs::             Many `awk' programs with complete
+                                explanations.
+* Language History::            The evolution of the `awk' language.
+* Gawk Summary::                `gawk' Options and Language Summary.
+* Installation::                Installing `gawk' under various operating
+                                systems.
+* Notes::                       Something about the implementation of
+                                `gawk'.
+* Glossary::                    An explanation of some unfamiliar terms.
+* Copying::                     Your right to copy and distribute `gawk'.
+* Index::                       Concept and Variable Index.
+
+* History::                     The history of `gawk' and `awk'.
+* Manual History::              Brief history of the GNU project and this
+                                Info file.
+* Acknowledgements::            Acknowledgements.
+* This Manual::                 Using this Info file. Includes sample
+                                input files that you can use.
+* Conventions::                 Typographical Conventions.
+* Sample Data Files::           Sample data files for use in the `awk'
+                                programs illustrated in this Info file.
+* Names::                       What name to use to find `awk'.
+* Running gawk::                How to run `gawk' programs; includes
+                                command line syntax.
+* One-shot::                    Running a short throw-away `awk' program.
+* Read Terminal::               Using no input files (input from terminal
+                                instead).
+* Long::                        Putting permanent `awk' programs in
+                                files.
+* Executable Scripts::          Making self-contained `awk' programs.
+* Comments::                    Adding documentation to `gawk' programs.
+* Very Simple::                 A very simple example.
+* Two Rules::                   A less simple one-line example with two rules.
+* More Complex::                A more complex example.
+* Statements/Lines::            Subdividing or combining statements into
+                                lines.
+* Other Features::              Other Features of `awk'.
+* When::                        When to use `gawk' and when to use other
+                                things.
+* Regexp Usage::                How to Use Regular Expressions.
+* Escape Sequences::            How to write non-printing characters.
+* Regexp Operators::            Regular Expression Operators.
+* GNU Regexp Operators::        Operators specific to GNU software.
+* Case-sensitivity::            How to do case-insensitive matching.
+* Leftmost Longest::            How much text matches.
+* Computed Regexps::            Using Dynamic Regexps.
+* Records::                     Controlling how data is split into records.
+* Fields::                      An introduction to fields.
+* Non-Constant Fields::         Non-constant Field Numbers.
+* Changing Fields::             Changing the Contents of a Field.
+* Field Separators::            The field separator and how to change it.
+* Basic Field Splitting::       How fields are split with single characters or
+                                simple strings.
+* Regexp Field Splitting::      Using regexps as the field separator.
+* Single Character Fields::     Making each character a separate field.
+* Command Line Field Separator:: Setting `FS' from the command line.
+* Field Splitting Summary::     Some final points and a summary table.
+* Constant Size::               Reading constant width data.
+* Multiple Line::               Reading multi-line records.
+* Getline::                     Reading files under explicit program control
+                                using the `getline' function.
+* Getline Intro::               Introduction to the `getline' function.
+* Plain Getline::               Using `getline' with no arguments.
+* Getline/Variable::            Using `getline' into a variable.
+* Getline/File::                Using `getline' from a file.
+* Getline/Variable/File::       Using `getline' into a variable from a
+                                file.
+* Getline/Pipe::                Using `getline' from a pipe.
+* Getline/Variable/Pipe::       Using `getline' into a variable from a
+                                pipe.
+* Getline Summary::             Summary Of `getline' Variants.
+* Print::                       The `print' statement.
+* Print Examples::              Simple examples of `print' statements.
+* Output Separators::           The output separators and how to change them.
+* OFMT::                        Controlling Numeric Output With `print'.
+* Printf::                      The `printf' statement.
+* Basic Printf::                Syntax of the `printf' statement.
+* Control Letters::             Format-control letters.
+* Format Modifiers::            Format-specification modifiers.
+* Printf Examples::             Several examples.
+* Redirection::                 How to redirect output to multiple files and
+                                pipes.
+* Special Files::               File name interpretation in `gawk'.
+                                `gawk' allows access to inherited file
+                                descriptors.
+* Close Files And Pipes::       Closing Input and Output Files and Pipes.
+* Constants::                   String, numeric, and regexp constants.
+* Scalar Constants::            Numeric and string constants.
+* Regexp Constants::            Regular Expression constants.
+* Using Constant Regexps::      When and how to use a regexp constant.
+* Variables::                   Variables give names to values for later use.
+* Using Variables::             Using variables in your programs.
+* Assignment Options::          Setting variables on the command line and a
+                                summary of command line syntax. This is an
+                                advanced method of input.
+* Conversion::                  The conversion of strings to numbers and vice
+                                versa.
+* Arithmetic Ops::              Arithmetic operations (`+', `-',
+                                etc.)
+* Concatenation::               Concatenating strings.
+* Assignment Ops::              Changing the value of a variable or a field.
+* Increment Ops::               Incrementing the numeric value of a variable.
+* Truth Values::                What is ``true'' and what is ``false''.
+* Typing and Comparison::       How variables acquire types, and how this
+                                affects comparison of numbers and strings with
+                                `<', etc.
+* Boolean Ops::                 Combining comparison expressions using boolean
+                                operators `||' (``or''), `&&'
+                                (``and'') and `!' (``not'').
+* Conditional Exp::             Conditional expressions select between two
+                                subexpressions under control of a third
+                                subexpression.
+* Function Calls::              A function call is an expression.
+* Precedence::                  How various operators nest.
+* Pattern Overview::            What goes into a pattern.
+* Kinds of Patterns::           A list of all kinds of patterns.
+* Regexp Patterns::             Using regexps as patterns.
+* Expression Patterns::         Any expression can be used as a pattern.
+* Ranges::                      Pairs of patterns specify record ranges.
+* BEGIN/END::                   Specifying initialization and cleanup rules.
+* Using BEGIN/END::             How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::           I/O issues in BEGIN/END rules.
+* Empty::                       The empty pattern, which matches every record.
+* Action Overview::             What goes into an action.
+* If Statement::                Conditionally execute some `awk'
+                                statements.
+* While Statement::             Loop until some condition is satisfied.
+* Do Statement::                Do specified action while looping until some
+                                condition is satisfied.
+* For Statement::               Another looping statement, that provides
+                                initialization and increment clauses.
+* Break Statement::             Immediately exit the innermost enclosing loop.
+* Continue Statement::          Skip to the end of the innermost enclosing
+                                loop.
+* Next Statement::              Stop processing the current input record.
+* Nextfile Statement::          Stop processing the current file.
+* Exit Statement::              Stop execution of `awk'.
+* User-modified::               Built-in variables that you change to control
+                                `awk'.
+* Auto-set::                    Built-in variables where `awk' gives you
+                                information.
+* ARGC and ARGV::               Ways to use `ARGC' and `ARGV'.
+* Array Intro::                 Introduction to Arrays
+* Reference to Elements::       How to examine one element of an array.
+* Assigning Elements::          How to change an element of an array.
+* Array Example::               Basic Example of an Array
+* Scanning an Array::           A variation of the `for' statement. It
+                                loops through the indices of an array's
+                                existing elements.
+* Delete::                      The `delete' statement removes an element
+                                from an array.
+* Numeric Array Subscripts::    How to use numbers as subscripts in
+                                `awk'.
+* Uninitialized Subscripts::    Using Uninitialized variables as subscripts.
+* Multi-dimensional::           Emulating multi-dimensional arrays in
+                                `awk'.
+* Multi-scanning::              Scanning multi-dimensional arrays.
+* Calling Built-in::            How to call built-in functions.
+* Numeric Functions::           Functions that work with numbers, including
+                                `int', `sin' and `rand'.
+* String Functions::            Functions for string manipulation, such as
+                                `split', `match', and
+                                `sprintf'.
+* I/O Functions::               Functions for files and shell commands.
+* Time Functions::              Functions for dealing with time stamps.
+* Definition Syntax::           How to write definitions and what they mean.
+* Function Example::            An example function definition and what it
+                                does.
+* Function Caveats::            Things to watch out for.
+* Return Statement::            Specifying the value a function returns.
+* Options::                     Command line options and their meanings.
+* Other Arguments::             Input file names and variable assignments.
+* AWKPATH Variable::            Searching directories for `awk' programs.
+* Obsolete::                    Obsolete Options and/or features.
+* Undocumented::                Undocumented Options and Features.
+* Known Bugs::                  Known Bugs in `gawk'.
+* Portability Notes::           What to do if you don't have `gawk'.
+* Nextfile Function::           Two implementations of a `nextfile'
+                                function.
+* Assert Function::             A function for assertions in `awk'
+                                programs.
+* Round Function::              A function for rounding if `sprintf' does
+                                not do it correctly.
+* Ordinal Functions::           Functions for using characters as numbers and
+                                vice versa.
+* Join Function::               A function to join an array into a string.
+* Mktime Function::             A function to turn a date into a timestamp.
+* Gettimeofday Function::       A function to get formatted times.
+* Filetrans Function::          A function for handling data file transitions.
+* Getopt Function::             A function for processing command line
+                                arguments.
+* Passwd Functions::            Functions for getting user information.
+* Group Functions::             Functions for getting group information.
+* Library Names::               How to best name private global variables in
+                                library functions.
+* Clones::                      Clones of common utilities.
+* Cut Program::                 The `cut' utility.
+* Egrep Program::               The `egrep' utility.
+* Id Program::                  The `id' utility.
+* Split Program::               The `split' utility.
+* Tee Program::                 The `tee' utility.
+* Uniq Program::                The `uniq' utility.
+* Wc Program::                  The `wc' utility.
+* Miscellaneous Programs::      Some interesting `awk' programs.
+* Dupword Program::             Finding duplicated words in a document.
+* Alarm Program::               An alarm clock.
+* Translate Program::           A program similar to the `tr' utility.
+* Labels Program::              Printing mailing labels.
+* Word Sorting::                A program to produce a word usage count.
+* History Sorting::             Eliminating duplicate entries from a history
+                                file.
+* Extract Program::             Pulling out programs from Texinfo source
+                                files.
+* Simple Sed::                  A Simple Stream Editor.
+* Igawk Program::               A wrapper for `awk' that includes files.
+* V7/SVR3.1::                   The major changes between V7 and System V
+                                Release 3.1.
+* SVR4::                        Minor changes between System V Releases 3.1
+                                and 4.
+* POSIX::                       New features from the POSIX standard.
+* BTL::                         New features from the Bell Laboratories
+                                version of `awk'.
+* POSIX/GNU::                   The extensions in `gawk' not in POSIX
+                                `awk'.
+* Command Line Summary::        Recapitulation of the command line.
+* Language Summary::            A terse review of the language.
+* Variables/Fields::            Variables, fields, and arrays.
+* Fields Summary::              Input field splitting.
+* Built-in Summary::            `awk''s built-in variables.
+* Arrays Summary::              Using arrays.
+* Data Type Summary::           Values in `awk' are numbers or strings.
+* Rules Summary::               Patterns and Actions, and their component
+                                parts.
+* Pattern Summary::             Quick overview of patterns.
+* Regexp Summary::              Quick overview of regular expressions.
+* Actions Summary::             Quick overview of actions.
+* Operator Summary::            `awk' operators.
+* Control Flow Summary::        The control statements.
+* I/O Summary::                 The I/O statements.
+* Printf Summary::              A summary of `printf'.
+* Special File Summary::        Special file names interpreted internally.
+* Built-in Functions Summary::  Built-in numeric and string functions.
+* Time Functions Summary::      Built-in time functions.
+* String Constants Summary::    Escape sequences in strings.
+* Functions Summary::           Defining and calling functions.
+* Historical Features::         Some undocumented but supported ``features''.
+* Gawk Distribution::           What is in the `gawk' distribution.
+* Getting::                     How to get the distribution.
+* Extracting::                  How to extract the distribution.
+* Distribution contents::       What is in the distribution.
+* Unix Installation::           Installing `gawk' under various versions
+                                of Unix.
+* Quick Installation::          Compiling `gawk' under Unix.
+* Configuration Philosophy::    How it's all supposed to work.
+* VMS Installation::            Installing `gawk' on VMS.
+* VMS Compilation::             How to compile `gawk' under VMS.
+* VMS Installation Details::    How to install `gawk' under VMS.
+* VMS Running::                 How to run `gawk' under VMS.
+* VMS POSIX::                   Alternate instructions for VMS POSIX.
+* PC Installation::             Installing and Compiling `gawk' on MS-DOS
+                                and OS/2
+* Atari Installation::          Installing `gawk' on the Atari ST.
+* Atari Compiling::             Compiling `gawk' on Atari
+* Atari Using::                 Running `gawk' on Atari
+* Amiga Installation::          Installing `gawk' on an Amiga.
+* Bugs::                        Reporting Problems and Bugs.
+* Other Versions::              Other freely available `awk'
+                                implementations.
+* Compatibility Mode::          How to disable certain `gawk' extensions.
+* Additions::                   Making Additions To `gawk'.
+* Adding Code::                 Adding code to the main body of `gawk'.
+* New Ports::                   Porting `gawk' to a new operating system.
+* Future Extensions::           New features that may be implemented one day.
+* Improvements::                Suggestions for improvements by volunteers.
+
+                  To Miriam, for making me complete.
+
+
+                  To Chana, for the joy you bring us.
+
+
+                To Rivka, for the exponential increase.
+
+
+                  To Nachum, for the added dimension.
+
+
+File: gawk.info,  Node: Preface,  Next: What Is Awk,  Prev: Top,  Up: Top
+
+Preface
+*******
+
+   This Info file teaches you about the `awk' language and how you can
+use it effectively.  You should already be familiar with basic system
+commands, such as `cat' and `ls',(1) and basic shell facilities, such
+as Input/Output (I/O) redirection and pipes.
+
+   Implementations of the `awk' language are available for many
+different computing environments.  This Info file, while describing the
+`awk' language in general, also describes a particular implementation
+of `awk' called `gawk' (which stands for "GNU Awk").  `gawk' runs on a
+broad range of Unix systems, ranging from 80386 PC-based computers, up
+through large scale systems, such as Crays. `gawk' has also been ported
+to MS-DOS and OS/2 PC's, Atari and Amiga micro-computers, and VMS.
+
+* Menu:
+
+* History::                     The history of `gawk' and `awk'.
+* Manual History::              Brief history of the GNU project and this
+                                Info file.
+* Acknowledgements::            Acknowledgements.
+
+   ---------- Footnotes ----------
+
+   (1) These commands are available on POSIX compliant systems, as well
+as on traditional Unix based systems. If you are using some other
+operating system, you still need to be familiar with the ideas of I/O
+redirection and pipes.
+
+
+File: gawk.info,  Node: History,  Next: Manual History,  Prev: Preface,  Up: Preface
+
+History of `awk' and `gawk'
+===========================
+
+   The name `awk' comes from the initials of its designers: Alfred V.
+Aho, Peter J. Weinberger, and Brian W. Kernighan.  The original version
+of `awk' was written in 1977 at AT&T Bell Laboratories.  In 1985 a new
+version made the programming language more powerful, introducing
+user-defined functions, multiple input streams, and computed regular
+expressions.  This new version became generally available with Unix
+System V Release 3.1.  The version in System V Release 4 added some new
+features and also cleaned up the behavior in some of the "dark corners"
+of the language.  The specification for `awk' in the POSIX Command
+Language and Utilities standard further clarified the language based on
+feedback from both the `gawk' designers, and the original Bell Labs
+`awk' designers.
+
+   The GNU implementation, `gawk', was written in 1986 by Paul Rubin
+and Jay Fenlason, with advice from Richard Stallman.  John Woods
+contributed parts of the code as well.  In 1988 and 1989, David
+Trueman, with help from Arnold Robbins, thoroughly reworked `gawk' for
+compatibility with the newer `awk'.  Current development focuses on bug
+fixes, performance improvements, standards compliance, and
+occasionally, new features.
+
+
+File: gawk.info,  Node: Manual History,  Next: Acknowledgements,  Prev: History,  Up: Preface
+
+The GNU Project and This Book
+=============================
+
+   The Free Software Foundation (FSF) is a non-profit organization
+dedicated to the production and distribution of freely distributable
+software.  It was founded by Richard M. Stallman, the author of the
+original Emacs editor.  GNU Emacs is the most widely used version of
+Emacs today.
+
+   The GNU project is an on-going effort on the part of the Free
+Software Foundation to create a complete, freely distributable, POSIX
+compliant computing environment.  (GNU stands for "GNU's not Unix".)
+The FSF uses the "GNU General Public License" (or GPL) to ensure that
+source code for their software is always available to the end user. A
+copy of the GPL is included for your reference (*note GNU GENERAL
+PUBLIC LICENSE: Copying.).  The GPL applies to the C language source
+code for `gawk'.
+
+   As of this writing (1995), the only major component of the GNU
+environment still uncompleted is the operating system kernel, and work
+proceeds apace on that.  A shell, an editor (Emacs), highly portable
+optimizing C, C++, and Objective-C compilers, a symbolic debugger, and
+dozens of large and small utilities (such as `gawk'), have all been
+completed and are freely available.
+
+   Until the GNU operating system is released, the FSF recommends the
+use of Linux, a freely distributable, Unix-like operating system for
+80386 and other systems.  There are many books on Linux. One freely
+available one is `Linux Installation and Getting Started', by Matt
+Welsh.  Many Linux distributions are available, often in computer
+stores or bundled on CD-ROM with books about Linux. Also, the FSF
+provides a Linux distribution ("Debian"); contact them for more
+information.  *Note Getting the `gawk' Distribution: Getting, for the
+FSF's contact information.  (There are two other freely available,
+Unix-like operating systems for 80386 and other systems, NetBSD and
+FreeBSD. Both are based on the 4.4-Lite Berkeley Software Distribution,
+and both use recent versions of `gawk' for their versions of `awk'.)
+
+   This Info file itself has gone through several previous, preliminary
+editions.  I started working on a preliminary draft of `The GAWK
+Manual', by Diane Close, Paul Rubin, and Richard Stallman in the fall
+of 1988.  It was around 90 pages long, and barely described the
+original, "old" version of `awk'. After substantial revision, the first
+version of the `The GAWK Manual' to be released was Edition 0.11 Beta in
+October of 1989.  The manual then underwent more substantial revision
+for Edition 0.13 of December 1991.  David Trueman, Pat Rankin, and
+Michal Jaegermann contributed sections of the manual for Edition 0.13.
+That edition was published by the FSF as a bound book early in 1992.
+Since then there have been several minor revisions, notably Edition
+0.14 of November 1992 that was published by the FSF in January of 1993,
+and Edition 0.16 of August 1993.
+
+   Edition 1.0 of `The GNU Awk User's Guide' represents a significant
+re-working of `The GAWK Manual', with much additional material.  The
+FSF and I agree that I am now the primary author.  I also felt that it
+needed a more descriptive title.
+
+   `The GNU Awk User's Guide' will undoubtedly continue to evolve.  An
+electronic version comes with the `gawk' distribution from the FSF.  If
+you find an error in this Info file, please report it!  *Note Reporting
+Problems and Bugs: Bugs, for information on submitting problem reports
+electronically, or write to me in care of the FSF.
+
+
+File: gawk.info,  Node: Acknowledgements,  Prev: Manual History,  Up: Preface
+
+Acknowledgements
+================
+
+   I would like to acknowledge Richard M. Stallman, for his vision of a
+better world, and for his courage in founding the FSF and starting the
+GNU project.
+
+   The initial draft of `The GAWK Manual' had the following
+acknowledgements:
+
+     Many people need to be thanked for their assistance in producing
+     this manual.  Jay Fenlason contributed many ideas and sample
+     programs.  Richard Mlynarik and Robert Chassell gave helpful
+     comments on drafts of this manual.  The paper `A Supplemental
+     Document for `awk'' by John W.  Pierce of the Chemistry Department
+     at UC San Diego, pinpointed several issues relevant both to `awk'
+     implementation and to this manual, that would otherwise have
+     escaped us.
+
+   The following people provided many helpful comments on Edition 0.13
+of `The GAWK Manual': Rick Adams, Michael Brennan, Rich Burridge, Diane
+Close, Christopher ("Topher") Eliot, Michael Lijewski, Pat Rankin,
+Miriam Robbins, and Michal Jaegermann.
+
+   The following people provided many helpful comments for Edition 1.0
+of `The GNU Awk User's Guide': Karl Berry, Michael Brennan, Darrel
+Hankerson, Michal Jaegermann, Michael Lijewski, and Miriam Robbins.
+Pat Rankin, Michal Jaegermann, Darrel Hankerson and Scott Deifik
+updated their respective sections for Edition 1.0.
+
+   Robert J. Chassell provided much valuable advice on the use of
+Texinfo.  He also deserves special thanks for convincing me _not_ to
+title this Info file `How To Gawk Politely'.  Karl Berry helped
+significantly with the TeX part of Texinfo.
+
+   David Trueman deserves special credit; he has done a yeoman job of
+evolving `gawk' so that it performs well, and without bugs.  Although
+he is no longer involved with `gawk', working with him on this project
+was a significant pleasure.
+
+   Scott Deifik, Darrel Hankerson, Kai Uwe Rommel, Pat Rankin, and
+Michal Jaegermann (in no particular order) are long time members of the
+`gawk' "crack portability team."  Without their hard work and help,
+`gawk' would not be nearly the fine program it is today.  It has been
+and continues to be a pleasure working with this team of fine people.
+
+   Jeffrey Friedl provided invaluable help in tracking down a number of
+last minute problems with regular expressions in `gawk' 3.0.
+
+   David and I would like to thank Brian Kernighan of Bell Labs for
+invaluable assistance during the testing and debugging of `gawk', and
+for help in clarifying numerous points about the language.  We could
+not have done nearly as good a job on either `gawk' or its
+documentation without his help.
+
+   I would like to thank Marshall and Elaine Hartholz of Seattle, and
+Dr.  Bert and Rita Schreiber of Detroit for large amounts of quiet
+vacation time in their homes, which allowed me to make significant
+progress on this Info file and on `gawk' itself.  Phil Hughes of SSC
+contributed in a very important way by loaning me his laptop Linux
+system, not once, but twice, allowing me to do a lot of work while away
+from home.
+
+   Finally, I must thank my wonderful wife, Miriam, for her patience
+through the many versions of this project, for her proof-reading, and
+for sharing me with the computer.  I would like to thank my parents for
+their love, and for the grace with which they raised and educated me.
+I also must acknowledge my gratitude to G-d, for the many opportunities
+He has sent my way, as well as for the gifts He has given me with which
+to take advantage of those opportunities.
+
+
+
+Arnold Robbins
+Atlanta, Georgia
+January, 1996
+
+
+File: gawk.info,  Node: What Is Awk,  Next: Getting Started,  Prev: Preface,  Up: Top
+
+Introduction
+************
+
+   If you are like many computer users, you would frequently like to
+make changes in various text files wherever certain patterns appear, or
+extract data from parts of certain lines while discarding the rest.  To
+write a program to do this in a language such as C or Pascal is a
+time-consuming inconvenience that may take many lines of code.  The job
+may be easier with `awk'.
+
+   The `awk' utility interprets a special-purpose programming language
+that makes it possible to handle simple data-reformatting jobs with
+just a few lines of code.
+
+   The GNU implementation of `awk' is called `gawk'; it is fully upward
+compatible with the System V Release 4 version of `awk'.  `gawk' is
+also upward compatible with the POSIX specification of the `awk'
+language.  This means that all properly written `awk' programs should
+work with `gawk'.  Thus, we usually don't distinguish between `gawk'
+and other `awk' implementations.
+
+   Using `awk' you can:
+
+   * manage small, personal databases
+
+   * generate reports
+
+   * validate data
+
+   * produce indexes, and perform other document preparation tasks
+
+   * even experiment with algorithms that can be adapted later to other
+     computer languages
+
+* Menu:
+
+* This Manual::                 Using this Info file. Includes sample
+                                input files that you can use.
+* Conventions::                 Typographical Conventions.
+* Sample Data Files::           Sample data files for use in the `awk'
+                                programs illustrated in this Info file.
+
+
+File: gawk.info,  Node: This Manual,  Next: Conventions,  Prev: What Is Awk,  Up: What Is Awk
+
+Using This Book
+===============
+
+   The term `awk' refers to a particular program, and to the language
+you use to tell this program what to do.  When we need to be careful,
+we call the program "the `awk' utility" and the language "the `awk'
+language."  The term `gawk' refers to a version of `awk' developed as
+part the GNU project.  The purpose of this Info file is to explain both
+the `awk' language and how to run the `awk' utility.
+
+   The main purpose of the Info file is to explain the features of
+`awk', as defined in the POSIX standard.  It does so in the context of
+one particular implementation, `gawk'. While doing so, it will also
+attempt to describe important differences between `gawk' and other
+`awk' implementations.  Finally, any `gawk' features that are not in
+the POSIX standard for `awk' will be noted.
+
+   The term "`awk' program" refers to a program written by you in the
+`awk' programming language.
+
+   *Note Getting Started with `awk': Getting Started, for the bare
+essentials you need to know to start using `awk'.
+
+   Some useful "one-liners" are included to give you a feel for the
+`awk' language (*note Useful One Line Programs: One-liners.).
+
+   Many sample `awk' programs have been provided for you (*note A
+Library of `awk' Functions: Library Functions.; also *note Practical
+`awk' Programs: Sample Programs.).
+
+   The entire `awk' language is summarized for quick reference in *Note
+`gawk' Summary: Gawk Summary.  Look there if you just need to refresh
+your memory about a particular feature.
+
+   If you find terms that you aren't familiar with, try looking them up
+in the glossary (*note Glossary::).
+
+   Most of the time complete `awk' programs are used as examples, but in
+some of the more advanced sections, only the part of the `awk' program
+that illustrates the concept being described is shown.
+
+   While this Info file is aimed principally at people who have not been
+exposed to `awk', there is a lot of information here that even the `awk'
+expert should find useful.  In particular, the description of POSIX
+`awk', and the example programs in *Note A Library of `awk' Functions:
+Library Functions, and *Note Practical `awk' Programs: Sample Programs,
+should be of interest.
+
+Dark Corners
+------------
+
+   Until the POSIX standard (and `The Gawk Manual'), many features of
+`awk' were either poorly documented, or not documented at all.
+Descriptions of such features (often called "dark corners") are noted
+in this Info file with "(d.c.)".  They also appear in the index under
+the heading "dark corner."
+
+
+File: gawk.info,  Node: Conventions,  Next: Sample Data Files,  Prev: This Manual,  Up: What Is Awk
+
+Typographical Conventions
+=========================
+
+   This Info file is written using Texinfo, the GNU documentation
+formatting language.  A single Texinfo source file is used to produce
+both the printed and on-line versions of the documentation.  This
+section briefly documents the typographical conventions used in Texinfo.
+
+   Examples you would type at the command line are preceded by the
+common shell primary and secondary prompts, `$' and `>'.  Output from
+the command is preceded by the glyph "-|".  This typically represents
+the command's standard output.  Error messages, and other output on the
+command's standard error, are preceded by the glyph "error-->".  For
+example:
+
+     $ echo hi on stdout
+     -| hi on stdout
+     $ echo hello on stderr 1>&2
+     error--> hello on stderr
+
+   Characters that you type at the keyboard look `like this'.  In
+particular, there are special characters called "control characters."
+These are characters that you type by holding down both the `CONTROL'
+key and another key, at the same time.  For example, a `Control-d' is
+typed by first pressing and holding the `CONTROL' key, next pressing
+the `d' key, and finally releasing both keys.
+
+
+File: gawk.info,  Node: Sample Data Files,  Prev: Conventions,  Up: What Is Awk
+
+Data Files for the Examples
+===========================
+
+   Many of the examples in this Info file take their input from two
+sample data files.  The first, called `BBS-list', represents a list of
+computer bulletin board systems together with information about those
+systems.  The second data file, called `inventory-shipped', contains
+information about shipments on a monthly basis.  In both files, each
+line is considered to be one "record".
+
+   In the file `BBS-list', each record contains the name of a computer
+bulletin board, its phone number, the board's baud rate(s), and a code
+for the number of hours it is operational.  An `A' in the last column
+means the board operates 24 hours a day.  A `B' in the last column
+means the board operates evening and weekend hours, only.  A `C' means
+the board operates only on weekends.
+
+     aardvark     555-5553     1200/300          B
+     alpo-net     555-3412     2400/1200/300     A
+     barfly       555-7685     1200/300          A
+     bites        555-1675     2400/1200/300     A
+     camelot      555-0542     300               C
+     core         555-2912     1200/300          C
+     fooey        555-1234     2400/1200/300     B
+     foot         555-6699     1200/300          B
+     macfoo       555-6480     1200/300          A
+     sdace        555-3430     2400/1200/300     A
+     sabafoo      555-2127     1200/300          C
+
+   The second data file, called `inventory-shipped', represents
+information about shipments during the year.  Each record contains the
+month of the year, the number of green crates shipped, the number of
+red boxes shipped, the number of orange bags shipped, and the number of
+blue packages shipped, respectively.  There are 16 entries, covering
+the 12 months of one year and four months of the next year.
+
+     Jan  13  25  15 115
+     Feb  15  32  24 226
+     Mar  15  24  34 228
+     Apr  31  52  63 420
+     May  16  34  29 208
+     Jun  31  42  75 492
+     Jul  24  34  67 436
+     Aug  15  34  47 316
+     Sep  13  55  37 277
+     Oct  29  54  68 525
+     Nov  20  87  82 577
+     Dec  17  35  61 401
+     
+     Jan  21  36  64 620
+     Feb  26  58  80 652
+     Mar  24  75  70 495
+     Apr  21  70  74 514
+
+   If you are reading this in GNU Emacs using Info, you can copy the
+regions of text showing these sample files into your own test files.
+This way you can try out the examples shown in the remainder of this
+document.  You do this by using the command `M-x write-region' to copy
+text from the Info file into a file for use with `awk' (*Note
+Miscellaneous File Operations: (emacs)Misc File Ops, for more
+information).  Using this information, create your own `BBS-list' and
+`inventory-shipped' files, and practice what you learn in this Info
+file.
+
+   If you are using the stand-alone version of Info, see *Note
+Extracting Programs from Texinfo Source Files: Extract Program, for an
+`awk' program that will extract these data files from `gawk.texi', the
+Texinfo source file for this Info file.
+
+
+File: gawk.info,  Node: Getting Started,  Next: One-liners,  Prev: What Is Awk,  Up: Top
+
+Getting Started with `awk'
+**************************
+
+   The basic function of `awk' is to search files for lines (or other
+units of text) that contain certain patterns.  When a line matches one
+of the patterns, `awk' performs specified actions on that line.  `awk'
+keeps processing input lines in this way until the end of the input
+files are reached.
+
+   Programs in `awk' are different from programs in most other
+languages, because `awk' programs are "data-driven"; that is, you
+describe the data you wish to work with, and then what to do when you
+find it.  Most other languages are "procedural"; you have to describe,
+in great detail, every step the program is to take.  When working with
+procedural languages, it is usually much harder to clearly describe the
+data your program will process.  For this reason, `awk' programs are
+often refreshingly easy to both write and read.
+
+   When you run `awk', you specify an `awk' "program" that tells `awk'
+what to do.  The program consists of a series of "rules".  (It may also
+contain "function definitions", an advanced feature which we will
+ignore for now.  *Note User-defined Functions: User-defined.)  Each
+rule specifies one pattern to search for, and one action to perform
+when that pattern is found.
+
+   Syntactically, a rule consists of a pattern followed by an action.
+The action is enclosed in curly braces to separate it from the pattern.
+Rules are usually separated by newlines.  Therefore, an `awk' program
+looks like this:
+
+     PATTERN { ACTION }
+     PATTERN { ACTION }
+     ...
+
+* Menu:
+
+* Names::                       What name to use to find `awk'.
+* Running gawk::                How to run `gawk' programs; includes
+                                command line syntax.
+* Very Simple::                 A very simple example.
+* Two Rules::                   A less simple one-line example with two rules.
+* More Complex::                A more complex example.
+* Statements/Lines::            Subdividing or combining statements into
+                                lines.
+* Other Features::              Other Features of `awk'.
+* When::                        When to use `gawk' and when to use other
+                                things.
+
+
+File: gawk.info,  Node: Names,  Next: Running gawk,  Prev: Getting Started,  Up: Getting Started
+
+A Rose By Any Other Name
+========================
+
+   The `awk' language has evolved over the years. Full details are
+provided in *Note The Evolution of the `awk' Language: Language History.
+The language described in this Info file is often referred to as "new
+`awk'."
+
+   Because of this, many systems have multiple versions of `awk'.  Some
+systems have an `awk' utility that implements the original version of
+the `awk' language, and a `nawk' utility for the new version.  Others
+have an `oawk' for the "old `awk'" language, and plain `awk' for the
+new one.  Still others only have one version, usually the new one.(1)
+
+   All in all, this makes it difficult for you to know which version of
+`awk' you should run when writing your programs.  The best advice we
+can give here is to check your local documentation. Look for `awk',
+`oawk', and `nawk', as well as for `gawk'. Chances are, you will have
+some version of new `awk' on your system, and that is what you should
+use when running your programs.  (Of course, if you're reading this
+Info file, chances are good that you have `gawk'!)
+
+   Throughout this Info file, whenever we refer to a language feature
+that should be available in any complete implementation of POSIX `awk',
+we simply use the term `awk'.  When referring to a feature that is
+specific to the GNU implementation, we use the term `gawk'.
+
+   ---------- Footnotes ----------
+
+   (1) Often, these systems use `gawk' for their `awk' implementation!
+
+
+File: gawk.info,  Node: Running gawk,  Next: Very Simple,  Prev: Names,  Up: Getting Started
+
+How to Run `awk' Programs
+=========================
+
+   There are several ways to run an `awk' program.  If the program is
+short, it is easiest to include it in the command that runs `awk', like
+this:
+
+     awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ...
+
+where PROGRAM consists of a series of patterns and actions, as
+described earlier.  (The reason for the single quotes is described
+below, in *Note One-shot Throw-away `awk' Programs: One-shot.)
+
+   When the program is long, it is usually more convenient to put it in
+a file and run it with a command like this:
+
+     awk -f PROGRAM-FILE INPUT-FILE1 INPUT-FILE2 ...
+
+* Menu:
+
+* One-shot::                    Running a short throw-away `awk' program.
+* Read Terminal::               Using no input files (input from terminal
+                                instead).
+* Long::                        Putting permanent `awk' programs in
+                                files.
+* Executable Scripts::          Making self-contained `awk' programs.
+* Comments::                    Adding documentation to `gawk' programs.
+
+
+File: gawk.info,  Node: One-shot,  Next: Read Terminal,  Prev: Running gawk,  Up: Running gawk
+
+One-shot Throw-away `awk' Programs
+----------------------------------
+
+   Once you are familiar with `awk', you will often type in simple
+programs the moment you want to use them.  Then you can write the
+program as the first argument of the `awk' command, like this:
+
+     awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ...
+
+where PROGRAM consists of a series of PATTERNS and ACTIONS, as
+described earlier.
+
+   This command format instructs the "shell", or command interpreter,
+to start `awk' and use the PROGRAM to process records in the input
+file(s).  There are single quotes around PROGRAM so that the shell
+doesn't interpret any `awk' characters as special shell characters.
+They also cause the shell to treat all of PROGRAM as a single argument
+for `awk' and allow PROGRAM to be more than one line long.
+
+   This format is also useful for running short or medium-sized `awk'
+programs from shell scripts, because it avoids the need for a separate
+file for the `awk' program.  A self-contained shell script is more
+reliable since there are no other files to misplace.
+
+   *Note Useful One Line Programs: One-liners, presents several short,
+self-contained programs.
+
+   As an interesting side point, the command
+
+     awk '/foo/' FILES ...
+
+is essentially the same as
+
+     egrep foo FILES ...
+
+
+File: gawk.info,  Node: Read Terminal,  Next: Long,  Prev: One-shot,  Up: Running gawk
+
+Running `awk' without Input Files
+---------------------------------
+
+   You can also run `awk' without any input files.  If you type the
+command line:
+
+     awk 'PROGRAM'
+
+then `awk' applies the PROGRAM to the "standard input", which usually
+means whatever you type on the terminal.  This continues until you
+indicate end-of-file by typing `Control-d'.  (On other operating
+systems, the end-of-file character may be different.  For example, on
+OS/2 and MS-DOS, it is `Control-z'.)
+
+   For example, the following program prints a friendly piece of advice
+(from Douglas Adams' `The Hitchhiker's Guide to the Galaxy'), to keep
+you from worrying about the complexities of computer programming
+(`BEGIN' is a feature we haven't discussed yet).
+
+     $ awk "BEGIN { print \"Don't Panic!\" }"
+     -| Don't Panic!
+
+   This program does not read any input.  The `\' before each of the
+inner double quotes is necessary because of the shell's quoting rules,
+in particular because it mixes both single quotes and double quotes.
+
+   This next simple `awk' program emulates the `cat' utility; it copies
+whatever you type at the keyboard to its standard output. (Why this
+works is explained shortly.)
+
+     $ awk '{ print }'
+     Now is the time for all good men
+     -| Now is the time for all good men
+     to come to the aid of their country.
+     -| to come to the aid of their country.
+     Four score and seven years ago, ...
+     -| Four score and seven years ago, ...
+     What, me worry?
+     -| What, me worry?
+     Control-d
+
+
+File: gawk.info,  Node: Long,  Next: Executable Scripts,  Prev: Read Terminal,  Up: Running gawk
+
+Running Long Programs
+---------------------
+
+   Sometimes your `awk' programs can be very long.  In this case it is
+more convenient to put the program into a separate file.  To tell `awk'
+to use that file for its program, you type:
+
+     awk -f SOURCE-FILE INPUT-FILE1 INPUT-FILE2 ...
+
+   The `-f' instructs the `awk' utility to get the `awk' program from
+the file SOURCE-FILE.  Any file name can be used for SOURCE-FILE.  For
+example, you could put the program:
+
+     BEGIN { print "Don't Panic!" }
+
+into the file `advice'.  Then this command:
+
+     awk -f advice
+
+does the same thing as this one:
+
+     awk "BEGIN { print \"Don't Panic!\" }"
+
+which was explained earlier (*note Running `awk' without Input Files:
+Read Terminal.).  Note that you don't usually need single quotes around
+the file name that you specify with `-f', because most file names don't
+contain any of the shell's special characters.  Notice that in
+`advice', the `awk' program did not have single quotes around it.  The
+quotes are only needed for programs that are provided on the `awk'
+command line.
+
+   If you want to identify your `awk' program files clearly as such,
+you can add the extension `.awk' to the file name.  This doesn't affect
+the execution of the `awk' program, but it does make "housekeeping"
+easier.
+
+
+File: gawk.info,  Node: Executable Scripts,  Next: Comments,  Prev: Long,  Up: Running gawk
+
+Executable `awk' Programs
+-------------------------
+
+   Once you have learned `awk', you may want to write self-contained
+`awk' scripts, using the `#!' script mechanism.  You can do this on
+many Unix systems(1) (and someday on the GNU system).
+
+   For example, you could update the file `advice' to look like this:
+
+     #! /bin/awk -f
+     
+     BEGIN    { print "Don't Panic!" }
+
+After making this file executable (with the `chmod' utility), you can
+simply type `advice' at the shell, and the system will arrange to run
+`awk'(2) as if you had typed `awk -f advice'.
+
+     $ advice
+     -| Don't Panic!
+
+Self-contained `awk' scripts are useful when you want to write a
+program which users can invoke without their having to know that the
+program is written in `awk'.
+
+   Some older systems do not support the `#!' mechanism. You can get a
+similar effect using a regular shell script.  It would look something
+like this:
+
+     : The colon ensures execution by the standard shell.
+     awk 'PROGRAM' "$@"
+
+   Using this technique, it is _vital_ to enclose the PROGRAM in single
+quotes to protect it from interpretation by the shell.  If you omit the
+quotes, only a shell wizard can predict the results.
+
+   The `"$@"' causes the shell to forward all the command line
+arguments to the `awk' program, without interpretation.  The first
+line, which starts with a colon, is used so that this shell script will
+work even if invoked by a user who uses the C shell.  (Not all older
+systems obey this convention, but many do.)
+
+   ---------- Footnotes ----------
+
+   (1) The `#!' mechanism works on Linux systems, Unix systems derived
+from Berkeley Unix, System V Release 4, and some System V Release 3
+systems.
+
+   (2) The line beginning with `#!' lists the full file name of an
+interpreter to be run, and an optional initial command line argument to
+pass to that interpreter.  The operating system then runs the
+interpreter with the given argument and the full argument list of the
+executed program.  The first argument in the list is the full file name
+of the `awk' program.  The rest of the argument list will either be
+options to `awk', or data files, or both.
+
+
+File: gawk.info,  Node: Comments,  Prev: Executable Scripts,  Up: Running gawk
+
+Comments in `awk' Programs
+--------------------------
+
+   A "comment" is some text that is included in a program for the sake
+of human readers; it is not really part of the program.  Comments can
+explain what the program does, and how it works.  Nearly all
+programming languages have provisions for comments, because programs are
+typically hard to understand without their extra help.
+
+   In the `awk' language, a comment starts with the sharp sign
+character, `#', and continues to the end of the line.  The `#' does not
+have to be the first character on the line. The `awk' language ignores
+the rest of a line following a sharp sign.  For example, we could have
+put the following into `advice':
+
+     # This program prints a nice friendly message.  It helps
+     # keep novice users from being afraid of the computer.
+     BEGIN    { print "Don't Panic!" }
+
+   You can put comment lines into keyboard-composed throw-away `awk'
+programs also, but this usually isn't very useful; the purpose of a
+comment is to help you or another person understand the program at a
+later time.
+
+
+File: gawk.info,  Node: Very Simple,  Next: Two Rules,  Prev: Running gawk,  Up: Getting Started
+
+A Very Simple Example
+=====================
+
+   The following command runs a simple `awk' program that searches the
+input file `BBS-list' for the string of characters: `foo'.  (A string
+of characters is usually called a "string".  The term "string" is
+perhaps based on similar usage in English, such as "a string of
+pearls," or, "a string of cars in a train.")
+
+     awk '/foo/ { print $0 }' BBS-list
+
+When lines containing `foo' are found, they are printed, because
+`print $0' means print the current line.  (Just `print' by itself means
+the same thing, so we could have written that instead.)
+
+   You will notice that slashes, `/', surround the string `foo' in the
+`awk' program.  The slashes indicate that `foo' is a pattern to search
+for.  This type of pattern is called a "regular expression", and is
+covered in more detail later (*note Regular Expressions: Regexp.).  The
+pattern is allowed to match parts of words.  There are single-quotes
+around the `awk' program so that the shell won't interpret any of it as
+special shell characters.
+
+   Here is what this program prints:
+
+     $ awk '/foo/ { print $0 }' BBS-list
+     -| fooey        555-1234     2400/1200/300     B
+     -| foot         555-6699     1200/300          B
+     -| macfoo       555-6480     1200/300          A
+     -| sabafoo      555-2127     1200/300          C
+
+   In an `awk' rule, either the pattern or the action can be omitted,
+but not both.  If the pattern is omitted, then the action is performed
+for _every_ input line.  If the action is omitted, the default action
+is to print all lines that match the pattern.
+
+   Thus, we could leave out the action (the `print' statement and the
+curly braces) in the above example, and the result would be the same:
+all lines matching the pattern `foo' would be printed.  By comparison,
+omitting the `print' statement but retaining the curly braces makes an
+empty action that does nothing; then no lines would be printed.
+
+
+File: gawk.info,  Node: Two Rules,  Next: More Complex,  Prev: Very Simple,  Up: Getting Started
+
+An Example with Two Rules
+=========================
+
+   The `awk' utility reads the input files one line at a time.  For
+each line, `awk' tries the patterns of each of the rules.  If several
+patterns match then several actions are run, in the order in which they
+appear in the `awk' program.  If no patterns match, then no actions are
+run.
+
+   After processing all the rules (perhaps none) that match the line,
+`awk' reads the next line (however, *note The `next' Statement: Next
+Statement., and also *note The `nextfile' Statement: Nextfile
+Statement.).  This continues until the end of the file is reached.
+
+   For example, the `awk' program:
+
+     /12/  { print $0 }
+     /21/  { print $0 }
+
+contains two rules.  The first rule has the string `12' as the pattern
+and `print $0' as the action.  The second rule has the string `21' as
+the pattern and also has `print $0' as the action.  Each rule's action
+is enclosed in its own pair of braces.
+
+   This `awk' program prints every line that contains the string `12'
+_or_ the string `21'.  If a line contains both strings, it is printed
+twice, once by each rule.
+
+   This is what happens if we run this program on our two sample data
+files, `BBS-list' and `inventory-shipped', as shown here:
+
+     $ awk '/12/ { print $0 }
+     >      /21/ { print $0 }' BBS-list inventory-shipped
+     -| aardvark     555-5553     1200/300          B
+     -| alpo-net     555-3412     2400/1200/300     A
+     -| barfly       555-7685     1200/300          A
+     -| bites        555-1675     2400/1200/300     A
+     -| core         555-2912     1200/300          C
+     -| fooey        555-1234     2400/1200/300     B
+     -| foot         555-6699     1200/300          B
+     -| macfoo       555-6480     1200/300          A
+     -| sdace        555-3430     2400/1200/300     A
+     -| sabafoo      555-2127     1200/300          C
+     -| sabafoo      555-2127     1200/300          C
+     -| Jan  21  36  64 620
+     -| Apr  21  70  74 514
+
+Note how the line in `BBS-list' beginning with `sabafoo' was printed
+twice, once for each rule.
+
+
+File: gawk.info,  Node: More Complex,  Next: Statements/Lines,  Prev: Two Rules,  Up: Getting Started
+
+A More Complex Example
+======================
+
+   Here is an example to give you an idea of what typical `awk'
+programs do.  This example shows how `awk' can be used to summarize,
+select, and rearrange the output of another utility.  It uses features
+that haven't been covered yet, so don't worry if you don't understand
+all the details.
+
+     ls -lg | awk '$6 == "Nov" { sum += $5 }
+                  END { print sum }'
+
+   This command prints the total number of bytes in all the files in the
+current directory that were last modified in November (of any year).
+(In the C shell you would need to type a semicolon and then a backslash
+at the end of the first line; in a POSIX-compliant shell, such as the
+Bourne shell or Bash, the GNU Bourne-Again shell, you can type the
+example as shown.)
+
+   The `ls -lg' part of this example is a system command that gives you
+a listing of the files in a directory, including file size and the date
+the file was last modified. Its output looks like this:
+
+     -rw-r--r--  1 arnold   user   1933 Nov  7 13:05 Makefile
+     -rw-r--r--  1 arnold   user  10809 Nov  7 13:03 gawk.h
+     -rw-r--r--  1 arnold   user    983 Apr 13 12:14 gawk.tab.h
+     -rw-r--r--  1 arnold   user  31869 Jun 15 12:20 gawk.y
+     -rw-r--r--  1 arnold   user  22414 Nov  7 13:03 gawk1.c
+     -rw-r--r--  1 arnold   user  37455 Nov  7 13:03 gawk2.c
+     -rw-r--r--  1 arnold   user  27511 Dec  9 13:07 gawk3.c
+     -rw-r--r--  1 arnold   user   7989 Nov  7 13:03 gawk4.c
+
+The first field contains read-write permissions, the second field
+contains the number of links to the file, and the third field
+identifies the owner of the file. The fourth field identifies the group
+of the file.  The fifth field contains the size of the file in bytes.
+The sixth, seventh and eighth fields contain the month, day, and time,
+respectively, that the file was last modified.  Finally, the ninth field
+contains the name of the file.
+
+   The `$6 == "Nov"' in our `awk' program is an expression that tests
+whether the sixth field of the output from `ls -lg' matches the string
+`Nov'.  Each time a line has the string `Nov' for its sixth field, the
+action `sum += $5' is performed.  This adds the fifth field (the file
+size) to the variable `sum'.  As a result, when `awk' has finished
+reading all the input lines, `sum' is the sum of the sizes of files
+whose lines matched the pattern.  (This works because `awk' variables
+are automatically initialized to zero.)
+
+   After the last line of output from `ls' has been processed, the
+`END' rule is executed, and the value of `sum' is printed.  In this
+example, the value of `sum' would be 80600.
+
+   These more advanced `awk' techniques are covered in later sections
+(*note Overview of Actions: Action Overview.).  Before you can move on
+to more advanced `awk' programming, you have to know how `awk'
+interprets your input and displays your output.  By manipulating fields
+and using `print' statements, you can produce some very useful and
+impressive looking reports.
+
+
+File: gawk.info,  Node: Statements/Lines,  Next: Other Features,  Prev: More Complex,  Up: Getting Started
+
+`awk' Statements Versus Lines
+=============================
+
+   Most often, each line in an `awk' program is a separate statement or
+separate rule, like this:
+
+     awk '/12/  { print $0 }
+          /21/  { print $0 }' BBS-list inventory-shipped
+
+   However, `gawk' will ignore newlines after any of the following:
+
+     ,    {    ?    :    ||    &&    do    else
+
+A newline at any other point is considered the end of the statement.
+(Splitting lines after `?' and `:' is a minor `gawk' extension.  The
+`?' and `:' referred to here is the three operand conditional
+expression described in *Note Conditional Expressions: Conditional Exp.)
+
+   If you would like to split a single statement into two lines at a
+point where a newline would terminate it, you can "continue" it by
+ending the first line with a backslash character, `\'.  The backslash
+must be the final character on the line to be recognized as a
+continuation character.  This is allowed absolutely anywhere in the
+statement, even in the middle of a string or regular expression.  For
+example:
+
+     awk '/This regular expression is too long, so continue it\
+      on the next line/ { print $1 }'
+
+We have generally not used backslash continuation in the sample programs
+in this Info file.  Since in `gawk' there is no limit on the length of
+a line, it is never strictly necessary; it just makes programs more
+readable.  For this same reason, as well as for clarity, we have kept
+most statements short in the sample programs presented throughout the
+Info file.  Backslash continuation is most useful when your `awk'
+program is in a separate source file, instead of typed in on the
+command line.  You should also note that many `awk' implementations are
+more particular about where you may use backslash continuation. For
+example, they may not allow you to split a string constant using
+backslash continuation.  Thus, for maximal portability of your `awk'
+programs, it is best not to split your lines in the middle of a regular
+expression or a string.
+
+   *Caution: backslash continuation does not work as described above
+with the C shell.*  Continuation with backslash works for `awk'
+programs in files, and also for one-shot programs _provided_ you are
+using a POSIX-compliant shell, such as the Bourne shell or Bash, the
+GNU Bourne-Again shell.  But the C shell (`csh') behaves differently!
+There, you must use two backslashes in a row, followed by a newline.
+Note also that when using the C shell, _every_ newline in your awk
+program must be escaped with a backslash. To illustrate:
+
+     % awk 'BEGIN { \
+     ?   print \\
+     ?       "hello, world" \
+     ? }'
+     -| hello, world
+
+Here, the `%' and `?' are the C shell's primary and secondary prompts,
+analogous to the standard shell's `$' and `>'.
+
+   `awk' is a line-oriented language.  Each rule's action has to begin
+on the same line as the pattern.  To have the pattern and action on
+separate lines, you _must_ use backslash continuation--there is no
+other way.
+
+   Note that backslash continuation and comments do not mix. As soon as
+`awk' sees the `#' that starts a comment, it ignores _everything_ on
+the rest of the line. For example:
+
+     $ gawk 'BEGIN { print "dont panic" # a friendly \
+     >                                    BEGIN rule
+     > }'
+     error--> gawk: cmd. line:2:                BEGIN rule
+     error--> gawk: cmd. line:2:                ^ parse error
+
+Here, it looks like the backslash would continue the comment onto the
+next line. However, the backslash-newline combination is never even
+noticed, since it is "hidden" inside the comment. Thus, the `BEGIN' is
+noted as a syntax error.
+
+   When `awk' statements within one rule are short, you might want to
+put more than one of them on a line.  You do this by separating the
+statements with a semicolon, `;'.
+
+   This also applies to the rules themselves.  Thus, the previous
+program could have been written:
+
+     /12/ { print $0 } ; /21/ { print $0 }
+
+*Note:* the requirement that rules on the same line must be separated
+with a semicolon was not in the original `awk' language; it was added
+for consistency with the treatment of statements within an action.
+
+
+File: gawk.info,  Node: Other Features,  Next: When,  Prev: Statements/Lines,  Up: Getting Started
+
+Other Features of `awk'
+=======================
+
+   The `awk' language provides a number of predefined, or built-in
+variables, which your programs can use to get information from `awk'.
+There are other variables your program can set to control how `awk'
+processes your data.
+
+   In addition, `awk' provides a number of built-in functions for doing
+common computational and string related operations.
+
+   As we develop our presentation of the `awk' language, we introduce
+most of the variables and many of the functions. They are defined
+systematically in *Note Built-in Variables::, and *Note Built-in
+Functions: Built-in.
+
+
+File: gawk.info,  Node: When,  Prev: Other Features,  Up: Getting Started
+
+When to Use `awk'
+=================
+
+   You might wonder how `awk' might be useful for you.  Using utility
+programs, advanced patterns, field separators, arithmetic statements,
+and other selection criteria, you can produce much more complex output.
+The `awk' language is very useful for producing reports from large
+amounts of raw data, such as summarizing information from the output of
+other utility programs like `ls'.  (*Note A More Complex Example: More
+Complex.)
+
+   Programs written with `awk' are usually much smaller than they would
+be in other languages.  This makes `awk' programs easy to compose and
+use.  Often, `awk' programs can be quickly composed at your terminal,
+used once, and thrown away.  Since `awk' programs are interpreted, you
+can avoid the (usually lengthy) compilation part of the typical
+edit-compile-test-debug cycle of software development.
+
+   Complex programs have been written in `awk', including a complete
+retargetable assembler for eight-bit microprocessors (*note Glossary::,
+for more information) and a microcode assembler for a special purpose
+Prolog computer.  However, `awk''s capabilities are strained by tasks of
+such complexity.
+
+   If you find yourself writing `awk' scripts of more than, say, a few
+hundred lines, you might consider using a different programming
+language.  Emacs Lisp is a good choice if you need sophisticated string
+or pattern matching capabilities.  The shell is also good at string and
+pattern matching; in addition, it allows powerful use of the system
+utilities.  More conventional languages, such as C, C++, and Lisp, offer
+better facilities for system programming and for managing the complexity
+of large programs.  Programs in these languages may require more lines
+of source code than the equivalent `awk' programs, but they are easier
+to maintain and usually run more efficiently.
+
+
+File: gawk.info,  Node: One-liners,  Next: Regexp,  Prev: Getting Started,  Up: Top
+
+Useful One Line Programs
+************************
+
+   Many useful `awk' programs are short, just a line or two.  Here is a
+collection of useful, short programs to get you started.  Some of these
+programs contain constructs that haven't been covered yet.  The
+description of the program will give you a good idea of what is going
+on, but please read the rest of the Info file to become an `awk' expert!
+
+   Most of the examples use a data file named `data'.  This is just a
+placeholder; if you were to use these programs yourself, you would
+substitute your own file names for `data'.
+
+   Since you are reading this in Info, each line of the example code is
+enclosed in quotes, to represent text that you would type literally.
+The examples themselves represent shell commands that use single quotes
+to keep the shell from interpreting the contents of the program.  When
+reading the examples, focus on the text between the open and close
+quotes.
+
+`awk '{ if (length($0) > max) max = length($0) }'
+`     END { print max }' data'
+     This program prints the length of the longest input line.
+
+`awk 'length($0) > 80' data'
+     This program prints every line that is longer than 80 characters.
+     The sole rule has a relational expression as its pattern, and has
+     no action (so the default action, printing the record, is used).
+
+`expand data | awk '{ if (x < length()) x = length() }'
+`                   END { print "maximum line length is " x }''
+     This program prints the length of the longest line in `data'.  The
+     input is processed by the `expand' program to change tabs into
+     spaces, so the widths compared are actually the right-margin
+     columns.
+
+`awk 'NF > 0' data'
+     This program prints every line that has at least one field.  This
+     is an easy way to delete blank lines from a file (or rather, to
+     create a new file similar to the old file but from which the blank
+     lines have been deleted).
+
+`awk 'BEGIN { for (i = 1; i <= 7; i++)'
+`               print int(101 * rand()) }''
+     This program prints seven random numbers from zero to 100,
+     inclusive.
+
+`ls -lg FILES | awk '{ x += $5 } ; END { print "total bytes: " x }''
+     This program prints the total number of bytes used by FILES.
+
+`ls -lg FILES | awk '{ x += $5 }'
+`                 END { print "total K-bytes: " (x + 1023)/1024 }''
+     This program prints the total number of kilobytes used by FILES.
+
+`awk -F: '{ print $1 }' /etc/passwd | sort'
+     This program prints a sorted list of the login names of all users.
+
+`awk 'END { print NR }' data'
+     This program counts lines in a file.
+
+`awk 'NR % 2 == 0' data'
+     This program prints the even numbered lines in the data file.  If
+     you were to use the expression `NR % 2 == 1' instead, it would
+     print the odd numbered lines.
+
+
+File: gawk.info,  Node: Regexp,  Next: Reading Files,  Prev: One-liners,  Up: Top
+
+Regular Expressions
+*******************
+
+   A "regular expression", or "regexp", is a way of describing a set of
+strings.  Because regular expressions are such a fundamental part of
+`awk' programming, their format and use deserve a separate chapter.
+
+   A regular expression enclosed in slashes (`/') is an `awk' pattern
+that matches every input record whose text belongs to that set.
+
+   The simplest regular expression is a sequence of letters, numbers, or
+both.  Such a regexp matches any string that contains that sequence.
+Thus, the regexp `foo' matches any string containing `foo'.  Therefore,
+the pattern `/foo/' matches any input record containing the three
+characters `foo', _anywhere_ in the record.  Other kinds of regexps let
+you specify more complicated classes of strings.
+
+* Menu:
+
+* Regexp Usage::                How to Use Regular Expressions.
+* Escape Sequences::            How to write non-printing characters.
+* Regexp Operators::            Regular Expression Operators.
+* GNU Regexp Operators::        Operators specific to GNU software.
+* Case-sensitivity::            How to do case-insensitive matching.
+* Leftmost Longest::            How much text matches.
+* Computed Regexps::            Using Dynamic Regexps.
+
+
+File: gawk.info,  Node: Regexp Usage,  Next: Escape Sequences,  Prev: Regexp,  Up: Regexp
+
+How to Use Regular Expressions
+==============================
+
+   A regular expression can be used as a pattern by enclosing it in
+slashes.  Then the regular expression is tested against the entire text
+of each record.  (Normally, it only needs to match some part of the
+text in order to succeed.)  For example, this prints the second field
+of each record that contains the three characters `foo' anywhere in it:
+
+     $ awk '/foo/ { print $2 }' BBS-list
+     -| 555-1234
+     -| 555-6699
+     -| 555-6480
+     -| 555-2127
+
+   Regular expressions can also be used in matching expressions.  These
+expressions allow you to specify the string to match against; it need
+not be the entire current input record.  The two operators, `~' and
+`!~', perform regular expression comparisons.  Expressions using these
+operators can be used as patterns or in `if', `while', `for', and `do'
+statements.  (*Note Control Statements in Actions: Statements.)
+
+`EXP ~ /REGEXP/'
+     This is true if the expression EXP (taken as a string) is matched
+     by REGEXP.  The following example matches, or selects, all input
+     records with the upper-case letter `J' somewhere in the first
+     field:
+
+          $ awk '$1 ~ /J/' inventory-shipped
+          -| Jan  13  25  15 115
+          -| Jun  31  42  75 492
+          -| Jul  24  34  67 436
+          -| Jan  21  36  64 620
+
+     So does this:
+
+          awk '{ if ($1 ~ /J/) print }' inventory-shipped
+
+`EXP !~ /REGEXP/'
+     This is true if the expression EXP (taken as a character string)
+     is _not_ matched by REGEXP.  The following example matches, or
+     selects, all input records whose first field _does not_ contain
+     the upper-case letter `J':
+
+          $ awk '$1 !~ /J/' inventory-shipped
+          -| Feb  15  32  24 226
+          -| Mar  15  24  34 228
+          -| Apr  31  52  63 420
+          -| May  16  34  29 208
+          ...
+
+   When a regexp is written enclosed in slashes, like `/foo/', we call
+it a "regexp constant", much like `5.27' is a numeric constant, and
+`"foo"' is a string constant.
+
+
+File: gawk.info,  Node: Escape Sequences,  Next: Regexp Operators,  Prev: Regexp Usage,  Up: Regexp
+
+Escape Sequences
+================
+
+   Some characters cannot be included literally in string constants
+(`"foo"') or regexp constants (`/foo/').  You represent them instead
+with "escape sequences", which are character sequences beginning with a
+backslash (`\').
+
+   One use of an escape sequence is to include a double-quote character
+in a string constant.  Since a plain double-quote would end the string,
+you must use `\"' to represent an actual double-quote character as a
+part of the string.  For example:
+
+     $ awk 'BEGIN { print "He said \"hi!\" to her." }'
+     -| He said "hi!" to her.
+
+   The  backslash character itself is another character that cannot be
+included normally; you write `\\' to put one backslash in the string or
+regexp.  Thus, the string whose contents are the two characters `"' and
+`\' must be written `"\"\\"'.
+
+   Another use of backslash is to represent unprintable characters such
+as tab or newline.  While there is nothing to stop you from entering
+most unprintable characters directly in a string constant or regexp
+constant, they may look ugly.
+
+   Here is a table of all the escape sequences used in `awk', and what
+they represent. Unless noted otherwise, all of these escape sequences
+apply to both string constants and regexp constants.
+
+`\\'
+     A literal backslash, `\'.
+
+`\a'
+     The "alert" character, `Control-g', ASCII code 7 (BEL).
+
+`\b'
+     Backspace, `Control-h', ASCII code 8 (BS).
+
+`\f'
+     Formfeed, `Control-l', ASCII code 12 (FF).
+
+`\n'
+     Newline, `Control-j', ASCII code 10 (LF).
+
+`\r'
+     Carriage return, `Control-m', ASCII code 13 (CR).
+
+`\t'
+     Horizontal tab, `Control-i', ASCII code 9 (HT).
+
+`\v'
+     Vertical tab, `Control-k', ASCII code 11 (VT).
+
+`\NNN'
+     The octal value NNN, where NNN are one to three digits between `0'
+     and `7'.  For example, the code for the ASCII ESC (escape)
+     character is `\033'.
+
+`\xHH...'
+     The hexadecimal value HH, where HH are hexadecimal digits (`0'
+     through `9' and either `A' through `F' or `a' through `f').  Like
+     the same construct in ANSI C, the escape sequence continues until
+     the first non-hexadecimal digit is seen.  However, using more than
+     two hexadecimal digits produces undefined results. (The `\x'
+     escape sequence is not allowed in POSIX `awk'.)
+
+`\/'
+     A literal slash (necessary for regexp constants only).  You use
+     this when you wish to write a regexp constant that contains a
+     slash. Since the regexp is delimited by slashes, you need to
+     escape the slash that is part of the pattern, in order to tell
+     `awk' to keep processing the rest of the regexp.
+
+`\"'
+     A literal double-quote (necessary for string constants only).  You
+     use this when you wish to write a string constant that contains a
+     double-quote. Since the string is delimited by double-quotes, you
+     need to escape the quote that is part of the string, in order to
+     tell `awk' to keep processing the rest of the string.
+
+   In `gawk', there are additional two character sequences that begin
+with backslash that have special meaning in regexps.  *Note Additional
+Regexp Operators Only in `gawk': GNU Regexp Operators.
+
+   In a string constant, what happens if you place a backslash before
+something that is not one of the characters listed above?  POSIX `awk'
+purposely leaves this case undefined.  There are two choices.
+
+   * Strip the backslash out.  This is what Unix `awk' and `gawk' both
+     do.  For example, `"a\qc"' is the same as `"aqc"'.
+
+   * Leave the backslash alone.  Some other `awk' implementations do
+     this.  In such implementations, `"a\qc"' is the same as if you had
+     typed `"a\\qc"'.
+
+   In a regexp, a backslash before any character that is not in the
+above table, and not listed in *Note Additional Regexp Operators Only
+in `gawk': GNU Regexp Operators, means that the next character should
+be taken literally, even if it would normally be a regexp operator.
+E.g., `/a\+b/' matches the three characters `a+b'.
+
+   For complete portability, do not use a backslash before any
+character not listed in the table above.
+
+   Another interesting question arises. Suppose you use an octal or
+hexadecimal escape to represent a regexp metacharacter (*note Regular
+Expression Operators: Regexp Operators.).  Does `awk' treat the
+character as literal character, or as a regexp operator?
+
+   It turns out that historically, such characters were taken literally
+(d.c.).  However, the POSIX standard indicates that they should be
+treated as real metacharacters, and this is what `gawk' does.  However,
+in compatibility mode (*note Command Line Options: Options.), `gawk'
+treats the characters represented by octal and hexadecimal escape
+sequences literally when used in regexp constants. Thus, `/a\52b/' is
+equivalent to `/a\*b/'.
+
+   To summarize:
+
+  1. The escape sequences in the table above are always processed first,
+     for both string constants and regexp constants. This happens very
+     early, as soon as `awk' reads your program.
+
+  2. `gawk' processes both regexp constants and dynamic regexps (*note
+     Using Dynamic Regexps: Computed Regexps.), for the special
+     operators listed in *Note Additional Regexp Operators Only in
+     `gawk': GNU Regexp Operators.
+
+  3. A backslash before any other character means to treat that
+     character literally.
+
+
+File: gawk.info,  Node: Regexp Operators,  Next: GNU Regexp Operators,  Prev: Escape Sequences,  Up: Regexp
+
+Regular Expression Operators
+============================
+
+   You can combine regular expressions with the following characters,
+called "regular expression operators", or "metacharacters", to increase
+the power and versatility of regular expressions.
+
+   The escape sequences described in *Note Escape Sequences::, are
+valid inside a regexp.  They are introduced by a `\'.  They are
+recognized and converted into the corresponding real characters as the
+very first step in processing regexps.
+
+   Here is a table of metacharacters.  All characters that are not
+escape sequences and that are not listed in the table stand for
+themselves.
+
+`\'
+     This is used to suppress the special meaning of a character when
+     matching.  For example:
+
+          \$
+
+     matches the character `$'.
+
+`^'
+     This matches the beginning of a string.  For example:
+
+          ^@chapter
+
+     matches the `@chapter' at the beginning of a string, and can be
+     used to identify chapter beginnings in Texinfo source files.  The
+     `^' is known as an "anchor", since it anchors the pattern to
+     matching only at the beginning of the string.
+
+     It is important to realize that `^' does not match the beginning of
+     a line embedded in a string.  In this example the condition is not
+     true:
+
+          if ("line1\nLINE 2" ~ /^L/) ...
+
+`$'
+     This is similar to `^', but it matches only at the end of a string.
+     For example:
+
+          p$
+
+     matches a record that ends with a `p'.  The `$' is also an anchor,
+     and also does not match the end of a line embedded in a string.
+     In this example the condition is not true:
+
+          if ("line1\nLINE 2" ~ /1$/) ...
+
+`.'
+     The period, or dot, matches any single character, _including_ the
+     newline character.  For example:
+
+          .P
+
+     matches any single character followed by a `P' in a string.  Using
+     concatenation we can make a regular expression like `U.A', which
+     matches any three-character sequence that begins with `U' and ends
+     with `A'.
+
+     In strict POSIX mode (*note Command Line Options: Options.), `.'
+     does not match the NUL character, which is a character with all
+     bits equal to zero.  Otherwise, NUL is just another character.
+     Other versions of `awk' may not be able to match the NUL character.
+
+`[...]'
+     This is called a "character list".  It matches any _one_ of the
+     characters that are enclosed in the square brackets.  For example:
+
+          [MVX]
+
+     matches any one of the characters `M', `V', or `X' in a string.
+
+     Ranges of characters are indicated by using a hyphen between the
+     beginning and ending characters, and enclosing the whole thing in
+     brackets.  For example:
+
+          [0-9]
+
+     matches any digit.  Multiple ranges are allowed. E.g., the list
+     `[A-Za-z0-9]' is a common way to express the idea of "all
+     alphanumeric characters."
+
+     To include one of the characters `\', `]', `-' or `^' in a
+     character list, put a `\' in front of it.  For example:
+
+          [d\]]
+
+     matches either `d', or `]'.
+
+     This treatment of `\' in character lists is compatible with other
+     `awk' implementations, and is also mandated by POSIX.  The regular
+     expressions in `awk' are a superset of the POSIX specification for
+     Extended Regular Expressions (EREs).  POSIX EREs are based on the
+     regular expressions accepted by the traditional `egrep' utility.
+
+     "Character classes" are a new feature introduced in the POSIX
+     standard.  A character class is a special notation for describing
+     lists of characters that have a specific attribute, but where the
+     actual characters themselves can vary from country to country
+     and/or from character set to character set.  For example, the
+     notion of what is an alphabetic character differs in the USA and
+     in France.
+
+     A character class is only valid in a regexp _inside_ the brackets
+     of a character list.  Character classes consist of `[:', a keyword
+     denoting the class, and `:]'.  Here are the character classes
+     defined by the POSIX standard.
+
+    `[:alnum:]'
+          Alphanumeric characters.
+
+    `[:alpha:]'
+          Alphabetic characters.
+
+    `[:blank:]'
+          Space and tab characters.
+
+    `[:cntrl:]'
+          Control characters.
+
+    `[:digit:]'
+          Numeric characters.
+
+    `[:graph:]'
+          Characters that are printable and are also visible.  (A space
+          is printable, but not visible, while an `a' is both.)
+
+    `[:lower:]'
+          Lower-case alphabetic characters.
+
+    `[:print:]'
+          Printable characters (characters that are not control
+          characters.)
+
+    `[:punct:]'
+          Punctuation characters (characters that are not letter,
+          digits, control characters, or space characters).
+
+    `[:space:]'
+          Space characters (such as space, tab, and formfeed, to name a
+          few).
+
+    `[:upper:]'
+          Upper-case alphabetic characters.
+
+    `[:xdigit:]'
+          Characters that are hexadecimal digits.
+
+     For example, before the POSIX standard, to match alphanumeric
+     characters, you had to write `/[A-Za-z0-9]/'.  If your character
+     set had other alphabetic characters in it, this would not match
+     them.  With the POSIX character classes, you can write
+     `/[[:alnum:]]/', and this will match _all_ the alphabetic and
+     numeric characters in your character set.
+
+     Two additional special sequences can appear in character lists.
+     These apply to non-ASCII character sets, which can have single
+     symbols (called "collating elements") that are represented with
+     more than one character, as well as several characters that are
+     equivalent for "collating", or sorting, purposes.  (E.g., in
+     French, a plain "e" and a grave-accented "e`" are equivalent.)
+
+    Collating Symbols
+          A "collating symbol" is a multi-character collating element
+          enclosed in `[.' and `.]'.  For example, if `ch' is a
+          collating element, then `[[.ch.]]' is a regexp that matches
+          this collating element, while `[ch]' is a regexp that matches
+          either `c' or `h'.
+
+    Equivalence Classes
+          An "equivalence class" is a locale-specific name for a list of
+          characters that are equivalent. The name is enclosed in `[='
+          and `=]'.  For example, the name `e' might be used to
+          represent all of "e," "e`," and "e'." In this case, `[[=e]]'
+          is a regexp that matches any of `e', `e'',  or `e`'.
+
+     These features are very valuable in non-English speaking locales.
+
+     *Caution:* The library functions that `gawk' uses for regular
+     expression matching currently only recognize POSIX character
+     classes; they do not recognize collating symbols or equivalence
+     classes.
+
+`[^ ...]'
+     This is a "complemented character list".  The first character after
+     the `[' _must_ be a `^'.  It matches any characters _except_ those
+     in the square brackets.  For example:
+
+          [^0-9]
+
+     matches any character that is not a digit.
+
+`|'
+     This is the "alternation operator", and it is used to specify
+     alternatives.  For example:
+
+          ^P|[0-9]
+
+     matches any string that matches either `^P' or `[0-9]'.  This
+     means it matches any string that starts with `P' or contains a
+     digit.
+
+     The alternation applies to the largest possible regexps on either
+     side.  In other words, `|' has the lowest precedence of all the
+     regular expression operators.
+
+`(...)'
+     Parentheses are used for grouping in regular expressions as in
+     arithmetic.  They can be used to concatenate regular expressions
+     containing the alternation operator, `|'.  For example,
+     `@(samp|code)\{[^}]+\}' matches both `@code{foo}' and
+     `@samp{bar}'. (These are Texinfo formatting control sequences.)
+
+`*'
+     This symbol means that the preceding regular expression is to be
+     repeated as many times as necessary to find a match.  For example:
+
+          ph*
+
+     applies the `*' symbol to the preceding `h' and looks for matches
+     of one `p' followed by any number of `h's.  This will also match
+     just `p' if no `h's are present.
+
+     The `*' repeats the _smallest_ possible preceding expression.
+     (Use parentheses if you wish to repeat a larger expression.)  It
+     finds as many repetitions as possible.  For example:
+
+          awk '/\(c[ad][ad]*r x\)/ { print }' sample
+
+     prints every record in `sample' containing a string of the form
+     `(car x)', `(cdr x)', `(cadr x)', and so on.  Notice the escaping
+     of the parentheses by preceding them with backslashes.
+
+`+'
+     This symbol is similar to `*', but the preceding expression must be
+     matched at least once.  This means that:
+
+          wh+y
+
+     would match `why' and `whhy' but not `wy', whereas `wh*y' would
+     match all three of these strings.  This is a simpler way of
+     writing the last `*' example:
+
+          awk '/\(c[ad]+r x\)/ { print }' sample
+
+`?'
+     This symbol is similar to `*', but the preceding expression can be
+     matched either once or not at all.  For example:
+
+          fe?d
+
+     will match `fed' and `fd', but nothing else.
+
+`{N}'
+`{N,}'
+`{N,M}'
+     One or two numbers inside braces denote an "interval expression".
+     If there is one number in the braces, the preceding regexp is
+     repeated N times.  If there are two numbers separated by a comma,
+     the preceding regexp is repeated N to M times.  If there is one
+     number followed by a comma, then the preceding regexp is repeated
+     at least N times.
+
+    `wh{3}y'
+          matches `whhhy' but not `why' or `whhhhy'.
+
+    `wh{3,5}y'
+          matches `whhhy' or `whhhhy' or `whhhhhy', only.
+
+    `wh{2,}y'
+          matches `whhy' or `whhhy', and so on.
+
+     Interval expressions were not traditionally available in `awk'.
+     As part of the POSIX standard they were added, to make `awk' and
+     `egrep' consistent with each other.
+
+     However, since old programs may use `{' and `}' in regexp
+     constants, by default `gawk' does _not_ match interval expressions
+     in regexps.  If either `--posix' or `--re-interval' are specified
+     (*note Command Line Options: Options.), then interval expressions
+     are allowed in regexps.
+
+   In regular expressions, the `*', `+', and `?' operators, as well as
+the braces `{' and `}', have the highest precedence, followed by
+concatenation, and finally by `|'.  As in arithmetic, parentheses can
+change how operators are grouped.
+
+   If `gawk' is in compatibility mode (*note Command Line Options:
+Options.), character classes and interval expressions are not available
+in regular expressions.
+
+   The next node discusses the GNU-specific regexp operators, and
+provides more detail concerning how command line options affect the way
+`gawk' interprets the characters in regular expressions.
+
+
+File: gawk.info,  Node: GNU Regexp Operators,  Next: Case-sensitivity,  Prev: Regexp Operators,  Up: Regexp
+
+Additional Regexp Operators Only in `gawk'
+==========================================
+
+   GNU software that deals with regular expressions provides a number of
+additional regexp operators.  These operators are described in this
+section, and are specific to `gawk'; they are not available in other
+`awk' implementations.
+
+   Most of the additional operators are for dealing with word matching.
+For our purposes, a "word" is a sequence of one or more letters, digits,
+or underscores (`_').
+
+`\w'
+     This operator matches any word-constituent character, i.e. any
+     letter, digit, or underscore. Think of it as a short-hand for
+     `[[:alnum:]_]'.
+
+`\W'
+     This operator matches any character that is not word-constituent.
+     Think of it as a short-hand for `[^[:alnum:]_]'.
+
+`\<'
+     This operator matches the empty string at the beginning of a word.
+     For example, `/\<away/' matches `away', but not `stowaway'.
+
+`\>'
+     This operator matches the empty string at the end of a word.  For
+     example, `/stow\>/' matches `stow', but not `stowaway'.
+
+`\y'
+     This operator matches the empty string at either the beginning or
+     the end of a word (the word boundar*y*).  For example, `\yballs?\y'
+     matches either `ball' or `balls' as a separate word.
+
+`\B'
+     This operator matches the empty string within a word. In other
+     words, `\B' matches the empty string that occurs between two
+     word-constituent characters. For example, `/\Brat\B/' matches
+     `crate', but it does not match `dirty rat'.  `\B' is essentially
+     the opposite of `\y'.
+
+   There are two other operators that work on buffers.  In Emacs, a
+"buffer" is, naturally, an Emacs buffer.  For other programs, the
+regexp library routines that `gawk' uses consider the entire string to
+be matched as the buffer.
+
+   For `awk', since `^' and `$' always work in terms of the beginning
+and end of strings, these operators don't add any new capabilities.
+They are provided for compatibility with other GNU software.
+
+`\`'
+     This operator matches the empty string at the beginning of the
+     buffer.
+
+`\''
+     This operator matches the empty string at the end of the buffer.
+
+   In other GNU software, the word boundary operator is `\b'. However,
+that conflicts with the `awk' language's definition of `\b' as
+backspace, so `gawk' uses a different letter.
+
+   An alternative method would have been to require two backslashes in
+the GNU operators, but this was deemed to be too confusing, and the
+current method of using `\y' for the GNU `\b' appears to be the lesser
+of two evils.
+
+   The various command line options (*note Command Line Options:
+Options.)  control how `gawk' interprets characters in regexps.
+
+No options
+     In the default case, `gawk' provide all the facilities of POSIX
+     regexps and the GNU regexp operators described in *Note Regular
+     Expression Operators: Regexp Operators.  However, interval
+     expressions are not supported.
+
+`--posix'
+     Only POSIX regexps are supported, the GNU operators are not special
+     (e.g., `\w' matches a literal `w').  Interval expressions are
+     allowed.
+
+`--traditional'
+     Traditional Unix `awk' regexps are matched. The GNU operators are
+     not special, interval expressions are not available, and neither
+     are the POSIX character classes (`[[:alnum:]]' and so on).
+     Characters described by octal and hexadecimal escape sequences are
+     treated literally, even if they represent regexp metacharacters.
+
+`--re-interval'
+     Allow interval expressions in regexps, even if `--traditional' has
+     been provided.
+
+
+File: gawk.info,  Node: Case-sensitivity,  Next: Leftmost Longest,  Prev: GNU Regexp Operators,  Up: Regexp
+
+Case-sensitivity in Matching
+============================
+
+   Case is normally significant in regular expressions, both when
+matching ordinary characters (i.e. not metacharacters), and inside
+character sets.  Thus a `w' in a regular expression matches only a
+lower-case `w' and not an upper-case `W'.
+
+   The simplest way to do a case-independent match is to use a character
+list: `[Ww]'.  However, this can be cumbersome if you need to use it
+often; and it can make the regular expressions harder to read.  There
+are two alternatives that you might prefer.
+
+   One way to do a case-insensitive match at a particular point in the
+program is to convert the data to a single case, using the `tolower' or
+`toupper' built-in string functions (which we haven't discussed yet;
+*note Built-in Functions for String Manipulation: String Functions.).
+For example:
+
+     tolower($1) ~ /foo/  { ... }
+
+converts the first field to lower-case before matching against it.
+This will work in any POSIX-compliant implementation of `awk'.
+
+   Another method, specific to `gawk', is to set the variable
+`IGNORECASE' to a non-zero value (*note Built-in Variables::).  When
+`IGNORECASE' is not zero, _all_ regexp and string operations ignore
+case.  Changing the value of `IGNORECASE' dynamically controls the case
+sensitivity of your program as it runs.  Case is significant by default
+because `IGNORECASE' (like most variables) is initialized to zero.
+
+     x = "aB"
+     if (x ~ /ab/) ...   # this test will fail
+     
+     IGNORECASE = 1
+     if (x ~ /ab/) ...   # now it will succeed
+
+   In general, you cannot use `IGNORECASE' to make certain rules
+case-insensitive and other rules case-sensitive, because there is no way
+to set `IGNORECASE' just for the pattern of a particular rule.  To do
+this, you must use character lists or `tolower'.  However, one thing
+you can do only with `IGNORECASE' is turn case-sensitivity on or off
+dynamically for all the rules at once.
+
+   `IGNORECASE' can be set on the command line, or in a `BEGIN' rule
+(*note Other Command Line Arguments: Other Arguments.; also *note
+Startup and Cleanup Actions: Using BEGIN/END.).  Setting `IGNORECASE'
+from the command line is a way to make a program case-insensitive
+without having to edit it.
+
+   Prior to version 3.0 of `gawk', the value of `IGNORECASE' only
+affected regexp operations. It did not affect string comparison with
+`==', `!=', and so on.  Beginning with version 3.0, both regexp and
+string comparison operations are affected by `IGNORECASE'.
+
+   Beginning with version 3.0 of `gawk', the equivalences between
+upper-case and lower-case characters are based on the ISO-8859-1 (ISO
+Latin-1) character set. This character set is a superset of the
+traditional 128 ASCII characters, that also provides a number of
+characters suitable for use with European languages.
+
+   The value of `IGNORECASE' has no effect if `gawk' is in
+compatibility mode (*note Command Line Options: Options.).  Case is
+always significant in compatibility mode.
+
+
+File: gawk.info,  Node: Leftmost Longest,  Next: Computed Regexps,  Prev: Case-sensitivity,  Up: Regexp
+
+How Much Text Matches?
+======================
+
+   Consider the following example:
+
+     echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
+
+   This example uses the `sub' function (which we haven't discussed yet,
+*note Built-in Functions for String Manipulation: String Functions.)
+to make a change to the input record. Here, the regexp `/a+/' indicates
+"one or more `a' characters," and the replacement text is `<A>'.
+
+   The input contains four `a' characters.  What will the output be?
+In other words, how many is "one or more"--will `awk' match two, three,
+or all four `a' characters?
+
+   The answer is, `awk' (and POSIX) regular expressions always match
+the leftmost, _longest_ sequence of input characters that can match.
+Thus, in this example, all four `a' characters are replaced with `<A>'.
+
+     $ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
+     -| <A>bcd
+
+   For simple match/no-match tests, this is not so important. But when
+doing regexp-based field and record splitting, and text matching and
+substitutions with the `match', `sub', `gsub', and `gensub' functions,
+it is very important.  *Note Built-in Functions for String
+Manipulation: String Functions, for more information on these functions.
+Understanding this principle is also important for regexp-based record
+and field splitting (*note How Input is Split into Records: Records.,
+and also *note Specifying How Fields are Separated: Field Separators.).
+
+
+File: gawk.info,  Node: Computed Regexps,  Prev: Leftmost Longest,  Up: Regexp
+
+Using Dynamic Regexps
+=====================
+
+   The right hand side of a `~' or `!~' operator need not be a regexp
+constant (i.e. a string of characters between slashes).  It may be any
+expression.  The expression is evaluated, and converted if necessary to
+a string; the contents of the string are used as the regexp.  A regexp
+that is computed in this way is called a "dynamic regexp".  For example:
+
+     BEGIN { identifier_regexp = "[A-Za-z_][A-Za-z_0-9]+" }
+     $0 ~ identifier_regexp    { print }
+
+sets `identifier_regexp' to a regexp that describes `awk' variable
+names, and tests if the input record matches this regexp.
+
+   *Caution:* When using the `~' and `!~' operators, there is a
+difference between a regexp constant enclosed in slashes, and a string
+constant enclosed in double quotes.  If you are going to use a string
+constant, you have to understand that the string is in essence scanned
+_twice_; the first time when `awk' reads your program, and the second
+time when it goes to match the string on the left-hand side of the
+operator with the pattern on the right.  This is true of any string
+valued expression (such as `identifier_regexp' above), not just string
+constants.
+
+   What difference does it make if the string is scanned twice? The
+answer has to do with escape sequences, and particularly with
+backslashes.  To get a backslash into a regular expression inside a
+string, you have to type two backslashes.
+
+   For example, `/\*/' is a regexp constant for a literal `*'.  Only
+one backslash is needed.  To do the same thing with a string, you would
+have to type `"\\*"'.  The first backslash escapes the second one, so
+that the string actually contains the two characters `\' and `*'.
+
+   Given that you can use both regexp and string constants to describe
+regular expressions, which should you use?  The answer is "regexp
+constants," for several reasons.
+
+  1. String constants are more complicated to write, and more difficult
+     to read. Using regexp constants makes your programs less
+     error-prone.  Not understanding the difference between the two
+     kinds of constants is a common source of errors.
+
+  2. It is also more efficient to use regexp constants: `awk' can note
+     that you have supplied a regexp and store it internally in a form
+     that makes pattern matching more efficient.  When using a string
+     constant, `awk' must first convert the string into this internal
+     form, and then perform the pattern matching.
+
+  3. Using regexp constants is better style; it shows clearly that you
+     intend a regexp match.
+
+
+File: gawk.info,  Node: Reading Files,  Next: Printing,  Prev: Regexp,  Up: Top
+
+Reading Input Files
+*******************
+
+   In the typical `awk' program, all input is read either from the
+standard input (by default the keyboard, but often a pipe from another
+command) or from files whose names you specify on the `awk' command
+line.  If you specify input files, `awk' reads them in order, reading
+all the data from one before going on to the next.  The name of the
+current input file can be found in the built-in variable `FILENAME'
+(*note Built-in Variables::).
+
+   The input is read in units called "records", and processed by the
+rules of your program one record at a time.  By default, each record is
+one line.  Each record is automatically split into chunks called
+"fields".  This makes it more convenient for programs to work on the
+parts of a record.
+
+   On rare occasions you will need to use the `getline' command.  The
+`getline' command is valuable, both because it can do explicit input
+from any number of files, and because the files used with it do not
+have to be named on the `awk' command line (*note Explicit Input with
+`getline': Getline.).
+
+* Menu:
+
+* Records::                     Controlling how data is split into records.
+* Fields::                      An introduction to fields.
+* Non-Constant Fields::         Non-constant Field Numbers.
+* Changing Fields::             Changing the Contents of a Field.
+* Field Separators::            The field separator and how to change it.
+* Constant Size::               Reading constant width data.
+* Multiple Line::               Reading multi-line records.
+* Getline::                     Reading files under explicit program control
+                                using the `getline' function.
+
+
+File: gawk.info,  Node: Records,  Next: Fields,  Prev: Reading Files,  Up: Reading Files
+
+How Input is Split into Records
+===============================
+
+   The `awk' utility divides the input for your `awk' program into
+records and fields.  Records are separated by a character called the
+"record separator".  By default, the record separator is the newline
+character.  This is why records are, by default, single lines.  You can
+use a different character for the record separator by assigning the
+character to the built-in variable `RS'.
+
+   You can change the value of `RS' in the `awk' program, like any
+other variable, with the assignment operator, `=' (*note Assignment
+Expressions: Assignment Ops.).  The new record-separator character
+should be enclosed in quotation marks, which indicate a string
+constant.  Often the right time to do this is at the beginning of
+execution, before any input has been processed, so that the very first
+record will be read with the proper separator.  To do this, use the
+special `BEGIN' pattern (*note The `BEGIN' and `END' Special Patterns:
+BEGIN/END.).  For example:
+
+     awk 'BEGIN { RS = "/" } ; { print $0 }' BBS-list
+
+changes the value of `RS' to `"/"', before reading any input.  This is
+a string whose first character is a slash; as a result, records are
+separated by slashes.  Then the input file is read, and the second rule
+in the `awk' program (the action with no pattern) prints each record.
+Since each `print' statement adds a newline at the end of its output,
+the effect of this `awk' program is to copy the input with each slash
+changed to a newline.  Here are the results of running the program on
+`BBS-list':
+
+     $ awk 'BEGIN { RS = "/" } ; { print $0 }' BBS-list
+     -| aardvark     555-5553     1200
+     -| 300          B
+     -| alpo-net     555-3412     2400
+     -| 1200
+     -| 300     A
+     -| barfly       555-7685     1200
+     -| 300          A
+     -| bites        555-1675     2400
+     -| 1200
+     -| 300     A
+     -| camelot      555-0542     300               C
+     -| core         555-2912     1200
+     -| 300          C
+     -| fooey        555-1234     2400
+     -| 1200
+     -| 300     B
+     -| foot         555-6699     1200
+     -| 300          B
+     -| macfoo       555-6480     1200
+     -| 300          A
+     -| sdace        555-3430     2400
+     -| 1200
+     -| 300     A
+     -| sabafoo      555-2127     1200
+     -| 300          C
+     -|
+
+Note that the entry for the `camelot' BBS is not split.  In the
+original data file (*note Data Files for the Examples: Sample Data
+Files.), the line looks like this:
+
+     camelot      555-0542     300               C
+
+It only has one baud rate; there are no slashes in the record.
+
+   Another way to change the record separator is on the command line,
+using the variable-assignment feature (*note Other Command Line
+Arguments: Other Arguments.).
+
+     awk '{ print $0 }' RS="/" BBS-list
+
+This sets `RS' to `/' before processing `BBS-list'.
+
+   Using an unusual character such as `/' for the record separator
+produces correct behavior in the vast majority of cases.  However, the
+following (extreme) pipeline prints a surprising `1'.  There is one
+field, consisting of a newline.  The value of the built-in variable
+`NF' is the number of fields in the current record.
+
+     $ echo | awk 'BEGIN { RS = "a" } ; { print NF }'
+     -| 1
+
+Reaching the end of an input file terminates the current input record,
+even if the last character in the file is not the character in `RS'
+(d.c.).
+
+   The empty string, `""' (a string of no characters), has a special
+meaning as the value of `RS': it means that records are separated by
+one or more blank lines, and nothing else.  *Note Multiple-Line
+Records: Multiple Line, for more details.
+
+   If you change the value of `RS' in the middle of an `awk' run, the
+new value is used to delimit subsequent records, but the record
+currently being processed (and records already processed) are not
+affected.
+
+   After the end of the record has been determined, `gawk' sets the
+variable `RT' to the text in the input that matched `RS'.
+
+   The value of `RS' is in fact not limited to a one-character string.
+It can be any regular expression (*note Regular Expressions: Regexp.).
+In general, each record ends at the next string that matches the
+regular expression; the next record starts at the end of the matching
+string.  This general rule is actually at work in the usual case, where
+`RS' contains just a newline: a record ends at the beginning of the
+next matching string (the next newline in the input) and the following
+record starts just after the end of this string (at the first character
+of the following line).  The newline, since it matches `RS', is not
+part of either record.
+
+   When `RS' is a single character, `RT' will contain the same single
+character. However, when `RS' is a regular expression, then `RT'
+becomes more useful; it contains the actual input text that matched the
+regular expression.
+
+   The following example illustrates both of these features.  It sets
+`RS' equal to a regular expression that matches either a newline, or a
+series of one or more upper-case letters with optional leading and/or
+trailing white space (*note Regular Expressions: Regexp.).
+
+     $ echo record 1 AAAA record 2 BBBB record 3 |
+     > gawk 'BEGIN { RS = "\n|( *[[:upper:]]+ *)" }
+     >             { print "Record =", $0, "and RT =", RT }'
+     -| Record = record 1 and RT =  AAAA
+     -| Record = record 2 and RT =  BBBB
+     -| Record = record 3 and RT =
+     -|
+
+The final line of output has an extra blank line. This is because the
+value of `RT' is a newline, and then the `print' statement supplies its
+own terminating newline.
+
+   *Note A Simple Stream Editor: Simple Sed, for a more useful example
+of `RS' as a regexp and `RT'.
+
+   The use of `RS' as a regular expression and the `RT' variable are
+`gawk' extensions; they are not available in compatibility mode (*note
+Command Line Options: Options.).  In compatibility mode, only the first
+character of the value of `RS' is used to determine the end of the
+record.
+
+   The `awk' utility keeps track of the number of records that have
+been read so far from the current input file.  This value is stored in a
+built-in variable called `FNR'.  It is reset to zero when a new file is
+started.  Another built-in variable, `NR', is the total number of input
+records read so far from all data files.  It starts at zero but is
+never automatically reset to zero.
+
+
+File: gawk.info,  Node: Fields,  Next: Non-Constant Fields,  Prev: Records,  Up: Reading Files
+
+Examining Fields
+================
+
+   When `awk' reads an input record, the record is automatically
+separated or "parsed" by the interpreter into chunks called "fields".
+By default, fields are separated by whitespace, like words in a line.
+Whitespace in `awk' means any string of one or more spaces, tabs or
+newlines;(1) other characters such as formfeed, and so on, that are
+considered whitespace by other languages are _not_ considered
+whitespace by `awk'.
+
+   The purpose of fields is to make it more convenient for you to refer
+to these pieces of the record.  You don't have to use them--you can
+operate on the whole record if you wish--but fields are what make
+simple `awk' programs so powerful.
+
+   To refer to a field in an `awk' program, you use a dollar-sign, `$',
+followed by the number of the field you want.  Thus, `$1' refers to the
+first field, `$2' to the second, and so on.  For example, suppose the
+following is a line of input:
+
+     This seems like a pretty nice example.
+
+Here the first field, or `$1', is `This'; the second field, or `$2', is
+`seems'; and so on.  Note that the last field, `$7', is `example.'.
+Because there is no space between the `e' and the `.', the period is
+considered part of the seventh field.
+
+   `NF' is a built-in variable whose value is the number of fields in
+the current record.  `awk' updates the value of `NF' automatically,
+each time a record is read.
+
+   No matter how many fields there are, the last field in a record can
+be represented by `$NF'.  So, in the example above, `$NF' would be the
+same as `$7', which is `example.'.  Why this works is explained below
+(*note Non-constant Field Numbers: Non-Constant Fields.).  If you try
+to reference a field beyond the last one, such as `$8' when the record
+has only seven fields, you get the empty string.
+
+   `$0', which looks like a reference to the "zeroth" field, is a
+special case: it represents the whole input record.  `$0' is used when
+you are not interested in fields.
+
+   Here are some more examples:
+
+     $ awk '$1 ~ /foo/ { print $0 }' BBS-list
+     -| fooey        555-1234     2400/1200/300     B
+     -| foot         555-6699     1200/300          B
+     -| macfoo       555-6480     1200/300          A
+     -| sabafoo      555-2127     1200/300          C
+
+This example prints each record in the file `BBS-list' whose first
+field contains the string `foo'.  The operator `~' is called a
+"matching operator" (*note How to Use Regular Expressions: Regexp
+Usage.); it tests whether a string (here, the field `$1') matches a
+given regular expression.
+
+   By contrast, the following example looks for `foo' in _the entire
+record_ and prints the first field and the last field for each input
+record containing a match.
+
+     $ awk '/foo/ { print $1, $NF }' BBS-list
+     -| fooey B
+     -| foot B
+     -| macfoo A
+     -| sabafoo C
+
+   ---------- Footnotes ----------
+
+   (1) In POSIX `awk', newlines are not considered whitespace for
+separating fields.
+
+
+File: gawk.info,  Node: Non-Constant Fields,  Next: Changing Fields,  Prev: Fields,  Up: Reading Files
+
+Non-constant Field Numbers
+==========================
+
+   The number of a field does not need to be a constant.  Any
+expression in the `awk' language can be used after a `$' to refer to a
+field.  The value of the expression specifies the field number.  If the
+value is a string, rather than a number, it is converted to a number.
+Consider this example:
+
+     awk '{ print $NR }'
+
+Recall that `NR' is the number of records read so far: one in the first
+record, two in the second, etc.  So this example prints the first field
+of the first record, the second field of the second record, and so on.
+For the twentieth record, field number 20 is printed; most likely, the
+record has fewer than 20 fields, so this prints a blank line.
+
+   Here is another example of using expressions as field numbers:
+
+     awk '{ print $(2*2) }' BBS-list
+
+   `awk' must evaluate the expression `(2*2)' and use its value as the
+number of the field to print.  The `*' sign represents multiplication,
+so the expression `2*2' evaluates to four.  The parentheses are used so
+that the multiplication is done before the `$' operation; they are
+necessary whenever there is a binary operator in the field-number
+expression.  This example, then, prints the hours of operation (the
+fourth field) for every line of the file `BBS-list'.  (All of the `awk'
+operators are listed, in order of decreasing precedence, in *Note
+Operator Precedence (How Operators Nest): Precedence.)
+
+   If the field number you compute is zero, you get the entire record.
+Thus, `$(2-2)' has the same value as `$0'.  Negative field numbers are
+not allowed; trying to reference one will usually terminate your
+running `awk' program.  (The POSIX standard does not define what
+happens when you reference a negative field number.  `gawk' will notice
+this and terminate your program.  Other `awk' implementations may
+behave differently.)
+
+   As mentioned in *Note Examining Fields: Fields, the number of fields
+in the current record is stored in the built-in variable `NF' (also
+*note Built-in Variables::).  The expression `$NF' is not a special
+feature: it is the direct consequence of evaluating `NF' and using its
+value as a field number.
+
+
+File: gawk.info,  Node: Changing Fields,  Next: Field Separators,  Prev: Non-Constant Fields,  Up: Reading Files
+
+Changing the Contents of a Field
+================================
+
+   You can change the contents of a field as seen by `awk' within an
+`awk' program; this changes what `awk' perceives as the current input
+record.  (The actual input is untouched; `awk' _never_ modifies the
+input file.)
+
+   Consider this example and its output:
+
+     $ awk '{ $3 = $2 - 10; print $2, $3 }' inventory-shipped
+     -| 13 3
+     -| 15 5
+     -| 15 5
+     ...
+
+The `-' sign represents subtraction, so this program reassigns field
+three, `$3', to be the value of field two minus ten, `$2 - 10'.  (*Note
+Arithmetic Operators: Arithmetic Ops.)  Then field two, and the new
+value for field three, are printed.
+
+   In order for this to work, the text in field `$2' must make sense as
+a number; the string of characters must be converted to a number in
+order for the computer to do arithmetic on it.  The number resulting
+from the subtraction is converted back to a string of characters which
+then becomes field three.  *Note Conversion of Strings and Numbers:
+Conversion.
+
+   When you change the value of a field (as perceived by `awk'), the
+text of the input record is recalculated to contain the new field where
+the old one was.  Therefore, `$0' changes to reflect the altered field.
+Thus, this program prints a copy of the input file, with 10 subtracted
+from the second field of each line.
+
+     $ awk '{ $2 = $2 - 10; print $0 }' inventory-shipped
+     -| Jan 3 25 15 115
+     -| Feb 5 32 24 226
+     -| Mar 5 24 34 228
+     ...
+
+   You can also assign contents to fields that are out of range.  For
+example:
+
+     $ awk '{ $6 = ($5 + $4 + $3 + $2)
+     >        print $6 }' inventory-shipped
+     -| 168
+     -| 297
+     -| 301
+     ...
+
+We've just created `$6', whose value is the sum of fields `$2', `$3',
+`$4', and `$5'.  The `+' sign represents addition.  For the file
+`inventory-shipped', `$6' represents the total number of parcels
+shipped for a particular month.
+
+   Creating a new field changes `awk''s internal copy of the current
+input record--the value of `$0'.  Thus, if you do `print $0' after
+adding a field, the record printed includes the new field, with the
+appropriate number of field separators between it and the previously
+existing fields.
+
+   This recomputation affects and is affected by `NF' (the number of
+fields; *note Examining Fields: Fields.), and by a feature that has not
+been discussed yet, the "output field separator", `OFS', which is used
+to separate the fields (*note Output Separators::).  For example, the
+value of `NF' is set to the number of the highest field you create.
+
+   Note, however, that merely _referencing_ an out-of-range field does
+_not_ change the value of either `$0' or `NF'.  Referencing an
+out-of-range field only produces an empty string.  For example:
+
+     if ($(NF+1) != "")
+         print "can't happen"
+     else
+         print "everything is normal"
+
+should print `everything is normal', because `NF+1' is certain to be
+out of range.  (*Note The `if'-`else' Statement: If Statement, for more
+information about `awk''s `if-else' statements.  *Note Variable Typing
+and Comparison Expressions: Typing and Comparison, for more information
+about the `!=' operator.)
+
+   It is important to note that making an assignment to an existing
+field will change the value of `$0', but will not change the value of
+`NF', even when you assign the empty string to a field.  For example:
+
+     $ echo a b c d | awk '{ OFS = ":"; $2 = ""
+     >                       print $0; print NF }'
+     -| a::c:d
+     -| 4
+
+The field is still there; it just has an empty value.  You can tell
+because there are two colons in a row.
+
+   This example shows what happens if you create a new field.
+
+     $ echo a b c d | awk '{ OFS = ":"; $2 = ""; $6 = "new"
+     >                       print $0; print NF }'
+     -| a::c:d::new
+     -| 6
+
+The intervening field, `$5' is created with an empty value (indicated
+by the second pair of adjacent colons), and `NF' is updated with the
+value six.
+
+   Finally, decrementing `NF' will lose the values of the fields after
+the new value of `NF', and `$0' will be recomputed.  Here is an example:
+
+     $ echo a b c d e f | ../gawk '{ print "NF =", NF;
+     >                               NF = 3; print $0 }'
+     -| NF = 6
+     -| a b c
+
+
+File: gawk.info,  Node: Field Separators,  Next: Constant Size,  Prev: Changing Fields,  Up: Reading Files
+
+Specifying How Fields are Separated
+===================================
+
+   This section is rather long; it describes one of the most fundamental
+operations in `awk'.
+
+* Menu:
+
+* Basic Field Splitting::        How fields are split with single characters
+                                 or simple strings.
+* Regexp Field Splitting::       Using regexps as the field separator.
+* Single Character Fields::      Making each character a separate field.
+* Command Line Field Separator:: Setting `FS' from the command line.
+* Field Splitting Summary::      Some final points and a summary table.
+
+
+File: gawk.info,  Node: Basic Field Splitting,  Next: Regexp Field Splitting,  Prev: Field Separators,  Up: Field Separators
+
+The Basics of Field Separating
+------------------------------
+
+   The "field separator", which is either a single character or a
+regular expression, controls the way `awk' splits an input record into
+fields.  `awk' scans the input record for character sequences that
+match the separator; the fields themselves are the text between the
+matches.
+
+   In the examples below, we use the bullet symbol "*" to represent
+spaces in the output.
+
+   If the field separator is `oo', then the following line:
+
+     moo goo gai pan
+
+would be split into three fields: `m', `*g' and `*gai*pan'.  Note the
+leading spaces in the values of the second and third fields.
+
+   The field separator is represented by the built-in variable `FS'.
+Shell programmers take note!  `awk' does _not_ use the name `IFS' which
+is used by the POSIX compatible shells (such as the Bourne shell, `sh',
+or the GNU Bourne-Again Shell, Bash).
+
+   You can change the value of `FS' in the `awk' program with the
+assignment operator, `=' (*note Assignment Expressions: Assignment
+Ops.).  Often the right time to do this is at the beginning of
+execution, before any input has been processed, so that the very first
+record will be read with the proper separator.  To do this, use the
+special `BEGIN' pattern (*note The `BEGIN' and `END' Special Patterns:
+BEGIN/END.).  For example, here we set the value of `FS' to the string
+`","':
+
+     awk 'BEGIN { FS = "," } ; { print $2 }'
+
+Given the input line,
+
+     John Q. Smith, 29 Oak St., Walamazoo, MI 42139
+
+this `awk' program extracts and prints the string `*29*Oak*St.'.
+
+   Sometimes your input data will contain separator characters that
+don't separate fields the way you thought they would.  For instance, the
+person's name in the example we just used might have a title or suffix
+attached, such as `John Q. Smith, LXIX'.  From input containing such a
+name:
+
+     John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
+
+the above program would extract `*LXIX', instead of `*29*Oak*St.'.  If
+you were expecting the program to print the address, you would be
+surprised.  The moral is: choose your data layout and separator
+characters carefully to prevent such problems.
+
+   Normally, fields are separated by whitespace sequences (spaces, tabs
+and newlines), not by single spaces: two spaces in a row do not delimit
+an empty field.  The default value of the field separator `FS' is a
+string containing a single space, `" "'.  If this value were
+interpreted in the usual way, each space character would separate
+fields, so two spaces in a row would make an empty field between them.
+The reason this does not happen is that a single space as the value of
+`FS' is a special case: it is taken to specify the default manner of
+delimiting fields.
+
+   If `FS' is any other single character, such as `","', then each
+occurrence of that character separates two fields.  Two consecutive
+occurrences delimit an empty field.  If the character occurs at the
+beginning or the end of the line, that too delimits an empty field.  The
+space character is the only single character which does not follow these
+rules.
+
+
+File: gawk.info,  Node: Regexp Field Splitting,  Next: Single Character Fields,  Prev: Basic Field Splitting,  Up: Field Separators
+
+Using Regular Expressions to Separate Fields
+--------------------------------------------
+
+   The previous node discussed the use of single characters or simple
+strings as the value of `FS'.  More generally, the value of `FS' may be
+a string containing any regular expression.  In this case, each match
+in the record for the regular expression separates fields.  For
+example, the assignment:
+
+     FS = ", \t"
+
+makes every area of an input line that consists of a comma followed by a
+space and a tab, into a field separator.  (`\t' is an "escape sequence"
+that stands for a tab; *note Escape Sequences::, for the complete list
+of similar escape sequences.)
+
+   For a less trivial example of a regular expression, suppose you want
+single spaces to separate fields the way single commas were used above.
+You can set `FS' to `"[ ]"' (left bracket, space, right bracket).  This
+regular expression matches a single space and nothing else (*note
+Regular Expressions: Regexp.).
+
+   There is an important difference between the two cases of `FS = " "'
+(a single space) and `FS = "[ \t\n]+"' (left bracket, space, backslash,
+"t", backslash, "n", right bracket, which is a regular expression
+matching one or more spaces, tabs, or newlines).  For both values of
+`FS', fields are separated by runs of spaces, tabs and/or newlines.
+However, when the value of `FS' is `" "', `awk' will first strip
+leading and trailing whitespace from the record, and then decide where
+the fields are.
+
+   For example, the following pipeline prints `b':
+
+     $ echo ' a b c d ' | awk '{ print $2 }'
+     -| b
+
+However, this pipeline prints `a' (note the extra spaces around each
+letter):
+
+     $ echo ' a  b  c  d ' | awk 'BEGIN { FS = "[ \t]+" }
+     >                                  { print $2 }'
+     -| a
+
+In this case, the first field is "null", or empty.
+
+   The stripping of leading and trailing whitespace also comes into
+play whenever `$0' is recomputed.  For instance, study this pipeline:
+
+     $ echo '   a b c d' | awk '{ print; $2 = $2; print }'
+     -|    a b c d
+     -| a b c d
+
+The first `print' statement prints the record as it was read, with
+leading whitespace intact.  The assignment to `$2' rebuilds `$0' by
+concatenating `$1' through `$NF' together, separated by the value of
+`OFS'.  Since the leading whitespace was ignored when finding `$1', it
+is not part of the new `$0'.  Finally, the last `print' statement
+prints the new `$0'.
+
+
+File: gawk.info,  Node: Single Character Fields,  Next: Command Line Field Separator,  Prev: Regexp Field Splitting,  Up: Field Separators
+
+Making Each Character a Separate Field
+--------------------------------------
+
+   There are times when you may want to examine each character of a
+record separately.  In `gawk', this is easy to do, you simply assign
+the null string (`""') to `FS'. In this case, each individual character
+in the record will become a separate field.  Here is an example:
+
+     echo a b | gawk 'BEGIN { FS = "" }
+                      {
+                          for (i = 1; i <= NF; i = i + 1)
+                              print "Field", i, "is", $i
+                      }'
+
+The output from this is:
+
+     Field 1 is a
+     Field 2 is
+     Field 3 is b
+
+   Traditionally, the behavior for `FS' equal to `""' was not defined.
+In this case, Unix `awk' would simply treat the entire record as only
+having one field (d.c.).  In compatibility mode (*note Command Line
+Options: Options.), if `FS' is the null string, then `gawk' will also
+behave this way.
+
+
+File: gawk.info,  Node: Command Line Field Separator,  Next: Field Splitting Summary,  Prev: Single Character Fields,  Up: Field Separators
+
+Setting `FS' from the Command Line
+----------------------------------
+
+   `FS' can be set on the command line.  You use the `-F' option to do
+so.  For example:
+
+     awk -F, 'PROGRAM' INPUT-FILES
+
+sets `FS' to be the `,' character.  Notice that the option uses a
+capital `F'.  Contrast this with `-f', which specifies a file
+containing an `awk' program.  Case is significant in command line
+options: the `-F' and `-f' options have nothing to do with each other.
+You can use both options at the same time to set the `FS' variable
+_and_ get an `awk' program from a file.
+
+   The value used for the argument to `-F' is processed in exactly the
+same way as assignments to the built-in variable `FS'.  This means that
+if the field separator contains special characters, they must be escaped
+appropriately.  For example, to use a `\' as the field separator, you
+would have to type:
+
+     # same as FS = "\\"
+     awk -F\\\\ '...' files ...
+
+Since `\' is used for quoting in the shell, `awk' will see `-F\\'.
+Then `awk' processes the `\\' for escape characters (*note Escape
+Sequences::), finally yielding a single `\' to be used for the field
+separator.
+
+   As a special case, in compatibility mode (*note Command Line
+Options: Options.), if the argument to `-F' is `t', then `FS' is set to
+the tab character.  This is because if you type `-F\t' at the shell,
+without any quotes, the `\' gets deleted, so `awk' figures that you
+really want your fields to be separated with tabs, and not `t's.  Use
+`-v FS="t"' on the command line if you really do want to separate your
+fields with `t's (*note Command Line Options: Options.).
+
+   For example, let's use an `awk' program file called `baud.awk' that
+contains the pattern `/300/', and the action `print $1'.  Here is the
+program:
+
+     /300/   { print $1 }
+
+   Let's also set `FS' to be the `-' character, and run the program on
+the file `BBS-list'.  The following command prints a list of the names
+of the bulletin boards that operate at 300 baud and the first three
+digits of their phone numbers:
+
+     $ awk -F- -f baud.awk BBS-list
+     -| aardvark     555
+     -| alpo
+     -| barfly       555
+     ...
+
+Note the second line of output.  In the original file (*note Data Files
+for the Examples: Sample Data Files.), the second line looked like this:
+
+     alpo-net     555-3412     2400/1200/300     A
+
+   The `-' as part of the system's name was used as the field
+separator, instead of the `-' in the phone number that was originally
+intended.  This demonstrates why you have to be careful in choosing
+your field and record separators.
+
+   On many Unix systems, each user has a separate entry in the system
+password file, one line per user.  The information in these lines is
+separated by colons.  The first field is the user's logon name, and the
+second is the user's encrypted password.  A password file entry might
+look like this:
+
+     arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh
+
+   The following program searches the system password file, and prints
+the entries for users who have no password:
+
+     awk -F: '$2 == ""' /etc/passwd
+
+
+File: gawk.info,  Node: Field Splitting Summary,  Prev: Command Line Field Separator,  Up: Field Separators
+
+Field Splitting Summary
+-----------------------
+
+   According to the POSIX standard, `awk' is supposed to behave as if
+each record is split into fields at the time that it is read.  In
+particular, this means that you can change the value of `FS' after a
+record is read, and the value of the fields (i.e. how they were split)
+should reflect the old value of `FS', not the new one.
+
+   However, many implementations of `awk' do not work this way.
+Instead, they defer splitting the fields until a field is actually
+referenced.  The fields will be split using the _current_ value of
+`FS'! (d.c.)  This behavior can be difficult to diagnose. The following
+example illustrates the difference between the two methods.  (The
+`sed'(1) command prints just the first line of `/etc/passwd'.)
+
+     sed 1q /etc/passwd | awk '{ FS = ":" ; print $1 }'
+
+will usually print
+
+     root
+
+on an incorrect implementation of `awk', while `gawk' will print
+something like
+
+     root:nSijPlPhZZwgE:0:0:Root:/:
+
+   The following table summarizes how fields are split, based on the
+value of `FS'. (`==' means "is equal to.")
+
+`FS == " "'
+     Fields are separated by runs of whitespace.  Leading and trailing
+     whitespace are ignored.  This is the default.
+
+`FS == ANY OTHER SINGLE CHARACTER'
+     Fields are separated by each occurrence of the character.  Multiple
+     successive occurrences delimit empty fields, as do leading and
+     trailing occurrences.  The character can even be a regexp
+     metacharacter; it does not need to be escaped.
+
+`FS == REGEXP'
+     Fields are separated by occurrences of characters that match
+     REGEXP.  Leading and trailing matches of REGEXP delimit empty
+     fields.
+
+`FS == ""'
+     Each individual character in the record becomes a separate field.
+
+   ---------- Footnotes ----------
+
+   (1) The `sed' utility is a "stream editor."  Its behavior is also
+defined by the POSIX standard.
+
+
+File: gawk.info,  Node: Constant Size,  Next: Multiple Line,  Prev: Field Separators,  Up: Reading Files
+
+Reading Fixed-width Data
+========================
+
+   (This section discusses an advanced, experimental feature.  If you
+are a novice `awk' user, you may wish to skip it on the first reading.)
+
+   `gawk' version 2.13 introduced a new facility for dealing with
+fixed-width fields with no distinctive field separator.  Data of this
+nature arises, for example, in  the input for old FORTRAN programs where
+numbers are run together; or in the output of programs that did not
+anticipate the use of their output as input for other programs.
+
+   An example of the latter is a table where all the columns are lined
+up by the use of a variable number of spaces and _empty fields are just
+spaces_.  Clearly, `awk''s normal field splitting based on `FS' will
+not work well in this case.  Although a portable `awk' program can use
+a series of `substr' calls on `$0' (*note Built-in Functions for String
+Manipulation: String Functions.), this is awkward and inefficient for a
+large number of fields.
+
+   The splitting of an input record into fixed-width fields is
+specified by assigning a string containing space-separated numbers to
+the built-in variable `FIELDWIDTHS'.  Each number specifies the width
+of the field _including_ columns between fields.  If you want to ignore
+the columns between fields, you can specify the width as a separate
+field that is subsequently ignored.
+
+   The following data is the output of the Unix `w' utility.  It is
+useful to illustrate the use of `FIELDWIDTHS'.
+
+      10:06pm  up 21 days, 14:04,  23 users
+     User     tty       login  idle   JCPU   PCPU  what
+     hzuo     ttyV0     8:58pm            9      5  vi p24.tex
+     hzang    ttyV3     6:37pm    50                -csh
+     eklye    ttyV5     9:53pm            7      1  em thes.tex
+     dportein ttyV6     8:17pm  1:47                -csh
+     gierd    ttyD3    10:00pm     1                elm
+     dave     ttyD4     9:47pm            4      4  w
+     brent    ttyp0    26Jun91  4:46  26:46   4:41  bash
+     dave     ttyq4    26Jun9115days     46     46  wnewmail
+
+   The following program takes the above input, converts the idle time
+to number of seconds and prints out the first two fields and the
+calculated idle time.  (This program uses a number of `awk' features
+that haven't been introduced yet.)
+
+     BEGIN  { FIELDWIDTHS = "9 6 10 6 7 7 35" }
+     NR > 2 {
+         idle = $4
+         sub(/^  */, "", idle)   # strip leading spaces
+         if (idle == "")
+             idle = 0
+         if (idle ~ /:/) {
+             split(idle, t, ":")
+             idle = t[1] * 60 + t[2]
+         }
+         if (idle ~ /days/)
+             idle *= 24 * 60 * 60
+     
+         print $1, $2, idle
+     }
+
+   Here is the result of running the program on the data:
+
+     hzuo      ttyV0  0
+     hzang     ttyV3  50
+     eklye     ttyV5  0
+     dportein  ttyV6  107
+     gierd     ttyD3  1
+     dave      ttyD4  0
+     brent     ttyp0  286
+     dave      ttyq4  1296000
+
+   Another (possibly more practical) example of fixed-width input data
+would be the input from a deck of balloting cards.  In some parts of
+the United States, voters mark their choices by punching holes in
+computer cards.  These cards are then processed to count the votes for
+any particular candidate or on any particular issue.  Since a voter may
+choose not to vote on some issue, any column on the card may be empty.
+An `awk' program for processing such data could use the `FIELDWIDTHS'
+feature to simplify reading the data.  (Of course, getting `gawk' to
+run on a system with card readers is another story!)
+
+   Assigning a value to `FS' causes `gawk' to return to using `FS' for
+field splitting.  Use `FS = FS' to make this happen, without having to
+know the current value of `FS'.
+
+   This feature is still experimental, and may evolve over time.  Note
+that in particular, `gawk' does not attempt to verify the sanity of the
+values used in the value of `FIELDWIDTHS'.
+
+
+File: gawk.info,  Node: Multiple Line,  Next: Getline,  Prev: Constant Size,  Up: Reading Files
+
+Multiple-Line Records
+=====================
+
+   In some data bases, a single line cannot conveniently hold all the
+information in one entry.  In such cases, you can use multi-line
+records.
+
+   The first step in doing this is to choose your data format: when
+records are not defined as single lines, how do you want to define them?
+What should separate records?
+
+   One technique is to use an unusual character or string to separate
+records.  For example, you could use the formfeed character (written
+`\f' in `awk', as in C) to separate them, making each record a page of
+the file.  To do this, just set the variable `RS' to `"\f"' (a string
+containing the formfeed character).  Any other character could equally
+well be used, as long as it won't be part of the data in a record.
+
+   Another technique is to have blank lines separate records.  By a
+special dispensation, an empty string as the value of `RS' indicates
+that records are separated by one or more blank lines.  If you set `RS'
+to the empty string, a record always ends at the first blank line
+encountered.  And the next record doesn't start until the first
+non-blank line that follows--no matter how many blank lines appear in a
+row, they are considered one record-separator.
+
+   You can achieve the same effect as `RS = ""' by assigning the string
+`"\n\n+"' to `RS'. This regexp matches the newline at the end of the
+record, and one or more blank lines after the record.  In addition, a
+regular expression always matches the longest possible sequence when
+there is a choice (*note How Much Text Matches?: Leftmost Longest.)  So
+the next record doesn't start until the first non-blank line that
+follows--no matter how many blank lines appear in a row, they are
+considered one record-separator.
+
+   There is an important difference between `RS = ""' and `RS =
+"\n\n+"'. In the first case, leading newlines in the input data file
+are ignored, and if a file ends without extra blank lines after the
+last record, the final newline is removed from the record.  In the
+second case, this special processing is not done (d.c.).
+
+   Now that the input is separated into records, the second step is to
+separate the fields in the record.  One way to do this is to divide each
+of the lines into fields in the normal manner.  This happens by default
+as the result of a special feature: when `RS' is set to the empty
+string, the newline character _always_ acts as a field separator.  This
+is in addition to whatever field separations result from `FS'.
+
+   The original motivation for this special exception was probably to
+provide useful behavior in the default case (i.e. `FS' is equal to
+`" "').  This feature can be a problem if you really don't want the
+newline character to separate fields, since there is no way to prevent
+it.  However, you can work around this by using the `split' function to
+break up the record manually (*note Built-in Functions for String
+Manipulation: String Functions.).
+
+   Another way to separate fields is to put each field on a separate
+line: to do this, just set the variable `FS' to the string `"\n"'.
+(This simple regular expression matches a single newline.)
+
+   A practical example of a data file organized this way might be a
+mailing list, where each entry is separated by blank lines.  If we have
+a mailing list in a file named `addresses', that looks like this:
+
+     Jane Doe
+     123 Main Street
+     Anywhere, SE 12345-6789
+     
+     John Smith
+     456 Tree-lined Avenue
+     Smallville, MW 98765-4321
+     
+     ...
+
+A simple program to process this file would look like this:
+
+     # addrs.awk --- simple mailing list program
+     
+     # Records are separated by blank lines.
+     # Each line is one field.
+     BEGIN { RS = "" ; FS = "\n" }
+     
+     {
+           print "Name is:", $1
+           print "Address is:", $2
+           print "City and State are:", $3
+           print ""
+     }
+
+   Running the program produces the following output:
+
+     $ awk -f addrs.awk addresses
+     -| Name is: Jane Doe
+     -| Address is: 123 Main Street
+     -| City and State are: Anywhere, SE 12345-6789
+     -|
+     -| Name is: John Smith
+     -| Address is: 456 Tree-lined Avenue
+     -| City and State are: Smallville, MW 98765-4321
+     -|
+     ...
+
+   *Note Printing Mailing Labels: Labels Program, for a more realistic
+program that deals with address lists.
+
+   The following table summarizes how records are split, based on the
+value of `RS'. (`==' means "is equal to.")
+
+`RS == "\n"'
+     Records are separated by the newline character (`\n').  In effect,
+     every line in the data file is a separate record, including blank
+     lines.  This is the default.
+
+`RS == ANY SINGLE CHARACTER'
+     Records are separated by each occurrence of the character.
+     Multiple successive occurrences delimit empty records.
+
+`RS == ""'
+     Records are separated by runs of blank lines.  The newline
+     character always serves as a field separator, in addition to
+     whatever value `FS' may have. Leading and trailing newlines in a
+     file are ignored.
+
+`RS == REGEXP'
+     Records are separated by occurrences of characters that match
+     REGEXP.  Leading and trailing matches of REGEXP delimit empty
+     records.
+
+   In all cases, `gawk' sets `RT' to the input text that matched the
+value specified by `RS'.
+
+
+File: gawk.info,  Node: Getline,  Prev: Multiple Line,  Up: Reading Files
+
+Explicit Input with `getline'
+=============================
+
+   So far we have been getting our input data from `awk''s main input
+stream--either the standard input (usually your terminal, sometimes the
+output from another program) or from the files specified on the command
+line.  The `awk' language has a special built-in command called
+`getline' that can be used to read input under your explicit control.
+
+* Menu:
+
+* Getline Intro::            Introduction to the `getline' function.
+* Plain Getline::            Using `getline' with no arguments.
+* Getline/Variable::         Using `getline' into a variable.
+* Getline/File::             Using `getline' from a file.
+* Getline/Variable/File::    Using `getline' into a variable from a
+                             file.
+* Getline/Pipe::             Using `getline' from a pipe.
+* Getline/Variable/Pipe::    Using `getline' into a variable from a
+                             pipe.
+* Getline Summary::          Summary Of `getline' Variants.
+
+
+File: gawk.info,  Node: Getline Intro,  Next: Plain Getline,  Prev: Getline,  Up: Getline
+
+Introduction to `getline'
+-------------------------
+
+   This command is used in several different ways, and should _not_ be
+used by beginners.  It is covered here because this is the chapter on
+input.  The examples that follow the explanation of the `getline'
+command include material that has not been covered yet.  Therefore,
+come back and study the `getline' command _after_ you have reviewed the
+rest of this Info file and have a good knowledge of how `awk' works.
+
+   `getline' returns one if it finds a record, and zero if the end of
+the file is encountered.  If there is some error in getting a record,
+such as a file that cannot be opened, then `getline' returns -1.  In
+this case, `gawk' sets the variable `ERRNO' to a string describing the
+error that occurred.
+
+   In the following examples, COMMAND stands for a string value that
+represents a shell command.
+
+
+File: gawk.info,  Node: Plain Getline,  Next: Getline/Variable,  Prev: Getline Intro,  Up: Getline
+
+Using `getline' with No Arguments
+---------------------------------
+
+   The `getline' command can be used without arguments to read input
+from the current input file.  All it does in this case is read the next
+input record and split it up into fields.  This is useful if you've
+finished processing the current record, but you want to do some special
+processing _right now_ on the next record.  Here's an example:
+
+     awk '{
+          if ((t = index($0, "/*")) != 0) {
+               # value will be "" if t is 1
+               tmp = substr($0, 1, t - 1)
+               u = index(substr($0, t + 2), "*/")
+               while (u == 0) {
+                    if (getline <= 0) {
+                         m = "unexpected EOF or error"
+                         m = (m ": " ERRNO)
+                         print m > "/dev/stderr"
+                         exit
+                    }
+                    t = -1
+                    u = index($0, "*/")
+               }
+               # substr expression will be "" if */
+               # occurred at end of line
+               $0 = tmp substr($0, t + u + 3)
+          }
+          print $0
+     }'
+
+   This `awk' program deletes all C-style comments, `/* ...  */', from
+the input.  By replacing the `print $0' with other statements, you
+could perform more complicated processing on the decommented input,
+like searching for matches of a regular expression.  This program has a
+subtle problem--it does not work if one comment ends and another begins
+on the same line.
+
+   This form of the `getline' command sets `NF' (the number of fields;
+*note Examining Fields: Fields.), `NR' (the number of records read so
+far; *note How Input is Split into Records: Records.), `FNR' (the
+number of records read from this input file), and the value of `$0'.
+
+   *Note:* the new value of `$0' is used in testing the patterns of any
+subsequent rules.  The original value of `$0' that triggered the rule
+which executed `getline' is lost (d.c.).  By contrast, the `next'
+statement reads a new record but immediately begins processing it
+normally, starting with the first rule in the program.  *Note The
+`next' Statement: Next Statement.
+
+
+File: gawk.info,  Node: Getline/Variable,  Next: Getline/File,  Prev: Plain Getline,  Up: Getline
+
+Using `getline' Into a Variable
+-------------------------------
+
+   You can use `getline VAR' to read the next record from `awk''s input
+into the variable VAR.  No other processing is done.
+
+   For example, suppose the next line is a comment, or a special string,
+and you want to read it, without triggering any rules.  This form of
+`getline' allows you to read that line and store it in a variable so
+that the main read-a-line-and-check-each-rule loop of `awk' never sees
+it.
+
+   The following example swaps every two lines of input.  For example,
+given:
+
+     wan
+     tew
+     free
+     phore
+
+it outputs:
+
+     tew
+     wan
+     phore
+     free
+
+Here's the program:
+
+     awk '{
+          if ((getline tmp) > 0) {
+               print tmp
+               print $0
+          } else
+               print $0
+     }'
+
+   The `getline' command used in this way sets only the variables `NR'
+and `FNR' (and of course, VAR).  The record is not split into fields,
+so the values of the fields (including `$0') and the value of `NF' do
+not change.
+
+
+File: gawk.info,  Node: Getline/File,  Next: Getline/Variable/File,  Prev: Getline/Variable,  Up: Getline
+
+Using `getline' from a File
+---------------------------
+
+   Use `getline < FILE' to read the next record from the file FILE.
+Here FILE is a string-valued expression that specifies the file name.
+`< FILE' is called a "redirection" since it directs input to come from
+a different place.
+
+   For example, the following program reads its input record from the
+file `secondary.input' when it encounters a first field with a value
+equal to 10 in the current input file.
+
+     awk '{
+         if ($1 == 10) {
+              getline < "secondary.input"
+              print
+         } else
+              print
+     }'
+
+   Since the main input stream is not used, the values of `NR' and
+`FNR' are not changed.  But the record read is split into fields in the
+normal manner, so the values of `$0' and other fields are changed.  So
+is the value of `NF'.
+
+   According to POSIX, `getline < EXPRESSION' is ambiguous if
+EXPRESSION contains unparenthesized operators other than `$'; for
+example, `getline < dir "/" file' is ambiguous because the
+concatenation operator is not parenthesized, and you should write it as
+`getline < (dir "/" file)' if you want your program to be portable to
+other `awk' implementations.
+
+
+File: gawk.info,  Node: Getline/Variable/File,  Next: Getline/Pipe,  Prev: Getline/File,  Up: Getline
+
+Using `getline' Into a Variable from a File
+-------------------------------------------
+
+   Use `getline VAR < FILE' to read input the file FILE and put it in
+the variable VAR.  As above, FILE is a string-valued expression that
+specifies the file from which to read.
+
+   In this version of `getline', none of the built-in variables are
+changed, and the record is not split into fields.  The only variable
+changed is VAR.
+
+   According to POSIX, `getline VAR < EXPRESSION' is ambiguous if
+EXPRESSION contains unparenthesized operators other than `$'; for
+example, `getline < dir "/" file' is ambiguous because the
+concatenation operator is not parenthesized, and you should write it as
+`getline < (dir "/" file)' if you want your program to be portable to
+other `awk' implementations.
+
+   For example, the following program copies all the input files to the
+output, except for records that say `@include FILENAME'.  Such a record
+is replaced by the contents of the file FILENAME.
+
+     awk '{
+          if (NF == 2 && $1 == "@include") {
+               while ((getline line < $2) > 0)
+                    print line
+               close($2)
+          } else
+               print
+     }'
+
+   Note here how the name of the extra input file is not built into the
+program; it is taken directly from the data, from the second field on
+the `@include' line.
+
+   The `close' function is called to ensure that if two identical
+`@include' lines appear in the input, the entire specified file is
+included twice.  *Note Closing Input and Output Files and Pipes: Close
+Files And Pipes.
+
+   One deficiency of this program is that it does not process nested
+`@include' statements (`@include' statements in included files) the way
+a true macro preprocessor would.  *Note An Easy Way to Use Library
+Functions: Igawk Program, for a program that does handle nested
+`@include' statements.
+
+
+File: gawk.info,  Node: Getline/Pipe,  Next: Getline/Variable/Pipe,  Prev: Getline/Variable/File,  Up: Getline
+
+Using `getline' from a Pipe
+---------------------------
+
+   You can pipe the output of a command into `getline', using `COMMAND
+| getline'.  In this case, the string COMMAND is run as a shell command
+and its output is piped into `awk' to be used as input.  This form of
+`getline' reads one record at a time from the pipe.
+
+   For example, the following program copies its input to its output,
+except for lines that begin with `@execute', which are replaced by the
+output produced by running the rest of the line as a shell command:
+
+     awk '{
+          if ($1 == "@execute") {
+               tmp = substr($0, 10)
+               while ((tmp | getline) > 0)
+                    print
+               close(tmp)
+          } else
+               print
+     }'
+
+The `close' function is called to ensure that if two identical
+`@execute' lines appear in the input, the command is run for each one.
+*Note Closing Input and Output Files and Pipes: Close Files And Pipes.
+
+   Given the input:
+
+     foo
+     bar
+     baz
+     @execute who
+     bletch
+
+the program might produce:
+
+     foo
+     bar
+     baz
+     arnold     ttyv0   Jul 13 14:22
+     miriam     ttyp0   Jul 13 14:23     (murphy:0)
+     bill       ttyp1   Jul 13 14:23     (murphy:0)
+     bletch
+
+Notice that this program ran the command `who' and printed the result.
+(If you try this program yourself, you will of course get different
+results, showing you who is logged in on your system.)
+
+   This variation of `getline' splits the record into fields, sets the
+value of `NF' and recomputes the value of `$0'.  The values of `NR' and
+`FNR' are not changed.
+
+   According to POSIX, `EXPRESSION | getline' is ambiguous if
+EXPRESSION contains unparenthesized operators other than `$'; for
+example, `"echo " "date" | getline' is ambiguous because the
+concatenation operator is not parenthesized, and you should write it as
+`("echo " "date") | getline' if you want your program to be portable to
+other `awk' implementations.
+
+
+File: gawk.info,  Node: Getline/Variable/Pipe,  Next: Getline Summary,  Prev: Getline/Pipe,  Up: Getline
+
+Using `getline' Into a Variable from a Pipe
+-------------------------------------------
+
+   When you use `COMMAND | getline VAR', the output of the command
+COMMAND is sent through a pipe to `getline' and into the variable VAR.
+For example, the following program reads the current date and time into
+the variable `current_time', using the `date' utility, and then prints
+it.
+
+     awk 'BEGIN {
+          "date" | getline current_time
+          close("date")
+          print "Report printed on " current_time
+     }'
+
+   In this version of `getline', none of the built-in variables are
+changed, and the record is not split into fields.
+
+   According to POSIX, `EXPRESSION | getline VAR' is ambiguous if
+EXPRESSION contains unparenthesized operators other than `$'; for
+example, `"echo " "date" | getline VAR' is ambiguous because the
+concatenation operator is not parenthesized, and you should write it as
+`("echo " "date") | getline VAR' if you want your program to be
+portable to other `awk' implementations.
+
+
+File: gawk.info,  Node: Getline Summary,  Prev: Getline/Variable/Pipe,  Up: Getline
+
+Summary of `getline' Variants
+-----------------------------
+
+   With all the forms of `getline', even though `$0' and `NF', may be
+updated, the record will not be tested against all the patterns in the
+`awk' program, in the way that would happen if the record were read
+normally by the main processing loop of `awk'.  However the new record
+is tested against any subsequent rules.
+
+   Many `awk' implementations limit the number of pipelines an `awk'
+program may have open to just one!  In `gawk', there is no such limit.
+You can open as many pipelines as the underlying operating system will
+permit.
+
+   An interesting side-effect occurs if you use `getline' (without a
+redirection) inside a `BEGIN' rule. Since an unredirected `getline'
+reads from the command line data files, the first `getline' command
+causes `awk' to set the value of `FILENAME'. Normally, `FILENAME' does
+not have a value inside `BEGIN' rules, since you have not yet started
+to process the command line data files (d.c.).  (*Note The `BEGIN' and
+`END' Special Patterns: BEGIN/END, also *note Built-in Variables that
+Convey Information: Auto-set..)
+
+   The following table summarizes the six variants of `getline',
+listing which built-in variables are set by each one.
+
+`getline'
+     sets `$0', `NF', `FNR', and `NR'.
+
+`getline VAR'
+     sets VAR, `FNR', and `NR'.
+
+`getline < FILE'
+     sets `$0', and `NF'.
+
+`getline VAR < FILE'
+     sets VAR.
+
+`COMMAND | getline'
+     sets `$0', and `NF'.
+
+`COMMAND | getline VAR'
+     sets VAR.
+
+
+File: gawk.info,  Node: Printing,  Next: Expressions,  Prev: Reading Files,  Up: Top
+
+Printing Output
+***************
+
+   One of the most common actions is to "print", or output, some or all
+of the input.  You use the `print' statement for simple output.  You
+use the `printf' statement for fancier formatting.  Both are described
+in this chapter.
+
+* Menu:
+
+* Print::                       The `print' statement.
+* Print Examples::              Simple examples of `print' statements.
+* Output Separators::           The output separators and how to change them.
+* OFMT::                        Controlling Numeric Output With `print'.
+* Printf::                      The `printf' statement.
+* Redirection::                 How to redirect output to multiple files and
+                                pipes.
+* Special Files::               File name interpretation in `gawk'.
+                                `gawk' allows access to inherited file
+                                descriptors.
+* Close Files And Pipes::       Closing Input and Output Files and Pipes.
+
+
+File: gawk.info,  Node: Print,  Next: Print Examples,  Prev: Printing,  Up: Printing
+
+The `print' Statement
+=====================
+
+   The `print' statement does output with simple, standardized
+formatting.  You specify only the strings or numbers to be printed, in a
+list separated by commas.  They are output, separated by single spaces,
+followed by a newline.  The statement looks like this:
+
+     print ITEM1, ITEM2, ...
+
+The entire list of items may optionally be enclosed in parentheses.  The
+parentheses are necessary if any of the item expressions uses the `>'
+relational operator; otherwise it could be confused with a redirection
+(*note Redirecting Output of `print' and `printf': Redirection.).
+
+   The items to be printed can be constant strings or numbers, fields
+of the current record (such as `$1'), variables, or any `awk'
+expressions.  Numeric values are converted to strings, and then printed.
+
+   The `print' statement is completely general for computing _what_
+values to print. However, with two exceptions, you cannot specify _how_
+to print them--how many columns, whether to use exponential notation or
+not, and so on.  (For the exceptions, *note Output Separators::, and
+*Note Controlling Numeric Output with `print': OFMT.)  For that, you
+need the `printf' statement (*note Using `printf' Statements for
+Fancier Printing: Printf.).
+
+   The simple statement `print' with no items is equivalent to `print
+$0': it prints the entire current record.  To print a blank line, use
+`print ""', where `""' is the empty string.
+
+   To print a fixed piece of text, use a string constant such as
+`"Don't Panic"' as one item.  If you forget to use the double-quote
+characters, your text will be taken as an `awk' expression, and you
+will probably get an error.  Keep in mind that a space is printed
+between any two items.
+
+   Each `print' statement makes at least one line of output.  But it
+isn't limited to one line.  If an item value is a string that contains a
+newline, the newline is output along with the rest of the string.  A
+single `print' can make any number of lines this way.
+
+
+File: gawk.info,  Node: Print Examples,  Next: Output Separators,  Prev: Print,  Up: Printing
+
+Examples of `print' Statements
+==============================
+
+   Here is an example of printing a string that contains embedded
+newlines (the `\n' is an escape sequence, used to represent the newline
+character; see *Note Escape Sequences::):
+
+     $ awk 'BEGIN { print "line one\nline two\nline three" }'
+     -| line one
+     -| line two
+     -| line three
+
+   Here is an example that prints the first two fields of each input
+record, with a space between them:
+
+     $ awk '{ print $1, $2 }' inventory-shipped
+     -| Jan 13
+     -| Feb 15
+     -| Mar 15
+     ...
+
+   A common mistake in using the `print' statement is to omit the comma
+between two items.  This often has the effect of making the items run
+together in the output, with no space.  The reason for this is that
+juxtaposing two string expressions in `awk' means to concatenate them.
+Here is the same program, without the comma:
+
+     $ awk '{ print $1 $2 }' inventory-shipped
+     -| Jan13
+     -| Feb15
+     -| Mar15
+     ...
+
+   To someone unfamiliar with the file `inventory-shipped', neither
+example's output makes much sense.  A heading line at the beginning
+would make it clearer.  Let's add some headings to our table of months
+(`$1') and green crates shipped (`$2').  We do this using the `BEGIN'
+pattern (*note The `BEGIN' and `END' Special Patterns: BEGIN/END.)  to
+force the headings to be printed only once:
+
+     awk 'BEGIN {  print "Month Crates"
+                   print "----- ------" }
+                {  print $1, $2 }' inventory-shipped
+
+Did you already guess what happens? When run, the program prints the
+following:
+
+     Month Crates
+     ----- ------
+     Jan 13
+     Feb 15
+     Mar 15
+     ...
+
+The headings and the table data don't line up!  We can fix this by
+printing some spaces between the two fields:
+
+     awk 'BEGIN { print "Month Crates"
+                  print "----- ------" }
+                { print $1, "     ", $2 }' inventory-shipped
+
+   You can imagine that this way of lining up columns can get pretty
+complicated when you have many columns to fix.  Counting spaces for two
+or three columns can be simple, but more than this and you can get lost
+quite easily.  This is why the `printf' statement was created (*note
+Using `printf' Statements for Fancier Printing: Printf.); one of its
+specialties is lining up columns of data.
+
+   As a side point, you can continue either a `print' or `printf'
+statement simply by putting a newline after any comma (*note `awk'
+Statements Versus Lines: Statements/Lines.).
+
+
+File: gawk.info,  Node: Output Separators,  Next: OFMT,  Prev: Print Examples,  Up: Printing
+
+Output Separators
+=================
+
+   As mentioned previously, a `print' statement contains a list of
+items, separated by commas.  In the output, the items are normally
+separated by single spaces.  This need not be the case; a single space
+is only the default.  You can specify any string of characters to use
+as the "output field separator" by setting the built-in variable `OFS'.
+The initial value of this variable is the string `" "', that is, a
+single space.
+
+   The output from an entire `print' statement is called an "output
+record".  Each `print' statement outputs one output record and then
+outputs a string called the "output record separator".  The built-in
+variable `ORS' specifies this string.  The initial value of `ORS' is
+the string `"\n"', i.e. a newline character; thus, normally each
+`print' statement makes a separate line.
+
+   You can change how output fields and records are separated by
+assigning new values to the variables `OFS' and/or `ORS'.  The usual
+place to do this is in the `BEGIN' rule (*note The `BEGIN' and `END'
+Special Patterns: BEGIN/END.), so that it happens before any input is
+processed.  You may also do this with assignments on the command line,
+before the names of your input files, or using the `-v' command line
+option (*note Command Line Options: Options.).
+
+   The following example prints the first and second fields of each
+input record separated by a semicolon, with a blank line added after
+each line:
+
+     $ awk 'BEGIN { OFS = ";"; ORS = "\n\n" }
+     >            { print $1, $2 }' BBS-list
+     -| aardvark;555-5553
+     -|
+     -| alpo-net;555-3412
+     -|
+     -| barfly;555-7685
+     ...
+
+   If the value of `ORS' does not contain a newline, all your output
+will be run together on a single line, unless you output newlines some
+other way.
+
+
+File: gawk.info,  Node: OFMT,  Next: Printf,  Prev: Output Separators,  Up: Printing
+
+Controlling Numeric Output with `print'
+=======================================
+
+   When you use the `print' statement to print numeric values, `awk'
+internally converts the number to a string of characters, and prints
+that string.  `awk' uses the `sprintf' function to do this conversion
+(*note Built-in Functions for String Manipulation: String Functions.).
+For now, it suffices to say that the `sprintf' function accepts a
+"format specification" that tells it how to format numbers (or
+strings), and that there are a number of different ways in which
+numbers can be formatted.  The different format specifications are
+discussed more fully in *Note Format-Control Letters: Control Letters.
+
+   The built-in variable `OFMT' contains the default format
+specification that `print' uses with `sprintf' when it wants to convert
+a number to a string for printing.  The default value of `OFMT' is
+`"%.6g"'.  By supplying different format specifications as the value of
+`OFMT', you can change how `print' will print your numbers.  As a brief
+example:
+
+     $ awk 'BEGIN {
+     >   OFMT = "%.0f"  # print numbers as integers (rounds)
+     >   print 17.23 }'
+     -| 17
+
+According to the POSIX standard, `awk''s behavior will be undefined if
+`OFMT' contains anything but a floating point conversion specification
+(d.c.).
+
+
+File: gawk.info,  Node: Printf,  Next: Redirection,  Prev: OFMT,  Up: Printing
+
+Using `printf' Statements for Fancier Printing
+==============================================
+
+   If you want more precise control over the output format than `print'
+gives you, use `printf'.  With `printf' you can specify the width to
+use for each item, and you can specify various formatting choices for
+numbers (such as what radix to use, whether to print an exponent,
+whether to print a sign, and how many digits to print after the decimal
+point).  You do this by supplying a string, called the "format string",
+which controls how and where to print the other arguments.
+
+* Menu:
+
+* Basic Printf::                Syntax of the `printf' statement.
+* Control Letters::             Format-control letters.
+* Format Modifiers::            Format-specification modifiers.
+* Printf Examples::             Several examples.
+
+
+File: gawk.info,  Node: Basic Printf,  Next: Control Letters,  Prev: Printf,  Up: Printf
+
+Introduction to the `printf' Statement
+--------------------------------------
+
+   The `printf' statement looks like this:
+
+     printf FORMAT, ITEM1, ITEM2, ...
+
+The entire list of arguments may optionally be enclosed in parentheses.
+The parentheses are necessary if any of the item expressions use the
+`>' relational operator; otherwise it could be confused with a
+redirection (*note Redirecting Output of `print' and `printf':
+Redirection.).
+
+   The difference between `printf' and `print' is the FORMAT argument.
+This is an expression whose value is taken as a string; it specifies
+how to output each of the other arguments.  It is called the "format
+string".
+
+   The format string is very similar to that in the ANSI C library
+function `printf'.  Most of FORMAT is text to be output verbatim.
+Scattered among this text are "format specifiers", one per item.  Each
+format specifier says to output the next item in the argument list at
+that place in the format.
+
+   The `printf' statement does not automatically append a newline to its
+output.  It outputs only what the format string specifies.  So if you
+want a newline, you must include one in the format string.  The output
+separator variables `OFS' and `ORS' have no effect on `printf'
+statements. For example:
+
+     BEGIN {
+        ORS = "\nOUCH!\n"; OFS = "!"
+        msg = "Don't Panic!"; printf "%s\n", msg
+     }
+
+   This program still prints the familiar `Don't Panic!' message.
+
+
+File: gawk.info,  Node: Control Letters,  Next: Format Modifiers,  Prev: Basic Printf,  Up: Printf
+
+Format-Control Letters
+----------------------
+
+   A format specifier starts with the character `%' and ends with a
+"format-control letter"; it tells the `printf' statement how to output
+one item.  (If you actually want to output a `%', write `%%'.)  The
+format-control letter specifies what kind of value to print.  The rest
+of the format specifier is made up of optional "modifiers" which are
+parameters to use, such as the field width.
+
+   Here is a list of the format-control letters:
+
+`c'
+     This prints a number as an ASCII character.  Thus, `printf "%c",
+     65' outputs the letter `A'.  The output for a string value is the
+     first character of the string.
+
+`d'
+`i'
+     These are equivalent. They both print a decimal integer.  The `%i'
+     specification is for compatibility with ANSI C.
+
+`e'
+`E'
+     This prints a number in scientific (exponential) notation.  For
+     example,
+
+          printf "%4.3e\n", 1950
+
+     prints `1.950e+03', with a total of four significant figures of
+     which three follow the decimal point.  The `4.3' are modifiers,
+     discussed below. `%E' uses `E' instead of `e' in the output.
+
+`f'
+     This prints a number in floating point notation.  For example,
+
+          printf "%4.3f", 1950
+
+     prints `1950.000', with a total of four significant figures of
+     which three follow the decimal point.  The `4.3' are modifiers,
+     discussed below.
+
+`g'
+`G'
+     This prints a number in either scientific notation or floating
+     point notation, whichever uses fewer characters. If the result is
+     printed in scientific notation, `%G' uses `E' instead of `e'.
+
+`o'
+     This prints an unsigned octal integer.  (In octal, or base-eight
+     notation, the digits run from `0' to `7'; the decimal number eight
+     is represented as `10' in octal.)
+
+`s'
+     This prints a string.
+
+`x'
+`X'
+     This prints an unsigned hexadecimal integer.  (In hexadecimal, or
+     base-16 notation, the digits are `0' through `9' and `a' through
+     `f'.  The hexadecimal digit `f' represents the decimal number 15.)
+     `%X' uses the letters `A' through `F' instead of `a' through `f'.
+
+`%'
+     This isn't really a format-control letter, but it does have a
+     meaning when used after a `%': the sequence `%%' outputs one `%'.
+     It does not consume an argument, and it ignores any modifiers.
+
+   When using the integer format-control letters for values that are
+outside the range of a C `long' integer, `gawk' will switch to the `%g'
+format specifier. Other versions of `awk' may print invalid values, or
+do something else entirely (d.c.).
+
+
+File: gawk.info,  Node: Format Modifiers,  Next: Printf Examples,  Prev: Control Letters,  Up: Printf
+
+Modifiers for `printf' Formats
+------------------------------
+
+   A format specification can also include "modifiers" that can control
+how much of the item's value is printed and how much space it gets.  The
+modifiers come between the `%' and the format-control letter.  In the
+examples below, we use the bullet symbol "*" to represent spaces in the
+output. Here are the possible modifiers, in the order in which they may
+appear:
+
+`-'
+     The minus sign, used before the width modifier (see below), says
+     to left-justify the argument within its specified width.  Normally
+     the argument is printed right-justified in the specified width.
+     Thus,
+
+          printf "%-4s", "foo"
+
+     prints `foo*'.
+
+`SPACE'
+     For numeric conversions, prefix positive values with a space, and
+     negative values with a minus sign.
+
+`+'
+     The plus sign, used before the width modifier (see below), says to
+     always supply a sign for numeric conversions, even if the data to
+     be formatted is positive. The `+' overrides the space modifier.
+
+`#'
+     Use an "alternate form" for certain control letters.  For `%o',
+     supply a leading zero.  For `%x', and `%X', supply a leading `0x'
+     or `0X' for a non-zero result.  For `%e', `%E', and `%f', the
+     result will always contain a decimal point.  For `%g', and `%G',
+     trailing zeros are not removed from the result.
+
+`0'
+     A leading `0' (zero) acts as a flag, that indicates output should
+     be padded with zeros instead of spaces.  This applies even to
+     non-numeric output formats (d.c.).  This flag only has an effect
+     when the field width is wider than the value to be printed.
+
+`WIDTH'
+     This is a number specifying the desired minimum width of a field.
+     Inserting any number between the `%' sign and the format control
+     character forces the field to be expanded to this width.  The
+     default way to do this is to pad with spaces on the left.  For
+     example,
+
+          printf "%4s", "foo"
+
+     prints `*foo'.
+
+     The value of WIDTH is a minimum width, not a maximum.  If the item
+     value requires more than WIDTH characters, it can be as wide as
+     necessary.  Thus,
+
+          printf "%4s", "foobar"
+
+     prints `foobar'.
+
+     Preceding the WIDTH with a minus sign causes the output to be
+     padded with spaces on the right, instead of on the left.
+
+`.PREC'
+     This is a number that specifies the precision to use when printing.
+     For the `e', `E', and `f' formats, this specifies the number of
+     digits you want printed to the right of the decimal point.  For
+     the `g', and `G' formats, it specifies the maximum number of
+     significant digits.  For the `d', `o', `i', `u', `x', and `X'
+     formats, it specifies the minimum number of digits to print.  For
+     a string, it specifies the maximum number of characters from the
+     string that should be printed.  Thus,
+
+          printf "%.4s", "foobar"
+
+     prints `foob'.
+
+   The C library `printf''s dynamic WIDTH and PREC capability (for
+example, `"%*.*s"') is supported.  Instead of supplying explicit WIDTH
+and/or PREC values in the format string, you pass them in the argument
+list.  For example:
+
+     w = 5
+     p = 3
+     s = "abcdefg"
+     printf "%*.*s\n", w, p, s
+
+is exactly equivalent to
+
+     s = "abcdefg"
+     printf "%5.3s\n", s
+
+Both programs output `**abc'.
+
+   Earlier versions of `awk' did not support this capability.  If you
+must use such a version, you may simulate this feature by using
+concatenation to build up the format string, like so:
+
+     w = 5
+     p = 3
+     s = "abcdefg"
+     printf "%" w "." p "s\n", s
+
+This is not particularly easy to read, but it does work.
+
+   C programmers may be used to supplying additional `l' and `h' flags
+in `printf' format strings. These are not valid in `awk'.  Most `awk'
+implementations silently ignore these flags.  If `--lint' is provided
+on the command line (*note Command Line Options: Options.), `gawk' will
+warn about their use. If `--posix' is supplied, their use is a fatal
+error.
+
+
+File: gawk.info,  Node: Printf Examples,  Prev: Format Modifiers,  Up: Printf
+
+Examples Using `printf'
+-----------------------
+
+   Here is how to use `printf' to make an aligned table:
+
+     awk '{ printf "%-10s %s\n", $1, $2 }' BBS-list
+
+prints the names of bulletin boards (`$1') of the file `BBS-list' as a
+string of 10 characters, left justified.  It also prints the phone
+numbers (`$2') afterward on the line.  This produces an aligned
+two-column table of names and phone numbers:
+
+     $ awk '{ printf "%-10s %s\n", $1, $2 }' BBS-list
+     -| aardvark   555-5553
+     -| alpo-net   555-3412
+     -| barfly     555-7685
+     -| bites      555-1675
+     -| camelot    555-0542
+     -| core       555-2912
+     -| fooey      555-1234
+     -| foot       555-6699
+     -| macfoo     555-6480
+     -| sdace      555-3430
+     -| sabafoo    555-2127
+
+   Did you notice that we did not specify that the phone numbers be
+printed as numbers?  They had to be printed as strings because the
+numbers are separated by a dash.  If we had tried to print the phone
+numbers as numbers, all we would have gotten would have been the first
+three digits, `555'.  This would have been pretty confusing.
+
+   We did not specify a width for the phone numbers because they are the
+last things on their lines.  We don't need to put spaces after them.
+
+   We could make our table look even nicer by adding headings to the
+tops of the columns.  To do this, we use the `BEGIN' pattern (*note The
+`BEGIN' and `END' Special Patterns: BEGIN/END.)  to force the header to
+be printed only once, at the beginning of the `awk' program:
+
+     awk 'BEGIN { print "Name      Number"
+                  print "----      ------" }
+          { printf "%-10s %s\n", $1, $2 }' BBS-list
+
+   Did you notice that we mixed `print' and `printf' statements in the
+above example?  We could have used just `printf' statements to get the
+same results:
+
+     awk 'BEGIN { printf "%-10s %s\n", "Name", "Number"
+                  printf "%-10s %s\n", "----", "------" }
+          { printf "%-10s %s\n", $1, $2 }' BBS-list
+
+By printing each column heading with the same format specification used
+for the elements of the column, we have made sure that the headings are
+aligned just like the columns.
+
+   The fact that the same format specification is used three times can
+be emphasized by storing it in a variable, like this:
+
+     awk 'BEGIN { format = "%-10s %s\n"
+                  printf format, "Name", "Number"
+                  printf format, "----", "------" }
+          { printf format, $1, $2 }' BBS-list
+
+   See if you can use the `printf' statement to line up the headings and
+table data for our `inventory-shipped' example covered earlier in the
+section on the `print' statement (*note The `print' Statement: Print.).
+
+
+File: gawk.info,  Node: Redirection,  Next: Special Files,  Prev: Printf,  Up: Printing
+
+Redirecting Output of `print' and `printf'
+==========================================
+
+   So far we have been dealing only with output that prints to the
+standard output, usually your terminal.  Both `print' and `printf' can
+also send their output to other places.  This is called "redirection".
+
+   A redirection appears after the `print' or `printf' statement.
+Redirections in `awk' are written just like redirections in shell
+commands, except that they are written inside the `awk' program.
+
+   There are three forms of output redirection: output to a file,
+output appended to a file, and output through a pipe to another command.
+They are all shown for the `print' statement, but they work identically
+for `printf' also.
+
+`print ITEMS > OUTPUT-FILE'
+     This type of redirection prints the items into the output file
+     OUTPUT-FILE.  The file name OUTPUT-FILE can be any expression.
+     Its value is changed to a string and then used as a file name
+     (*note Expressions::).
+
+     When this type of redirection is used, the OUTPUT-FILE is erased
+     before the first output is written to it.  Subsequent writes to
+     the same OUTPUT-FILE do not erase OUTPUT-FILE, but append to it.
+     If OUTPUT-FILE does not exist, then it is created.
+
+     For example, here is how an `awk' program can write a list of BBS
+     names to a file `name-list' and a list of phone numbers to a file
+     `phone-list'.  Each output file contains one name or number per
+     line.
+
+          $ awk '{ print $2 > "phone-list"
+          >        print $1 > "name-list" }' BBS-list
+          $ cat phone-list
+          -| 555-5553
+          -| 555-3412
+          ...
+          $ cat name-list
+          -| aardvark
+          -| alpo-net
+          ...
+
+`print ITEMS >> OUTPUT-FILE'
+     This type of redirection prints the items into the pre-existing
+     output file OUTPUT-FILE.  The difference between this and the
+     single-`>' redirection is that the old contents (if any) of
+     OUTPUT-FILE are not erased.  Instead, the `awk' output is appended
+     to the file.  If OUTPUT-FILE does not exist, then it is created.
+
+`print ITEMS | COMMAND'
+     It is also possible to send output to another program through a
+     pipe instead of into a file.   This type of redirection opens a
+     pipe to COMMAND and writes the values of ITEMS through this pipe,
+     to another process created to execute COMMAND.
+
+     The redirection argument COMMAND is actually an `awk' expression.
+     Its value is converted to a string, whose contents give the shell
+     command to be run.
+
+     For example, this produces two files, one unsorted list of BBS
+     names and one list sorted in reverse alphabetical order:
+
+          awk '{ print $1 > "names.unsorted"
+                 command = "sort -r > names.sorted"
+                 print $1 | command }' BBS-list
+
+     Here the unsorted list is written with an ordinary redirection
+     while the sorted list is written by piping through the `sort'
+     utility.
+
+     This example uses redirection to mail a message to a mailing list
+     `bug-system'.  This might be useful when trouble is encountered in
+     an `awk' script run periodically for system maintenance.
+
+          report = "mail bug-system"
+          print "Awk script failed:", $0 | report
+          m = ("at record number " FNR " of " FILENAME)
+          print m | report
+          close(report)
+
+     The message is built using string concatenation and saved in the
+     variable `m'.  It is then sent down the pipeline to the `mail'
+     program.
+
+     We call the `close' function here because it's a good idea to close
+     the pipe as soon as all the intended output has been sent to it.
+     *Note Closing Input and Output Files and Pipes: Close Files And
+     Pipes, for more information on this.  This example also
+     illustrates the use of a variable to represent a FILE or COMMAND:
+     it is not necessary to always use a string constant.  Using a
+     variable is generally a good idea, since `awk' requires you to
+     spell the string value identically every time.
+
+   Redirecting output using `>', `>>', or `|' asks the system to open a
+file or pipe only if the particular FILE or COMMAND you've specified
+has not already been written to by your program, or if it has been
+closed since it was last written to.
+
+   Many `awk' implementations limit the number of pipelines an `awk'
+program may have open to just one!  In `gawk', there is no such limit.
+You can open as many pipelines as the underlying operating system will
+permit.
+
+
+File: gawk.info,  Node: Special Files,  Next: Close Files And Pipes,  Prev: Redirection,  Up: Printing
+
+Special File Names in `gawk'
+============================
+
+   Running programs conventionally have three input and output streams
+already available to them for reading and writing.  These are known as
+the "standard input", "standard output", and "standard error output".
+These streams are, by default, connected to your terminal, but they are
+often redirected with the shell, via the `<', `<<', `>', `>>', `>&' and
+`|' operators.  Standard error is typically used for writing error
+messages; the reason we have two separate streams, standard output and
+standard error, is so that they can be redirected separately.
+
+   In other implementations of `awk', the only way to write an error
+message to standard error in an `awk' program is as follows:
+
+     print "Serious error detected!" | "cat 1>&2"
+
+This works by opening a pipeline to a shell command which can access the
+standard error stream which it inherits from the `awk' process.  This
+is far from elegant, and is also inefficient, since it requires a
+separate process.  So people writing `awk' programs often neglect to do
+this.  Instead, they send the error messages to the terminal, like this:
+
+     print "Serious error detected!" > "/dev/tty"
+
+This usually has the same effect, but not always: although the standard
+error stream is usually the terminal, it can be redirected, and when
+that happens, writing to the terminal is not correct.  In fact, if
+`awk' is run from a background job, it may not have a terminal at all.
+Then opening `/dev/tty' will fail.
+
+   `gawk' provides special file names for accessing the three standard
+streams.  When you redirect input or output in `gawk', if the file name
+matches one of these special names, then `gawk' directly uses the
+stream it stands for.
+
+`/dev/stdin'
+     The standard input (file descriptor 0).
+
+`/dev/stdout'
+     The standard output (file descriptor 1).
+
+`/dev/stderr'
+     The standard error output (file descriptor 2).
+
+`/dev/fd/N'
+     The file associated with file descriptor N.  Such a file must have
+     been opened by the program initiating the `awk' execution
+     (typically the shell).  Unless you take special pains in the shell
+     from which you invoke `gawk', only descriptors 0, 1 and 2 are
+     available.
+
+   The file names `/dev/stdin', `/dev/stdout', and `/dev/stderr' are
+aliases for `/dev/fd/0', `/dev/fd/1', and `/dev/fd/2', respectively,
+but they are more self-explanatory.
+
+   The proper way to write an error message in a `gawk' program is to
+use `/dev/stderr', like this:
+
+     print "Serious error detected!" > "/dev/stderr"
+
+   `gawk' also provides special file names that give access to
+information about the running `gawk' process.  Each of these "files"
+provides a single record of information.  To read them more than once,
+you must first close them with the `close' function (*note Closing
+Input and Output Files and Pipes: Close Files And Pipes.).  The
+filenames are:
+
+`/dev/pid'
+     Reading this file returns the process ID of the current process,
+     in decimal, terminated with a newline.
+
+`/dev/ppid'
+     Reading this file returns the parent process ID of the current
+     process, in decimal, terminated with a newline.
+
+`/dev/pgrpid'
+     Reading this file returns the process group ID of the current
+     process, in decimal, terminated with a newline.
+
+`/dev/user'
+     Reading this file returns a single record terminated with a
+     newline.  The fields are separated with spaces.  The fields
+     represent the following information:
+
+    `$1'
+          The return value of the `getuid' system call (the real user
+          ID number).
+
+    `$2'
+          The return value of the `geteuid' system call (the effective
+          user ID number).
+
+    `$3'
+          The return value of the `getgid' system call (the real group
+          ID number).
+
+    `$4'
+          The return value of the `getegid' system call (the effective
+          group ID number).
+
+     If there are any additional fields, they are the group IDs
+     returned by `getgroups' system call.  (Multiple groups may not be
+     supported on all systems.)
+
+   These special file names may be used on the command line as data
+files, as well as for I/O redirections within an `awk' program.  They
+may not be used as source files with the `-f' option.
+
+   Recognition of these special file names is disabled if `gawk' is in
+compatibility mode (*note Command Line Options: Options.).
+
+   *Caution*:  Unless your system actually has a `/dev/fd' directory
+(or any of the other above listed special files), the interpretation of
+these file names is done by `gawk' itself.  For example, using
+`/dev/fd/4' for output will actually write on file descriptor 4, and
+not on a new file descriptor that was `dup''ed from file descriptor 4.
+Most of the time this does not matter; however, it is important to
+_not_ close any of the files related to file descriptors 0, 1, and 2.
+If you do close one of these files, unpredictable behavior will result.
+
+   The special files that provide process-related information may
+disappear in a future version of `gawk'.  *Note Probable Future
+Extensions: Future Extensions.
+
+
+File: gawk.info,  Node: Close Files And Pipes,  Prev: Special Files,  Up: Printing
+
+Closing Input and Output Files and Pipes
+========================================
+
+   If the same file name or the same shell command is used with
+`getline' (*note Explicit Input with `getline': Getline.)  more than
+once during the execution of an `awk' program, the file is opened (or
+the command is executed) only the first time.  At that time, the first
+record of input is read from that file or command.  The next time the
+same file or command is used in `getline', another record is read from
+it, and so on.
+
+   Similarly, when a file or pipe is opened for output, the file name
+or command associated with it is remembered by `awk' and subsequent
+writes to the same file or command are appended to the previous writes.
+The file or pipe stays open until `awk' exits.
+
+   This implies that if you want to start reading the same file again
+from the beginning, or if you want to rerun a shell command (rather than
+reading more output from the command), you must take special steps.
+What you must do is use the `close' function, as follows:
+
+     close(FILENAME)
+
+or
+
+     close(COMMAND)
+
+   The argument FILENAME or COMMAND can be any expression.  Its value
+must _exactly_ match the string that was used to open the file or start
+the command (spaces and other "irrelevant" characters included). For
+example, if you open a pipe with this:
+
+     "sort -r names" | getline foo
+
+then you must close it with this:
+
+     close("sort -r names")
+
+   Once this function call is executed, the next `getline' from that
+file or command, or the next `print' or `printf' to that file or
+command, will reopen the file or rerun the command.
+
+   Because the expression that you use to close a file or pipeline must
+exactly match the expression used to open the file or run the command,
+it is good practice to use a variable to store the file name or command.
+The previous example would become
+
+     sortcom = "sort -r names"
+     sortcom | getline foo
+     ...
+     close(sortcom)
+
+This helps avoid hard-to-find typographical errors in your `awk'
+programs.
+
+   Here are some reasons why you might need to close an output file:
+
+   * To write a file and read it back later on in the same `awk'
+     program.  Close the file when you are finished writing it; then
+     you can start reading it with `getline'.
+
+   * To write numerous files, successively, in the same `awk' program.
+     If you don't close the files, eventually you may exceed a system
+     limit on the number of open files in one process.  So close each
+     one when you are finished writing it.
+
+   * To make a command finish.  When you redirect output through a pipe,
+     the command reading the pipe normally continues to try to read
+     input as long as the pipe is open.  Often this means the command
+     cannot really do its work until the pipe is closed.  For example,
+     if you redirect output to the `mail' program, the message is not
+     actually sent until the pipe is closed.
+
+   * To run the same program a second time, with the same arguments.
+     This is not the same thing as giving more input to the first run!
+
+     For example, suppose you pipe output to the `mail' program.  If you
+     output several lines redirected to this pipe without closing it,
+     they make a single message of several lines.  By contrast, if you
+     close the pipe after each line of output, then each line makes a
+     separate message.
+
+   `close' returns a value of zero if the close succeeded.  Otherwise,
+the value will be non-zero.  In this case, `gawk' sets the variable
+`ERRNO' to a string describing the error that occurred.
+
+   If you use more files than the system allows you to have open,
+`gawk' will attempt to multiplex the available open files among your
+data files.  `gawk''s ability to do this depends upon the facilities of
+your operating system: it may not always work.  It is therefore both
+good practice and good portability advice to always use `close' on your
+files when you are done with them.
+
+
+File: gawk.info,  Node: Expressions,  Next: Patterns and Actions,  Prev: Printing,  Up: Top
+
+Expressions
+***********
+
+   Expressions are the basic building blocks of `awk' patterns and
+actions.  An expression evaluates to a value, which you can print, test,
+store in a variable or pass to a function.  Additionally, an expression
+can assign a new value to a variable or a field, with an assignment
+operator.
+
+   An expression can serve as a pattern or action statement on its own.
+Most other kinds of statements contain one or more expressions which
+specify data on which to operate.  As in other languages, expressions
+in `awk' include variables, array references, constants, and function
+calls, as well as combinations of these with various operators.
+
+* Menu:
+
+* Constants::                   String, numeric, and regexp constants.
+* Using Constant Regexps::      When and how to use a regexp constant.
+* Variables::                   Variables give names to values for later use.
+* Conversion::                  The conversion of strings to numbers and vice
+                                versa.
+* Arithmetic Ops::              Arithmetic operations (`+', `-',
+                                etc.)
+* Concatenation::               Concatenating strings.
+* Assignment Ops::              Changing the value of a variable or a field.
+* Increment Ops::               Incrementing the numeric value of a variable.
+* Truth Values::                What is ``true'' and what is ``false''.
+* Typing and Comparison::       How variables acquire types, and how this
+                                affects comparison of numbers and strings with
+                                `<', etc.
+* Boolean Ops::                 Combining comparison expressions using boolean
+                                operators `||' (``or''), `&&'
+                                (``and'') and `!' (``not'').
+* Conditional Exp::             Conditional expressions select between two
+                                subexpressions under control of a third
+                                subexpression.
+* Function Calls::              A function call is an expression.
+* Precedence::                  How various operators nest.
+
+
+File: gawk.info,  Node: Constants,  Next: Using Constant Regexps,  Prev: Expressions,  Up: Expressions
+
+Constant Expressions
+====================
+
+   The simplest type of expression is the "constant", which always has
+the same value.  There are three types of constants: numeric constants,
+string constants, and regular expression constants.
+
+* Menu:
+
+* Scalar Constants::            Numeric and string constants.
+* Regexp Constants::            Regular Expression constants.
+
+
+File: gawk.info,  Node: Scalar Constants,  Next: Regexp Constants,  Prev: Constants,  Up: Constants
+
+Numeric and String Constants
+----------------------------
+
+   A "numeric constant" stands for a number.  This number can be an
+integer, a decimal fraction, or a number in scientific (exponential)
+notation.(1) Here are some examples of numeric constants, which all
+have the same value:
+
+     105
+     1.05e+2
+     1050e-1
+
+   A string constant consists of a sequence of characters enclosed in
+double-quote marks.  For example:
+
+     "parrot"
+
+represents the string whose contents are `parrot'.  Strings in `gawk'
+can be of any length and they can contain any of the possible eight-bit
+ASCII characters including ASCII NUL (character code zero).  Other `awk'
+implementations may have difficulty with some character codes.
+
+   ---------- Footnotes ----------
+
+   (1) The internal representation uses double-precision floating point
+numbers. If you don't know what that means, then don't worry about it.
+
+
+File: gawk.info,  Node: Regexp Constants,  Prev: Scalar Constants,  Up: Constants
+
+Regular Expression Constants
+----------------------------
+
+   A regexp constant is a regular expression description enclosed in
+slashes, such as `/^beginning and end$/'.  Most regexps used in `awk'
+programs are constant, but the `~' and `!~' matching operators can also
+match computed or "dynamic" regexps (which are just ordinary strings or
+variables that contain a regexp).
+
+
+File: gawk.info,  Node: Using Constant Regexps,  Next: Variables,  Prev: Constants,  Up: Expressions
+
+Using Regular Expression Constants
+==================================
+
+   When used on the right hand side of the `~' or `!~' operators, a
+regexp constant merely stands for the regexp that is to be matched.
+
+   Regexp constants (such as `/foo/') may be used like simple
+expressions.  When a regexp constant appears by itself, it has the same
+meaning as if it appeared in a pattern, i.e. `($0 ~ /foo/)' (d.c.)
+(*note Expressions as Patterns: Expression Patterns.).  This means that
+the two code segments,
+
+     if ($0 ~ /barfly/ || $0 ~ /camelot/)
+         print "found"
+
+and
+
+     if (/barfly/ || /camelot/)
+         print "found"
+
+are exactly equivalent.
+
+   One rather bizarre consequence of this rule is that the following
+boolean expression is valid, but does not do what the user probably
+intended:
+
+     # note that /foo/ is on the left of the ~
+     if (/foo/ ~ $1) print "found foo"
+
+This code is "obviously" testing `$1' for a match against the regexp
+`/foo/'.  But in fact, the expression `/foo/ ~ $1' actually means `($0
+~ /foo/) ~ $1'.  In other words, first match the input record against
+the regexp `/foo/'.  The result will be either zero or one, depending
+upon the success or failure of the match.  Then match that result
+against the first field in the record.
+
+   Since it is unlikely that you would ever really wish to make this
+kind of test, `gawk' will issue a warning when it sees this construct in
+a program.
+
+   Another consequence of this rule is that the assignment statement
+
+     matches = /foo/
+
+will assign either zero or one to the variable `matches', depending
+upon the contents of the current input record.
+
+   This feature of the language was never well documented until the
+POSIX specification.
+
+   Constant regular expressions are also used as the first argument for
+the `gensub', `sub' and `gsub' functions, and as the second argument of
+the `match' function (*note Built-in Functions for String Manipulation:
+String Functions.).  Modern implementations of `awk', including `gawk',
+allow the third argument of `split' to be a regexp constant, while some
+older implementations do not (d.c.).
+
+   This can lead to confusion when attempting to use regexp constants
+as arguments to user defined functions (*note User-defined Functions:
+User-defined.).  For example:
+
+     function mysub(pat, repl, str, global)
+     {
+         if (global)
+             gsub(pat, repl, str)
+         else
+             sub(pat, repl, str)
+         return str
+     }
+     
+     {
+         ...
+         text = "hi! hi yourself!"
+         mysub(/hi/, "howdy", text, 1)
+         ...
+     }
+
+   In this example, the programmer wishes to pass a regexp constant to
+the user-defined function `mysub', which will in turn pass it on to
+either `sub' or `gsub'.  However, what really happens is that the `pat'
+parameter will be either one or zero, depending upon whether or not
+`$0' matches `/hi/'.
+
+   As it is unlikely that you would ever really wish to pass a truth
+value in this way, `gawk' will issue a warning when it sees a regexp
+constant used as a parameter to a user-defined function.
+
+
+File: gawk.info,  Node: Variables,  Next: Conversion,  Prev: Using Constant Regexps,  Up: Expressions
+
+Variables
+=========
+
+   Variables are ways of storing values at one point in your program for
+use later in another part of your program.  You can manipulate them
+entirely within your program text, and you can also assign values to
+them on the `awk' command line.
+
+* Menu:
+
+* Using Variables::             Using variables in your programs.
+* Assignment Options::          Setting variables on the command line and a
+                                summary of command line syntax. This is an
+                                advanced method of input.
+
+
+File: gawk.info,  Node: Using Variables,  Next: Assignment Options,  Prev: Variables,  Up: Variables
+
+Using Variables in a Program
+----------------------------
+
+   Variables let you give names to values and refer to them later.  You
+have already seen variables in many of the examples.  The name of a
+variable must be a sequence of letters, digits and underscores, but it
+may not begin with a digit.  Case is significant in variable names; `a'
+and `A' are distinct variables.
+
+   A variable name is a valid expression by itself; it represents the
+variable's current value.  Variables are given new values with
+"assignment operators", "increment operators" and "decrement operators".
+*Note Assignment Expressions: Assignment Ops.
+
+   A few variables have special built-in meanings, such as `FS', the
+field separator, and `NF', the number of fields in the current input
+record.  *Note Built-in Variables::, for a list of them.  These
+built-in variables can be used and assigned just like all other
+variables, but their values are also used or changed automatically by
+`awk'.  All built-in variables names are entirely upper-case.
+
+   Variables in `awk' can be assigned either numeric or string values.
+By default, variables are initialized to the empty string, which is
+zero if converted to a number.  There is no need to "initialize" each
+variable explicitly in `awk', the way you would in C and in most other
+traditional languages.
+
+
+File: gawk.info,  Node: Assignment Options,  Prev: Using Variables,  Up: Variables
+
+Assigning Variables on the Command Line
+---------------------------------------
+
+   You can set any `awk' variable by including a "variable assignment"
+among the arguments on the command line when you invoke `awk' (*note
+Other Command Line Arguments: Other Arguments.).  Such an assignment has
+this form:
+
+     VARIABLE=TEXT
+
+With it, you can set a variable either at the beginning of the `awk'
+run or in between input files.
+
+   If you precede the assignment with the `-v' option, like this:
+
+     -v VARIABLE=TEXT
+
+then the variable is set at the very beginning, before even the `BEGIN'
+rules are run.  The `-v' option and its assignment must precede all the
+file name arguments, as well as the program text.  (*Note Command Line
+Options: Options, for more information about the `-v' option.)
+
+   Otherwise, the variable assignment is performed at a time determined
+by its position among the input file arguments: after the processing of
+the preceding input file argument.  For example:
+
+     awk '{ print $n }' n=4 inventory-shipped n=2 BBS-list
+
+prints the value of field number `n' for all input records.  Before the
+first file is read, the command line sets the variable `n' equal to
+four.  This causes the fourth field to be printed in lines from the
+file `inventory-shipped'.  After the first file has finished, but
+before the second file is started, `n' is set to two, so that the
+second field is printed in lines from `BBS-list'.
+
+     $ awk '{ print $n }' n=4 inventory-shipped n=2 BBS-list
+     -| 15
+     -| 24
+     ...
+     -| 555-5553
+     -| 555-3412
+     ...
+
+   Command line arguments are made available for explicit examination by
+the `awk' program in an array named `ARGV' (*note Using `ARGC' and
+`ARGV': ARGC and ARGV.).
+
+   `awk' processes the values of command line assignments for escape
+sequences (d.c.) (*note Escape Sequences::).
+
+
+File: gawk.info,  Node: Conversion,  Next: Arithmetic Ops,  Prev: Variables,  Up: Expressions
+
+Conversion of Strings and Numbers
+=================================
+
+   Strings are converted to numbers, and numbers to strings, if the
+context of the `awk' program demands it.  For example, if the value of
+either `foo' or `bar' in the expression `foo + bar' happens to be a
+string, it is converted to a number before the addition is performed.
+If numeric values appear in string concatenation, they are converted to
+strings.  Consider this:
+
+     two = 2; three = 3
+     print (two three) + 4
+
+This prints the (numeric) value 27.  The numeric values of the
+variables `two' and `three' are converted to strings and concatenated
+together, and the resulting string is converted back to the number 23,
+to which four is then added.
+
+   If, for some reason, you need to force a number to be converted to a
+string, concatenate the empty string, `""', with that number.  To force
+a string to be converted to a number, add zero to that string.
+
+   A string is converted to a number by interpreting any numeric prefix
+of the string as numerals: `"2.5"' converts to 2.5, `"1e3"' converts to
+1000, and `"25fix"' has a numeric value of 25.  Strings that can't be
+interpreted as valid numbers are converted to zero.
+
+   The exact manner in which numbers are converted into strings is
+controlled by the `awk' built-in variable `CONVFMT' (*note Built-in
+Variables::).  Numbers are converted using the `sprintf' function
+(*note Built-in Functions for String Manipulation: String Functions.)
+with `CONVFMT' as the format specifier.
+
+   `CONVFMT''s default value is `"%.6g"', which prints a value with at
+least six significant digits.  For some applications you will want to
+change it to specify more precision.  Double precision on most modern
+machines gives you 16 or 17 decimal digits of precision.
+
+   Strange results can happen if you set `CONVFMT' to a string that
+doesn't tell `sprintf' how to format floating point numbers in a useful
+way.  For example, if you forget the `%' in the format, all numbers
+will be converted to the same constant string.
+
+   As a special case, if a number is an integer, then the result of
+converting it to a string is _always_ an integer, no matter what the
+value of `CONVFMT' may be.  Given the following code fragment:
+
+     CONVFMT = "%2.2f"
+     a = 12
+     b = a ""
+
+`b' has the value `"12"', not `"12.00"' (d.c.).
+
+   Prior to the POSIX standard, `awk' specified that the value of
+`OFMT' was used for converting numbers to strings.  `OFMT' specifies
+the output format to use when printing numbers with `print'.  `CONVFMT'
+was introduced in order to separate the semantics of conversion from
+the semantics of printing.  Both `CONVFMT' and `OFMT' have the same
+default value: `"%.6g"'.  In the vast majority of cases, old `awk'
+programs will not change their behavior.  However, this use of `OFMT'
+is something to keep in mind if you must port your program to other
+implementations of `awk'; we recommend that instead of changing your
+programs, you just port `gawk' itself!  *Note The `print' Statement:
+Print, for more information on the `print' statement.
+
+
+File: gawk.info,  Node: Arithmetic Ops,  Next: Concatenation,  Prev: Conversion,  Up: Expressions
+
+Arithmetic Operators
+====================
+
+   The `awk' language uses the common arithmetic operators when
+evaluating expressions.  All of these arithmetic operators follow normal
+precedence rules, and work as you would expect them to.
+
+   Here is a file `grades' containing a list of student names and three
+test scores per student (it's a small class):
+
+     Pat   100 97 58
+     Sandy  84 72 93
+     Chris  72 92 89
+
+This programs takes the file `grades', and prints the average of the
+scores.
+
+     $ awk '{ sum = $2 + $3 + $4 ; avg = sum / 3
+     >        print $1, avg }' grades
+     -| Pat 85
+     -| Sandy 83
+     -| Chris 84.3333
+
+   This table lists the arithmetic operators in `awk', in order from
+highest precedence to lowest:
+
+`- X'
+     Negation.
+
+`+ X'
+     Unary plus.  The expression is converted to a number.
+
+`X ^ Y'
+`X ** Y'
+     Exponentiation: X raised to the Y power.  `2 ^ 3' has the value
+     eight.  The character sequence `**' is equivalent to `^'.  (The
+     POSIX standard only specifies the use of `^' for exponentiation.)
+
+`X * Y'
+     Multiplication.
+
+`X / Y'
+     Division.  Since all numbers in `awk' are real numbers, the result
+     is not rounded to an integer: `3 / 4' has the value 0.75.
+
+`X % Y'
+     Remainder.  The quotient is rounded toward zero to an integer,
+     multiplied by Y and this result is subtracted from X.  This
+     operation is sometimes known as "trunc-mod."  The following
+     relation always holds:
+
+          b * int(a / b) + (a % b) == a
+
+     One possibly undesirable effect of this definition of remainder is
+     that `X % Y' is negative if X is negative.  Thus,
+
+          -17 % 8 = -1
+
+     In other `awk' implementations, the signedness of the remainder
+     may be machine dependent.
+
+`X + Y'
+     Addition.
+
+`X - Y'
+     Subtraction.
+
+   For maximum portability, do not use the `**' operator.
+
+   Unary plus and minus have the same precedence, the multiplication
+operators all have the same precedence, and addition and subtraction
+have the same precedence.
+
+
+File: gawk.info,  Node: Concatenation,  Next: Assignment Ops,  Prev: Arithmetic Ops,  Up: Expressions
+
+String Concatenation
+====================
+
+   There is only one string operation: concatenation.  It does not have
+a specific operator to represent it.  Instead, concatenation is
+performed by writing expressions next to one another, with no operator.
+For example:
+
+     $ awk '{ print "Field number one: " $1 }' BBS-list
+     -| Field number one: aardvark
+     -| Field number one: alpo-net
+     ...
+
+   Without the space in the string constant after the `:', the line
+would run together.  For example:
+
+     $ awk '{ print "Field number one:" $1 }' BBS-list
+     -| Field number one:aardvark
+     -| Field number one:alpo-net
+     ...
+
+   Since string concatenation does not have an explicit operator, it is
+often necessary to insure that it happens where you want it to by using
+parentheses to enclose the items to be concatenated.  For example, the
+following code fragment does not concatenate `file' and `name' as you
+might expect:
+
+     file = "file"
+     name = "name"
+     print "something meaningful" > file name
+
+It is necessary to use the following:
+
+     print "something meaningful" > (file name)
+
+   We recommend that you use parentheses around concatenation in all
+but the most common contexts (such as on the right-hand side of `=').
+
+
+File: gawk.info,  Node: Assignment Ops,  Next: Increment Ops,  Prev: Concatenation,  Up: Expressions
+
+Assignment Expressions
+======================
+
+   An "assignment" is an expression that stores a new value into a
+variable.  For example, let's assign the value one to the variable `z':
+
+     z = 1
+
+   After this expression is executed, the variable `z' has the value
+one.  Whatever old value `z' had before the assignment is forgotten.
+
+   Assignments can store string values also.  For example, this would
+store the value `"this food is good"' in the variable `message':
+
+     thing = "food"
+     predicate = "good"
+     message = "this " thing " is " predicate
+
+(This also illustrates string concatenation.)
+
+   The `=' sign is called an "assignment operator".  It is the simplest
+assignment operator because the value of the right-hand operand is
+stored unchanged.
+
+   Most operators (addition, concatenation, and so on) have no effect
+except to compute a value.  If you ignore the value, you might as well
+not use the operator.  An assignment operator is different; it does
+produce a value, but even if you ignore the value, the assignment still
+makes itself felt through the alteration of the variable.  We call this
+a "side effect".
+
+   The left-hand operand of an assignment need not be a variable (*note
+Variables::); it can also be a field (*note Changing the Contents of a
+Field: Changing Fields.) or an array element (*note Arrays in `awk':
+Arrays.).  These are all called "lvalues", which means they can appear
+on the left-hand side of an assignment operator.  The right-hand
+operand may be any expression; it produces the new value which the
+assignment stores in the specified variable, field or array element.
+(Such values are called "rvalues").
+
+   It is important to note that variables do _not_ have permanent types.
+The type of a variable is simply the type of whatever value it happens
+to hold at the moment.  In the following program fragment, the variable
+`foo' has a numeric value at first, and a string value later on:
+
+     foo = 1
+     print foo
+     foo = "bar"
+     print foo
+
+When the second assignment gives `foo' a string value, the fact that it
+previously had a numeric value is forgotten.
+
+   String values that do not begin with a digit have a numeric value of
+zero. After executing this code, the value of `foo' is five:
+
+     foo = "a string"
+     foo = foo + 5
+
+(Note that using a variable as a number and then later as a string can
+be confusing and is poor programming style.  The above examples
+illustrate how `awk' works, _not_ how you should write your own
+programs!)
+
+   An assignment is an expression, so it has a value: the same value
+that is assigned.  Thus, `z = 1' as an expression has the value one.
+One consequence of this is that you can write multiple assignments
+together:
+
+     x = y = z = 0
+
+stores the value zero in all three variables.  It does this because the
+value of `z = 0', which is zero, is stored into `y', and then the value
+of `y = z = 0', which is zero, is stored into `x'.
+
+   You can use an assignment anywhere an expression is called for.  For
+example, it is valid to write `x != (y = 1)' to set `y' to one and then
+test whether `x' equals one.  But this style tends to make programs
+hard to read; except in a one-shot program, you should not use such
+nesting of assignments.
+
+   Aside from `=', there are several other assignment operators that do
+arithmetic with the old value of the variable.  For example, the
+operator `+=' computes a new value by adding the right-hand value to
+the old value of the variable.  Thus, the following assignment adds
+five to the value of `foo':
+
+     foo += 5
+
+This is equivalent to the following:
+
+     foo = foo + 5
+
+Use whichever one makes the meaning of your program clearer.
+
+   There are situations where using `+=' (or any assignment operator)
+is _not_ the same as simply repeating the left-hand operand in the
+right-hand expression.  For example:
+
+     # Thanks to Pat Rankin for this example
+     BEGIN  {
+         foo[rand()] += 5
+         for (x in foo)
+            print x, foo[x]
+     
+         bar[rand()] = bar[rand()] + 5
+         for (x in bar)
+            print x, bar[x]
+     }
+
+The indices of `bar' are guaranteed to be different, because `rand'
+will return different values each time it is called.  (Arrays and the
+`rand' function haven't been covered yet.  *Note Arrays in `awk':
+Arrays, and see *Note Numeric Built-in Functions: Numeric Functions,
+for more information).  This example illustrates an important fact
+about the assignment operators: the left-hand expression is only
+evaluated _once_.
+
+   It is also up to the implementation as to which expression is
+evaluated first, the left-hand one or the right-hand one.  Consider
+this example:
+
+     i = 1
+     a[i += 2] = i + 1
+
+The value of `a[3]' could be either two or four.
+
+   Here is a table of the arithmetic assignment operators.  In each
+case, the right-hand operand is an expression whose value is converted
+to a number.
+
+`LVALUE += INCREMENT'
+     Adds INCREMENT to the value of LVALUE to make the new value of
+     LVALUE.
+
+`LVALUE -= DECREMENT'
+     Subtracts DECREMENT from the value of LVALUE.
+
+`LVALUE *= COEFFICIENT'
+     Multiplies the value of LVALUE by COEFFICIENT.
+
+`LVALUE /= DIVISOR'
+     Divides the value of LVALUE by DIVISOR.
+
+`LVALUE %= MODULUS'
+     Sets LVALUE to its remainder by MODULUS.
+
+`LVALUE ^= POWER'
+`LVALUE **= POWER'
+     Raises LVALUE to the power POWER.  (Only the `^=' operator is
+     specified by POSIX.)
+
+   For maximum portability, do not use the `**=' operator.
+
+
+File: gawk.info,  Node: Increment Ops,  Next: Truth Values,  Prev: Assignment Ops,  Up: Expressions
+
+Increment and Decrement Operators
+=================================
+
+   "Increment" and "decrement operators" increase or decrease the value
+of a variable by one.  You could do the same thing with an assignment
+operator, so the increment operators add no power to the `awk'
+language; but they are convenient abbreviations for very common
+operations.
+
+   The operator to add one is written `++'.  It can be used to increment
+a variable either before or after taking its value.
+
+   To pre-increment a variable V, write `++V'.  This adds one to the
+value of V and that new value is also the value of this expression.
+The assignment expression `V += 1' is completely equivalent.
+
+   Writing the `++' after the variable specifies post-increment.  This
+increments the variable value just the same; the difference is that the
+value of the increment expression itself is the variable's _old_ value.
+Thus, if `foo' has the value four, then the expression `foo++' has the
+value four, but it changes the value of `foo' to five.
+
+   The post-increment `foo++' is nearly equivalent to writing `(foo +=
+1) - 1'.  It is not perfectly equivalent because all numbers in `awk'
+are floating point: in floating point, `foo + 1 - 1' does not
+necessarily equal `foo'.  But the difference is minute as long as you
+stick to numbers that are fairly small (less than 10e12).
+
+   Any lvalue can be incremented.  Fields and array elements are
+incremented just like variables.  (Use `$(i++)' when you wish to do a
+field reference and a variable increment at the same time.  The
+parentheses are necessary because of the precedence of the field
+reference operator, `$'.)
+
+   The decrement operator `--' works just like `++' except that it
+subtracts one instead of adding.  Like `++', it can be used before the
+lvalue to pre-decrement or after it to post-decrement.
+
+   Here is a summary of increment and decrement expressions.
+
+`++LVALUE'
+     This expression increments LVALUE and the new value becomes the
+     value of the expression.
+
+`LVALUE++'
+     This expression increments LVALUE, but the value of the expression
+     is the _old_ value of LVALUE.
+
+`--LVALUE'
+     Like `++LVALUE', but instead of adding, it subtracts.  It
+     decrements LVALUE and delivers the value that results.
+
+`LVALUE--'
+     Like `LVALUE++', but instead of adding, it subtracts.  It
+     decrements LVALUE.  The value of the expression is the _old_ value
+     of LVALUE.
+
+
+File: gawk.info,  Node: Truth Values,  Next: Typing and Comparison,  Prev: Increment Ops,  Up: Expressions
+
+True and False in `awk'
+=======================
+
+   Many programming languages have a special representation for the
+concepts of "true" and "false."  Such languages usually use the special
+constants `true' and `false', or perhaps their upper-case equivalents.
+
+   `awk' is different.  It borrows a very simple concept of true and
+false from C.  In `awk', any non-zero numeric value, _or_ any non-empty
+string value is true.  Any other value (zero or the null string, `""')
+is false.  The following program will print `A strange truth value'
+three times:
+
+     BEGIN {
+        if (3.1415927)
+            print "A strange truth value"
+        if ("Four Score And Seven Years Ago")
+            print "A strange truth value"
+        if (j = 57)
+            print "A strange truth value"
+     }
+
+   There is a surprising consequence of the "non-zero or non-null" rule:
+The string constant `"0"' is actually true, since it is non-null (d.c.).
+
+
+File: gawk.info,  Node: Typing and Comparison,  Next: Boolean Ops,  Prev: Truth Values,  Up: Expressions
+
+Variable Typing and Comparison Expressions
+==========================================
+
+   Unlike other programming languages, `awk' variables do not have a
+fixed type. Instead, they can be either a number or a string, depending
+upon the value that is assigned to them.
+
+   The 1992 POSIX standard introduced the concept of a "numeric
+string", which is simply a string that looks like a number, for
+example, `" +2"'.  This concept is used for determining the type of a
+variable.
+
+   The type of the variable is important, since the types of two
+variables determine how they are compared.
+
+   In `gawk', variable typing follows these rules.
+
+  1. A numeric literal or the result of a numeric operation has the
+     NUMERIC attribute.
+
+  2. A string literal or the result of a string operation has the STRING
+     attribute.
+
+  3. Fields, `getline' input, `FILENAME', `ARGV' elements, `ENVIRON'
+     elements and the elements of an array created by `split' that are
+     numeric strings have the STRNUM attribute.  Otherwise, they have
+     the STRING attribute.  Uninitialized variables also have the
+     STRNUM attribute.
+
+  4. Attributes propagate across assignments, but are not changed by
+     any use.
+
+   The last rule is particularly important. In the following program,
+`a' has numeric type, even though it is later used in a string
+operation.
+
+     BEGIN {
+              a = 12.345
+              b = a " is a cute number"
+              print b
+     }
+
+   When two operands are compared, either string comparison or numeric
+comparison may be used, depending on the attributes of the operands,
+according to the following, symmetric, matrix:
+
+     	+----------------------------------------------
+     	|	STRING		NUMERIC		STRNUM
+     --------+----------------------------------------------
+     	|
+     STRING	|	string		string		string
+     	|
+     NUMERIC	|	string		numeric		numeric
+     	|
+     STRNUM	|	string		numeric		numeric
+     --------+----------------------------------------------
+
+   The basic idea is that user input that looks numeric, and _only_
+user input, should be treated as numeric, even though it is actually
+made of characters, and is therefore also a string.
+
+   "Comparison expressions" compare strings or numbers for
+relationships such as equality.  They are written using "relational
+operators", which are a superset of those in C.  Here is a table of
+them:
+
+`X < Y'
+     True if X is less than Y.
+
+`X <= Y'
+     True if X is less than or equal to Y.
+
+`X > Y'
+     True if X is greater than Y.
+
+`X >= Y'
+     True if X is greater than or equal to Y.
+
+`X == Y'
+     True if X is equal to Y.
+
+`X != Y'
+     True if X is not equal to Y.
+
+`X ~ Y'
+     True if the string X matches the regexp denoted by Y.
+
+`X !~ Y'
+     True if the string X does not match the regexp denoted by Y.
+
+`SUBSCRIPT in ARRAY'
+     True if the array ARRAY has an element with the subscript
+     SUBSCRIPT.
+
+   Comparison expressions have the value one if true and zero if false.
+
+   When comparing operands of mixed types, numeric operands are
+converted to strings using the value of `CONVFMT' (*note Conversion of
+Strings and Numbers: Conversion.).
+
+   Strings are compared by comparing the first character of each, then
+the second character of each, and so on.  Thus `"10"' is less than
+`"9"'.  If there are two strings where one is a prefix of the other,
+the shorter string is less than the longer one.  Thus `"abc"' is less
+than `"abcd"'.
+
+   It is very easy to accidentally mistype the `==' operator, and leave
+off one of the `='s.  The result is still valid `awk' code, but the
+program will not do what you mean:
+
+     if (a = b)   # oops! should be a == b
+        ...
+     else
+        ...
+
+Unless `b' happens to be zero or the null string, the `if' part of the
+test will always succeed.  Because the operators are so similar, this
+kind of error is very difficult to spot when scanning the source code.
+
+   Here are some sample expressions, how `gawk' compares them, and what
+the result of the comparison is.
+
+`1.5 <= 2.0'
+     numeric comparison (true)
+
+`"abc" >= "xyz"'
+     string comparison (false)
+
+`1.5 != " +2"'
+     string comparison (true)
+
+`"1e2" < "3"'
+     string comparison (true)
+
+`a = 2; b = "2"'
+`a == b'
+     string comparison (true)
+
+`a = 2; b = " +2"'
+`a == b'
+     string comparison (false)
+
+   In this example,
+
+     $ echo 1e2 3 | awk '{ print ($1 < $2) ? "true" : "false" }'
+     -| false
+
+the result is `false' since both `$1' and `$2' are numeric strings and
+thus both have the STRNUM attribute, dictating a numeric comparison.
+
+   The purpose of the comparison rules and the use of numeric strings is
+to attempt to produce the behavior that is "least surprising," while
+still "doing the right thing."
+
+   String comparisons and regular expression comparisons are very
+different.  For example,
+
+     x == "foo"
+
+has the value of one, or is true, if the variable `x' is precisely
+`foo'.  By contrast,
+
+     x ~ /foo/
+
+has the value one if `x' contains `foo', such as `"Oh, what a fool am
+I!"'.
+
+   The right hand operand of the `~' and `!~' operators may be either a
+regexp constant (`/.../'), or an ordinary expression, in which case the
+value of the expression as a string is used as a dynamic regexp (*note
+How to Use Regular Expressions: Regexp Usage.; also *note Using Dynamic
+Regexps: Computed Regexps.).
+
+   In recent implementations of `awk', a constant regular expression in
+slashes by itself is also an expression.  The regexp `/REGEXP/' is an
+abbreviation for this comparison expression:
+
+     $0 ~ /REGEXP/
+
+   One special place where `/foo/' is _not_ an abbreviation for `$0 ~
+/foo/' is when it is the right-hand operand of `~' or `!~'!  *Note
+Using Regular Expression Constants: Using Constant Regexps, where this
+is discussed in more detail.
+
+
+File: gawk.info,  Node: Boolean Ops,  Next: Conditional Exp,  Prev: Typing and Comparison,  Up: Expressions
+
+Boolean Expressions
+===================
+
+   A "boolean expression" is a combination of comparison expressions or
+matching expressions, using the boolean operators "or" (`||'), "and"
+(`&&'), and "not" (`!'), along with parentheses to control nesting.
+The truth value of the boolean expression is computed by combining the
+truth values of the component expressions.  Boolean expressions are
+also referred to as "logical expressions".  The terms are equivalent.
+
+   Boolean expressions can be used wherever comparison and matching
+expressions can be used.  They can be used in `if', `while', `do' and
+`for' statements (*note Control Statements in Actions: Statements.).
+They have numeric values (one if true, zero if false), which come into
+play if the result of the boolean expression is stored in a variable, or
+used in arithmetic.
+
+   In addition, every boolean expression is also a valid pattern, so
+you can use one as a pattern to control the execution of rules.
+
+   Here are descriptions of the three boolean operators, with examples.
+
+`BOOLEAN1 && BOOLEAN2'
+     True if both BOOLEAN1 and BOOLEAN2 are true.  For example, the
+     following statement prints the current input record if it contains
+     both `2400' and `foo'.
+
+          if ($0 ~ /2400/ && $0 ~ /foo/) print
+
+     The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is true.
+     This can make a difference when BOOLEAN2 contains expressions that
+     have side effects: in the case of `$0 ~ /foo/ && ($2 == bar++)',
+     the variable `bar' is not incremented if there is no `foo' in the
+     record.
+
+`BOOLEAN1 || BOOLEAN2'
+     True if at least one of BOOLEAN1 or BOOLEAN2 is true.  For
+     example, the following statement prints all records in the input
+     that contain _either_ `2400' or `foo', or both.
+
+          if ($0 ~ /2400/ || $0 ~ /foo/) print
+
+     The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is false.
+     This can make a difference when BOOLEAN2 contains expressions
+     that have side effects.
+
+`! BOOLEAN'
+     True if BOOLEAN is false.  For example, the following program
+     prints all records in the input file `BBS-list' that do _not_
+     contain the string `foo'.
+
+          awk '{ if (! ($0 ~ /foo/)) print }' BBS-list
+
+   The `&&' and `||' operators are called "short-circuit" operators
+because of the way they work.  Evaluation of the full expression is
+"short-circuited" if the result can be determined part way through its
+evaluation.
+
+   You can continue a statement that uses `&&' or `||' simply by
+putting a newline after them.  But you cannot put a newline in front of
+either of these operators without using backslash continuation (*note
+`awk' Statements Versus Lines: Statements/Lines.).
+
+   The actual value of an expression using the `!' operator will be
+either one or zero, depending upon the truth value of the expression it
+is applied to.
+
+   The `!' operator is often useful for changing the sense of a flag
+variable from false to true and back again. For example, the following
+program is one way to print lines in between special bracketing lines:
+
+     $1 == "START"   { interested = ! interested }
+     interested == 1 { print }
+     $1 == "END"     { interested = ! interested }
+
+The variable `interested', like all `awk' variables, starts out
+initialized to zero, which is also false.  When a line is seen whose
+first field is `START', the value of `interested' is toggled to true,
+using `!'. The next rule prints lines as long as `interested' is true.
+When a line is seen whose first field is `END', `interested' is toggled
+back to false.
+
+
+File: gawk.info,  Node: Conditional Exp,  Next: Function Calls,  Prev: Boolean Ops,  Up: Expressions
+
+Conditional Expressions
+=======================
+
+   A "conditional expression" is a special kind of expression with
+three operands.  It allows you to use one expression's value to select
+one of two other expressions.
+
+   The conditional expression is the same as in the C language:
+
+     SELECTOR ? IF-TRUE-EXP : IF-FALSE-EXP
+
+There are three subexpressions.  The first, SELECTOR, is always
+computed first.  If it is "true" (not zero and not null) then
+IF-TRUE-EXP is computed next and its value becomes the value of the
+whole expression.  Otherwise, IF-FALSE-EXP is computed next and its
+value becomes the value of the whole expression.
+
+   For example, this expression produces the absolute value of `x':
+
+     x > 0 ? x : -x
+
+   Each time the conditional expression is computed, exactly one of
+IF-TRUE-EXP and IF-FALSE-EXP is computed; the other is ignored.  This
+is important when the expressions contain side effects.  For example,
+this conditional expression examines element `i' of either array `a' or
+array `b', and increments `i'.
+
+     x == y ? a[i++] : b[i++]
+
+This is guaranteed to increment `i' exactly once, because each time
+only one of the two increment expressions is executed, and the other is
+not.  *Note Arrays in `awk': Arrays, for more information about arrays.
+
+   As a minor `gawk' extension, you can continue a statement that uses
+`?:' simply by putting a newline after either character.  However, you
+cannot put a newline in front of either character without using
+backslash continuation (*note `awk' Statements Versus Lines:
+Statements/Lines.).
+
+
+File: gawk.info,  Node: Function Calls,  Next: Precedence,  Prev: Conditional Exp,  Up: Expressions
+
+Function Calls
+==============
+
+   A "function" is a name for a particular calculation.  Because it has
+a name, you can ask for it by name at any point in the program.  For
+example, the function `sqrt' computes the square root of a number.
+
+   A fixed set of functions are "built-in", which means they are
+available in every `awk' program.  The `sqrt' function is one of these.
+*Note Built-in Functions: Built-in, for a list of built-in functions
+and their descriptions.  In addition, you can define your own functions
+for use in your program.  *Note User-defined Functions: User-defined,
+for how to do this.
+
+   The way to use a function is with a "function call" expression,
+which consists of the function name followed immediately by a list of
+"arguments" in parentheses.  The arguments are expressions which
+provide the raw materials for the function's calculations.  When there
+is more than one argument, they are separated by commas.  If there are
+no arguments, write just `()' after the function name.  Here are some
+examples:
+
+     sqrt(x^2 + y^2)        one argument
+     atan2(y, x)            two arguments
+     rand()                 no arguments
+
+   *Do not put any space between the function name and the
+open-parenthesis!*  A user-defined function name looks just like the
+name of a variable, and space would make the expression look like
+concatenation of a variable with an expression inside parentheses.
+Space before the parenthesis is harmless with built-in functions, but
+it is best not to get into the habit of using space to avoid mistakes
+with user-defined functions.
+
+   Each function expects a particular number of arguments.  For
+example, the `sqrt' function must be called with a single argument, the
+number to take the square root of:
+
+     sqrt(ARGUMENT)
+
+   Some of the built-in functions allow you to omit the final argument.
+If you do so, they use a reasonable default.  *Note Built-in Functions:
+Built-in, for full details.  If arguments are omitted in calls to
+user-defined functions, then those arguments are treated as local
+variables, initialized to the empty string (*note User-defined
+Functions: User-defined.).
+
+   Like every other expression, the function call has a value, which is
+computed by the function based on the arguments you give it.  In this
+example, the value of `sqrt(ARGUMENT)' is the square root of ARGUMENT.
+A function can also have side effects, such as assigning values to
+certain variables or doing I/O.
+
+   Here is a command to read numbers, one number per line, and print the
+square root of each one:
+
+     $ awk '{ print "The square root of", $1, "is", sqrt($1) }'
+     1
+     -| The square root of 1 is 1
+     3
+     -| The square root of 3 is 1.73205
+     5
+     -| The square root of 5 is 2.23607
+     Control-d
+
+
+File: gawk.info,  Node: Precedence,  Prev: Function Calls,  Up: Expressions
+
+Operator Precedence (How Operators Nest)
+========================================
+
+   "Operator precedence" determines how operators are grouped, when
+different operators appear close by in one expression.  For example,
+`*' has higher precedence than `+'; thus, `a + b * c' means to multiply
+`b' and `c', and then add `a' to the product (i.e. `a + (b * c)').
+
+   You can overrule the precedence of the operators by using
+parentheses.  You can think of the precedence rules as saying where the
+parentheses are assumed to be if you do not write parentheses yourself.
+In fact, it is wise to always use parentheses whenever you have an
+unusual combination of operators, because other people who read the
+program may not remember what the precedence is in this case.  You
+might forget, too; then you could make a mistake.  Explicit parentheses
+will help prevent any such mistake.
+
+   When operators of equal precedence are used together, the leftmost
+operator groups first, except for the assignment, conditional and
+exponentiation operators, which group in the opposite order.  Thus, `a
+- b + c' groups as `(a - b) + c', and `a = b = c' groups as `a = (b =
+c)'.
+
+   The precedence of prefix unary operators does not matter as long as
+only unary operators are involved, because there is only one way to
+interpret them--innermost first.  Thus, `$++i' means `$(++i)' and
+`++$x' means `++($x)'.  However, when another operator follows the
+operand, then the precedence of the unary operators can matter.  Thus,
+`$x^2' means `($x)^2', but `-x^2' means `-(x^2)', because `-' has lower
+precedence than `^' while `$' has higher precedence.
+
+   Here is a table of `awk''s operators, in order from highest
+precedence to lowest:
+
+`(...)'
+     Grouping.
+
+`$'
+     Field.
+
+`++ --'
+     Increment, decrement.
+
+`^ **'
+     Exponentiation.  These operators group right-to-left.  (The `**'
+     operator is not specified by POSIX.)
+
+`+ - !'
+     Unary plus, minus, logical "not".
+
+`* / %'
+     Multiplication, division, modulus.
+
+`+ -'
+     Addition, subtraction.
+
+`Concatenation'
+     No special token is used to indicate concatenation.  The operands
+     are simply written side by side.
+
+`< <= == !='
+`> >= >> |'
+     Relational, and redirection.  The relational operators and the
+     redirections have the same precedence level.  Characters such as
+     `>' serve both as relationals and as redirections; the context
+     distinguishes between the two meanings.
+
+     Note that the I/O redirection operators in `print' and `printf'
+     statements belong to the statement level, not to expressions.  The
+     redirection does not produce an expression which could be the
+     operand of another operator.  As a result, it does not make sense
+     to use a redirection operator near another operator of lower
+     precedence, without parentheses.  Such combinations, for example
+     `print foo > a ? b : c', result in syntax errors.  The correct way
+     to write this statement is `print foo > (a ? b : c)'.
+
+`~ !~'
+     Matching, non-matching.
+
+`in'
+     Array membership.
+
+`&&'
+     Logical "and".
+
+`||'
+     Logical "or".
+
+`?:'
+     Conditional.  This operator groups right-to-left.
+
+`= += -= *='
+`/= %= ^= **='
+     Assignment.  These operators group right-to-left.  (The `**='
+     operator is not specified by POSIX.)
+
+
+File: gawk.info,  Node: Patterns and Actions,  Next: Statements,  Prev: Expressions,  Up: Top
+
+Patterns and Actions
+********************
+
+   As you have already seen, each `awk' statement consists of a pattern
+with an associated action.  This chapter describes how you build
+patterns and actions.
+
+* Menu:
+
+* Pattern Overview::            What goes into a pattern.
+* Action Overview::             What goes into an action.
+
+
+File: gawk.info,  Node: Pattern Overview,  Next: Action Overview,  Prev: Patterns and Actions,  Up: Patterns and Actions
+
+Pattern Elements
+================
+
+   Patterns in `awk' control the execution of rules: a rule is executed
+when its pattern matches the current input record.  This section
+explains all about how to write patterns.
+
+* Menu:
+
+* Kinds of Patterns::           A list of all kinds of patterns.
+* Regexp Patterns::             Using regexps as patterns.
+* Expression Patterns::         Any expression can be used as a pattern.
+* Ranges::                      Pairs of patterns specify record ranges.
+* BEGIN/END::                   Specifying initialization and cleanup rules.
+* Empty::                       The empty pattern, which matches every record.
+
+
+File: gawk.info,  Node: Kinds of Patterns,  Next: Regexp Patterns,  Prev: Pattern Overview,  Up: Pattern Overview
+
+Kinds of Patterns
+-----------------
+
+   Here is a summary of the types of patterns supported in `awk'.
+
+`/REGULAR EXPRESSION/'
+     A regular expression as a pattern.  It matches when the text of the
+     input record fits the regular expression.  (*Note Regular
+     Expressions: Regexp.)
+
+`EXPRESSION'
+     A single expression.  It matches when its value is non-zero (if a
+     number) or non-null (if a string).  (*Note Expressions as
+     Patterns: Expression Patterns.)
+
+`PAT1, PAT2'
+     A pair of patterns separated by a comma, specifying a range of
+     records.  The range includes both the initial record that matches
+     PAT1, and the final record that matches PAT2.  (*Note Specifying
+     Record Ranges with Patterns: Ranges.)
+
+`BEGIN'
+`END'
+     Special patterns for you to supply start-up or clean-up actions
+     for your `awk' program.  (*Note The `BEGIN' and `END' Special
+     Patterns: BEGIN/END.)
+
+`EMPTY'
+     The empty pattern matches every input record.  (*Note The Empty
+     Pattern: Empty.)
+
+
+File: gawk.info,  Node: Regexp Patterns,  Next: Expression Patterns,  Prev: Kinds of Patterns,  Up: Pattern Overview
+
+Regular Expressions as Patterns
+-------------------------------
+
+   We have been using regular expressions as patterns since our early
+examples.  This kind of pattern is simply a regexp constant in the
+pattern part of a rule.  Its  meaning is `$0 ~ /PATTERN/'.  The pattern
+matches when the input record matches the regexp.  For example:
+
+     /foo|bar|baz/  { buzzwords++ }
+     END            { print buzzwords, "buzzwords seen" }
+
+
+File: gawk.info,  Node: Expression Patterns,  Next: Ranges,  Prev: Regexp Patterns,  Up: Pattern Overview
+
+Expressions as Patterns
+-----------------------
+
+   Any `awk' expression is valid as an `awk' pattern.  Then the pattern
+matches if the expression's value is non-zero (if a number) or non-null
+(if a string).
+
+   The expression is reevaluated each time the rule is tested against a
+new input record.  If the expression uses fields such as `$1', the
+value depends directly on the new input record's text; otherwise, it
+depends only on what has happened so far in the execution of the `awk'
+program, but that may still be useful.
+
+   A very common kind of expression used as a pattern is the comparison
+expression, using the comparison operators described in *Note Variable
+Typing and Comparison Expressions: Typing and Comparison.
+
+   Regexp matching and non-matching are also very common expressions.
+The left operand of the `~' and `!~' operators is a string.  The right
+operand is either a constant regular expression enclosed in slashes
+(`/REGEXP/'), or any expression, whose string value is used as a
+dynamic regular expression (*note Using Dynamic Regexps: Computed
+Regexps.).
+
+   The following example prints the second field of each input record
+whose first field is precisely `foo'.
+
+     $ awk '$1 == "foo" { print $2 }' BBS-list
+
+(There is no output, since there is no BBS site named "foo".)  Contrast
+this with the following regular expression match, which would accept
+any record with a first field that contains `foo':
+
+     $ awk '$1 ~ /foo/ { print $2 }' BBS-list
+     -| 555-1234
+     -| 555-6699
+     -| 555-6480
+     -| 555-2127
+
+   Boolean expressions are also commonly used as patterns.  Whether the
+pattern matches an input record depends on whether its subexpressions
+match.
+
+   For example, the following command prints all records in `BBS-list'
+that contain both `2400' and `foo'.
+
+     $ awk '/2400/ && /foo/' BBS-list
+     -| fooey        555-1234     2400/1200/300     B
+
+   The following command prints all records in `BBS-list' that contain
+_either_ `2400' or `foo', or both.
+
+     $ awk '/2400/ || /foo/' BBS-list
+     -| alpo-net     555-3412     2400/1200/300     A
+     -| bites        555-1675     2400/1200/300     A
+     -| fooey        555-1234     2400/1200/300     B
+     -| foot         555-6699     1200/300          B
+     -| macfoo       555-6480     1200/300          A
+     -| sdace        555-3430     2400/1200/300     A
+     -| sabafoo      555-2127     1200/300          C
+
+   The following command prints all records in `BBS-list' that do _not_
+contain the string `foo'.
+
+     $ awk '! /foo/' BBS-list
+     -| aardvark     555-5553     1200/300          B
+     -| alpo-net     555-3412     2400/1200/300     A
+     -| barfly       555-7685     1200/300          A
+     -| bites        555-1675     2400/1200/300     A
+     -| camelot      555-0542     300               C
+     -| core         555-2912     1200/300          C
+     -| sdace        555-3430     2400/1200/300     A
+
+   The subexpressions of a boolean operator in a pattern can be
+constant regular expressions, comparisons, or any other `awk'
+expressions.  Range patterns are not expressions, so they cannot appear
+inside boolean patterns.  Likewise, the special patterns `BEGIN' and
+`END', which never match any input record, are not expressions and
+cannot appear inside boolean patterns.
+
+   A regexp constant as a pattern is also a special case of an
+expression pattern.  `/foo/' as an expression has the value one if `foo'
+appears in the current input record; thus, as a pattern, `/foo/'
+matches any record containing `foo'.
+
+
+File: gawk.info,  Node: Ranges,  Next: BEGIN/END,  Prev: Expression Patterns,  Up: Pattern Overview
+
+Specifying Record Ranges with Patterns
+--------------------------------------
+
+   A "range pattern" is made of two patterns separated by a comma, of
+the form `BEGPAT, ENDPAT'.  It matches ranges of consecutive input
+records.  The first pattern, BEGPAT, controls where the range begins,
+and the second one, ENDPAT, controls where it ends.  For example,
+
+     awk '$1 == "on", $1 == "off"'
+
+prints every record between `on'/`off' pairs, inclusive.
+
+   A range pattern starts out by matching BEGPAT against every input
+record; when a record matches BEGPAT, the range pattern becomes "turned
+on".  The range pattern matches this record.  As long as it stays
+turned on, it automatically matches every input record read.  It also
+matches ENDPAT against every input record; when that succeeds, the
+range pattern is turned off again for the following record.  Then it
+goes back to checking BEGPAT against each record.
+
+   The record that turns on the range pattern and the one that turns it
+off both match the range pattern.  If you don't want to operate on
+these records, you can write `if' statements in the rule's action to
+distinguish them from the records you are interested in.
+
+   It is possible for a pattern to be turned both on and off by the same
+record, if the record satisfies both conditions.  Then the action is
+executed for just that record.
+
+   For example, suppose you have text between two identical markers (say
+the `%' symbol) that you wish to ignore.  You might try to combine a
+range pattern that describes the delimited text with the `next'
+statement (not discussed yet, *note The `next' Statement: Next
+Statement.), which causes `awk' to skip any further processing of the
+current record and start over again with the next input record. Such a
+program would like this:
+
+     /^%$/,/^%$/    { next }
+                    { print }
+
+This program fails because the range pattern is both turned on and
+turned off by the first line with just a `%' on it.  To accomplish this
+task, you must write the program this way, using a flag:
+
+     /^%$/     { skip = ! skip; next }
+     skip == 1 { next } # skip lines with `skip' set
+
+   Note that in a range pattern, the `,' has the lowest precedence (is
+evaluated last) of all the operators.  Thus, for example, the following
+program attempts to combine a range pattern with another, simpler test.
+
+     echo Yes | awk '/1/,/2/ || /Yes/'
+
+   The author of this program intended it to mean `(/1/,/2/) || /Yes/'.
+However, `awk' interprets this as `/1/, (/2/ || /Yes/)'.  This cannot
+be changed or worked around; range patterns do not combine with other
+patterns.
+
+
+File: gawk.info,  Node: BEGIN/END,  Next: Empty,  Prev: Ranges,  Up: Pattern Overview
+
+The `BEGIN' and `END' Special Patterns
+--------------------------------------
+
+   `BEGIN' and `END' are special patterns.  They are not used to match
+input records.  Rather, they supply start-up or clean-up actions for
+your `awk' script.
+
+* Menu:
+
+* Using BEGIN/END::             How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::           I/O issues in BEGIN/END rules.
+
+
+File: gawk.info,  Node: Using BEGIN/END,  Next: I/O And BEGIN/END,  Prev: BEGIN/END,  Up: BEGIN/END
+
+Startup and Cleanup Actions
+...........................
+
+   A `BEGIN' rule is executed, once, before the first input record has
+been read.  An `END' rule is executed, once, after all the input has
+been read.  For example:
+
+     $ awk '
+     > BEGIN { print "Analysis of \"foo\"" }
+     > /foo/ { ++n }
+     > END   { print "\"foo\" appears " n " times." }' BBS-list
+     -| Analysis of "foo"
+     -| "foo" appears 4 times.
+
+   This program finds the number of records in the input file `BBS-list'
+that contain the string `foo'.  The `BEGIN' rule prints a title for the
+report.  There is no need to use the `BEGIN' rule to initialize the
+counter `n' to zero, as `awk' does this automatically (*note
+Variables::).
+
+   The second rule increments the variable `n' every time a record
+containing the pattern `foo' is read.  The `END' rule prints the value
+of `n' at the end of the run.
+
+   The special patterns `BEGIN' and `END' cannot be used in ranges or
+with boolean operators (indeed, they cannot be used with any operators).
+
+   An `awk' program may have multiple `BEGIN' and/or `END' rules.  They
+are executed in the order they appear, all the `BEGIN' rules at
+start-up and all the `END' rules at termination.  `BEGIN' and `END'
+rules may be intermixed with other rules.  This feature was added in
+the 1987 version of `awk', and is included in the POSIX standard.  The
+original (1978) version of `awk' required you to put the `BEGIN' rule
+at the beginning of the program, and the `END' rule at the end, and
+only allowed one of each.  This is no longer required, but it is a good
+idea in terms of program organization and readability.
+
+   Multiple `BEGIN' and `END' rules are useful for writing library
+functions, since each library file can have its own `BEGIN' and/or
+`END' rule to do its own initialization and/or cleanup.  Note that the
+order in which library functions are named on the command line controls
+the order in which their `BEGIN' and `END' rules are executed.
+Therefore you have to be careful to write such rules in library files
+so that the order in which they are executed doesn't matter.  *Note
+Command Line Options: Options, for more information on using library
+functions.  *Note A Library of `awk' Functions: Library Functions, for
+a number of useful library functions.
+
+   If an `awk' program only has a `BEGIN' rule, and no other rules,
+then the program exits after the `BEGIN' rule has been run.  (The
+original version of `awk' used to keep reading and ignoring input until
+end of file was seen.)  However, if an `END' rule exists, then the
+input will be read, even if there are no other rules in the program.
+This is necessary in case the `END' rule checks the `FNR' and `NR'
+variables (d.c.).
+
+   `BEGIN' and `END' rules must have actions; there is no default
+action for these rules since there is no current record when they run.
+
+
+File: gawk.info,  Node: I/O And BEGIN/END,  Prev: Using BEGIN/END,  Up: BEGIN/END
+
+Input/Output from `BEGIN' and `END' Rules
+.........................................
+
+   There are several (sometimes subtle) issues involved when doing I/O
+from a `BEGIN' or `END' rule.
+
+   The first has to do with the value of `$0' in a `BEGIN' rule.  Since
+`BEGIN' rules are executed before any input is read, there simply is no
+input record, and therefore no fields, when executing `BEGIN' rules.
+References to `$0' and the fields yield a null string or zero,
+depending upon the context.  One way to give `$0' a real value is to
+execute a `getline' command without a variable (*note Explicit Input
+with `getline': Getline.).  Another way is to simply assign a value to
+it.
+
+   The second point is similar to the first, but from the other
+direction.  Inside an `END' rule, what is the value of `$0' and `NF'?
+Traditionally, due largely to implementation issues, `$0' and `NF' were
+_undefined_ inside an `END' rule.  The POSIX standard specified that
+`NF' was available in an `END' rule, containing the number of fields
+from the last input record.  Due most probably to an oversight, the
+standard does not say that `$0' is also preserved, although logically
+one would think that it should be.  In fact, `gawk' does preserve the
+value of `$0' for use in `END' rules.  Be aware, however, that Unix
+`awk', and possibly other implementations, do not.
+
+   The third point follows from the first two.  What is the meaning of
+`print' inside a `BEGIN' or `END' rule?  The meaning is the same as
+always, `print $0'.  If `$0' is the null string, then this prints an
+empty line.  Many long time `awk' programmers use `print' in `BEGIN'
+and `END' rules, to mean `print ""', relying on `$0' being null.  While
+you might generally get away with this in `BEGIN' rules, in `gawk' at
+least, it is a very bad idea in `END' rules.  It is also poor style,
+since if you want an empty line in the output, you should say so
+explicitly in your program.
+
+
+File: gawk.info,  Node: Empty,  Prev: BEGIN/END,  Up: Pattern Overview
+
+The Empty Pattern
+-----------------
+
+   An empty (i.e. non-existent) pattern is considered to match _every_
+input record.  For example, the program:
+
+     awk '{ print $1 }' BBS-list
+
+prints the first field of every record.
+
+
+File: gawk.info,  Node: Action Overview,  Prev: Pattern Overview,  Up: Patterns and Actions
+
+Overview of Actions
+===================
+
+   An `awk' program or script consists of a series of rules and
+function definitions, interspersed.  (Functions are described later.
+*Note User-defined Functions: User-defined.)
+
+   A rule contains a pattern and an action, either of which (but not
+both) may be omitted.  The purpose of the "action" is to tell `awk'
+what to do once a match for the pattern is found.  Thus, in outline, an
+`awk' program generally looks like this:
+
+     [PATTERN] [{ ACTION }]
+     [PATTERN] [{ ACTION }]
+     ...
+     function NAME(ARGS) { ... }
+     ...
+
+   An action consists of one or more `awk' "statements", enclosed in
+curly braces (`{' and `}').  Each statement specifies one thing to be
+done.  The statements are separated by newlines or semicolons.
+
+   The curly braces around an action must be used even if the action
+contains only one statement, or even if it contains no statements at
+all.  However, if you omit the action entirely, omit the curly braces as
+well.  An omitted action is equivalent to `{ print $0 }'.
+
+     /foo/  { }  # match foo, do nothing - empty action
+     /foo/       # match foo, print the record - omitted action
+
+   Here are the kinds of statements supported in `awk':
+
+   * Expressions, which can call functions or assign values to variables
+     (*note Expressions::).  Executing this kind of statement simply
+     computes the value of the expression.  This is useful when the
+     expression has side effects (*note Assignment Expressions:
+     Assignment Ops.).
+
+   * Control statements, which specify the control flow of `awk'
+     programs.  The `awk' language gives you C-like constructs (`if',
+     `for', `while', and `do') as well as a few special ones (*note
+     Control Statements in Actions: Statements.).
+
+   * Compound statements, which consist of one or more statements
+     enclosed in curly braces.  A compound statement is used in order
+     to put several statements together in the body of an `if',
+     `while', `do' or `for' statement.
+
+   * Input statements, using the `getline' command (*note Explicit
+     Input with `getline': Getline.), the `next' statement (*note The
+     `next' Statement: Next Statement.), and the `nextfile' statement
+     (*note The `nextfile' Statement: Nextfile Statement.).
+
+   * Output statements, `print' and `printf'.  *Note Printing Output:
+     Printing.
+
+   * Deletion statements, for deleting array elements.  *Note The
+     `delete' Statement: Delete.
+
+
+File: gawk.info,  Node: Statements,  Next: Built-in Variables,  Prev: Patterns and Actions,  Up: Top
+
+Control Statements in Actions
+*****************************
+
+   "Control statements" such as `if', `while', and so on control the
+flow of execution in `awk' programs.  Most of the control statements in
+`awk' are patterned on similar statements in C.
+
+   All the control statements start with special keywords such as `if'
+and `while', to distinguish them from simple expressions.
+
+   Many control statements contain other statements; for example, the
+`if' statement contains another statement which may or may not be
+executed.  The contained statement is called the "body".  If you want
+to include more than one statement in the body, group them into a
+single "compound statement" with curly braces, separating them with
+newlines or semicolons.
+
+* Menu:
+
+* If Statement::                Conditionally execute some `awk'
+                                statements.
+* While Statement::             Loop until some condition is satisfied.
+* Do Statement::                Do specified action while looping until some
+                                condition is satisfied.
+* For Statement::               Another looping statement, that provides
+                                initialization and increment clauses.
+* Break Statement::             Immediately exit the innermost enclosing loop.
+* Continue Statement::          Skip to the end of the innermost enclosing
+                                loop.
+* Next Statement::              Stop processing the current input record.
+* Nextfile Statement::          Stop processing the current file.
+* Exit Statement::              Stop execution of `awk'.
+
+
+File: gawk.info,  Node: If Statement,  Next: While Statement,  Prev: Statements,  Up: Statements
+
+The `if'-`else' Statement
+=========================
+
+   The `if'-`else' statement is `awk''s decision-making statement.  It
+looks like this:
+
+     if (CONDITION) THEN-BODY [else ELSE-BODY]
+
+The CONDITION is an expression that controls what the rest of the
+statement will do.  If CONDITION is true, THEN-BODY is executed;
+otherwise, ELSE-BODY is executed.  The `else' part of the statement is
+optional.  The condition is considered false if its value is zero or
+the null string, and true otherwise.
+
+   Here is an example:
+
+     if (x % 2 == 0)
+         print "x is even"
+     else
+         print "x is odd"
+
+   In this example, if the expression `x % 2 == 0' is true (that is,
+the value of `x' is evenly divisible by two), then the first `print'
+statement is executed, otherwise the second `print' statement is
+executed.
+
+   If the `else' appears on the same line as THEN-BODY, and THEN-BODY
+is not a compound statement (i.e. not surrounded by curly braces), then
+a semicolon must separate THEN-BODY from `else'.  To illustrate this,
+let's rewrite the previous example:
+
+     if (x % 2 == 0) print "x is even"; else
+             print "x is odd"
+
+If you forget the `;', `awk' won't be able to interpret the statement,
+and you will get a syntax error.
+
+   We would not actually write this example this way, because a human
+reader might fail to see the `else' if it were not the first thing on
+its line.
+
+
+File: gawk.info,  Node: While Statement,  Next: Do Statement,  Prev: If Statement,  Up: Statements
+
+The `while' Statement
+=====================
+
+   In programming, a "loop" means a part of a program that can be
+executed two or more times in succession.
+
+   The `while' statement is the simplest looping statement in `awk'.
+It repeatedly executes a statement as long as a condition is true.  It
+looks like this:
+
+     while (CONDITION)
+       BODY
+
+Here BODY is a statement that we call the "body" of the loop, and
+CONDITION is an expression that controls how long the loop keeps
+running.
+
+   The first thing the `while' statement does is test CONDITION.  If
+CONDITION is true, it executes the statement BODY.  (The CONDITION is
+true when the value is not zero and not a null string.)  After BODY has
+been executed, CONDITION is tested again, and if it is still true, BODY
+is executed again.  This process repeats until CONDITION is no longer
+true.  If CONDITION is initially false, the body of the loop is never
+executed, and `awk' continues with the statement following the loop.
+
+   This example prints the first three fields of each record, one per
+line.
+
+     awk '{ i = 1
+            while (i <= 3) {
+                print $i
+                i++
+            }
+     }' inventory-shipped
+
+Here the body of the loop is a compound statement enclosed in braces,
+containing two statements.
+
+   The loop works like this: first, the value of `i' is set to one.
+Then, the `while' tests whether `i' is less than or equal to three.
+This is true when `i' equals one, so the `i'-th field is printed.  Then
+the `i++' increments the value of `i' and the loop repeats.  The loop
+terminates when `i' reaches four.
+
+   As you can see, a newline is not required between the condition and
+the body; but using one makes the program clearer unless the body is a
+compound statement or is very simple.  The newline after the open-brace
+that begins the compound statement is not required either, but the
+program would be harder to read without it.
+
+
+File: gawk.info,  Node: Do Statement,  Next: For Statement,  Prev: While Statement,  Up: Statements
+
+The `do'-`while' Statement
+==========================
+
+   The `do' loop is a variation of the `while' looping statement.  The
+`do' loop executes the BODY once, and then repeats BODY as long as
+CONDITION is true.  It looks like this:
+
+     do
+       BODY
+     while (CONDITION)
+
+   Even if CONDITION is false at the start, BODY is executed at least
+once (and only once, unless executing BODY makes CONDITION true).
+Contrast this with the corresponding `while' statement:
+
+     while (CONDITION)
+       BODY
+
+This statement does not execute BODY even once if CONDITION is false to
+begin with.
+
+   Here is an example of a `do' statement:
+
+     awk '{ i = 1
+            do {
+               print $0
+               i++
+            } while (i <= 10)
+     }'
+
+This program prints each input record ten times.  It isn't a very
+realistic example, since in this case an ordinary `while' would do just
+as well.  But this reflects actual experience; there is only
+occasionally a real use for a `do' statement.
+
+
+File: gawk.info,  Node: For Statement,  Next: Break Statement,  Prev: Do Statement,  Up: Statements
+
+The `for' Statement
+===================
+
+   The `for' statement makes it more convenient to count iterations of a
+loop.  The general form of the `for' statement looks like this:
+
+     for (INITIALIZATION; CONDITION; INCREMENT)
+       BODY
+
+The INITIALIZATION, CONDITION and INCREMENT parts are arbitrary `awk'
+expressions, and BODY stands for any `awk' statement.
+
+   The `for' statement starts by executing INITIALIZATION.  Then, as
+long as CONDITION is true, it repeatedly executes BODY and then
+INCREMENT.  Typically INITIALIZATION sets a variable to either zero or
+one, INCREMENT adds one to it, and CONDITION compares it against the
+desired number of iterations.
+
+   Here is an example of a `for' statement:
+
+     awk '{ for (i = 1; i <= 3; i++)
+               print $i
+     }' inventory-shipped
+
+This prints the first three fields of each input record, one field per
+line.
+
+   You cannot set more than one variable in the INITIALIZATION part
+unless you use a multiple assignment statement such as `x = y = 0',
+which is possible only if all the initial values are equal.  (But you
+can initialize additional variables by writing their assignments as
+separate statements preceding the `for' loop.)
+
+   The same is true of the INCREMENT part; to increment additional
+variables, you must write separate statements at the end of the loop.
+The C compound expression, using C's comma operator, would be useful in
+this context, but it is not supported in `awk'.
+
+   Most often, INCREMENT is an increment expression, as in the example
+above.  But this is not required; it can be any expression whatever.
+For example, this statement prints all the powers of two between one
+and 100:
+
+     for (i = 1; i <= 100; i *= 2)
+       print i
+
+   Any of the three expressions in the parentheses following the `for'
+may be omitted if there is nothing to be done there.  Thus,
+`for (; x > 0;)' is equivalent to `while (x > 0)'.  If the CONDITION is
+omitted, it is treated as TRUE, effectively yielding an "infinite loop"
+(i.e. a loop that will never terminate).
+
+   In most cases, a `for' loop is an abbreviation for a `while' loop,
+as shown here:
+
+     INITIALIZATION
+     while (CONDITION) {
+       BODY
+       INCREMENT
+     }
+
+The only exception is when the `continue' statement (*note The
+`continue' Statement: Continue Statement.) is used inside the loop;
+changing a `for' statement to a `while' statement in this way can
+change the effect of the `continue' statement inside the loop.
+
+   There is an alternate version of the `for' loop, for iterating over
+all the indices of an array:
+
+     for (i in array)
+         DO SOMETHING WITH array[i]
+
+*Note Scanning All Elements of an Array: Scanning an Array, for more
+information on this version of the `for' loop.
+
+   The `awk' language has a `for' statement in addition to a `while'
+statement because often a `for' loop is both less work to type and more
+natural to think of.  Counting the number of iterations is very common
+in loops.  It can be easier to think of this counting as part of
+looping rather than as something to do inside the loop.
+
+   The next section has more complicated examples of `for' loops.
+
+
+File: gawk.info,  Node: Break Statement,  Next: Continue Statement,  Prev: For Statement,  Up: Statements
+
+The `break' Statement
+=====================
+
+   The `break' statement jumps out of the innermost `for', `while', or
+`do' loop that encloses it.  The following example finds the smallest
+divisor of any integer, and also identifies prime numbers:
+
+     awk '# find smallest divisor of num
+          { num = $1
+            for (div = 2; div*div <= num; div++)
+              if (num % div == 0)
+                break
+            if (num % div == 0)
+              printf "Smallest divisor of %d is %d\n", num, div
+            else
+              printf "%d is prime\n", num
+          }'
+
+   When the remainder is zero in the first `if' statement, `awk'
+immediately "breaks out" of the containing `for' loop.  This means that
+`awk' proceeds immediately to the statement following the loop and
+continues processing.  (This is very different from the `exit'
+statement which stops the entire `awk' program.  *Note The `exit'
+Statement: Exit Statement.)
+
+   Here is another program equivalent to the previous one.  It
+illustrates how the CONDITION of a `for' or `while' could just as well
+be replaced with a `break' inside an `if':
+
+     awk '# find smallest divisor of num
+          { num = $1
+            for (div = 2; ; div++) {
+              if (num % div == 0) {
+                printf "Smallest divisor of %d is %d\n", num, div
+                break
+              }
+              if (div*div > num) {
+                printf "%d is prime\n", num
+                break
+              }
+            }
+     }'
+
+   As described above, the `break' statement has no meaning when used
+outside the body of a loop.  However, although it was never documented,
+historical implementations of `awk' have treated the `break' statement
+outside of a loop as if it were a `next' statement (*note The `next'
+Statement: Next Statement.).  Recent versions of Unix `awk' no longer
+allow this usage.  `gawk' will support this use of `break' only if
+`--traditional' has been specified on the command line (*note Command
+Line Options: Options.).  Otherwise, it will be treated as an error,
+since the POSIX standard specifies that `break' should only be used
+inside the body of a loop (d.c.).
+
+
+File: gawk.info,  Node: Continue Statement,  Next: Next Statement,  Prev: Break Statement,  Up: Statements
+
+The `continue' Statement
+========================
+
+   The `continue' statement, like `break', is used only inside `for',
+`while', and `do' loops.  It skips over the rest of the loop body,
+causing the next cycle around the loop to begin immediately.  Contrast
+this with `break', which jumps out of the loop altogether.
+
+   The `continue' statement in a `for' loop directs `awk' to skip the
+rest of the body of the loop, and resume execution with the
+increment-expression of the `for' statement.  The following program
+illustrates this fact:
+
+     awk 'BEGIN {
+          for (x = 0; x <= 20; x++) {
+              if (x == 5)
+                  continue
+              printf "%d ", x
+          }
+          print ""
+     }'
+
+This program prints all the numbers from zero to 20, except for five,
+for which the `printf' is skipped.  Since the increment `x++' is not
+skipped, `x' does not remain stuck at five.  Contrast the `for' loop
+above with this `while' loop:
+
+     awk 'BEGIN {
+          x = 0
+          while (x <= 20) {
+              if (x == 5)
+                  continue
+              printf "%d ", x
+              x++
+          }
+          print ""
+     }'
+
+This program loops forever once `x' gets to five.
+
+   As described above, the `continue' statement has no meaning when
+used outside the body of a loop.  However, although it was never
+documented, historical implementations of `awk' have treated the
+`continue' statement outside of a loop as if it were a `next' statement
+(*note The `next' Statement: Next Statement.).  Recent versions of Unix
+`awk' no longer allow this usage.  `gawk' will support this use of
+`continue' only if `--traditional' has been specified on the command
+line (*note Command Line Options: Options.).  Otherwise, it will be
+treated as an error, since the POSIX standard specifies that `continue'
+should only be used inside the body of a loop (d.c.).
+
+
+File: gawk.info,  Node: Next Statement,  Next: Nextfile Statement,  Prev: Continue Statement,  Up: Statements
+
+The `next' Statement
+====================
+
+   The `next' statement forces `awk' to immediately stop processing the
+current record and go on to the next record.  This means that no
+further rules are executed for the current record.  The rest of the
+current rule's action is not executed either.
+
+   Contrast this with the effect of the `getline' function (*note
+Explicit Input with `getline': Getline.).  That too causes `awk' to
+read the next record immediately, but it does not alter the flow of
+control in any way.  So the rest of the current action executes with a
+new input record.
+
+   At the highest level, `awk' program execution is a loop that reads
+an input record and then tests each rule's pattern against it.  If you
+think of this loop as a `for' statement whose body contains the rules,
+then the `next' statement is analogous to a `continue' statement: it
+skips to the end of the body of this implicit loop, and executes the
+increment (which reads another record).
+
+   For example, if your `awk' program works only on records with four
+fields, and you don't want it to fail when given bad input, you might
+use this rule near the beginning of the program:
+
+     NF != 4 {
+       err = sprintf("%s:%d: skipped: NF != 4\n", FILENAME, FNR)
+       print err > "/dev/stderr"
+       next
+     }
+
+so that the following rules will not see the bad record.  The error
+message is redirected to the standard error output stream, as error
+messages should be.  *Note Special File Names in `gawk': Special Files.
+
+   According to the POSIX standard, the behavior is undefined if the
+`next' statement is used in a `BEGIN' or `END' rule.  `gawk' will treat
+it as a syntax error.  Although POSIX permits it, some other `awk'
+implementations don't allow the `next' statement inside function bodies
+(*note User-defined Functions: User-defined.).  Just as any other
+`next' statement, a `next' inside a function body reads the next record
+and starts processing it with the first rule in the program.
+
+   If the `next' statement causes the end of the input to be reached,
+then the code in any `END' rules will be executed.  *Note The `BEGIN'
+and `END' Special Patterns: BEGIN/END.
+
+   *Caution:* Some `awk' implementations generate a run-time error if
+you use the `next' statement inside a user-defined function (*note
+User-defined Functions: User-defined.).  `gawk' does not have this
+problem.
+
+
+File: gawk.info,  Node: Nextfile Statement,  Next: Exit Statement,  Prev: Next Statement,  Up: Statements
+
+The `nextfile' Statement
+========================
+
+   `gawk' provides the `nextfile' statement, which is similar to the
+`next' statement.  However, instead of abandoning processing of the
+current record, the `nextfile' statement instructs `gawk' to stop
+processing the current data file.
+
+   Upon execution of the `nextfile' statement, `FILENAME' is updated to
+the name of the next data file listed on the command line, `FNR' is
+reset to one, `ARGIND' is incremented, and processing starts over with
+the first rule in the progam.  *Note Built-in Variables::.
+
+   If the `nextfile' statement causes the end of the input to be
+reached, then the code in any `END' rules will be executed.  *Note The
+`BEGIN' and `END' Special Patterns: BEGIN/END.
+
+   The `nextfile' statement is a `gawk' extension; it is not
+(currently) available in any other `awk' implementation.  *Note
+Implementing `nextfile' as a Function: Nextfile Function, for a
+user-defined function you can use to simulate the `nextfile' statement.
+
+   The `nextfile' statement would be useful if you have many data files
+to process, and you expect that you would not want to process every
+record in every file.  Normally, in order to move on to the next data
+file, you would have to continue scanning the unwanted records.  The
+`nextfile' statement accomplishes this much more efficiently.
+
+   *Caution:*  Versions of `gawk' prior to 3.0 used two words (`next
+file') for the `nextfile' statement.  This was changed in 3.0 to one
+word, since the treatment of `file' was inconsistent. When it appeared
+after `next', it was a keyword.  Otherwise, it was a regular
+identifier.  The old usage is still accepted. However, `gawk' will
+generate a warning message, and support for `next file' will eventually
+be discontinued in a future version of `gawk'.
+
+
+File: gawk.info,  Node: Exit Statement,  Prev: Nextfile Statement,  Up: Statements
+
+The `exit' Statement
+====================
+
+   The `exit' statement causes `awk' to immediately stop executing the
+current rule and to stop processing input; any remaining input is
+ignored.  It looks like this:
+
+     exit [RETURN CODE]
+
+   If an `exit' statement is executed from a `BEGIN' rule the program
+stops processing everything immediately.  No input records are read.
+However, if an `END' rule is present, it is executed (*note The `BEGIN'
+and `END' Special Patterns: BEGIN/END.).
+
+   If `exit' is used as part of an `END' rule, it causes the program to
+stop immediately.
+
+   An `exit' statement that is not part of a `BEGIN' or `END' rule
+stops the execution of any further automatic rules for the current
+record, skips reading any remaining input records, and executes the
+`END' rule if there is one.
+
+   If you do not want the `END' rule to do its job in this case, you
+can set a variable to non-zero before the `exit' statement, and check
+that variable in the `END' rule.  *Note Assertions: Assert Function,
+for an example that does this.
+
+   If an argument is supplied to `exit', its value is used as the exit
+status code for the `awk' process.  If no argument is supplied, `exit'
+returns status zero (success).  In the case where an argument is
+supplied to a first `exit' statement, and then `exit' is called a
+second time with no argument, the previously supplied exit value is
+used (d.c.).
+
+   For example, let's say you've discovered an error condition you
+really don't know how to handle.  Conventionally, programs report this
+by exiting with a non-zero status.  Your `awk' program can do this
+using an `exit' statement with a non-zero argument.  Here is an example:
+
+     BEGIN {
+            if (("date" | getline date_now) < 0) {
+              print "Can't get system date" > "/dev/stderr"
+              exit 1
+            }
+            print "current date is", date_now
+            close("date")
+     }
+
+
+File: gawk.info,  Node: Built-in Variables,  Next: Arrays,  Prev: Statements,  Up: Top
+
+Built-in Variables
+******************
+
+   Most `awk' variables are available for you to use for your own
+purposes; they never change except when your program assigns values to
+them, and never affect anything except when your program examines them.
+However, a few variables in `awk' have special built-in meanings.  Some
+of them `awk' examines automatically, so that they enable you to tell
+`awk' how to do certain things.  Others are set automatically by `awk',
+so that they carry information from the internal workings of `awk' to
+your program.
+
+   This chapter documents all the built-in variables of `gawk'.  Most
+of them are also documented in the chapters describing their areas of
+activity.
+
+* Menu:
+
+* User-modified::               Built-in variables that you change to control
+                                `awk'.
+* Auto-set::                    Built-in variables where `awk' gives you
+                                information.
+* ARGC and ARGV::               Ways to use `ARGC' and `ARGV'.
+
+
+File: gawk.info,  Node: User-modified,  Next: Auto-set,  Prev: Built-in Variables,  Up: Built-in Variables
+
+Built-in Variables that Control `awk'
+=====================================
+
+   This is an alphabetical list of the variables which you can change to
+control how `awk' does certain things. Those variables that are
+specific to `gawk' are marked with an asterisk, `*'.
+
+`CONVFMT'
+     This string controls conversion of numbers to strings (*note
+     Conversion of Strings and Numbers: Conversion.).  It works by
+     being passed, in effect, as the first argument to the `sprintf'
+     function (*note Built-in Functions for String Manipulation: String
+     Functions.).  Its default value is `"%.6g"'.  `CONVFMT' was
+     introduced by the POSIX standard.
+
+`FIELDWIDTHS *'
+     This is a space separated list of columns that tells `gawk' how to
+     split input with fixed, columnar boundaries.  It is an
+     experimental feature.  Assigning to `FIELDWIDTHS' overrides the
+     use of `FS' for field splitting.  *Note Reading Fixed-width Data:
+     Constant Size, for more information.
+
+     If `gawk' is in compatibility mode (*note Command Line Options:
+     Options.), then `FIELDWIDTHS' has no special meaning, and field
+     splitting operations are done based exclusively on the value of
+     `FS'.
+
+`FS'
+     `FS' is the input field separator (*note Specifying How Fields are
+     Separated: Field Separators.).  The value is a single-character
+     string or a multi-character regular expression that matches the
+     separations between fields in an input record.  If the value is
+     the null string (`""'), then each character in the record becomes
+     a separate field.
+
+     The default value is `" "', a string consisting of a single space.
+     As a special exception, this value means that any sequence of
+     spaces, tabs, and/or newlines is a single separator.(1)  It also
+     causes spaces, tabs, and newlines at the beginning and end of a
+     record to be ignored.
+
+     You can set the value of `FS' on the command line using the `-F'
+     option:
+
+          awk -F, 'PROGRAM' INPUT-FILES
+
+     If `gawk' is using `FIELDWIDTHS' for field-splitting, assigning a
+     value to `FS' will cause `gawk' to return to the normal,
+     `FS'-based, field splitting. An easy way to do this is to simply
+     say `FS = FS', perhaps with an explanatory comment.
+
+`IGNORECASE *'
+     If `IGNORECASE' is non-zero or non-null, then all string
+     comparisons, and all regular expression matching are
+     case-independent.  Thus, regexp matching with `~' and `!~', and
+     the `gensub', `gsub', `index', `match', `split' and `sub'
+     functions, record termination with `RS', and field splitting with
+     `FS' all ignore case when doing their particular regexp operations.
+     *Note Case-sensitivity in Matching: Case-sensitivity.
+
+     If `gawk' is in compatibility mode (*note Command Line Options:
+     Options.), then `IGNORECASE' has no special meaning, and string
+     and regexp operations are always case-sensitive.
+
+`OFMT'
+     This string controls conversion of numbers to strings (*note
+     Conversion of Strings and Numbers: Conversion.) for printing with
+     the `print' statement.  It works by being passed, in effect, as
+     the first argument to the `sprintf' function (*note Built-in
+     Functions for String Manipulation: String Functions.).  Its
+     default value is `"%.6g"'.  Earlier versions of `awk' also used
+     `OFMT' to specify the format for converting numbers to strings in
+     general expressions; this is now done by `CONVFMT'.
+
+`OFS'
+     This is the output field separator (*note Output Separators::).
+     It is output between the fields output by a `print' statement.  Its
+     default value is `" "', a string consisting of a single space.
+
+`ORS'
+     This is the output record separator.  It is output at the end of
+     every `print' statement.  Its default value is `"\n"'.  (*Note
+     Output Separators::.)
+
+`RS'
+     This is `awk''s input record separator.  Its default value is a
+     string containing a single newline character, which means that an
+     input record consists of a single line of text.  It can also be
+     the null string, in which case records are separated by runs of
+     blank lines, or a regexp, in which case records are separated by
+     matches of the regexp in the input text.  (*Note How Input is
+     Split into Records: Records.)
+
+`SUBSEP'
+     `SUBSEP' is the subscript separator.  It has the default value of
+     `"\034"', and is used to separate the parts of the indices of a
+     multi-dimensional array.  Thus, the expression `foo["A", "B"]'
+     really accesses `foo["A\034B"]' (*note Multi-dimensional Arrays:
+     Multi-dimensional.).
+
+   ---------- Footnotes ----------
+
+   (1) In POSIX `awk', newline does not count as whitespace.
+
+
+File: gawk.info,  Node: Auto-set,  Next: ARGC and ARGV,  Prev: User-modified,  Up: Built-in Variables
+
+Built-in Variables that Convey Information
+==========================================
+
+   This is an alphabetical list of the variables that are set
+automatically by `awk' on certain occasions in order to provide
+information to your program.  Those variables that are specific to
+`gawk' are marked with an asterisk, `*'.
+
+`ARGC'
+`ARGV'
+     The command-line arguments available to `awk' programs are stored
+     in an array called `ARGV'.  `ARGC' is the number of command-line
+     arguments present.  *Note Other Command Line Arguments: Other
+     Arguments.  Unlike most `awk' arrays, `ARGV' is indexed from zero
+     to `ARGC' - 1.  For example:
+
+          $ awk 'BEGIN {
+          >        for (i = 0; i < ARGC; i++)
+          >            print ARGV[i]
+          >      }' inventory-shipped BBS-list
+          -| awk
+          -| inventory-shipped
+          -| BBS-list
+
+     In this example, `ARGV[0]' contains `"awk"', `ARGV[1]' contains
+     `"inventory-shipped"', and `ARGV[2]' contains `"BBS-list"'.  The
+     value of `ARGC' is three, one more than the index of the last
+     element in `ARGV', since the elements are numbered from zero.
+
+     The names `ARGC' and `ARGV', as well as the convention of indexing
+     the array from zero to `ARGC' - 1, are derived from the C
+     language's method of accessing command line arguments.  *Note
+     Using `ARGC' and `ARGV': ARGC and ARGV, for information about how
+     `awk' uses these variables.
+
+`ARGIND *'
+     The index in `ARGV' of the current file being processed.  Every
+     time `gawk' opens a new data file for processing, it sets `ARGIND'
+     to the index in `ARGV' of the file name.  When `gawk' is
+     processing the input files, it is always true that `FILENAME ==
+     ARGV[ARGIND]'.
+
+     This variable is useful in file processing; it allows you to tell
+     how far along you are in the list of data files, and to
+     distinguish between successive instances of the same filename on
+     the command line.
+
+     While you can change the value of `ARGIND' within your `awk'
+     program, `gawk' will automatically set it to a new value when the
+     next file is opened.
+
+     This variable is a `gawk' extension. In other `awk'
+     implementations, or if `gawk' is in compatibility mode (*note
+     Command Line Options: Options.), it is not special.
+
+`ENVIRON'
+     An associative array that contains the values of the environment.
+     The array indices are the environment variable names; the values
+     are the values of the particular environment variables.  For
+     example, `ENVIRON["HOME"]' might be `/home/arnold'.  Changing this
+     array does not affect the environment passed on to any programs
+     that `awk' may spawn via redirection or the `system' function.
+     (In a future version of `gawk', it may do so.)
+
+     Some operating systems may not have environment variables.  On
+     such systems, the `ENVIRON' array is empty (except for
+     `ENVIRON["AWKPATH"]').
+
+`ERRNO *'
+     If a system error occurs either doing a redirection for `getline',
+     during a read for `getline', or during a `close' operation, then
+     `ERRNO' will contain a string describing the error.
+
+     This variable is a `gawk' extension. In other `awk'
+     implementations, or if `gawk' is in compatibility mode (*note
+     Command Line Options: Options.), it is not special.
+
+`FILENAME'
+     This is the name of the file that `awk' is currently reading.
+     When no data files are listed on the command line, `awk' reads
+     from the standard input, and `FILENAME' is set to `"-"'.
+     `FILENAME' is changed each time a new file is read (*note Reading
+     Input Files: Reading Files.).  Inside a `BEGIN' rule, the value of
+     `FILENAME' is `""', since there are no input files being processed
+     yet.(1) (d.c.)
+
+`FNR'
+     `FNR' is the current record number in the current file.  `FNR' is
+     incremented each time a new record is read (*note Explicit Input
+     with `getline': Getline.).  It is reinitialized to zero each time
+     a new input file is started.
+
+`NF'
+     `NF' is the number of fields in the current input record.  `NF' is
+     set each time a new record is read, when a new field is created,
+     or when `$0' changes (*note Examining Fields: Fields.).
+
+`NR'
+     This is the number of input records `awk' has processed since the
+     beginning of the program's execution (*note How Input is Split
+     into Records: Records.).  `NR' is set each time a new record is
+     read.
+
+`RLENGTH'
+     `RLENGTH' is the length of the substring matched by the `match'
+     function (*note Built-in Functions for String Manipulation: String
+     Functions.).  `RLENGTH' is set by invoking the `match' function.
+     Its value is the length of the matched string, or -1 if no match
+     was found.
+
+`RSTART'
+     `RSTART' is the start-index in characters of the substring matched
+     by the `match' function (*note Built-in Functions for String
+     Manipulation: String Functions.).  `RSTART' is set by invoking the
+     `match' function.  Its value is the position of the string where
+     the matched substring starts, or zero if no match was found.
+
+`RT *'
+     `RT' is set each time a record is read. It contains the input text
+     that matched the text denoted by `RS', the record separator.
+
+     This variable is a `gawk' extension. In other `awk'
+     implementations, or if `gawk' is in compatibility mode (*note
+     Command Line Options: Options.), it is not special.
+
+   A side note about `NR' and `FNR'.  `awk' simply increments both of
+these variables each time it reads a record, instead of setting them to
+the absolute value of the number of records read.  This means that your
+program can change these variables, and their new values will be
+incremented for each record (d.c.).  For example:
+
+     $ echo '1
+     > 2
+     > 3
+     > 4' | awk 'NR == 2 { NR = 17 }
+     > { print NR }'
+     -| 1
+     -| 17
+     -| 18
+     -| 19
+
+Before `FNR' was added to the `awk' language (*note Major Changes
+between V7 and SVR3.1: V7/SVR3.1.), many `awk' programs used this
+feature to track the number of records in a file by resetting `NR' to
+zero when `FILENAME' changed.
+
+   ---------- Footnotes ----------
+
+   (1) Some early implementations of Unix `awk' initialized `FILENAME'
+to `"-"', even if there were data files to be processed. This behavior
+was incorrect, and should not be relied upon in your programs.
+
+
+File: gawk.info,  Node: ARGC and ARGV,  Prev: Auto-set,  Up: Built-in Variables
+
+Using `ARGC' and `ARGV'
+=======================
+
+   In *Note Built-in Variables that Convey Information: Auto-set, you
+saw this program describing the information contained in `ARGC' and
+`ARGV':
+
+     $ awk 'BEGIN {
+     >        for (i = 0; i < ARGC; i++)
+     >            print ARGV[i]
+     >      }' inventory-shipped BBS-list
+     -| awk
+     -| inventory-shipped
+     -| BBS-list
+
+In this example, `ARGV[0]' contains `"awk"', `ARGV[1]' contains
+`"inventory-shipped"', and `ARGV[2]' contains `"BBS-list"'.
+
+   Notice that the `awk' program is not entered in `ARGV'.  The other
+special command line options, with their arguments, are also not
+entered.  But variable assignments on the command line _are_ treated as
+arguments, and do show up in the `ARGV' array.
+
+   Your program can alter `ARGC' and the elements of `ARGV'.  Each time
+`awk' reaches the end of an input file, it uses the next element of
+`ARGV' as the name of the next input file.  By storing a different
+string there, your program can change which files are read.  You can
+use `"-"' to represent the standard input.  By storing additional
+elements and incrementing `ARGC' you can cause additional files to be
+read.
+
+   If you decrease the value of `ARGC', that eliminates input files
+from the end of the list.  By recording the old value of `ARGC'
+elsewhere, your program can treat the eliminated arguments as something
+other than file names.
+
+   To eliminate a file from the middle of the list, store the null
+string (`""') into `ARGV' in place of the file's name.  As a special
+feature, `awk' ignores file names that have been replaced with the null
+string.  You may also use the `delete' statement to remove elements from
+`ARGV' (*note The `delete' Statement: Delete.).
+
+   All of these actions are typically done from the `BEGIN' rule,
+before actual processing of the input begins.  *Note Splitting a Large
+File Into Pieces: Split Program, and see *Note Duplicating Output Into
+Multiple Files: Tee Program, for an example of each way of removing
+elements from `ARGV'.
+
+   The following fragment processes `ARGV' in order to examine, and
+then remove, command line options.
+
+     BEGIN {
+         for (i = 1; i < ARGC; i++) {
+             if (ARGV[i] == "-v")
+                 verbose = 1
+             else if (ARGV[i] == "-d")
+                 debug = 1
+             else if (ARGV[i] ~ /^-?/) {
+                 e = sprintf("%s: unrecognized option -- %c",
+                         ARGV[0], substr(ARGV[i], 1, ,1))
+                 print e > "/dev/stderr"
+             } else
+                 break
+             delete ARGV[i]
+         }
+     }
+
+
+File: gawk.info,  Node: Arrays,  Next: Built-in,  Prev: Built-in Variables,  Up: Top
+
+Arrays in `awk'
+***************
+
+   An "array" is a table of values, called "elements".  The elements of
+an array are distinguished by their indices.  "Indices" may be either
+numbers or strings.  `awk' maintains a single set of names that may be
+used for naming variables, arrays and functions (*note User-defined
+Functions: User-defined.).  Thus, you cannot have a variable and an
+array with the same name in the same `awk' program.
+
+* Menu:
+
+* Array Intro::                 Introduction to Arrays
+* Reference to Elements::       How to examine one element of an array.
+* Assigning Elements::          How to change an element of an array.
+* Array Example::               Basic Example of an Array
+* Scanning an Array::           A variation of the `for' statement. It
+                                loops through the indices of an array's
+                                existing elements.
+* Delete::                      The `delete' statement removes an element
+                                from an array.
+* Numeric Array Subscripts::    How to use numbers as subscripts in
+                                `awk'.
+* Uninitialized Subscripts::    Using Uninitialized variables as subscripts.
+* Multi-dimensional::           Emulating multi-dimensional arrays in
+                                `awk'.
+* Multi-scanning::              Scanning multi-dimensional arrays.
+
+
+File: gawk.info,  Node: Array Intro,  Next: Reference to Elements,  Prev: Arrays,  Up: Arrays
+
+Introduction to Arrays
+======================
+
+   The `awk' language provides one-dimensional "arrays" for storing
+groups of related strings or numbers.
+
+   Every `awk' array must have a name.  Array names have the same
+syntax as variable names; any valid variable name would also be a valid
+array name.  But you cannot use one name in both ways (as an array and
+as a variable) in one `awk' program.
+
+   Arrays in `awk' superficially resemble arrays in other programming
+languages; but there are fundamental differences.  In `awk', you don't
+need to specify the size of an array before you start to use it.
+Additionally, any number or string in `awk' may be used as an array
+index, not just consecutive integers.
+
+   In most other languages, you have to "declare" an array and specify
+how many elements or components it contains.  In such languages, the
+declaration causes a contiguous block of memory to be allocated for that
+many elements.  An index in the array usually must be a positive
+integer; for example, the index zero specifies the first element in the
+array, which is actually stored at the beginning of the block of
+memory.  Index one specifies the second element, which is stored in
+memory right after the first element, and so on.  It is impossible to
+add more elements to the array, because it has room for only as many
+elements as you declared.  (Some languages allow arbitrary starting and
+ending indices, e.g., `15 .. 27', but the size of the array is still
+fixed when the array is declared.)
+
+   A contiguous array of four elements might look like this,
+conceptually, if the element values are eight, `"foo"', `""' and 30:
+
+     +---------+---------+--------+---------+
+     |    8    |  "foo"  |   ""   |    30   |    value
+     +---------+---------+--------+---------+
+          0         1         2         3        index
+
+Only the values are stored; the indices are implicit from the order of
+the values.  Eight is the value at index zero, because eight appears in
+the position with zero elements before it.
+
+   Arrays in `awk' are different: they are "associative".  This means
+that each array is a collection of pairs: an index, and its
+corresponding array element value:
+
+     Element 4     Value 30
+     Element 2     Value "foo"
+     Element 1     Value 8
+     Element 3     Value ""
+
+We have shown the pairs in jumbled order because their order is
+irrelevant.
+
+   One advantage of associative arrays is that new pairs can be added
+at any time.  For example, suppose we add to the above array a tenth
+element whose value is `"number ten"'.  The result is this:
+
+     Element 10    Value "number ten"
+     Element 4     Value 30
+     Element 2     Value "foo"
+     Element 1     Value 8
+     Element 3     Value ""
+
+Now the array is "sparse", which just means some indices are missing:
+it has elements 1-4 and 10, but doesn't have elements 5, 6, 7, 8, or 9.
+
+   Another consequence of associative arrays is that the indices don't
+have to be positive integers.  Any number, or even a string, can be an
+index.  For example, here is an array which translates words from
+English into French:
+
+     Element "dog" Value "chien"
+     Element "cat" Value "chat"
+     Element "one" Value "un"
+     Element 1     Value "un"
+
+Here we decided to translate the number one in both spelled-out and
+numeric form--thus illustrating that a single array can have both
+numbers and strings as indices.  (In fact, array subscripts are always
+strings; this is discussed in more detail in *Note Using Numbers to
+Subscript Arrays: Numeric Array Subscripts.)
+
+   When `awk' creates an array for you, e.g., with the `split' built-in
+function, that array's indices are consecutive integers starting at one.
+(*Note Built-in Functions for String Manipulation: String Functions.)
+
+
+File: gawk.info,  Node: Reference to Elements,  Next: Assigning Elements,  Prev: Array Intro,  Up: Arrays
+
+Referring to an Array Element
+=============================
+
+   The principal way of using an array is to refer to one of its
+elements.  An array reference is an expression which looks like this:
+
+     ARRAY[INDEX]
+
+Here, ARRAY is the name of an array.  The expression INDEX is the index
+of the element of the array that you want.
+
+   The value of the array reference is the current value of that array
+element.  For example, `foo[4.3]' is an expression for the element of
+array `foo' at index `4.3'.
+
+   If you refer to an array element that has no recorded value, the
+value of the reference is `""', the null string.  This includes elements
+to which you have not assigned any value, and elements that have been
+deleted (*note The `delete' Statement: Delete.).  Such a reference
+automatically creates that array element, with the null string as its
+value.  (In some cases, this is unfortunate, because it might waste
+memory inside `awk'.)
+
+   You can find out if an element exists in an array at a certain index
+with the expression:
+
+     INDEX in ARRAY
+
+This expression tests whether or not the particular index exists,
+without the side effect of creating that element if it is not present.
+The expression has the value one (true) if `ARRAY[INDEX]' exists, and
+zero (false) if it does not exist.
+
+   For example, to test whether the array `frequencies' contains the
+index `2', you could write this statement:
+
+     if (2 in frequencies)
+         print "Subscript 2 is present."
+
+   Note that this is _not_ a test of whether or not the array
+`frequencies' contains an element whose _value_ is two.  (There is no
+way to do that except to scan all the elements.)  Also, this _does not_
+create `frequencies[2]', while the following (incorrect) alternative
+would do so:
+
+     if (frequencies[2] != "")
+         print "Subscript 2 is present."
+
+
+File: gawk.info,  Node: Assigning Elements,  Next: Array Example,  Prev: Reference to Elements,  Up: Arrays
+
+Assigning Array Elements
+========================
+
+   Array elements are lvalues: they can be assigned values just like
+`awk' variables:
+
+     ARRAY[SUBSCRIPT] = VALUE
+
+Here ARRAY is the name of your array.  The expression SUBSCRIPT is the
+index of the element of the array that you want to assign a value.  The
+expression VALUE is the value you are assigning to that element of the
+array.
+
+
+File: gawk.info,  Node: Array Example,  Next: Scanning an Array,  Prev: Assigning Elements,  Up: Arrays
+
+Basic Array Example
+===================
+
+   The following program takes a list of lines, each beginning with a
+line number, and prints them out in order of line number.  The line
+numbers are not in order, however, when they are first read:  they are
+scrambled.  This program sorts the lines by making an array using the
+line numbers as subscripts.  It then prints out the lines in sorted
+order of their numbers.  It is a very simple program, and gets confused
+if it encounters repeated numbers, gaps, or lines that don't begin with
+a number.
+
+     {
+       if ($1 > max)
+         max = $1
+       arr[$1] = $0
+     }
+     
+     END {
+       for (x = 1; x <= max; x++)
+         print arr[x]
+     }
+
+   The first rule keeps track of the largest line number seen so far;
+it also stores each line into the array `arr', at an index that is the
+line's number.
+
+   The second rule runs after all the input has been read, to print out
+all the lines.
+
+   When this program is run with the following input:
+
+     5  I am the Five man
+     2  Who are you?  The new number two!
+     4  . . . And four on the floor
+     1  Who is number one?
+     3  I three you.
+
+its output is this:
+
+     1  Who is number one?
+     2  Who are you?  The new number two!
+     3  I three you.
+     4  . . . And four on the floor
+     5  I am the Five man
+
+   If a line number is repeated, the last line with a given number
+overrides the others.
+
+   Gaps in the line numbers can be handled with an easy improvement to
+the program's `END' rule:
+
+     END {
+       for (x = 1; x <= max; x++)
+         if (x in arr)
+           print arr[x]
+     }
+
+
+File: gawk.info,  Node: Scanning an Array,  Next: Delete,  Prev: Array Example,  Up: Arrays
+
+Scanning All Elements of an Array
+=================================
+
+   In programs that use arrays, you often need a loop that executes
+once for each element of an array.  In other languages, where arrays are
+contiguous and indices are limited to positive integers, this is easy:
+you can find all the valid indices by counting from the lowest index up
+to the highest.  This technique won't do the job in `awk', since any
+number or string can be an array index.  So `awk' has a special kind of
+`for' statement for scanning an array:
+
+     for (VAR in ARRAY)
+       BODY
+
+This loop executes BODY once for each index in ARRAY that your program
+has previously used, with the variable VAR set to that index.
+
+   Here is a program that uses this form of the `for' statement.  The
+first rule scans the input records and notes which words appear (at
+least once) in the input, by storing a one into the array `used' with
+the word as index.  The second rule scans the elements of `used' to
+find all the distinct words that appear in the input.  It prints each
+word that is more than 10 characters long, and also prints the number of
+such words.  *Note Built-in Functions for String Manipulation: String
+Functions, for more information on the built-in function `length'.
+
+     # Record a 1 for each word that is used at least once.
+     {
+         for (i = 1; i <= NF; i++)
+             used[$i] = 1
+     }
+     
+     # Find number of distinct words more than 10 characters long.
+     END {
+         for (x in used)
+             if (length(x) > 10) {
+                 ++num_long_words
+                 print x
+             }
+         print num_long_words, "words longer than 10 characters"
+     }
+
+*Note Generating Word Usage Counts: Word Sorting, for a more detailed
+example of this type.
+
+   The order in which elements of the array are accessed by this
+statement is determined by the internal arrangement of the array
+elements within `awk' and cannot be controlled or changed.  This can
+lead to problems if new elements are added to ARRAY by statements in
+the loop body; you cannot predict whether or not the `for' loop will
+reach them.  Similarly, changing VAR inside the loop may produce
+strange results.  It is best to avoid such things.
+
+
+File: gawk.info,  Node: Delete,  Next: Numeric Array Subscripts,  Prev: Scanning an Array,  Up: Arrays
+
+The `delete' Statement
+======================
+
+   You can remove an individual element of an array using the `delete'
+statement:
+
+     delete ARRAY[INDEX]
+
+   Once you have deleted an array element, you can no longer obtain any
+value the element once had.  It is as if you had never referred to it
+and had never given it any value.
+
+   Here is an example of deleting elements in an array:
+
+     for (i in frequencies)
+       delete frequencies[i]
+
+This example removes all the elements from the array `frequencies'.
+
+   If you delete an element, a subsequent `for' statement to scan the
+array will not report that element, and the `in' operator to check for
+the presence of that element will return zero (i.e. false):
+
+     delete foo[4]
+     if (4 in foo)
+         print "This will never be printed"
+
+   It is important to note that deleting an element is _not_ the same
+as assigning it a null value (the empty string, `""').
+
+     foo[4] = ""
+     if (4 in foo)
+       print "This is printed, even though foo[4] is empty"
+
+   It is not an error to delete an element that does not exist.
+
+   You can delete all the elements of an array with a single statement,
+by leaving off the subscript in the `delete' statement.
+
+     delete ARRAY
+
+   This ability is a `gawk' extension; it is not available in
+compatibility mode (*note Command Line Options: Options.).
+
+   Using this version of the `delete' statement is about three times
+more efficient than the equivalent loop that deletes each element one
+at a time.
+
+   The following statement provides a portable, but non-obvious way to
+clear out an array.
+
+     # thanks to Michael Brennan for pointing this out
+     split("", array)
+
+   The `split' function (*note Built-in Functions for String
+Manipulation: String Functions.)  clears out the target array first.
+This call asks it to split apart the null string. Since there is no
+data to split out, the function simply clears the array and then
+returns.
+
+
+File: gawk.info,  Node: Numeric Array Subscripts,  Next: Uninitialized Subscripts,  Prev: Delete,  Up: Arrays
+
+Using Numbers to Subscript Arrays
+=================================
+
+   An important aspect of arrays to remember is that _array subscripts
+are always strings_.  If you use a numeric value as a subscript, it
+will be converted to a string value before it is used for subscripting
+(*note Conversion of Strings and Numbers: Conversion.).
+
+   This means that the value of the built-in variable `CONVFMT' can
+potentially affect how your program accesses elements of an array.  For
+example:
+
+     xyz = 12.153
+     data[xyz] = 1
+     CONVFMT = "%2.2f"
+     if (xyz in data)
+         printf "%s is in data\n", xyz
+     else
+         printf "%s is not in data\n", xyz
+
+This prints `12.15 is not in data'.  The first statement gives `xyz' a
+numeric value.  Assigning to `data[xyz]' subscripts `data' with the
+string value `"12.153"' (using the default conversion value of
+`CONVFMT', `"%.6g"'), and assigns one to `data["12.153"]'.  The program
+then changes the value of `CONVFMT'.  The test `(xyz in data)'
+generates a new string value from `xyz', this time `"12.15"', since the
+value of `CONVFMT' only allows two significant digits.  This test fails,
+since `"12.15"' is a different string from `"12.153"'.
+
+   According to the rules for conversions (*note Conversion of Strings
+and Numbers: Conversion.), integer values are always converted to
+strings as integers, no matter what the value of `CONVFMT' may happen
+to be.  So the usual case of:
+
+     for (i = 1; i <= maxsub; i++)
+         do something with array[i]
+
+will work, no matter what the value of `CONVFMT'.
+
+   Like many things in `awk', the majority of the time things work as
+you would expect them to work.  But it is useful to have a precise
+knowledge of the actual rules, since sometimes they can have a subtle
+effect on your programs.
+
+
+File: gawk.info,  Node: Uninitialized Subscripts,  Next: Multi-dimensional,  Prev: Numeric Array Subscripts,  Up: Arrays
+
+Using Uninitialized Variables as Subscripts
+===========================================
+
+   Suppose you want to print your input data in reverse order.  A
+reasonable attempt at a program to do so (with some test data) might
+look like this:
+
+     $ echo 'line 1
+     > line 2
+     > line 3' | awk '{ l[lines] = $0; ++lines }
+     > END {
+     >     for (i = lines-1; i >= 0; --i)
+     >        print l[i]
+     > }'
+     -| line 3
+     -| line 2
+
+   Unfortunately, the very first line of input data did not come out in
+the output!
+
+   At first glance, this program should have worked.  The variable
+`lines' is uninitialized, and uninitialized variables have the numeric
+value zero.  So, the value of `l[0]' should have been printed.
+
+   The issue here is that subscripts for `awk' arrays are *always*
+strings. And uninitialized variables, when used as strings, have the
+value `""', not zero.  Thus, `line 1' ended up stored in `l[""]'.
+
+   The following version of the program works correctly:
+
+     { l[lines++] = $0 }
+     END {
+         for (i = lines - 1; i >= 0; --i)
+            print l[i]
+     }
+
+   Here, the `++' forces `l' to be numeric, thus making the "old value"
+numeric zero, which is then converted to `"0"' as the array subscript.
+
+   As we have just seen, even though it is somewhat unusual, the null
+string (`""') is a valid array subscript (d.c.). If `--lint' is provided
+on the command line (*note Command Line Options: Options.), `gawk' will
+warn about the use of the null string as a subscript.
+
+
+File: gawk.info,  Node: Multi-dimensional,  Next: Multi-scanning,  Prev: Uninitialized Subscripts,  Up: Arrays
+
+Multi-dimensional Arrays
+========================
+
+   A multi-dimensional array is an array in which an element is
+identified by a sequence of indices, instead of a single index.  For
+example, a two-dimensional array requires two indices.  The usual way
+(in most languages, including `awk') to refer to an element of a
+two-dimensional array named `grid' is with `grid[X,Y]'.
+
+   Multi-dimensional arrays are supported in `awk' through
+concatenation of indices into one string.  What happens is that `awk'
+converts the indices into strings (*note Conversion of Strings and
+Numbers: Conversion.) and concatenates them together, with a separator
+between them.  This creates a single string that describes the values
+of the separate indices.  The combined string is used as a single index
+into an ordinary, one-dimensional array.  The separator used is the
+value of the built-in variable `SUBSEP'.
+
+   For example, suppose we evaluate the expression `foo[5,12] = "value"'
+when the value of `SUBSEP' is `"@"'.  The numbers five and 12 are
+converted to strings and concatenated with an `@' between them,
+yielding `"5@12"'; thus, the array element `foo["5@12"]' is set to
+`"value"'.
+
+   Once the element's value is stored, `awk' has no record of whether
+it was stored with a single index or a sequence of indices.  The two
+expressions `foo[5,12]' and `foo[5 SUBSEP 12]' are always equivalent.
+
+   The default value of `SUBSEP' is the string `"\034"', which contains
+a non-printing character that is unlikely to appear in an `awk' program
+or in most input data.
+
+   The usefulness of choosing an unlikely character comes from the fact
+that index values that contain a string matching `SUBSEP' lead to
+combined strings that are ambiguous.  Suppose that `SUBSEP' were `"@"';
+then `foo["a@b", "c"]' and `foo["a", "b@c"]' would be indistinguishable
+because both would actually be stored as `foo["a@b@c"]'.
+
+   You can test whether a particular index-sequence exists in a
+"multi-dimensional" array with the same operator `in' used for single
+dimensional arrays.  Instead of a single index as the left-hand operand,
+write the whole sequence of indices, separated by commas, in
+parentheses:
+
+     (SUBSCRIPT1, SUBSCRIPT2, ...) in ARRAY
+
+   The following example treats its input as a two-dimensional array of
+fields; it rotates this array 90 degrees clockwise and prints the
+result.  It assumes that all lines have the same number of elements.
+
+     awk '{
+          if (max_nf < NF)
+               max_nf = NF
+          max_nr = NR
+          for (x = 1; x <= NF; x++)
+               vector[x, NR] = $x
+     }
+     
+     END {
+          for (x = 1; x <= max_nf; x++) {
+               for (y = max_nr; y >= 1; --y)
+                    printf("%s ", vector[x, y])
+               printf("\n")
+          }
+     }'
+
+When given the input:
+
+     1 2 3 4 5 6
+     2 3 4 5 6 1
+     3 4 5 6 1 2
+     4 5 6 1 2 3
+
+it produces:
+
+     4 3 2 1
+     5 4 3 2
+     6 5 4 3
+     1 6 5 4
+     2 1 6 5
+     3 2 1 6
+
+
+File: gawk.info,  Node: Multi-scanning,  Prev: Multi-dimensional,  Up: Arrays
+
+Scanning Multi-dimensional Arrays
+=================================
+
+   There is no special `for' statement for scanning a
+"multi-dimensional" array; there cannot be one, because in truth there
+are no multi-dimensional arrays or elements; there is only a
+multi-dimensional _way of accessing_ an array.
+
+   However, if your program has an array that is always accessed as
+multi-dimensional, you can get the effect of scanning it by combining
+the scanning `for' statement (*note Scanning All Elements of an Array:
+Scanning an Array.) with the `split' built-in function (*note Built-in
+Functions for String Manipulation: String Functions.).  It works like
+this:
+
+     for (combined in array) {
+       split(combined, separate, SUBSEP)
+       ...
+     }
+
+This sets `combined' to each concatenated, combined index in the array,
+and splits it into the individual indices by breaking it apart where
+the value of `SUBSEP' appears.  The split-out indices become the
+elements of the array `separate'.
+
+   Thus, suppose you have previously stored a value in `array[1,
+"foo"]'; then an element with index `"1\034foo"' exists in `array'.
+(Recall that the default value of `SUBSEP' is the character with code
+034.)  Sooner or later the `for' statement will find that index and do
+an iteration with `combined' set to `"1\034foo"'.  Then the `split'
+function is called as follows:
+
+     split("1\034foo", separate, "\034")
+
+The result of this is to set `separate[1]' to `"1"' and `separate[2]'
+to `"foo"'.  Presto, the original sequence of separate indices has been
+recovered.
+
+
+File: gawk.info,  Node: Built-in,  Next: User-defined,  Prev: Arrays,  Up: Top
+
+Built-in Functions
+******************
+
+   "Built-in" functions are functions that are always available for
+your `awk' program to call.  This chapter defines all the built-in
+functions in `awk'; some of them are mentioned in other sections, but
+they are summarized here for your convenience.  (You can also define
+new functions yourself.  *Note User-defined Functions: User-defined.)
+
+* Menu:
+
+* Calling Built-in::            How to call built-in functions.
+* Numeric Functions::           Functions that work with numbers, including
+                                `int', `sin' and `rand'.
+* String Functions::            Functions for string manipulation, such as
+                                `split', `match', and
+                                `sprintf'.
+* I/O Functions::               Functions for files and shell commands.
+* Time Functions::              Functions for dealing with time stamps.
+
+
+File: gawk.info,  Node: Calling Built-in,  Next: Numeric Functions,  Prev: Built-in,  Up: Built-in
+
+Calling Built-in Functions
+==========================
+
+   To call a built-in function, write the name of the function followed
+by arguments in parentheses.  For example, `atan2(y + z, 1)' is a call
+to the function `atan2', with two arguments.
+
+   Whitespace is ignored between the built-in function name and the
+open-parenthesis, but we recommend that you avoid using whitespace
+there.  User-defined functions do not permit whitespace in this way, and
+you will find it easier to avoid mistakes by following a simple
+convention which always works: no whitespace after a function name.
+
+   Each built-in function accepts a certain number of arguments.  In
+some cases, arguments can be omitted. The defaults for omitted
+arguments vary from function to function and are described under the
+individual functions.  In some `awk' implementations, extra arguments
+given to built-in functions are ignored.  However, in `gawk', it is a
+fatal error to give extra arguments to a built-in function.
+
+   When a function is called, expressions that create the function's
+actual parameters are evaluated completely before the function call is
+performed.  For example, in the code fragment:
+
+     i = 4
+     j = sqrt(i++)
+
+the variable `i' is set to five before `sqrt' is called with a value of
+four for its actual parameter.
+
+   The order of evaluation of the expressions used for the function's
+parameters is undefined.  Thus, you should not write programs that
+assume that parameters are evaluated from left to right or from right
+to left.  For example,
+
+     i = 5
+     j = atan2(i++, i *= 2)
+
+   If the order of evaluation is left to right, then `i' first becomes
+six, and then 12, and `atan2' is called with the two arguments six and
+12.  But if the order of evaluation is right to left, `i' first becomes
+10, and then 11, and `atan2' is called with the two arguments 11 and 10.
+
+
+File: gawk.info,  Node: Numeric Functions,  Next: String Functions,  Prev: Calling Built-in,  Up: Built-in
+
+Numeric Built-in Functions
+==========================
+
+   Here is a full list of built-in functions that work with numbers.
+Optional parameters are enclosed in square brackets ("[" and "]").
+
+`int(X)'
+     This produces the nearest integer to X, located between X and zero,
+     truncated toward zero.
+
+     For example, `int(3)' is three, `int(3.9)' is three, `int(-3.9)'
+     is -3, and `int(-3)' is -3 as well.
+
+`sqrt(X)'
+     This gives you the positive square root of X.  It reports an error
+     if X is negative.  Thus, `sqrt(4)' is two.
+
+`exp(X)'
+     This gives you the exponential of X (`e ^ X'), or reports an error
+     if X is out of range.  The range of values X can have depends on
+     your machine's floating point representation.
+
+`log(X)'
+     This gives you the natural logarithm of X, if X is positive;
+     otherwise, it reports an error.
+
+`sin(X)'
+     This gives you the sine of X, with X in radians.
+
+`cos(X)'
+     This gives you the cosine of X, with X in radians.
+
+`atan2(Y, X)'
+     This gives you the arctangent of `Y / X' in radians.
+
+`rand()'
+     This gives you a random number.  The values of `rand' are
+     uniformly-distributed between zero and one.  The value is never
+     zero and never one.
+
+     Often you want random integers instead.  Here is a user-defined
+     function you can use to obtain a random non-negative integer less
+     than N:
+
+          function randint(n) {
+               return int(n * rand())
+          }
+
+     The multiplication produces a random real number greater than zero
+     and less than `n'.  We then make it an integer (using `int')
+     between zero and `n' - 1, inclusive.
+
+     Here is an example where a similar function is used to produce
+     random integers between one and N.  This program prints a new
+     random number for each input record.
+
+          awk '
+          # Function to roll a simulated die.
+          function roll(n) { return 1 + int(rand() * n) }
+          
+          # Roll 3 six-sided dice and
+          # print total number of points.
+          {
+                printf("%d points\n",
+                       roll(6)+roll(6)+roll(6))
+          }'
+
+     *Caution:* In most `awk' implementations, including `gawk', `rand'
+     starts generating numbers from the same starting number, or
+     "seed", each time you run `awk'.  Thus, a program will generate
+     the same results each time you run it.  The numbers are random
+     within one `awk' run, but predictable from run to run.  This is
+     convenient for debugging, but if you want a program to do
+     different things each time it is used, you must change the seed to
+     a value that will be different in each run.  To do this, use
+     `srand'.
+
+`srand([X])'
+     The function `srand' sets the starting point, or seed, for
+     generating random numbers to the value X.
+
+     Each seed value leads to a particular sequence of random
+     numbers.(1) Thus, if you set the seed to the same value a second
+     time, you will get the same sequence of random numbers again.
+
+     If you omit the argument X, as in `srand()', then the current date
+     and time of day are used for a seed.  This is the way to get random
+     numbers that are truly unpredictable.
+
+     The return value of `srand' is the previous seed.  This makes it
+     easy to keep track of the seeds for use in consistently reproducing
+     sequences of random numbers.
+
+   ---------- Footnotes ----------
+
+   (1) Computer generated random numbers really are not truly random.
+They are technically known as "pseudo-random."  This means that while
+the numbers in a sequence appear to be random, you can in fact generate
+the same sequence of random numbers over and over again.
+
+
+File: gawk.info,  Node: String Functions,  Next: I/O Functions,  Prev: Numeric Functions,  Up: Built-in
+
+Built-in Functions for String Manipulation
+==========================================
+
+   The functions in this section look at or change the text of one or
+more strings.  Optional parameters are enclosed in square brackets ("["
+and "]").
+
+`index(IN, FIND)'
+     This searches the string IN for the first occurrence of the string
+     FIND, and returns the position in characters where that occurrence
+     begins in the string IN.  For example:
+
+          $ awk 'BEGIN { print index("peanut", "an") }'
+          -| 3
+
+     If FIND is not found, `index' returns zero.  (Remember that string
+     indices in `awk' start at one.)
+
+`length([STRING])'
+     This gives you the number of characters in STRING.  If STRING is a
+     number, the length of the digit string representing that number is
+     returned.  For example, `length("abcde")' is five.  By contrast,
+     `length(15 * 35)' works out to three.  How?  Well, 15 * 35 = 525,
+     and 525 is then converted to the string `"525"', which has three
+     characters.
+
+     If no argument is supplied, `length' returns the length of `$0'.
+
+     In older versions of `awk', you could call the `length' function
+     without any parentheses.  Doing so is marked as "deprecated" in the
+     POSIX standard.  This means that while you can do this in your
+     programs, it is a feature that can eventually be removed from a
+     future version of the standard.  Therefore, for maximal
+     portability of your `awk' programs, you should always supply the
+     parentheses.
+
+`match(STRING, REGEXP)'
+     The `match' function searches the string, STRING, for the longest,
+     leftmost substring matched by the regular expression, REGEXP.  It
+     returns the character position, or "index", of where that
+     substring begins (one, if it starts at the beginning of STRING).
+     If no match is found, it returns zero.
+
+     The `match' function sets the built-in variable `RSTART' to the
+     index.  It also sets the built-in variable `RLENGTH' to the length
+     in characters of the matched substring.  If no match is found,
+     `RSTART' is set to zero, and `RLENGTH' to -1.
+
+     For example:
+
+          awk '{
+                 if ($1 == "FIND")
+                   regex = $2
+                 else {
+                   where = match($0, regex)
+                   if (where != 0)
+                     print "Match of", regex, "found at", \
+                               where, "in", $0
+                 }
+          }'
+
+     This program looks for lines that match the regular expression
+     stored in the variable `regex'.  This regular expression can be
+     changed.  If the first word on a line is `FIND', `regex' is
+     changed to be the second word on that line.  Therefore, given:
+
+          FIND ru+n
+          My program runs
+          but not very quickly
+          FIND Melvin
+          JF+KM
+          This line is property of Reality Engineering Co.
+          Melvin was here.
+
+     `awk' prints:
+
+          Match of ru+n found at 12 in My program runs
+          Match of Melvin found at 1 in Melvin was here.
+
+`split(STRING, ARRAY [, FIELDSEP])'
+     This divides STRING into pieces separated by FIELDSEP, and stores
+     the pieces in ARRAY.  The first piece is stored in `ARRAY[1]', the
+     second piece in `ARRAY[2]', and so forth.  The string value of the
+     third argument, FIELDSEP, is a regexp describing where to split
+     STRING (much as `FS' can be a regexp describing where to split
+     input records).  If the FIELDSEP is omitted, the value of `FS' is
+     used.  `split' returns the number of elements created.
+
+     The `split' function splits strings into pieces in a manner
+     similar to the way input lines are split into fields.  For example:
+
+          split("cul-de-sac", a, "-")
+
+     splits the string `cul-de-sac' into three fields using `-' as the
+     separator.  It sets the contents of the array `a' as follows:
+
+          a[1] = "cul"
+          a[2] = "de"
+          a[3] = "sac"
+
+     The value returned by this call to `split' is three.
+
+     As with input field-splitting, when the value of FIELDSEP is
+     `" "', leading and trailing whitespace is ignored, and the elements
+     are separated by runs of whitespace.
+
+     Also as with input field-splitting, if FIELDSEP is the null
+     string, each individual character in the string is split into its
+     own array element.  (This is a `gawk'-specific extension.)
+
+     Recent implementations of `awk', including `gawk', allow the third
+     argument to be a regexp constant (`/abc/'), as well as a string
+     (d.c.).  The POSIX standard allows this as well.
+
+     Before splitting the string, `split' deletes any previously
+     existing elements in the array ARRAY (d.c.).
+
+`sprintf(FORMAT, EXPRESSION1,...)'
+     This returns (without printing) the string that `printf' would
+     have printed out with the same arguments (*note Using `printf'
+     Statements for Fancier Printing: Printf.).  For example:
+
+          sprintf("pi = %.2f (approx.)", 22/7)
+
+     returns the string `"pi = 3.14 (approx.)"'.
+
+`sub(REGEXP, REPLACEMENT [, TARGET])'
+     The `sub' function alters the value of TARGET.  It searches this
+     value, which is treated as a string, for the leftmost longest
+     substring matched by the regular expression, REGEXP, extending
+     this match as far as possible.  Then the entire string is changed
+     by replacing the matched text with REPLACEMENT.  The modified
+     string becomes the new value of TARGET.
+
+     This function is peculiar because TARGET is not simply used to
+     compute a value, and not just any expression will do: it must be a
+     variable, field or array element, so that `sub' can store a
+     modified value there.  If this argument is omitted, then the
+     default is to use and alter `$0'.
+
+     For example:
+
+          str = "water, water, everywhere"
+          sub(/at/, "ith", str)
+
+     sets `str' to `"wither, water, everywhere"', by replacing the
+     leftmost, longest occurrence of `at' with `ith'.
+
+     The `sub' function returns the number of substitutions made (either
+     one or zero).
+
+     If the special character `&' appears in REPLACEMENT, it stands for
+     the precise substring that was matched by REGEXP.  (If the regexp
+     can match more than one string, then this precise substring may
+     vary.)  For example:
+
+          awk '{ sub(/candidate/, "& and his wife"); print }'
+
+     changes the first occurrence of `candidate' to `candidate and his
+     wife' on each input line.
+
+     Here is another example:
+
+          awk 'BEGIN {
+                  str = "daabaaa"
+                  sub(/a*/, "c&c", str)
+                  print str
+          }'
+          -| dcaacbaaa
+
+     This shows how `&' can represent a non-constant string, and also
+     illustrates the "leftmost, longest" rule in regexp matching (*note
+     How Much Text Matches?: Leftmost Longest.).
+
+     The effect of this special character (`&') can be turned off by
+     putting a backslash before it in the string.  As usual, to insert
+     one backslash in the string, you must write two backslashes.
+     Therefore, write `\\&' in a string constant to include a literal
+     `&' in the replacement.  For example, here is how to replace the
+     first `|' on each line with an `&':
+
+          awk '{ sub(/\|/, "\\&"); print }'
+
+     *Note:* As mentioned above, the third argument to `sub' must be a
+     variable, field or array reference.  Some versions of `awk' allow
+     the third argument to be an expression which is not an lvalue.  In
+     such a case, `sub' would still search for the pattern and return
+     zero or one, but the result of the substitution (if any) would be
+     thrown away because there is no place to put it.  Such versions of
+     `awk' accept expressions like this:
+
+          sub(/USA/, "United States", "the USA and Canada")
+
+     For historical compatibility, `gawk' will accept erroneous code,
+     such as in the above example. However, using any other
+     non-changeable object as the third parameter will cause a fatal
+     error, and your program will not run.
+
+`gsub(REGEXP, REPLACEMENT [, TARGET])'
+     This is similar to the `sub' function, except `gsub' replaces
+     _all_ of the longest, leftmost, _non-overlapping_ matching
+     substrings it can find.  The `g' in `gsub' stands for "global,"
+     which means replace everywhere.  For example:
+
+          awk '{ gsub(/Britain/, "United Kingdom"); print }'
+
+     replaces all occurrences of the string `Britain' with `United
+     Kingdom' for all input records.
+
+     The `gsub' function returns the number of substitutions made.  If
+     the variable to be searched and altered, TARGET, is omitted, then
+     the entire input record, `$0', is used.
+
+     As in `sub', the characters `&' and `\' are special, and the third
+     argument must be an lvalue.
+
+`gensub(REGEXP, REPLACEMENT, HOW [, TARGET])'
+     `gensub' is a general substitution function.  Like `sub' and
+     `gsub', it searches the target string TARGET for matches of the
+     regular expression REGEXP.  Unlike `sub' and `gsub', the modified
+     string is returned as the result of the function, and the original
+     target string is _not_ changed.  If HOW is a string beginning with
+     `g' or `G', then it replaces all matches of REGEXP with
+     REPLACEMENT.  Otherwise, HOW is a number indicating which match of
+     REGEXP to replace. If no TARGET is supplied, `$0' is used instead.
+
+     `gensub' provides an additional feature that is not available in
+     `sub' or `gsub': the ability to specify components of a regexp in
+     the replacement text.  This is done by using parentheses in the
+     regexp to mark the components, and then specifying `\N' in the
+     replacement text, where N is a digit from one to nine.  For
+     example:
+
+          $ gawk '
+          > BEGIN {
+          >      a = "abc def"
+          >      b = gensub(/(.+) (.+)/, "\\2 \\1", "g", a)
+          >      print b
+          > }'
+          -| def abc
+
+     As described above for `sub', you must type two backslashes in
+     order to get one into the string.
+
+     In the replacement text, the sequence `\0' represents the entire
+     matched text, as does the character `&'.
+
+     This example shows how you can use the third argument to control
+     which match of the regexp should be changed.
+
+          $ echo a b c a b c |
+          > gawk '{ print gensub(/a/, "AA", 2) }'
+          -| a b c AA b c
+
+     In this case, `$0' is used as the default target string.  `gensub'
+     returns the new string as its result, which is passed directly to
+     `print' for printing.
+
+     If the HOW argument is a string that does not begin with `g' or
+     `G', or if it is a number that is less than zero, only one
+     substitution is performed.
+
+     `gensub' is a `gawk' extension; it is not available in
+     compatibility mode (*note Command Line Options: Options.).
+
+`substr(STRING, START [, LENGTH])'
+     This returns a LENGTH-character-long substring of STRING, starting
+     at character number START.  The first character of a string is
+     character number one.  For example, `substr("washington", 5, 3)'
+     returns `"ing"'.
+
+     If LENGTH is not present, this function returns the whole suffix of
+     STRING that begins at character number START.  For example,
+     `substr("washington", 5)' returns `"ington"'.  The whole suffix is
+     also returned if LENGTH is greater than the number of characters
+     remaining in the string, counting from character number START.
+
+     *Note:* The string returned by `substr' _cannot_ be assigned to.
+     Thus, it is a mistake to attempt to change a portion of a string,
+     like this:
+
+          string = "abcdef"
+          # try to get "abCDEf", won't work
+          substr(string, 3, 3) = "CDE"
+
+     or to use `substr' as the third agument of `sub' or `gsub':
+
+          gsub(/xyz/, "pdq", substr($0, 5, 20))  # WRONG
+
+`tolower(STRING)'
+     This returns a copy of STRING, with each upper-case character in
+     the string replaced with its corresponding lower-case character.
+     Non-alphabetic characters are left unchanged.  For example,
+     `tolower("MiXeD cAsE 123")' returns `"mixed case 123"'.
+
+`toupper(STRING)'
+     This returns a copy of STRING, with each lower-case character in
+     the string replaced with its corresponding upper-case character.
+     Non-alphabetic characters are left unchanged.  For example,
+     `toupper("MiXeD cAsE 123")' returns `"MIXED CASE 123"'.
+
+More About `\' and `&' with `sub', `gsub' and `gensub'
+------------------------------------------------------
+
+   When using `sub', `gsub' or `gensub', and trying to get literal
+backslashes and ampersands into the replacement text, you need to
+remember that there are several levels of "escape processing" going on.
+
+   First, there is the "lexical" level, which is when `awk' reads your
+program, and builds an internal copy of your program that can be
+executed.
+
+   Then there is the run-time level, when `awk' actually scans the
+replacement string to determine what to generate.
+
+   At both levels, `awk' looks for a defined set of characters that can
+come after a backslash.  At the lexical level, it looks for the escape
+sequences listed in *Note Escape Sequences::.  Thus, for every `\' that
+`awk' will process at the run-time level, you type two `\'s at the
+lexical level.  When a character that is not valid for an escape
+sequence follows the `\', Unix `awk' and `gawk' both simply remove the
+initial `\', and put the following character into the string. Thus, for
+example, `"a\qb"' is treated as `"aqb"'.
+
+   At the run-time level, the various functions handle sequences of `\'
+and `&' differently.  The situation is (sadly) somewhat complex.
+
+   Historically, the `sub' and `gsub' functions treated the two
+character sequence `\&' specially; this sequence was replaced in the
+generated text with a single `&'.  Any other `\' within the REPLACEMENT
+string that did not precede an `&' was passed through unchanged.  To
+illustrate with a table:
+
+      You type         `sub' sees          `sub' generates
+      --------         ----------          ---------------
+          `\&'              `&'            the matched text
+         `\\&'             `\&'            a literal `&'
+        `\\\&'             `\&'            a literal `&'
+       `\\\\&'            `\\&'            a literal `\&'
+      `\\\\\&'            `\\&'            a literal `\&'
+     `\\\\\\&'           `\\\&'            a literal `\\&'
+         `\\q'             `\q'            a literal `\q'
+
+This table shows both the lexical level processing, where an odd number
+of backslashes becomes an even number at the run time level, and the
+run-time processing done by `sub'.  (For the sake of simplicity, the
+rest of the tables below only show the case of even numbers of `\'s
+entered at the lexical level.)
+
+   The problem with the historical approach is that there is no way to
+get a literal `\' followed by the matched text.
+
+   The 1992 POSIX standard attempted to fix this problem. The standard
+says that `sub' and `gsub' look for either a `\' or an `&' after the
+`\'. If either one follows a `\', that character is output literally.
+The interpretation of `\' and `&' then becomes like this:
+
+      You type         `sub' sees          `sub' generates
+      --------         ----------          ---------------
+           `&'              `&'            the matched text
+         `\\&'             `\&'            a literal `&'
+       `\\\\&'            `\\&'            a literal `\', then the matched text
+     `\\\\\\&'           `\\\&'            a literal `\&'
+
+This would appear to solve the problem.  Unfortunately, the phrasing of
+the standard is unusual. It says, in effect, that `\' turns off the
+special meaning of any following character, but that for anything other
+than `\' and `&', such special meaning is undefined.  This wording
+leads to two problems.
+
+  1. Backslashes must now be doubled in the REPLACEMENT string, breaking
+     historical `awk' programs.
+
+  2. To make sure that an `awk' program is portable, _every_ character
+     in the REPLACEMENT string must be preceded with a backslash.(1)
+
+   The POSIX standard is under revision.(2) Because of the above
+problems, proposed text for the revised standard reverts to rules that
+correspond more closely to the original existing practice. The proposed
+rules have special cases that make it possible to produce a `\'
+preceding the matched text.
+
+      You type         `sub' sees         `sub' generates
+      --------         ----------         ---------------
+     `\\\\\\&'           `\\\&'            a literal `\&'
+       `\\\\&'            `\\&'            a literal `\', followed by the matched text
+         `\\&'             `\&'            a literal `&'
+         `\\q'             `\q'            a literal `\q'
+
+   In a nutshell, at the run-time level, there are now three special
+sequences of characters, `\\\&', `\\&' and `\&', whereas historically,
+there was only one.  However, as in the historical case, any `\' that
+is not part of one of these three sequences is not special, and appears
+in the output literally.
+
+   `gawk' 3.0 follows these proposed POSIX rules for `sub' and `gsub'.
+Whether these proposed rules will actually become codified into the
+standard is unknown at this point. Subsequent `gawk' releases will
+track the standard and implement whatever the final version specifies;
+this Info file will be updated as well.
+
+   The rules for `gensub' are considerably simpler. At the run-time
+level, whenever `gawk' sees a `\', if the following character is a
+digit, then the text that matched the corresponding parenthesized
+subexpression is placed in the generated output.  Otherwise, no matter
+what the character after the `\' is, that character will appear in the
+generated text, and the `\' will not.
+
+       You type          `gensub' sees         `gensub' generates
+       --------          -------------         ------------------
+           `&'                    `&'            the matched text
+         `\\&'                   `\&'            a literal `&'
+        `\\\\'                   `\\'            a literal `\'
+       `\\\\&'                  `\\&'            a literal `\', then the matched text
+     `\\\\\\&'                 `\\\&'            a literal `\&'
+         `\\q'                   `\q'            a literal `q'
+
+   Because of the complexity of the lexical and run-time level
+processing, and the special cases for `sub' and `gsub', we recommend
+the use of `gawk' and `gensub' for when you have to do substitutions.
+
+   ---------- Footnotes ----------
+
+   (1) This consequence was certainly unintended.
+
+   (2) As of December 1995, with final approval and publication
+hopefully sometime in 1996.
+
+
+File: gawk.info,  Node: I/O Functions,  Next: Time Functions,  Prev: String Functions,  Up: Built-in
+
+Built-in Functions for Input/Output
+===================================
+
+   The following functions are related to Input/Output (I/O).  Optional
+parameters are enclosed in square brackets ("[" and "]").
+
+`close(FILENAME)'
+     Close the file FILENAME, for input or output.  The argument may
+     alternatively be a shell command that was used for redirecting to
+     or from a pipe; then the pipe is closed.  *Note Closing Input and
+     Output Files and Pipes: Close Files And Pipes, for more
+     information.
+
+`fflush([FILENAME])'
+     Flush any buffered output associated FILENAME, which is either a
+     file opened for writing, or a shell command for redirecting output
+     to a pipe.
+
+     Many utility programs will "buffer" their output; they save
+     information to be written to a disk file or terminal in memory,
+     until there is enough for it to be worthwhile to send the data to
+     the ouput device.  This is often more efficient than writing every
+     little bit of information as soon as it is ready.  However,
+     sometimes it is necessary to force a program to "flush" its
+     buffers; that is, write the information to its destination, even
+     if a buffer is not full.  This is the purpose of the `fflush'
+     function; `gawk' too buffers its output, and the `fflush' function
+     can be used to force `gawk' to flush its buffers.
+
+     `fflush' is a recent (1994) addition to the Bell Labs research
+     version of `awk'; it is not part of the POSIX standard, and will
+     not be available if `--posix' has been specified on the command
+     line (*note Command Line Options: Options.).
+
+     `gawk' extends the `fflush' function in two ways.  The first is to
+     allow no argument at all. In this case, the buffer for the
+     standard output is flushed.  The second way is to allow the null
+     string (`""') as the argument. In this case, the buffers for _all_
+     open output files and pipes are flushed.
+
+     `fflush' returns zero if the buffer was successfully flushed, and
+     nonzero otherwise.
+
+`system(COMMAND)'
+     The system function allows the user to execute operating system
+     commands and then return to the `awk' program.  The `system'
+     function executes the command given by the string COMMAND.  It
+     returns, as its value, the status returned by the command that was
+     executed.
+
+     For example, if the following fragment of code is put in your `awk'
+     program:
+
+          END {
+               system("date | mail -s 'awk run done' root")
+          }
+
+     the system administrator will be sent mail when the `awk' program
+     finishes processing input and begins its end-of-input processing.
+
+     Note that redirecting `print' or `printf' into a pipe is often
+     enough to accomplish your task.  However, if your `awk' program is
+     interactive, `system' is useful for cranking up large
+     self-contained programs, such as a shell or an editor.
+
+     Some operating systems cannot implement the `system' function.
+     `system' causes a fatal error if it is not supported.
+
+Interactive vs. Non-Interactive Buffering
+-----------------------------------------
+
+   As a side point, buffering issues can be even more confusing
+depending upon whether or not your program is "interactive", i.e.,
+communicating with a user sitting at a keyboard.(1)
+
+   Interactive programs generally "line buffer" their output; they
+write out every line.  Non-interactive programs wait until they have a
+full buffer, which may be many lines of output.
+
+   Here is an example of the difference.
+
+     $ awk '{ print $1 + $2 }'
+     1 1
+     -| 2
+     2 3
+     -| 5
+     Control-d
+
+Each line of output is printed immediately. Compare that behavior with
+this example.
+
+     $ awk '{ print $1 + $2 }' | cat
+     1 1
+     2 3
+     Control-d
+     -| 2
+     -| 5
+
+Here, no output is printed until after the `Control-D' is typed, since
+it is all buffered, and sent down the pipe to `cat' in one shot.
+
+Controlling Output Buffering with `system'
+------------------------------------------
+
+   The `fflush' function provides explicit control over output
+buffering for individual files and pipes.  However, its use is not
+portable to many other `awk' implementations.  An alternative method to
+flush output buffers is by calling `system' with a null string as its
+argument:
+
+     system("")   # flush output
+
+`gawk' treats this use of the `system' function as a special case, and
+is smart enough not to run a shell (or other command interpreter) with
+the empty command.  Therefore, with `gawk', this idiom is not only
+useful, it is efficient.  While this method should work with other
+`awk' implementations, it will not necessarily avoid starting an
+unnecessary shell.  (Other implementations may only flush the buffer
+associated with the standard output, and not necessarily all buffered
+output.)
+
+   If you think about what a programmer expects, it makes sense that
+`system' should flush any pending output.  The following program:
+
+     BEGIN {
+          print "first print"
+          system("echo system echo")
+          print "second print"
+     }
+
+must print
+
+     first print
+     system echo
+     second print
+
+and not
+
+     system echo
+     first print
+     second print
+
+   If `awk' did not flush its buffers before calling `system', the
+latter (undesirable) output is what you would see.
+
+   ---------- Footnotes ----------
+
+   (1) A program is interactive if the standard output is connected to
+a terminal device.
+
+
+File: gawk.info,  Node: Time Functions,  Prev: I/O Functions,  Up: Built-in
+
+Functions for Dealing with Time Stamps
+======================================
+
+   A common use for `awk' programs is the processing of log files
+containing time stamp information, indicating when a particular log
+record was written.  Many programs log their time stamp in the form
+returned by the `time' system call, which is the number of seconds
+since a particular epoch.  On POSIX systems, it is the number of
+seconds since Midnight, January 1, 1970, UTC.
+
+   In order to make it easier to process such log files, and to produce
+useful reports, `gawk' provides two functions for working with time
+stamps.  Both of these are `gawk' extensions; they are not specified in
+the POSIX standard, nor are they in any other known version of `awk'.
+
+   Optional parameters are enclosed in square brackets ("[" and "]").
+
+`systime()'
+     This function returns the current time as the number of seconds
+     since the system epoch.  On POSIX systems, this is the number of
+     seconds since Midnight, January 1, 1970, UTC.  It may be a
+     different number on other systems.
+
+`strftime([FORMAT [, TIMESTAMP]])'
+     This function returns a string.  It is similar to the function of
+     the same name in ANSI C.  The time specified by TIMESTAMP is used
+     to produce a string, based on the contents of the FORMAT string.
+     The TIMESTAMP is in the same format as the value returned by the
+     `systime' function.  If no TIMESTAMP argument is supplied, `gawk'
+     will use the current time of day as the time stamp.  If no FORMAT
+     argument is supplied, `strftime' uses `"%a %b %d %H:%M:%S %Z %Y"'.
+     This format string produces output (almost) equivalent to that of
+     the `date' utility.  (Versions of `gawk' prior to 3.0 require the
+     FORMAT argument.)
+
+   The `systime' function allows you to compare a time stamp from a log
+file with the current time of day.  In particular, it is easy to
+determine how long ago a particular record was logged.  It also allows
+you to produce log records using the "seconds since the epoch" format.
+
+   The `strftime' function allows you to easily turn a time stamp into
+human-readable information.  It is similar in nature to the `sprintf'
+function (*note Built-in Functions for String Manipulation: String
+Functions.), in that it copies non-format specification characters
+verbatim to the returned string, while substituting date and time
+values for format specifications in the FORMAT string.
+
+   `strftime' is guaranteed by the ANSI C standard to support the
+following date format specifications:
+
+`%a'
+     The locale's abbreviated weekday name.
+
+`%A'
+     The locale's full weekday name.
+
+`%b'
+     The locale's abbreviated month name.
+
+`%B'
+     The locale's full month name.
+
+`%c'
+     The locale's "appropriate" date and time representation.
+
+`%d'
+     The day of the month as a decimal number (01-31).
+
+`%H'
+     The hour (24-hour clock) as a decimal number (00-23).
+
+`%I'
+     The hour (12-hour clock) as a decimal number (01-12).
+
+`%j'
+     The day of the year as a decimal number (001-366).
+
+`%m'
+     The month as a decimal number (01-12).
+
+`%M'
+     The minute as a decimal number (00-59).
+
+`%p'
+     The locale's equivalent of the AM/PM designations associated with
+     a 12-hour clock.
+
+`%S'
+     The second as a decimal number (00-60).(1)
+
+`%U'
+     The week number of the year (the first Sunday as the first day of
+     week one) as a decimal number (00-53).
+
+`%w'
+     The weekday as a decimal number (0-6).  Sunday is day zero.
+
+`%W'
+     The week number of the year (the first Monday as the first day of
+     week one) as a decimal number (00-53).
+
+`%x'
+     The locale's "appropriate" date representation.
+
+`%X'
+     The locale's "appropriate" time representation.
+
+`%y'
+     The year without century as a decimal number (00-99).
+
+`%Y'
+     The year with century as a decimal number (e.g., 1995).
+
+`%Z'
+     The time zone name or abbreviation, or no characters if no time
+     zone is determinable.
+
+`%%'
+     A literal `%'.
+
+   If a conversion specifier is not one of the above, the behavior is
+undefined.(2)
+
+   Informally, a "locale" is the geographic place in which a program is
+meant to run.  For example, a common way to abbreviate the date
+September 4, 1991 in the United States would be "9/4/91".  In many
+countries in Europe, however, it would be abbreviated "4.9.91".  Thus,
+the `%x' specification in a `"US"' locale might produce `9/4/91', while
+in a `"EUROPE"' locale, it might produce `4.9.91'.  The ANSI C standard
+defines a default `"C"' locale, which is an environment that is typical
+of what most C programmers are used to.
+
+   A public-domain C version of `strftime' is supplied with `gawk' for
+systems that are not yet fully ANSI-compliant.  If that version is used
+to compile `gawk' (*note Installing `gawk': Installation.), then the
+following additional format specifications are available:
+
+`%D'
+     Equivalent to specifying `%m/%d/%y'.
+
+`%e'
+     The day of the month, padded with a space if it is only one digit.
+
+`%h'
+     Equivalent to `%b', above.
+
+`%n'
+     A newline character (ASCII LF).
+
+`%r'
+     Equivalent to specifying `%I:%M:%S %p'.
+
+`%R'
+     Equivalent to specifying `%H:%M'.
+
+`%T'
+     Equivalent to specifying `%H:%M:%S'.
+
+`%t'
+     A tab character.
+
+`%k'
+     The hour (24-hour clock) as a decimal number (0-23).  Single digit
+     numbers are padded with a space.
+
+`%l'
+     The hour (12-hour clock) as a decimal number (1-12).  Single digit
+     numbers are padded with a space.
+
+`%C'
+     The century, as a number between 00 and 99.
+
+`%u'
+     The weekday as a decimal number [1 (Monday)-7].
+
+`%V'
+     The week number of the year (the first Monday as the first day of
+     week one) as a decimal number (01-53).  The method for determining
+     the week number is as specified by ISO 8601 (to wit: if the week
+     containing January 1 has four or more days in the new year, then
+     it is week one, otherwise it is week 53 of the previous year and
+     the next week is week one).
+
+`%G'
+     The year with century of the ISO week number, as a decimal number.
+
+     For example, January 1, 1993, is in week 53 of 1992. Thus, the year
+     of its ISO week number is 1992, even though its year is 1993.
+     Similarly, December 31, 1973, is in week 1 of 1974. Thus, the year
+     of its ISO week number is 1974, even though its year is 1973.
+
+`%g'
+     The year without century of the ISO week number, as a decimal
+     number (00-99).
+
+`%Ec %EC %Ex %Ey %EY %Od %Oe %OH %OI'
+`%Om %OM %OS %Ou %OU %OV %Ow %OW %Oy'
+     These are "alternate representations" for the specifications that
+     use only the second letter (`%c', `%C', and so on).  They are
+     recognized, but their normal representations are used.(3) (These
+     facilitate compliance with the POSIX `date' utility.)
+
+`%v'
+     The date in VMS format (e.g., 20-JUN-1991).
+
+`%z'
+     The timezone offset in a +HHMM format (e.g., the format necessary
+     to produce RFC-822/RFC-1036 date headers).
+
+   This example is an `awk' implementation of the POSIX `date' utility.
+Normally, the `date' utility prints the current date and time of day
+in a well known format.  However, if you provide an argument to it that
+begins with a `+', `date' will copy non-format specifier characters to
+the standard output, and will interpret the current time according to
+the format specifiers in the string.  For example:
+
+     $ date '+Today is %A, %B %d, %Y.'
+     -| Today is Thursday, July 11, 1991.
+
+   Here is the `gawk' version of the `date' utility.  It has a shell
+"wrapper", to handle the `-u' option, which requires that `date' run as
+if the time zone was set to UTC.
+
+     #! /bin/sh
+     #
+     # date --- approximate the P1003.2 'date' command
+     
+     case $1 in
+     -u)  TZ=GMT0     # use UTC
+          export TZ
+          shift ;;
+     esac
+     
+     gawk 'BEGIN  {
+         format = "%a %b %d %H:%M:%S %Z %Y"
+         exitval = 0
+     
+         if (ARGC > 2)
+             exitval = 1
+         else if (ARGC == 2) {
+             format = ARGV[1]
+             if (format ~ /^\+/)
+                 format = substr(format, 2)   # remove leading +
+         }
+         print strftime(format)
+         exit exitval
+     }' "$@"
+
+   ---------- Footnotes ----------
+
+   (1) Occasionally there are minutes in a year with a leap second,
+which is why the seconds can go up to 60.
+
+   (2) This is because ANSI C leaves the behavior of the C version of
+`strftime' undefined, and `gawk' will use the system's version of
+`strftime' if it's there.  Typically, the conversion specifier will
+either not appear in the returned string, or it will appear literally.
+
+   (3) If you don't understand any of this, don't worry about it; these
+facilities are meant to make it easier to "internationalize" programs.
+
+
+File: gawk.info,  Node: User-defined,  Next: Invoking Gawk,  Prev: Built-in,  Up: Top
+
+User-defined Functions
+**********************
+
+   Complicated `awk' programs can often be simplified by defining your
+own functions.  User-defined functions can be called just like built-in
+ones (*note Function Calls::), but it is up to you to define them--to
+tell `awk' what they should do.
+
+* Menu:
+
+* Definition Syntax::           How to write definitions and what they mean.
+* Function Example::            An example function definition and what it
+                                does.
+* Function Caveats::            Things to watch out for.
+* Return Statement::            Specifying the value a function returns.
+
+
+File: gawk.info,  Node: Definition Syntax,  Next: Function Example,  Prev: User-defined,  Up: User-defined
+
+Function Definition Syntax
+==========================
+
+   Definitions of functions can appear anywhere between the rules of an
+`awk' program.  Thus, the general form of an `awk' program is extended
+to include sequences of rules _and_ user-defined function definitions.
+There is no need in `awk' to put the definition of a function before
+all uses of the function.  This is because `awk' reads the entire
+program before starting to execute any of it.
+
+   The definition of a function named NAME looks like this:
+
+     function NAME(PARAMETER-LIST)
+     {
+          BODY-OF-FUNCTION
+     }
+
+NAME is the name of the function to be defined.  A valid function name
+is like a valid variable name: a sequence of letters, digits and
+underscores, not starting with a digit.  Within a single `awk' program,
+any particular name can only be used as a variable, array or function.
+
+   PARAMETER-LIST is a list of the function's arguments and local
+variable names, separated by commas.  When the function is called, the
+argument names are used to hold the argument values given in the call.
+The local variables are initialized to the empty string.  A function
+cannot have two parameters with the same name.
+
+   The BODY-OF-FUNCTION consists of `awk' statements.  It is the most
+important part of the definition, because it says what the function
+should actually _do_.  The argument names exist to give the body a way
+to talk about the arguments; local variables, to give the body places
+to keep temporary values.
+
+   Argument names are not distinguished syntactically from local
+variable names; instead, the number of arguments supplied when the
+function is called determines how many argument variables there are.
+Thus, if three argument values are given, the first three names in
+PARAMETER-LIST are arguments, and the rest are local variables.
+
+   It follows that if the number of arguments is not the same in all
+calls to the function, some of the names in PARAMETER-LIST may be
+arguments on some occasions and local variables on others.  Another way
+to think of this is that omitted arguments default to the null string.
+
+   Usually when you write a function you know how many names you intend
+to use for arguments and how many you intend to use as local variables.
+It is conventional to place some extra space between the arguments and
+the local variables, to document how your function is supposed to be
+used.
+
+   During execution of the function body, the arguments and local
+variable values hide or "shadow" any variables of the same names used
+in the rest of the program.  The shadowed variables are not accessible
+in the function definition, because there is no way to name them while
+their names have been taken away for the local variables.  All other
+variables used in the `awk' program can be referenced or set normally
+in the function's body.
+
+   The arguments and local variables last only as long as the function
+body is executing.  Once the body finishes, you can once again access
+the variables that were shadowed while the function was running.
+
+   The function body can contain expressions which call functions.  They
+can even call this function, either directly or by way of another
+function.  When this happens, we say the function is "recursive".
+
+   In many `awk' implementations, including `gawk', the keyword
+`function' may be abbreviated `func'.  However, POSIX only specifies
+the use of the keyword `function'.  This actually has some practical
+implications.  If `gawk' is in POSIX-compatibility mode (*note Command
+Line Options: Options.), then the following statement will _not_ define
+a function:
+
+     func foo() { a = sqrt($1) ; print a }
+
+Instead it defines a rule that, for each record, concatenates the value
+of the variable `func' with the return value of the function `foo'.  If
+the resulting string is non-null, the action is executed.  This is
+probably not what was desired.  (`awk' accepts this input as
+syntactically valid, since functions may be used before they are defined
+in `awk' programs.)
+
+   To ensure that your `awk' programs are portable, always use the
+keyword `function' when defining a function.
+
+
+File: gawk.info,  Node: Function Example,  Next: Function Caveats,  Prev: Definition Syntax,  Up: User-defined
+
+Function Definition Examples
+============================
+
+   Here is an example of a user-defined function, called `myprint', that
+takes a number and prints it in a specific format.
+
+     function myprint(num)
+     {
+          printf "%6.3g\n", num
+     }
+
+To illustrate, here is an `awk' rule which uses our `myprint' function:
+
+     $3 > 0     { myprint($3) }
+
+This program prints, in our special format, all the third fields that
+contain a positive number in our input.  Therefore, when given:
+
+      1.2   3.4    5.6   7.8
+      9.10 11.12 -13.14 15.16
+     17.18 19.20  21.22 23.24
+
+this program, using our function to format the results, prints:
+
+        5.6
+       21.2
+
+   This function deletes all the elements in an array.
+
+     function delarray(a,    i)
+     {
+         for (i in a)
+            delete a[i]
+     }
+
+   When working with arrays, it is often necessary to delete all the
+elements in an array and start over with a new list of elements (*note
+The `delete' Statement: Delete.).  Instead of having to repeat this
+loop everywhere in your program that you need to clear out an array,
+your program can just call `delarray'.
+
+   Here is an example of a recursive function.  It takes a string as an
+input parameter, and returns the string in backwards order.
+
+     function rev(str, start)
+     {
+         if (start == 0)
+             return ""
+     
+         return (substr(str, start, 1) rev(str, start - 1))
+     }
+
+   If this function is in a file named `rev.awk', we can test it this
+way:
+
+     $ echo "Don't Panic!" |
+     > gawk --source '{ print rev($0, length($0)) }' -f rev.awk
+     -| !cinaP t'noD
+
+   Here is an example that uses the built-in function `strftime'.
+(*Note Functions for Dealing with Time Stamps: Time Functions, for more
+information on `strftime'.)  The C `ctime' function takes a timestamp
+and returns it in a string, formatted in a well known fashion.  Here is
+an `awk' version:
+
+     # ctime.awk
+     #
+     # awk version of C ctime(3) function
+     
+     function ctime(ts,    format)
+     {
+         format = "%a %b %d %H:%M:%S %Z %Y"
+         if (ts == 0)
+             ts = systime()       # use current time as default
+         return strftime(format, ts)
+     }
+
+
+File: gawk.info,  Node: Function Caveats,  Next: Return Statement,  Prev: Function Example,  Up: User-defined
+
+Calling User-defined Functions
+==============================
+
+   "Calling a function" means causing the function to run and do its
+job.  A function call is an expression, and its value is the value
+returned by the function.
+
+   A function call consists of the function name followed by the
+arguments in parentheses.  What you write in the call for the arguments
+are `awk' expressions; each time the call is executed, these
+expressions are evaluated, and the values are the actual arguments.  For
+example, here is a call to `foo' with three arguments (the first being
+a string concatenation):
+
+     foo(x y, "lose", 4 * z)
+
+   *Caution:* whitespace characters (spaces and tabs) are not allowed
+between the function name and the open-parenthesis of the argument list.
+If you write whitespace by mistake, `awk' might think that you mean to
+concatenate a variable with an expression in parentheses.  However, it
+notices that you used a function name and not a variable name, and
+reports an error.
+
+   When a function is called, it is given a _copy_ of the values of its
+arguments.  This is known as "call by value".  The caller may use a
+variable as the expression for the argument, but the called function
+does not know this: it only knows what value the argument had.  For
+example, if you write this code:
+
+     foo = "bar"
+     z = myfunc(foo)
+
+then you should not think of the argument to `myfunc' as being "the
+variable `foo'."  Instead, think of the argument as the string value,
+`"bar"'.
+
+   If the function `myfunc' alters the values of its local variables,
+this has no effect on any other variables.  Thus, if `myfunc' does this:
+
+     function myfunc(str)
+     {
+       print str
+       str = "zzz"
+       print str
+     }
+
+to change its first argument variable `str', this _does not_ change the
+value of `foo' in the caller.  The role of `foo' in calling `myfunc'
+ended when its value, `"bar"', was computed.  If `str' also exists
+outside of `myfunc', the function body cannot alter this outer value,
+because it is shadowed during the execution of `myfunc' and cannot be
+seen or changed from there.
+
+   However, when arrays are the parameters to functions, they are _not_
+copied.  Instead, the array itself is made available for direct
+manipulation by the function.  This is usually called "call by
+reference".  Changes made to an array parameter inside the body of a
+function _are_ visible outside that function.  This can be *very*
+dangerous if you do not watch what you are doing.  For example:
+
+     function changeit(array, ind, nvalue)
+     {
+          array[ind] = nvalue
+     }
+     
+     BEGIN {
+         a[1] = 1; a[2] = 2; a[3] = 3
+         changeit(a, 2, "two")
+         printf "a[1] = %s, a[2] = %s, a[3] = %s\n",
+                 a[1], a[2], a[3]
+     }
+
+This program prints `a[1] = 1, a[2] = two, a[3] = 3', because
+`changeit' stores `"two"' in the second element of `a'.
+
+   Some `awk' implementations allow you to call a function that has not
+been defined, and only report a problem at run-time when the program
+actually tries to call the function. For example:
+
+     BEGIN {
+         if (0)
+             foo()
+         else
+             bar()
+     }
+     function bar() { ... }
+     # note that `foo' is not defined
+
+Since the `if' statement will never be true, it is not really a problem
+that `foo' has not been defined.  Usually though, it is a problem if a
+program calls an undefined function.
+
+   If `--lint' has been specified (*note Command Line Options:
+Options.), `gawk' will report about calls to undefined functions.
+
+   Some `awk' implementations generate a run-time error if you use the
+`next' statement (*note The `next' Statement: Next Statement.)  inside
+a user-defined function.  `gawk' does not have this problem.
+
+
+File: gawk.info,  Node: Return Statement,  Prev: Function Caveats,  Up: User-defined
+
+The `return' Statement
+======================
+
+   The body of a user-defined function can contain a `return' statement.
+This statement returns control to the rest of the `awk' program.  It
+can also be used to return a value for use in the rest of the `awk'
+program.  It looks like this:
+
+     return [EXPRESSION]
+
+   The EXPRESSION part is optional.  If it is omitted, then the returned
+value is undefined and, therefore, unpredictable.
+
+   A `return' statement with no value expression is assumed at the end
+of every function definition.  So if control reaches the end of the
+function body, then the function returns an unpredictable value.  `awk'
+will _not_ warn you if you use the return value of such a function.
+
+   Sometimes, you want to write a function for what it does, not for
+what it returns.  Such a function corresponds to a `void' function in C
+or to a `procedure' in Pascal.  Thus, it may be appropriate to not
+return any value; you should simply bear in mind that if you use the
+return value of such a function, you do so at your own risk.
+
+   Here is an example of a user-defined function that returns a value
+for the largest number among the elements of an array:
+
+     function maxelt(vec,   i, ret)
+     {
+          for (i in vec) {
+               if (ret == "" || vec[i] > ret)
+                    ret = vec[i]
+          }
+          return ret
+     }
+
+You call `maxelt' with one argument, which is an array name.  The local
+variables `i' and `ret' are not intended to be arguments; while there
+is nothing to stop you from passing two or three arguments to `maxelt',
+the results would be strange.  The extra space before `i' in the
+function parameter list indicates that `i' and `ret' are not supposed
+to be arguments.  This is a convention that you should follow when you
+define functions.
+
+   Here is a program that uses our `maxelt' function.  It loads an
+array, calls `maxelt', and then reports the maximum number in that
+array:
+
+     awk '
+     function maxelt(vec,   i, ret)
+     {
+          for (i in vec) {
+               if (ret == "" || vec[i] > ret)
+                    ret = vec[i]
+          }
+          return ret
+     }
+     
+     # Load all fields of each record into nums.
+     {
+          for(i = 1; i <= NF; i++)
+               nums[NR, i] = $i
+     }
+     
+     END {
+          print maxelt(nums)
+     }'
+
+   Given the following input:
+
+      1 5 23 8 16
+     44 3 5 2 8 26
+     256 291 1396 2962 100
+     -6 467 998 1101
+     99385 11 0 225
+
+our program tells us (predictably) that `99385' is the largest number
+in our array.
+
+
+File: gawk.info,  Node: Invoking Gawk,  Next: Library Functions,  Prev: User-defined,  Up: Top
+
+Running `awk'
+*************
+
+   There are two ways to run `awk': with an explicit program, or with
+one or more program files.  Here are templates for both of them; items
+enclosed in `[...]' in these templates are optional.
+
+   Besides traditional one-letter POSIX-style options, `gawk' also
+supports GNU long options.
+
+     awk [OPTIONS] -f progfile [`--'] FILE ...
+     awk [OPTIONS] [`--'] 'PROGRAM' FILE ...
+
+   It is possible to invoke `awk' with an empty program:
+
+     $ awk '' datafile1 datafile2
+
+Doing so makes little sense though; `awk' will simply exit silently
+when given an empty program (d.c.).  If `--lint' has been specified on
+the command line, `gawk' will issue a warning that the program is empty.
+
+* Menu:
+
+* Options::                     Command line options and their meanings.
+* Other Arguments::             Input file names and variable assignments.
+* AWKPATH Variable::            Searching directories for `awk' programs.
+* Obsolete::                    Obsolete Options and/or features.
+* Undocumented::                Undocumented Options and Features.
+* Known Bugs::                  Known Bugs in `gawk'.
+
+
+File: gawk.info,  Node: Options,  Next: Other Arguments,  Prev: Invoking Gawk,  Up: Invoking Gawk
+
+Command Line Options
+====================
+
+   Options begin with a dash, and consist of a single character.  GNU
+style long options consist of two dashes and a keyword.  The keyword
+can be abbreviated, as long the abbreviation allows the option to be
+uniquely identified.  If the option takes an argument, then the keyword
+is either immediately followed by an equals sign (`=') and the
+argument's value, or the keyword and the argument's value are separated
+by whitespace.  For brevity, the discussion below only refers to the
+traditional short options; however the long and short options are
+interchangeable in all contexts.
+
+   Each long option for `gawk' has a corresponding POSIX-style option.
+The options and their meanings are as follows:
+
+`-F FS'
+`--field-separator FS'
+     Sets the `FS' variable to FS (*note Specifying How Fields are
+     Separated: Field Separators.).
+
+`-f SOURCE-FILE'
+`--file SOURCE-FILE'
+     Indicates that the `awk' program is to be found in SOURCE-FILE
+     instead of in the first non-option argument.
+
+`-v VAR=VAL'
+`--assign VAR=VAL'
+     Sets the variable VAR to the value VAL *before* execution of the
+     program begins.  Such variable values are available inside the
+     `BEGIN' rule (*note Other Command Line Arguments: Other
+     Arguments.).
+
+     The `-v' option can only set one variable, but you can use it more
+     than once, setting another variable each time, like this: `awk
+     -v foo=1 -v bar=2 ...'.
+
+`-mf NNN'
+`-mr NNN'
+     Set various memory limits to the value NNN.  The `f' flag sets the
+     maximum number of fields, and the `r' flag sets the maximum record
+     size.  These two flags and the `-m' option are from the Bell Labs
+     research version of Unix `awk'.  They are provided for
+     compatibility, but otherwise ignored by `gawk', since `gawk' has
+     no predefined limits.
+
+`-W GAWK-OPT'
+     Following the POSIX standard, options that are implementation
+     specific are supplied as arguments to the `-W' option.  These
+     options also have corresponding GNU style long options.  See below.
+
+`--'
+     Signals the end of the command line options.  The following
+     arguments are not treated as options even if they begin with `-'.
+     This interpretation of `--' follows the POSIX argument parsing
+     conventions.
+
+     This is useful if you have file names that start with `-', or in
+     shell scripts, if you have file names that will be specified by
+     the user which could start with `-'.
+
+   The following `gawk'-specific options are available:
+
+`-W traditional'
+`-W compat'
+`--traditional'
+`--compat'
+     Specifies "compatibility mode", in which the GNU extensions to the
+     `awk' language are disabled, so that `gawk' behaves just like the
+     Bell Labs research version of Unix `awk'.  `--traditional' is the
+     preferred form of this option.  *Note Extensions in `gawk' Not in
+     POSIX `awk': POSIX/GNU, which summarizes the extensions.  Also see
+     *Note Downward Compatibility and Debugging: Compatibility Mode.
+
+`-W copyleft'
+`-W copyright'
+`--copyleft'
+`--copyright'
+     Print the short version of the General Public License, and then
+     exit.  This option may disappear in a future version of `gawk'.
+
+`-W help'
+`-W usage'
+`--help'
+`--usage'
+     Print a "usage" message summarizing the short and long style
+     options that `gawk' accepts, and then exit.
+
+`-W lint'
+`--lint'
+     Warn about constructs that are dubious or non-portable to other
+     `awk' implementations.  Some warnings are issued when `gawk' first
+     reads your program.  Others are issued at run-time, as your
+     program executes.
+
+`-W lint-old'
+`--lint-old'
+     Warn about constructs that are not available in the original
+     Version 7 Unix version of `awk' (*note Major Changes between V7
+     and SVR3.1: V7/SVR3.1.).
+
+`-W posix'
+`--posix'
+     Operate in strict POSIX mode.  This disables all `gawk' extensions
+     (just like `--traditional'), and adds the following additional
+     restrictions:
+
+        * `\x' escape sequences are not recognized (*note Escape
+          Sequences::).
+
+        * Newlines do not act as whitespace to separate fields when
+          `FS' is equal to a single space.
+
+        * The synonym `func' for the keyword `function' is not
+          recognized (*note Function Definition Syntax: Definition
+          Syntax.).
+
+        * The operators `**' and `**=' cannot be used in place of `^'
+          and `^=' (*note Arithmetic Operators: Arithmetic Ops., and
+          also *note Assignment Expressions: Assignment Ops.).
+
+        * Specifying `-Ft' on the command line does not set the value
+          of `FS' to be a single tab character (*note Specifying How
+          Fields are Separated: Field Separators.).
+
+        * The `fflush' built-in function is not supported (*note
+          Built-in Functions for Input/Output: I/O Functions.).
+
+     If you supply both `--traditional' and `--posix' on the command
+     line, `--posix' will take precedence. `gawk' will also issue a
+     warning if both options are supplied.
+
+`-W re-interval'
+`--re-interval'
+     Allow interval expressions (*note Regular Expression Operators:
+     Regexp Operators.), in regexps.  Because interval expressions were
+     traditionally not available in `awk', `gawk' does not provide them
+     by default. This prevents old `awk' programs from breaking.
+
+`-W source PROGRAM-TEXT'
+`--source PROGRAM-TEXT'
+     Program source code is taken from the PROGRAM-TEXT.  This option
+     allows you to mix source code in files with source code that you
+     enter on the command line. This is particularly useful when you
+     have library functions that you wish to use from your command line
+     programs (*note The `AWKPATH' Environment Variable: AWKPATH
+     Variable.).
+
+`-W version'
+`--version'
+     Prints version information for this particular copy of `gawk'.
+     This allows you to determine if your copy of `gawk' is up to date
+     with respect to whatever the Free Software Foundation is currently
+     distributing.  It is also useful for bug reports (*note Reporting
+     Problems and Bugs: Bugs.).
+
+   Any other options are flagged as invalid with a warning message, but
+are otherwise ignored.
+
+   In compatibility mode, as a special case, if the value of FS supplied
+to the `-F' option is `t', then `FS' is set to the tab character
+(`"\t"').  This is only true for `--traditional', and not for `--posix'
+(*note Specifying How Fields are Separated: Field Separators.).
+
+   The `-f' option may be used more than once on the command line.  If
+it is, `awk' reads its program source from all of the named files, as
+if they had been concatenated together into one big file.  This is
+useful for creating libraries of `awk' functions.  Useful functions can
+be written once, and then retrieved from a standard place, instead of
+having to be included into each individual program.
+
+   You can type in a program at the terminal and still use library
+functions, by specifying `-f /dev/tty'.  `awk' will read a file from
+the terminal to use as part of the `awk' program.  After typing your
+program, type `Control-d' (the end-of-file character) to terminate it.
+(You may also use `-f -' to read program source from the standard
+input, but then you will not be able to also use the standard input as a
+source of data.)
+
+   Because it is clumsy using the standard `awk' mechanisms to mix
+source file and command line `awk' programs, `gawk' provides the
+`--source' option.  This does not require you to pre-empt the standard
+input for your source code, and allows you to easily mix command line
+and library source code (*note The `AWKPATH' Environment Variable:
+AWKPATH Variable.).
+
+   If no `-f' or `--source' option is specified, then `gawk' will use
+the first non-option command line argument as the text of the program
+source code.
+
+   If the environment variable `POSIXLY_CORRECT' exists, then `gawk'
+will behave in strict POSIX mode, exactly as if you had supplied the
+`--posix' command line option.  Many GNU programs look for this
+environment variable to turn on strict POSIX mode. If you supply
+`--lint' on the command line, and `gawk' turns on POSIX mode because of
+`POSIXLY_CORRECT', then it will print a warning message indicating that
+POSIX mode is in effect.
+
+   You would typically set this variable in your shell's startup file.
+For a Bourne compatible shell (such as Bash), you would add these lines
+to the `.profile' file in your home directory.
+
+     POSIXLY_CORRECT=true
+     export POSIXLY_CORRECT
+
+   For a `csh' compatible shell,(1) you would add this line to the
+`.login' file in your home directory.
+
+     setenv POSIXLY_CORRECT true
+
+   ---------- Footnotes ----------
+
+   (1) Not recommended.
+
+
+File: gawk.info,  Node: Other Arguments,  Next: AWKPATH Variable,  Prev: Options,  Up: Invoking Gawk
+
+Other Command Line Arguments
+============================
+
+   Any additional arguments on the command line are normally treated as
+input files to be processed in the order specified.   However, an
+argument that has the form `VAR=VALUE', assigns the value VALUE to the
+variable VAR--it does not specify a file at all.
+
+   All these arguments are made available to your `awk' program in the
+`ARGV' array (*note Built-in Variables::).  Command line options and
+the program text (if present) are omitted from `ARGV'.  All other
+arguments, including variable assignments, are included.   As each
+element of `ARGV' is processed, `gawk' sets the variable `ARGIND' to
+the index in `ARGV' of the current element.
+
+   The distinction between file name arguments and variable-assignment
+arguments is made when `awk' is about to open the next input file.  At
+that point in execution, it checks the "file name" to see whether it is
+really a variable assignment; if so, `awk' sets the variable instead of
+reading a file.
+
+   Therefore, the variables actually receive the given values after all
+previously specified files have been read.  In particular, the values of
+variables assigned in this fashion are _not_ available inside a `BEGIN'
+rule (*note The `BEGIN' and `END' Special Patterns: BEGIN/END.), since
+such rules are run before `awk' begins scanning the argument list.
+
+   The variable values given on the command line are processed for
+escape sequences (d.c.) (*note Escape Sequences::).
+
+   In some earlier implementations of `awk', when a variable assignment
+occurred before any file names, the assignment would happen _before_
+the `BEGIN' rule was executed.  `awk''s behavior was thus inconsistent;
+some command line assignments were available inside the `BEGIN' rule,
+while others were not.  However, some applications came to depend upon
+this "feature."  When `awk' was changed to be more consistent, the `-v'
+option was added to accommodate applications that depended upon the old
+behavior.
+
+   The variable assignment feature is most useful for assigning to
+variables such as `RS', `OFS', and `ORS', which control input and
+output formats, before scanning the data files.  It is also useful for
+controlling state if multiple passes are needed over a data file.  For
+example:
+
+     awk 'pass == 1  { PASS 1 STUFF }
+          pass == 2  { PASS 2 STUFF }' pass=1 mydata pass=2 mydata
+
+   Given the variable assignment feature, the `-F' option for setting
+the value of `FS' is not strictly necessary.  It remains for historical
+compatibility.
+
+
+File: gawk.info,  Node: AWKPATH Variable,  Next: Obsolete,  Prev: Other Arguments,  Up: Invoking Gawk
+
+The `AWKPATH' Environment Variable
+==================================
+
+   The previous section described how `awk' program files can be named
+on the command line with the `-f' option.  In most `awk'
+implementations, you must supply a precise path name for each program
+file, unless the file is in the current directory.
+
+   But in `gawk', if the file name supplied to the `-f' option does not
+contain a `/', then `gawk' searches a list of directories (called the
+"search path"), one by one, looking for a file with the specified name.
+
+   The search path is a string consisting of directory names separated
+by colons.  `gawk' gets its search path from the `AWKPATH' environment
+variable.  If that variable does not exist, `gawk' uses a default path,
+which is `.:/usr/local/share/awk'.(1) (Programs written for use by
+system administrators should use an `AWKPATH' variable that does not
+include the current directory, `.'.)
+
+   The search path feature is particularly useful for building up
+libraries of useful `awk' functions.  The library files can be placed
+in a standard directory that is in the default path, and then specified
+on the command line with a short file name.  Otherwise, the full file
+name would have to be typed for each file.
+
+   By using both the `--source' and `-f' options, your command line
+`awk' programs can use facilities in `awk' library files.  *Note A
+Library of `awk' Functions: Library Functions.
+
+   Path searching is not done if `gawk' is in compatibility mode.  This
+is true for both `--traditional' and `--posix'.  *Note Command Line
+Options: Options.
+
+   *Note:* if you want files in the current directory to be found, you
+must include the current directory in the path, either by including `.'
+explicitly in the path, or by writing a null entry in the path.  (A
+null entry is indicated by starting or ending the path with a colon, or
+by placing two colons next to each other (`::').)  If the current
+directory is not included in the path, then files cannot be found in
+the current directory.  This path search mechanism is identical to the
+shell's.
+
+   Starting with version 3.0, if `AWKPATH' is not defined in the
+environment, `gawk' will place its default search path into
+`ENVIRON["AWKPATH"]'. This makes it easy to determine the actual search
+path `gawk' will use.
+
+   ---------- Footnotes ----------
+
+   (1) Your version of `gawk' may use a directory that is different
+than `/usr/local/share/awk'; it will depend upon how `gawk' was built
+and installed. The actual directory will be the value of `$(datadir)'
+generated when `gawk' was configured.  You probably don't need to worry
+about this though.
+
+
+File: gawk.info,  Node: Obsolete,  Next: Undocumented,  Prev: AWKPATH Variable,  Up: Invoking Gawk
+
+Obsolete Options and/or Features
+================================
+
+   This section describes features and/or command line options from
+previous releases of `gawk' that are either not available in the
+current version, or that are still supported but deprecated (meaning
+that they will _not_ be in the next release).
+
+   For version 3.0.1 of `gawk', there are no command line options or
+other deprecated features from the previous version of `gawk'.  This
+node is thus essentially a place holder, in case some option becomes
+obsolete in a future version of `gawk'.
+
+
+File: gawk.info,  Node: Undocumented,  Next: Known Bugs,  Prev: Obsolete,  Up: Invoking Gawk
+
+Undocumented Options and Features
+=================================
+
+   This section intentionally left blank.
+
+
+File: gawk.info,  Node: Known Bugs,  Prev: Undocumented,  Up: Invoking Gawk
+
+Known Bugs in `gawk'
+====================
+
+   * The `-F' option for changing the value of `FS' (*note Command Line
+     Options: Options.)  is not necessary given the command line
+     variable assignment feature; it remains only for backwards
+     compatibility.
+
+   * If your system actually has support for `/dev/fd' and the
+     associated `/dev/stdin', `/dev/stdout', and `/dev/stderr' files,
+     you may get different output from `gawk' than you would get on a
+     system without those files.  When `gawk' interprets these files
+     internally, it synchronizes output to the standard output with
+     output to `/dev/stdout', while on a system with those files, the
+     output is actually to different open files (*note Special File
+     Names in `gawk': Special Files.).
+
+   * Syntactically invalid single character programs tend to overflow
+     the parse stack, generating a rather unhelpful message.  Such
+     programs are surprisingly difficult to diagnose in the completely
+     general case, and the effort to do so really is not worth it.
+
+
+File: gawk.info,  Node: Library Functions,  Next: Sample Programs,  Prev: Invoking Gawk,  Up: Top
+
+A Library of `awk' Functions
+****************************
+
+   This chapter presents a library of useful `awk' functions.  The
+sample programs presented later (*note Practical `awk' Programs: Sample
+Programs.)  use these functions.  The functions are presented here in a
+progression from simple to complex.
+
+   *Note Extracting Programs from Texinfo Source Files: Extract Program,
+presents a program that you can use to extract the source code for
+these example library functions and programs from the Texinfo source
+for this Info file.  (This has already been done as part of the `gawk'
+distribution.)
+
+   If you have written one or more useful, general purpose `awk'
+functions, and would like to contribute them for a subsequent edition
+of this Info file, please contact the author.  *Note Reporting Problems
+and Bugs: Bugs, for information on doing this.  Don't just send code,
+as you will be required to either place your code in the public domain,
+publish it under the GPL (*note GNU GENERAL PUBLIC LICENSE: Copying.),
+or assign the copyright in it to the Free Software Foundation.
+
+* Menu:
+
+* Portability Notes::           What to do if you don't have `gawk'.
+* Nextfile Function::           Two implementations of a `nextfile'
+                                function.
+* Assert Function::             A function for assertions in `awk'
+                                programs.
+* Round Function::              A function for rounding if `sprintf' does
+                                not do it correctly.
+* Ordinal Functions::           Functions for using characters as numbers and
+                                vice versa.
+* Join Function::               A function to join an array into a string.
+* Mktime Function::             A function to turn a date into a timestamp.
+* Gettimeofday Function::       A function to get formatted times.
+* Filetrans Function::          A function for handling data file transitions.
+* Getopt Function::             A function for processing command line
+                                arguments.
+* Passwd Functions::            Functions for getting user information.
+* Group Functions::             Functions for getting group information.
+* Library Names::               How to best name private global variables in
+                                library functions.
+
+
+File: gawk.info,  Node: Portability Notes,  Next: Nextfile Function,  Prev: Library Functions,  Up: Library Functions
+
+Simulating `gawk'-specific Features
+===================================
+
+   The programs in this chapter and in *Note Practical `awk' Programs:
+Sample Programs, freely use features that are specific to `gawk'.  This
+section briefly discusses how you can rewrite these programs for
+different implementations of `awk'.
+
+   Diagnostic error messages are sent to `/dev/stderr'.  Use `| "cat
+1>&2"' instead of `> "/dev/stderr"', if your system does not have a
+`/dev/stderr', or if you cannot use `gawk'.
+
+   A number of programs use `nextfile' (*note The `nextfile' Statement:
+Nextfile Statement.), to skip any remaining input in the input file.
+*Note Implementing `nextfile' as a Function: Nextfile Function, shows
+you how to write a function that will do the same thing.
+
+   Finally, some of the programs choose to ignore upper-case and
+lower-case distinctions in their input. They do this by assigning one
+to `IGNORECASE'.  You can achieve the same effect by adding the
+following rule to the beginning of the program:
+
+     # ignore case
+     { $0 = tolower($0) }
+
+Also, verify that all regexp and string constants used in comparisons
+only use lower-case letters.
+
+
+File: gawk.info,  Node: Nextfile Function,  Next: Assert Function,  Prev: Portability Notes,  Up: Library Functions
+
+Implementing `nextfile' as a Function
+=====================================
+
+   The `nextfile' statement presented in *Note The `nextfile'
+Statement: Nextfile Statement, is a `gawk'-specific extension.  It is
+not available in other implementations of `awk'.  This section shows
+two versions of a `nextfile' function that you can use to simulate
+`gawk''s `nextfile' statement if you cannot use `gawk'.
+
+   Here is a first attempt at writing a `nextfile' function.
+
+     # nextfile --- skip remaining records in current file
+     
+     # this should be read in before the "main" awk program
+     
+     function nextfile()    { _abandon_ = FILENAME; next }
+     
+     _abandon_ == FILENAME  { next }
+
+   This file should be included before the main program, because it
+supplies a rule that must be executed first.  This rule compares the
+current data file's name (which is always in the `FILENAME' variable)
+to a private variable named `_abandon_'.  If the file name matches,
+then the action part of the rule executes a `next' statement, to go on
+to the next record.  (The use of `_' in the variable name is a
+convention.  It is discussed more fully in *Note Naming Library
+Function Global Variables: Library Names.)
+
+   The use of the `next' statement effectively creates a loop that reads
+all the records from the current data file.  Eventually, the end of the
+file is reached, and a new data file is opened, changing the value of
+`FILENAME'.  Once this happens, the comparison of `_abandon_' to
+`FILENAME' fails, and execution continues with the first rule of the
+"real" program.
+
+   The `nextfile' function itself simply sets the value of `_abandon_'
+and then executes a `next' statement to start the loop going.(1)
+
+   This initial version has a subtle problem.  What happens if the same
+data file is listed _twice_ on the command line, one right after the
+other, or even with just a variable assignment between the two
+occurrences of the file name?
+
+   In such a case, this code will skip right through the file, a second
+time, even though it should stop when it gets to the end of the first
+occurrence.  Here is a second version of `nextfile' that remedies this
+problem.
+
+     # nextfile --- skip remaining records in current file
+     # correctly handle successive occurrences of the same file
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May, 1993
+     
+     # this should be read in before the "main" awk program
+     
+     function nextfile()   { _abandon_ = FILENAME; next }
+     
+     _abandon_ == FILENAME {
+           if (FNR == 1)
+               _abandon_ = ""
+           else
+               next
+     }
+
+   The `nextfile' function has not changed.  It sets `_abandon_' equal
+to the current file name and then executes a `next' satement.  The
+`next' statement reads the next record and increments `FNR', so `FNR'
+is guaranteed to have a value of at least two.  However, if `nextfile'
+is called for the last record in the file, then `awk' will close the
+current data file and move on to the next one.  Upon doing so,
+`FILENAME' will be set to the name of the new file, and `FNR' will be
+reset to one.  If this next file is the same as the previous one,
+`_abandon_' will still be equal to `FILENAME'.  However, `FNR' will be
+equal to one, telling us that this is a new occurrence of the file, and
+not the one we were reading when the `nextfile' function was executed.
+In that case, `_abandon_' is reset to the empty string, so that further
+executions of this rule will fail (until the next time that `nextfile'
+is called).
+
+   If `FNR' is not one, then we are still in the original data file,
+and the program executes a `next' statement to skip through it.
+
+   An important question to ask at this point is: "Given that the
+functionality of `nextfile' can be provided with a library file, why is
+it built into `gawk'?"  This is an important question.  Adding features
+for little reason leads to larger, slower programs that are harder to
+maintain.
+
+   The answer is that building `nextfile' into `gawk' provides
+significant gains in efficiency.  If the `nextfile' function is executed
+at the beginning of a large data file, `awk' still has to scan the
+entire file, splitting it up into records, just to skip over it.  The
+built-in `nextfile' can simply close the file immediately and proceed
+to the next one, saving a lot of time.  This is particularly important
+in `awk', since `awk' programs are generally I/O bound (i.e.  they
+spend most of their time doing input and output, instead of performing
+computations).
+
+   ---------- Footnotes ----------
+
+   (1) Some implementations of `awk' do not allow you to execute `next'
+from within a function body. Some other work-around will be necessary
+if you use such a version.
+
+
+File: gawk.info,  Node: Assert Function,  Next: Round Function,  Prev: Nextfile Function,  Up: Library Functions
+
+Assertions
+==========
+
+   When writing large programs, it is often useful to be able to know
+that a condition or set of conditions is true.  Before proceeding with a
+particular computation, you make a statement about what you believe to
+be the case.  Such a statement is known as an "assertion."  The C
+language provides an `<assert.h>' header file and corresponding
+`assert' macro that the programmer can use to make assertions.  If an
+assertion fails, the `assert' macro arranges to print a diagnostic
+message describing the condition that should have been true but was
+not, and then it kills the program.  In C, using `assert' looks this:
+
+     #include <assert.h>
+     
+     int myfunc(int a, double b)
+     {
+          assert(a <= 5 && b >= 17);
+          ...
+     }
+
+   If the assertion failed, the program would print a message similar to
+this:
+
+     prog.c:5: assertion failed: a <= 5 && b >= 17
+
+   The ANSI C language makes it possible to turn the condition into a
+string for use in printing the diagnostic message.  This is not
+possible in `awk', so this `assert' function also requires a string
+version of the condition that is being tested.
+
+     # assert --- assert that a condition is true. Otherwise exit.
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May, 1993
+     
+     function assert(condition, string)
+     {
+         if (! condition) {
+             printf("%s:%d: assertion failed: %s\n",
+                 FILENAME, FNR, string) > "/dev/stderr"
+             _assert_exit = 1
+             exit 1
+         }
+     }
+     
+     END {
+         if (_assert_exit)
+             exit 1
+     }
+
+   The `assert' function tests the `condition' parameter. If it is
+false, it prints a message to standard error, using the `string'
+parameter to describe the failed condition.  It then sets the variable
+`_assert_exit' to one, and executes the `exit' statement.  The `exit'
+statement jumps to the `END' rule. If the `END' rules finds
+`_assert_exit' to be true, then it exits immediately.
+
+   The purpose of the `END' rule with its test is to keep any other
+`END' rules from running.  When an assertion fails, the program should
+exit immediately.  If no assertions fail, then `_assert_exit' will
+still be false when the `END' rule is run normally, and the rest of the
+program's `END' rules will execute.  For all of this to work correctly,
+`assert.awk' must be the first source file read by `awk'.
+
+   You would use this function in your programs this way:
+
+     function myfunc(a, b)
+     {
+          assert(a <= 5 && b >= 17, "a <= 5 && b >= 17")
+          ...
+     }
+
+If the assertion failed, you would see a message like this:
+
+     mydata:1357: assertion failed: a <= 5 && b >= 17
+
+   There is a problem with this version of `assert', that it may not be
+possible to work around.  An `END' rule is automatically added to the
+program calling `assert'.  Normally, if a program consists of just a
+`BEGIN' rule, the input files and/or standard input are not read.
+However, now that the program has an `END' rule, `awk' will attempt to
+read the input data files, or standard input (*note Startup and Cleanup
+Actions: Using BEGIN/END.), most likely causing the program to hang,
+waiting for input.
+
+
+File: gawk.info,  Node: Round Function,  Next: Ordinal Functions,  Prev: Assert Function,  Up: Library Functions
+
+Rounding Numbers
+================
+
+   The way `printf' and `sprintf' (*note Using `printf' Statements for
+Fancier Printing: Printf.)  do rounding will often depend upon the
+system's C `sprintf' subroutine.  On many machines, `sprintf' rounding
+is "unbiased," which means it doesn't always round a trailing `.5' up,
+contrary to naive expectations.  In unbiased rounding, `.5' rounds to
+even, rather than always up, so 1.5 rounds to 2 but 4.5 rounds to 4.
+The result is that if you are using a format that does rounding (e.g.,
+`"%.0f"') you should check what your system does.  The following
+function does traditional rounding; it might be useful if your awk's
+`printf' does unbiased rounding.
+
+     # round --- do normal rounding
+     #
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, August, 1996
+     # Public Domain
+     
+     function round(x,   ival, aval, fraction)
+     {
+        ival = int(x)    # integer part, int() truncates
+     
+        # see if fractional part
+        if (ival == x)   # no fraction
+           return x
+     
+        if (x < 0) {
+           aval = -x     # absolute value
+           ival = int(aval)
+           fraction = aval - ival
+           if (fraction >= .5)
+              return int(x) - 1   # -2.5 --> -3
+           else
+              return int(x)       # -2.3 --> -2
+        } else {
+           fraction = x - ival
+           if (fraction >= .5)
+              return ival + 1
+           else
+              return ival
+        }
+     }
+     
+     # test harness
+     { print $0, round($0) }
+
+
+File: gawk.info,  Node: Ordinal Functions,  Next: Join Function,  Prev: Round Function,  Up: Library Functions
+
+Translating Between Characters and Numbers
+==========================================
+
+   One commercial implementation of `awk' supplies a built-in function,
+`ord', which takes a character and returns the numeric value for that
+character in the machine's character set.  If the string passed to
+`ord' has more than one character, only the first one is used.
+
+   The inverse of this function is `chr' (from the function of the same
+name in Pascal), which takes a number and returns the corresponding
+character.
+
+   Both functions can be written very nicely in `awk'; there is no real
+reason to build them into the `awk' interpreter.
+
+     # ord.awk --- do ord and chr
+     #
+     # Global identifiers:
+     #    _ord_:        numerical values indexed by characters
+     #    _ord_init:    function to initialize _ord_
+     #
+     # Arnold Robbins
+     # arnold@gnu.ai.mit.edu
+     # Public Domain
+     # 16 January, 1992
+     # 20 July, 1992, revised
+     
+     BEGIN    { _ord_init() }
+     
+     function _ord_init(    low, high, i, t)
+     {
+         low = sprintf("%c", 7) # BEL is ascii 7
+         if (low == "\a") {    # regular ascii
+             low = 0
+             high = 127
+         } else if (sprintf("%c", 128 + 7) == "\a") {
+             # ascii, mark parity
+             low = 128
+             high = 255
+         } else {        # ebcdic(!)
+             low = 0
+             high = 255
+         }
+     
+         for (i = low; i <= high; i++) {
+             t = sprintf("%c", i)
+             _ord_[t] = i
+         }
+     }
+
+   Some explanation of the numbers used by `chr' is worthwhile.  The
+most prominent character set in use today is ASCII. Although an
+eight-bit byte can hold 256 distinct values (from zero to 255), ASCII
+only defines characters that use the values from zero to 127.(1) At
+least one computer manufacturer that we know of uses ASCII, but with
+mark parity, meaning that the leftmost bit in the byte is always one.
+What this means is that on those systems, characters have numeric
+values from 128 to 255.  Finally, large mainframe systems use the
+EBCDIC character set, which uses all 256 values.  While there are other
+character sets in use on some older systems, they are not really worth
+worrying about.
+
+     function ord(str,    c)
+     {
+         # only first character is of interest
+         c = substr(str, 1, 1)
+         return _ord_[c]
+     }
+     
+     function chr(c)
+     {
+         # force c to be numeric by adding 0
+         return sprintf("%c", c + 0)
+     }
+     
+     #### test code ####
+     # BEGIN    \
+     # {
+     #    for (;;) {
+     #        printf("enter a character: ")
+     #        if (getline var <= 0)
+     #            break
+     #        printf("ord(%s) = %d\n", var, ord(var))
+     #    }
+     # }
+
+   An obvious improvement to these functions would be to move the code
+for the `_ord_init' function into the body of the `BEGIN' rule.  It was
+written this way initially for ease of development.
+
+   There is a "test program" in a `BEGIN' rule, for testing the
+function.  It is commented out for production use.
+
+   ---------- Footnotes ----------
+
+   (1) ASCII has been extended in many countries to use the values from
+128 to 255 for country-specific characters.  If your  system uses these
+extensions, you can simplify `_ord_init' to simply loop from zero to
+255.
+
+
+File: gawk.info,  Node: Join Function,  Next: Mktime Function,  Prev: Ordinal Functions,  Up: Library Functions
+
+Merging an Array Into a String
+==============================
+
+   When doing string processing, it is often useful to be able to join
+all the strings in an array into one long string.  The following
+function, `join', accomplishes this task.  It is used later in several
+of the application programs (*note Practical `awk' Programs: Sample
+Programs.).
+
+   Good function design is important; this function needs to be
+general, but it should also have a reasonable default behavior.  It is
+called with an array and the beginning and ending indices of the
+elements in the array to be merged.  This assumes that the array
+indices are numeric--a reasonable assumption since the array was likely
+created with `split' (*note Built-in Functions for String Manipulation:
+String Functions.).
+
+     # join.awk --- join an array into a string
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     function join(array, start, end, sep,    result, i)
+     {
+         if (sep == "")
+            sep = " "
+         else if (sep == SUBSEP) # magic value
+            sep = ""
+         result = array[start]
+         for (i = start + 1; i <= end; i++)
+             result = result sep array[i]
+         return result
+     }
+
+   An optional additional argument is the separator to use when joining
+the strings back together.  If the caller supplies a non-empty value,
+`join' uses it.  If it is not supplied, it will have a null value.  In
+this case, `join' uses a single blank as a default separator for the
+strings.  If the value is equal to `SUBSEP', then `join' joins the
+strings with no separator between them.  `SUBSEP' serves as a "magic"
+value to indicate that there should be no separation between the
+component strings.
+
+   It would be nice if `awk' had an assignment operator for
+concatenation.  The lack of an explicit operator for concatenation
+makes string operations more difficult than they really need to be.
+
+
+File: gawk.info,  Node: Mktime Function,  Next: Gettimeofday Function,  Prev: Join Function,  Up: Library Functions
+
+Turning Dates Into Timestamps
+=============================
+
+   The `systime' function built in to `gawk' returns the current time
+of day as a timestamp in "seconds since the Epoch."  This timestamp can
+be converted into a printable date of almost infinitely variable format
+using the built-in `strftime' function.  (For more information on
+`systime' and `strftime', *note Functions for Dealing with Time Stamps:
+Time Functions..)
+
+   An interesting but difficult problem is to convert a readable
+representation of a date back into a timestamp.  The ANSI C library
+provides a `mktime' function that does the basic job, converting a
+canonical representation of a date into a timestamp.
+
+   It would appear at first glance that `gawk' would have to supply a
+`mktime' built-in function that was simply a "hook" to the C language
+version.  In fact though, `mktime' can be implemented entirely in `awk'.
+
+   Here is a version of `mktime' for `awk'.  It takes a simple
+representation of the date and time, and converts it into a timestamp.
+
+   The code is presented here intermixed with explanatory prose.  In
+*Note Extracting Programs from Texinfo Source Files: Extract Program,
+you will see how the Texinfo source file for this Info file can be
+processed to extract the code into a single source file.
+
+   The program begins with a descriptive comment and a `BEGIN' rule
+that initializes a table `_tm_months'.  This table is a two-dimensional
+array that has the lengths of the months.  The first index is zero for
+regular years, and one for leap years.  The values are the same for all
+the months in both kinds of years, except for February; thus the use of
+multiple assignment.
+
+     # mktime.awk --- convert a canonical date representation
+     #                into a timestamp
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     BEGIN    \
+     {
+         # Initialize table of month lengths
+         _tm_months[0,1] = _tm_months[1,1] = 31
+         _tm_months[0,2] = 28; _tm_months[1,2] = 29
+         _tm_months[0,3] = _tm_months[1,3] = 31
+         _tm_months[0,4] = _tm_months[1,4] = 30
+         _tm_months[0,5] = _tm_months[1,5] = 31
+         _tm_months[0,6] = _tm_months[1,6] = 30
+         _tm_months[0,7] = _tm_months[1,7] = 31
+         _tm_months[0,8] = _tm_months[1,8] = 31
+         _tm_months[0,9] = _tm_months[1,9] = 30
+         _tm_months[0,10] = _tm_months[1,10] = 31
+         _tm_months[0,11] = _tm_months[1,11] = 30
+         _tm_months[0,12] = _tm_months[1,12] = 31
+     }
+
+   The benefit of merging multiple `BEGIN' rules (*note The `BEGIN' and
+`END' Special Patterns: BEGIN/END.)  is particularly clear when writing
+library files.  Functions in library files can cleanly initialize their
+own private data and also provide clean-up actions in private `END'
+rules.
+
+   The next function is a simple one that computes whether a given year
+is or is not a leap year.  If a year is evenly divisible by four, but
+not evenly divisible by 100, or if it is evenly divisible by 400, then
+it is a leap year.  Thus, 1904 was a leap year, 1900 was not, but 2000
+will be.
+
+     # decide if a year is a leap year
+     function _tm_isleap(year,    ret)
+     {
+         ret = (year % 4 == 0 && year % 100 != 0) ||
+                 (year % 400 == 0)
+     
+         return ret
+     }
+
+   This function is only used a few times in this file, and its
+computation could have been written "in-line" (at the point where it's
+used).  Making it a separate function made the original development
+easier, and also avoids the possibility of typing errors when
+duplicating the code in multiple places.
+
+   The next function is more interesting.  It does most of the work of
+generating a timestamp, which is converting a date and time into some
+number of seconds since the Epoch.  The caller passes an array (rather
+imaginatively named `a') containing six values: the year including
+century, the month as a number between one and 12, the day of the
+month, the hour as a number between zero and 23, the minute in the
+hour, and the seconds within the minute.
+
+   The function uses several local variables to precompute the number of
+seconds in an hour, seconds in a day, and seconds in a year.  Often,
+similar C code simply writes out the expression in-line, expecting the
+compiler to do "constant folding".  E.g., most C compilers would turn
+`60 * 60' into `3600' at compile time, instead of recomputing it every
+time at run time.  Precomputing these values makes the function more
+efficient.
+
+     # convert a date into seconds
+     function _tm_addup(a,    total, yearsecs, daysecs,
+                              hoursecs, i, j)
+     {
+         hoursecs = 60 * 60
+         daysecs = 24 * hoursecs
+         yearsecs = 365 * daysecs
+     
+         total = (a[1] - 1970) * yearsecs
+     
+         # extra day for leap years
+         for (i = 1970; i < a[1]; i++)
+             if (_tm_isleap(i))
+                 total += daysecs
+     
+         j = _tm_isleap(a[1])
+         for (i = 1; i < a[2]; i++)
+             total += _tm_months[j, i] * daysecs
+     
+         total += (a[3] - 1) * daysecs
+         total += a[4] * hoursecs
+         total += a[5] * 60
+         total += a[6]
+     
+         return total
+     }
+
+   The function starts with a first approximation of all the seconds
+between Midnight, January 1, 1970,(1) and the beginning of the current
+year.  It then goes through all those years, and for every leap year,
+adds an additional day's worth of seconds.
+
+   The variable `j' holds either one or zero, if the current year is or
+is not a leap year.  For every month in the current year prior to the
+current month, it adds the number of seconds in the month, using the
+appropriate entry in the `_tm_months' array.
+
+   Finally, it adds in the seconds for the number of days prior to the
+current day, and the number of hours, minutes, and seconds in the
+current day.
+
+   The result is a count of seconds since January 1, 1970.  This value
+is not yet what is needed though.  The reason why is described shortly.
+
+   The main `mktime' function takes a single character string argument.
+This string is a representation of a date and time in a "canonical"
+(fixed) form.  This string should be `"YEAR MONTH DAY HOUR MINUTE
+SECOND"'.
+
+     # mktime --- convert a date into seconds,
+     #            compensate for time zone
+     
+     function mktime(str,    res1, res2, a, b, i, j, t, diff)
+     {
+         i = split(str, a, " ")    # don't rely on FS
+     
+         if (i != 6)
+             return -1
+     
+         # force numeric
+         for (j in a)
+             a[j] += 0
+     
+         # validate
+         if (a[1] < 1970 ||
+             a[2] < 1 || a[2] > 12 ||
+             a[3] < 1 || a[3] > 31 ||
+             a[4] < 0 || a[4] > 23 ||
+             a[5] < 0 || a[5] > 59 ||
+             a[6] < 0 || a[6] > 60 )
+                 return -1
+     
+         res1 = _tm_addup(a)
+         t = strftime("%Y %m %d %H %M %S", res1)
+     
+         if (_tm_debug)
+             printf("(%s) -> (%s)\n", str, t) > "/dev/stderr"
+     
+         split(t, b, " ")
+         res2 = _tm_addup(b)
+     
+         diff = res1 - res2
+     
+         if (_tm_debug)
+             printf("diff = %d seconds\n", diff) > "/dev/stderr"
+     
+         res1 += diff
+     
+         return res1
+     }
+
+   The function first splits the string into an array, using spaces and
+tabs as separators.  If there are not six elements in the array, it
+returns an error, signaled as the value -1.  Next, it forces each
+element of the array to be numeric, by adding zero to it.  The
+following `if' statement then makes sure that each element is within an
+allowable range.  (This checking could be extended further, e.g., to
+make sure that the day of the month is within the correct range for the
+particular month supplied.)  All of this is essentially preliminary
+set-up and error checking.
+
+   Recall that `_tm_addup' generated a value in seconds since Midnight,
+January 1, 1970.  This value is not directly usable as the result we
+want, _since the calculation does not account for the local timezone_.
+In other words, the value represents the count in seconds since the
+Epoch, but only for UTC (Universal Coordinated Time).  If the local
+timezone is east or west of UTC, then some number of hours should be
+either added to, or subtracted from the resulting timestamp.
+
+   For example, 6:23 p.m. in Atlanta, Georgia (USA), is normally five
+hours west of (behind) UTC.  It is only four hours behind UTC if
+daylight savings time is in effect.  If you are calling `mktime' in
+Atlanta, with the argument `"1993 5 23 18 23 12"', the result from
+`_tm_addup' will be for 6:23 p.m. UTC, which is only 2:23 p.m. in
+Atlanta.  It is necessary to add another four hours worth of seconds to
+the result.
+
+   How can `mktime' determine how far away it is from UTC?  This is
+surprisingly easy.  The returned timestamp represents the time passed to
+`mktime' _as UTC_.  This timestamp can be fed back to `strftime', which
+will format it as a _local_ time; i.e. as if it already had the UTC
+difference added in to it.  This is done by giving
+`"%Y %m %d %H %M %S"' to `strftime' as the format argument.  It returns
+the computed timestamp in the original string format.  The result
+represents a time that accounts for the UTC difference.  When the new
+time is converted back to a timestamp, the difference between the two
+timestamps is the difference (in seconds) between the local timezone
+and UTC.  This difference is then added back to the original result.
+An example demonstrating this is presented below.
+
+   Finally, there is a "main" program for testing the function.
+
+     BEGIN  {
+         if (_tm_test) {
+             printf "Enter date as yyyy mm dd hh mm ss: "
+             getline _tm_test_date
+     
+             t = mktime(_tm_test_date)
+             r = strftime("%Y %m %d %H %M %S", t)
+             printf "Got back (%s)\n", r
+         }
+     }
+
+   The entire program uses two variables that can be set on the command
+line to control debugging output and to enable the test in the final
+`BEGIN' rule.  Here is the result of a test run. (Note that debugging
+output is to standard error, and test output is to standard output.)
+
+     $ gawk -f mktime.awk -v _tm_test=1 -v _tm_debug=1
+     -| Enter date as yyyy mm dd hh mm ss: 1993 5 23 15 35 10
+     error--> (1993 5 23 15 35 10) -> (1993 05 23 11 35 10)
+     error--> diff = 14400 seconds
+     -| Got back (1993 05 23 15 35 10)
+
+   The time entered was 3:35 p.m. (15:35 on a 24-hour clock), on May
+23, 1993.  The first line of debugging output shows the resulting time
+as UTC--four hours ahead of the local time zone.  The second line shows
+that the difference is 14400 seconds, which is four hours.  (The
+difference is only four hours, since daylight savings time is in effect
+during May.)  The final line of test output shows that the timezone
+compensation algorithm works; the returned time is the same as the
+entered time.
+
+   This program does not solve the general problem of turning an
+arbitrary date representation into a timestamp.  That problem is very
+involved.  However, the `mktime' function provides a foundation upon
+which to build. Other software can convert month names into numeric
+months, and AM/PM times into 24-hour clocks, to generate the
+"canonical" format that `mktime' requires.
+
+   ---------- Footnotes ----------
+
+   (1) This is the Epoch on POSIX systems.  It may be different on
+other systems.
+
+
+File: gawk.info,  Node: Gettimeofday Function,  Next: Filetrans Function,  Prev: Mktime Function,  Up: Library Functions
+
+Managing the Time of Day
+========================
+
+   The `systime' and `strftime' functions described in *Note Functions
+for Dealing with Time Stamps: Time Functions, provide the minimum
+functionality necessary for dealing with the time of day in human
+readable form.  While `strftime' is extensive, the control formats are
+not necessarily easy to remember or intuitively obvious when reading a
+program.
+
+   The following function, `gettimeofday', populates a user-supplied
+array with pre-formatted time information.  It returns a string with
+the current time formatted in the same way as the `date' utility.
+
+     # gettimeofday --- get the time of day in a usable format
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain, May 1993
+     #
+     # Returns a string in the format of output of date(1)
+     # Populates the array argument time with individual values:
+     #    time["second"]       -- seconds (0 - 59)
+     #    time["minute"]       -- minutes (0 - 59)
+     #    time["hour"]         -- hours (0 - 23)
+     #    time["althour"]      -- hours (0 - 12)
+     #    time["monthday"]     -- day of month (1 - 31)
+     #    time["month"]        -- month of year (1 - 12)
+     #    time["monthname"]    -- name of the month
+     #    time["shortmonth"]   -- short name of the month
+     #    time["year"]         -- year within century (0 - 99)
+     #    time["fullyear"]     -- year with century (19xx or 20xx)
+     #    time["weekday"]      -- day of week (Sunday = 0)
+     #    time["altweekday"]   -- day of week (Monday = 0)
+     #    time["weeknum"]      -- week number, Sunday first day
+     #    time["altweeknum"]   -- week number, Monday first day
+     #    time["dayname"]      -- name of weekday
+     #    time["shortdayname"] -- short name of weekday
+     #    time["yearday"]      -- day of year (0 - 365)
+     #    time["timezone"]     -- abbreviation of timezone name
+     #    time["ampm"]         -- AM or PM designation
+     
+     function gettimeofday(time,    ret, now, i)
+     {
+         # get time once, avoids unnecessary system calls
+         now = systime()
+     
+         # return date(1)-style output
+         ret = strftime("%a %b %d %H:%M:%S %Z %Y", now)
+     
+         # clear out target array
+         for (i in time)
+             delete time[i]
+     
+         # fill in values, force numeric values to be
+         # numeric by adding 0
+         time["second"]       = strftime("%S", now) + 0
+         time["minute"]       = strftime("%M", now) + 0
+         time["hour"]         = strftime("%H", now) + 0
+         time["althour"]      = strftime("%I", now) + 0
+         time["monthday"]     = strftime("%d", now) + 0
+         time["month"]        = strftime("%m", now) + 0
+         time["monthname"]    = strftime("%B", now)
+         time["shortmonth"]   = strftime("%b", now)
+         time["year"]         = strftime("%y", now) + 0
+         time["fullyear"]     = strftime("%Y", now) + 0
+         time["weekday"]      = strftime("%w", now) + 0
+         time["altweekday"]   = strftime("%u", now) + 0
+         time["dayname"]      = strftime("%A", now)
+         time["shortdayname"] = strftime("%a", now)
+         time["yearday"]      = strftime("%j", now) + 0
+         time["timezone"]     = strftime("%Z", now)
+         time["ampm"]         = strftime("%p", now)
+         time["weeknum"]      = strftime("%U", now) + 0
+         time["altweeknum"]   = strftime("%W", now) + 0
+     
+         return ret
+     }
+
+   The string indices are easier to use and read than the various
+formats required by `strftime'.  The `alarm' program presented in *Note
+An Alarm Clock Program: Alarm Program, uses this function.
+
+   The `gettimeofday' function is presented above as it was written. A
+more general design for this function would have allowed the user to
+supply an optional timestamp value that would have been used instead of
+the current time.
+
+
+File: gawk.info,  Node: Filetrans Function,  Next: Getopt Function,  Prev: Gettimeofday Function,  Up: Library Functions
+
+Noting Data File Boundaries
+===========================
+
+   The `BEGIN' and `END' rules are each executed exactly once, at the
+beginning and end respectively of your `awk' program (*note The `BEGIN'
+and `END' Special Patterns: BEGIN/END.).  We (the `gawk' authors) once
+had a user who mistakenly thought that the `BEGIN' rule was executed at
+the beginning of each data file and the `END' rule was executed at the
+end of each data file.  When informed that this was not the case, the
+user requested that we add new special patterns to `gawk', named
+`BEGIN_FILE' and `END_FILE', that would have the desired behavior.  He
+even supplied us the code to do so.
+
+   However, after a little thought, I came up with the following
+library program.  It arranges to call two user-supplied functions,
+`beginfile' and `endfile', at the beginning and end of each data file.
+Besides solving the problem in only nine(!) lines of code, it does so
+_portably_; this will work with any implementation of `awk'.
+
+     # transfile.awk
+     #
+     # Give the user a hook for filename transitions
+     #
+     # The user must supply functions beginfile() and endfile()
+     # that each take the name of the file being started or
+     # finished, respectively.
+     #
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, January 1992
+     # Public Domain
+     
+     FILENAME != _oldfilename \
+     {
+         if (_oldfilename != "")
+             endfile(_oldfilename)
+         _oldfilename = FILENAME
+         beginfile(FILENAME)
+     }
+     
+     END   { endfile(FILENAME) }
+
+   This file must be loaded before the user's "main" program, so that
+the rule it supplies will be executed first.
+
+   This rule relies on `awk''s `FILENAME' variable that automatically
+changes for each new data file.  The current file name is saved in a
+private variable, `_oldfilename'.  If `FILENAME' does not equal
+`_oldfilename', then a new data file is being processed, and it is
+necessary to call `endfile' for the old file.  Since `endfile' should
+only be called if a file has been processed, the program first checks
+to make sure that `_oldfilename' is not the null string.  The program
+then assigns the current file name to `_oldfilename', and calls
+`beginfile' for the file.  Since, like all `awk' variables,
+`_oldfilename' will be initialized to the null string, this rule
+executes correctly even for the first data file.
+
+   The program also supplies an `END' rule, to do the final processing
+for the last file.  Since this `END' rule comes before any `END' rules
+supplied in the "main" program, `endfile' will be called first.  Once
+again the value of multiple `BEGIN' and `END' rules should be clear.
+
+   This version has same problem as the first version of `nextfile'
+(*note Implementing `nextfile' as a Function: Nextfile Function.).  If
+the same data file occurs twice in a row on command line, then
+`endfile' and `beginfile' will not be executed at the end of the first
+pass and at the beginning of the second pass.  This version solves the
+problem.
+
+     # ftrans.awk --- handle data file transitions
+     #
+     # user supplies beginfile() and endfile() functions
+     #
+     # Arnold Robbins, arnold@gnu.ai.mit.edu. November 1992
+     # Public Domain
+     
+     FNR == 1 {
+         if (_filename_ != "")
+             endfile(_filename_)
+         _filename_ = FILENAME
+         beginfile(FILENAME)
+     }
+     
+     END  { endfile(_filename_) }
+
+   In *Note Counting Things: Wc Program, you will see how this library
+function can be used, and how it simplifies writing the main program.
+
+
+File: gawk.info,  Node: Getopt Function,  Next: Passwd Functions,  Prev: Filetrans Function,  Up: Library Functions
+
+Processing Command Line Options
+===============================
+
+   Most utilities on POSIX compatible systems take options or
+"switches" on the command line that can be used to change the way a
+program behaves.  `awk' is an example of such a program (*note Command
+Line Options: Options.).  Often, options take "arguments", data that
+the program needs to correctly obey the command line option.  For
+example, `awk''s `-F' option requires a string to use as the field
+separator.  The first occurrence on the command line of either `--' or a
+string that does not begin with `-' ends the options.
+
+   Most Unix systems provide a C function named `getopt' for processing
+command line arguments.  The programmer provides a string describing
+the one letter options. If an option requires an argument, it is
+followed in the string with a colon.  `getopt' is also passed the count
+and values of the command line arguments, and is called in a loop.
+`getopt' processes the command line arguments for option letters.  Each
+time around the loop, it returns a single character representing the
+next option letter that it found, or `?' if it found an invalid option.
+When it returns -1, there are no options left on the command line.
+
+   When using `getopt', options that do not take arguments can be
+grouped together.  Furthermore, options that take arguments require
+that the argument be present.  The argument can immediately follow the
+option letter, or it can be a separate command line argument.
+
+   Given a hypothetical program that takes three command line options,
+`-a', `-b', and `-c', and `-b' requires an argument, all of the
+following are valid ways of invoking the program:
+
+     prog -a -b foo -c data1 data2 data3
+     prog -ac -bfoo -- data1 data2 data3
+     prog -acbfoo data1 data2 data3
+
+   Notice that when the argument is grouped with its option, the rest of
+the command line argument is considered to be the option's argument.
+In the above example, `-acbfoo' indicates that all of the `-a', `-b',
+and `-c' options were supplied, and that `foo' is the argument to the
+`-b' option.
+
+   `getopt' provides four external variables that the programmer can
+use.
+
+`optind'
+     The index in the argument value array (`argv') where the first
+     non-option command line argument can be found.
+
+`optarg'
+     The string value of the argument to an option.
+
+`opterr'
+     Usually `getopt' prints an error message when it finds an invalid
+     option.  Setting `opterr' to zero disables this feature.  (An
+     application might wish to print its own error message.)
+
+`optopt'
+     The letter representing the command line option.  While not
+     usually documented, most versions supply this variable.
+
+   The following C fragment shows how `getopt' might process command
+line arguments for `awk'.
+
+     int
+     main(int argc, char *argv[])
+     {
+         ...
+         /* print our own message */
+         opterr = 0;
+         while ((c = getopt(argc, argv, "v:f:F:W:")) != -1) {
+             switch (c) {
+             case 'f':    /* file */
+                 ...
+                 break;
+             case 'F':    /* field separator */
+                 ...
+                 break;
+             case 'v':    /* variable assignment */
+                 ...
+                 break;
+             case 'W':    /* extension */
+                 ...
+                 break;
+             case '?':
+             default:
+                 usage();
+                 break;
+             }
+         }
+         ...
+     }
+
+   As a side point, `gawk' actually uses the GNU `getopt_long' function
+to process both normal and GNU-style long options (*note Command Line
+Options: Options.).
+
+   The abstraction provided by `getopt' is very useful, and would be
+quite handy in `awk' programs as well.  Here is an `awk' version of
+`getopt'.  This function highlights one of the greatest weaknesses in
+`awk', which is that it is very poor at manipulating single characters.
+Repeated calls to `substr' are necessary for accessing individual
+characters (*note Built-in Functions for String Manipulation: String
+Functions.).
+
+   The discussion walks through the code a bit at a time.
+
+     # getopt --- do C library getopt(3) function in awk
+     #
+     # arnold@gnu.ai.mit.edu
+     # Public domain
+     #
+     # Initial version: March, 1991
+     # Revised: May, 1993
+     
+     # External variables:
+     #    Optind -- index of ARGV for first non-option argument
+     #    Optarg -- string value of argument to current option
+     #    Opterr -- if non-zero, print our own diagnostic
+     #    Optopt -- current option letter
+     
+     # Returns
+     #    -1     at end of options
+     #    ?      for unrecognized option
+     #    <c>    a character representing the current option
+     
+     # Private Data
+     #    _opti  index in multi-flag option, e.g., -abc
+
+   The function starts out with some documentation: who wrote the code,
+and when it was revised, followed by a list of the global variables it
+uses, what the return values are and what they mean, and any global
+variables that are "private" to this library function.  Such
+documentation is essential for any program, and particularly for
+library functions.
+
+     function getopt(argc, argv, options,    optl, thisopt, i)
+     {
+         optl = length(options)
+         if (optl == 0)        # no options given
+             return -1
+     
+         if (argv[Optind] == "--") {  # all done
+             Optind++
+             _opti = 0
+             return -1
+         } else if (argv[Optind] !~ /^-[^: \t\n\f\r\v\b]/) {
+             _opti = 0
+             return -1
+         }
+
+   The function first checks that it was indeed called with a string of
+options (the `options' parameter).  If `options' has a zero length,
+`getopt' immediately returns -1.
+
+   The next thing to check for is the end of the options.  A `--' ends
+the command line options, as does any command line argument that does
+not begin with a `-'.  `Optind' is used to step through the array of
+command line arguments; it retains its value across calls to `getopt',
+since it is a global variable.
+
+   The regexp used, `/^-[^: \t\n\f\r\v\b]/', is perhaps a bit of
+overkill; it checks for a `-' followed by anything that is not
+whitespace and not a colon.  If the current command line argument does
+not match this pattern, it is not an option, and it ends option
+processing.
+
+         if (_opti == 0)
+             _opti = 2
+         thisopt = substr(argv[Optind], _opti, 1)
+         Optopt = thisopt
+         i = index(options, thisopt)
+         if (i == 0) {
+             if (Opterr)
+                 printf("%c -- invalid option\n",
+                                       thisopt) > "/dev/stderr"
+             if (_opti >= length(argv[Optind])) {
+                 Optind++
+                 _opti = 0
+             } else
+                 _opti++
+             return "?"
+         }
+
+   The `_opti' variable tracks the position in the current command line
+argument (`argv[Optind]').  In the case that multiple options were
+grouped together with one `-' (e.g., `-abx'), it is necessary to return
+them to the user one at a time.
+
+   If `_opti' is equal to zero, it is set to two, the index in the
+string of the next character to look at (we skip the `-', which is at
+position one).  The variable `thisopt' holds the character, obtained
+with `substr'.  It is saved in `Optopt' for the main program to use.
+
+   If `thisopt' is not in the `options' string, then it is an invalid
+option.  If `Opterr' is non-zero, `getopt' prints an error message on
+the standard error that is similar to the message from the C version of
+`getopt'.
+
+   Since the option is invalid, it is necessary to skip it and move on
+to the next option character.  If `_opti' is greater than or equal to
+the length of the current command line argument, then it is necessary
+to move on to the next one, so `Optind' is incremented and `_opti' is
+reset to zero. Otherwise, `Optind' is left alone and `_opti' is merely
+incremented.
+
+   In any case, since the option was invalid, `getopt' returns `?'.
+The main program can examine `Optopt' if it needs to know what the
+invalid option letter actually was.
+
+         if (substr(options, i + 1, 1) == ":") {
+             # get option argument
+             if (length(substr(argv[Optind], _opti + 1)) > 0)
+                 Optarg = substr(argv[Optind], _opti + 1)
+             else
+                 Optarg = argv[++Optind]
+             _opti = 0
+         } else
+             Optarg = ""
+
+   If the option requires an argument, the option letter is followed by
+a colon in the `options' string.  If there are remaining characters in
+the current command line argument (`argv[Optind]'), then the rest of
+that string is assigned to `Optarg'.  Otherwise, the next command line
+argument is used (`-xFOO' vs. `-x FOO'). In either case, `_opti' is
+reset to zero, since there are no more characters left to examine in
+the current command line argument.
+
+         if (_opti == 0 || _opti >= length(argv[Optind])) {
+             Optind++
+             _opti = 0
+         } else
+             _opti++
+         return thisopt
+     }
+
+   Finally, if `_opti' is either zero or greater than the length of the
+current command line argument, it means this element in `argv' is
+through being processed, so `Optind' is incremented to point to the
+next element in `argv'.  If neither condition is true, then only
+`_opti' is incremented, so that the next option letter can be processed
+on the next call to `getopt'.
+
+     BEGIN {
+         Opterr = 1    # default is to diagnose
+         Optind = 1    # skip ARGV[0]
+     
+         # test program
+         if (_getopt_test) {
+             while ((_go_c = getopt(ARGC, ARGV, "ab:cd")) != -1)
+                 printf("c = <%c>, optarg = <%s>\n",
+                                            _go_c, Optarg)
+             printf("non-option arguments:\n")
+             for (; Optind < ARGC; Optind++)
+                 printf("\tARGV[%d] = <%s>\n",
+                                         Optind, ARGV[Optind])
+         }
+     }
+
+   The `BEGIN' rule initializes both `Opterr' and `Optind' to one.
+`Opterr' is set to one, since the default behavior is for `getopt' to
+print a diagnostic message upon seeing an invalid option.  `Optind' is
+set to one, since there's no reason to look at the program name, which
+is in `ARGV[0]'.
+
+   The rest of the `BEGIN' rule is a simple test program.  Here is the
+result of two sample runs of the test program.
+
+     $ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x
+     -| c = <a>, optarg = <>
+     -| c = <c>, optarg = <>
+     -| c = <b>, optarg = <ARG>
+     -| non-option arguments:
+     -|         ARGV[3] = <bax>
+     -|         ARGV[4] = <-x>
+     
+     $ awk -f getopt.awk -v _getopt_test=1 -- -a -x -- xyz abc
+     -| c = <a>, optarg = <>
+     error--> x -- invalid option
+     -| c = <?>, optarg = <>
+     -| non-option arguments:
+     -|         ARGV[4] = <xyz>
+     -|         ARGV[5] = <abc>
+
+   The first `--' terminates the arguments to `awk', so that it does
+not try to interpret the `-a' etc. as its own options.
+
+   Several of the sample programs presented in *Note Practical `awk'
+Programs: Sample Programs, use `getopt' to process their arguments.
+
+
+File: gawk.info,  Node: Passwd Functions,  Next: Group Functions,  Prev: Getopt Function,  Up: Library Functions
+
+Reading the User Database
+=========================
+
+   The `/dev/user' special file (*note Special File Names in `gawk':
+Special Files.)  provides access to the current user's real and
+effective user and group id numbers, and if available, the user's
+supplementary group set.  However, since these are numbers, they do not
+provide very useful information to the average user.  There needs to be
+some way to find the user information associated with the user and
+group numbers.  This section presents a suite of functions for
+retrieving information from the user database.  *Note Reading the Group
+Database: Group Functions, for a similar suite that retrieves
+information from the group database.
+
+   The POSIX standard does not define the file where user information is
+kept.  Instead, it provides the `<pwd.h>' header file and several C
+language subroutines for obtaining user information.  The primary
+function is `getpwent', for "get password entry."  The "password" comes
+from the original user database file, `/etc/passwd', which kept user
+information, along with the encrypted passwords (hence the name).
+
+   While an `awk' program could simply read `/etc/passwd' directly (the
+format is well known), because of the way password files are handled on
+networked systems, this file may not contain complete information about
+the system's set of users.
+
+   To be sure of being able to produce a readable, complete version of
+the user database, it is necessary to write a small C program that
+calls `getpwent'.  `getpwent' is defined to return a pointer to a
+`struct passwd'.  Each time it is called, it returns the next entry in
+the database.  When there are no more entries, it returns `NULL', the
+null pointer.  When this happens, the C program should call `endpwent'
+to close the database.  Here is `pwcat', a C program that "cats" the
+password database.
+
+     /*
+      * pwcat.c
+      *
+      * Generate a printable version of the password database
+      *
+      * Arnold Robbins
+      * arnold@gnu.ai.mit.edu
+      * May 1993
+      * Public Domain
+      */
+     
+     #include <stdio.h>
+     #include <pwd.h>
+     
+     int
+     main(argc, argv)
+     int argc;
+     char **argv;
+     {
+         struct passwd *p;
+     
+         while ((p = getpwent()) != NULL)
+             printf("%s:%s:%d:%d:%s:%s:%s\n",
+                 p->pw_name, p->pw_passwd, p->pw_uid,
+                 p->pw_gid, p->pw_gecos, p->pw_dir, p->pw_shell);
+     
+         endpwent();
+         exit(0);
+     }
+
+   If you don't understand C, don't worry about it.  The output from
+`pwcat' is the user database, in the traditional `/etc/passwd' format
+of colon-separated fields.  The fields are:
+
+Login name
+     The user's login name.
+
+Encrypted password
+     The user's encrypted password.  This may not be available on some
+     systems.
+
+User-ID
+     The user's numeric user-id number.
+
+Group-ID
+     The user's numeric group-id number.
+
+Full name
+     The user's full name, and perhaps other information associated
+     with the user.
+
+Home directory
+     The user's login, or "home" directory (familiar to shell
+     programmers as `$HOME').
+
+Login shell
+     The program that will be run when the user logs in.  This is
+     usually a shell, such as Bash (the Gnu Bourne-Again shell).
+
+   Here are a few lines representative of `pwcat''s output.
+
+     $ pwcat
+     -| root:3Ov02d5VaUPB6:0:1:Operator:/:/bin/sh
+     -| nobody:*:65534:65534::/:
+     -| daemon:*:1:1::/:
+     -| sys:*:2:2::/:/bin/csh
+     -| bin:*:3:3::/bin:
+     -| arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh
+     -| miriam:yxaay:112:10:Miriam Robbins:/home/miriam:/bin/sh
+     -| andy:abcca2:113:10:Andy Jacobs:/home/andy:/bin/sh
+     ...
+
+   With that introduction, here is a group of functions for getting user
+information.  There are several functions here, corresponding to the C
+functions of the same name.
+
+     # passwd.awk --- access password file information
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     BEGIN {
+         # tailor this to suit your system
+         _pw_awklib = "/usr/local/libexec/awk/"
+     }
+     
+     function _pw_init(    oldfs, oldrs, olddol0, pwcat)
+     {
+         if (_pw_inited)
+             return
+         oldfs = FS
+         oldrs = RS
+         olddol0 = $0
+         FS = ":"
+         RS = "\n"
+         pwcat = _pw_awklib "pwcat"
+         while ((pwcat | getline) > 0) {
+             _pw_byname[$1] = $0
+             _pw_byuid[$3] = $0
+             _pw_bycount[++_pw_total] = $0
+         }
+         close(pwcat)
+         _pw_count = 0
+         _pw_inited = 1
+         FS = oldfs
+         RS = oldrs
+         $0 = olddol0
+     }
+
+   The `BEGIN' rule sets a private variable to the directory where
+`pwcat' is stored.  Since it is used to help out an `awk' library
+routine, we have chosen to put it in `/usr/local/libexec/awk'.  You
+might want it to be in a different directory on your system.
+
+   The function `_pw_init' keeps three copies of the user information
+in three associative arrays.  The arrays are indexed by user name
+(`_pw_byname'), by user-id number (`_pw_byuid'), and by order of
+occurrence (`_pw_bycount').
+
+   The variable `_pw_inited' is used for efficiency; `_pw_init' only
+needs to be called once.
+
+   Since this function uses `getline' to read information from `pwcat',
+it first saves the values of `FS', `RS', and `$0'.  Doing so is
+necessary, since these functions could be called from anywhere within a
+user's program, and the user may have his or her own values for `FS'
+and `RS'.
+
+   The main part of the function uses a loop to read database lines,
+split the line into fields, and then store the line into each array as
+necessary.  When the loop is done, `_pw_init' cleans up by closing the
+pipeline, setting `_pw_inited' to one, and restoring `FS', `RS', and
+`$0'.  The use of `_pw_count' will be explained below.
+
+     function getpwnam(name)
+     {
+         _pw_init()
+         if (name in _pw_byname)
+             return _pw_byname[name]
+         return ""
+     }
+
+   The `getpwnam' function takes a user name as a string argument. If
+that user is in the database, it returns the appropriate line.
+Otherwise it returns the null string.
+
+     function getpwuid(uid)
+     {
+         _pw_init()
+         if (uid in _pw_byuid)
+             return _pw_byuid[uid]
+         return ""
+     }
+
+   Similarly, the `getpwuid' function takes a user-id number argument.
+If that user number is in the database, it returns the appropriate
+line. Otherwise it returns the null string.
+
+     function getpwent()
+     {
+         _pw_init()
+         if (_pw_count < _pw_total)
+             return _pw_bycount[++_pw_count]
+         return ""
+     }
+
+   The `getpwent' function simply steps through the database, one entry
+at a time.  It uses `_pw_count' to track its current position in the
+`_pw_bycount' array.
+
+     function endpwent()
+     {
+         _pw_count = 0
+     }
+
+   The `endpwent' function resets `_pw_count' to zero, so that
+subsequent calls to `getpwent' will start over again.
+
+   A conscious design decision in this suite is that each subroutine
+calls `_pw_init' to initialize the database arrays.  The overhead of
+running a separate process to generate the user database, and the I/O
+to scan it, will only be incurred if the user's main program actually
+calls one of these functions.  If this library file is loaded along
+with a user's program, but none of the routines are ever called, then
+there is no extra run-time overhead.  (The alternative would be to move
+the body of `_pw_init' into a `BEGIN' rule, which would always run
+`pwcat'.  This simplifies the code but runs an extra process that may
+never be needed.)
+
+   In turn, calling `_pw_init' is not too expensive, since the
+`_pw_inited' variable keeps the program from reading the data more than
+once.  If you are worried about squeezing every last cycle out of your
+`awk' program, the check of `_pw_inited' could be moved out of
+`_pw_init' and duplicated in all the other functions.  In practice,
+this is not necessary, since most `awk' programs are I/O bound, and it
+would clutter up the code.
+
+   The `id' program in *Note Printing Out User Information: Id Program,
+uses these functions.
+
+
+File: gawk.info,  Node: Group Functions,  Next: Library Names,  Prev: Passwd Functions,  Up: Library Functions
+
+Reading the Group Database
+==========================
+
+   Much of the discussion presented in *Note Reading the User Database:
+Passwd Functions, applies to the group database as well.  Although
+there has traditionally been a well known file, `/etc/group', in a well
+known format, the POSIX standard only provides a set of C library
+routines (`<grp.h>' and `getgrent') for accessing the information.
+Even though this file may exist, it likely does not have complete
+information.  Therefore, as with the user database, it is necessary to
+have a small C program that generates the group database as its output.
+
+   Here is `grcat', a C program that "cats" the group database.
+
+     /*
+      * grcat.c
+      *
+      * Generate a printable version of the group database
+      *
+      * Arnold Robbins, arnold@gnu.ai.mit.edu
+      * May 1993
+      * Public Domain
+      */
+     
+     #include <stdio.h>
+     #include <grp.h>
+     
+     int
+     main(argc, argv)
+     int argc;
+     char **argv;
+     {
+         struct group *g;
+         int i;
+     
+         while ((g = getgrent()) != NULL) {
+             printf("%s:%s:%d:", g->gr_name, g->gr_passwd,
+                                                 g->gr_gid);
+             for (i = 0; g->gr_mem[i] != NULL; i++) {
+                 printf("%s", g->gr_mem[i]);
+                 if (g->gr_mem[i+1] != NULL)
+                     putchar(',');
+             }
+             putchar('\n');
+         }
+         endgrent();
+         exit(0);
+     }
+
+   Each line in the group database represent one group.  The fields are
+separated with colons, and represent the following information.
+
+Group Name
+     The name of the group.
+
+Group Password
+     The encrypted group password. In practice, this field is never
+     used. It is usually empty, or set to `*'.
+
+Group ID Number
+     The numeric group-id number. This number should be unique within
+     the file.
+
+Group Member List
+     A comma-separated list of user names.  These users are members of
+     the group.  Most Unix systems allow users to be members of several
+     groups simultaneously.  If your system does, then reading
+     `/dev/user' will return those group-id numbers in `$5' through
+     `$NF'.  (Note that `/dev/user' is a `gawk' extension; *note
+     Special File Names in `gawk': Special Files..)
+
+   Here is what running `grcat' might produce:
+
+     $ grcat
+     -| wheel:*:0:arnold
+     -| nogroup:*:65534:
+     -| daemon:*:1:
+     -| kmem:*:2:
+     -| staff:*:10:arnold,miriam,andy
+     -| other:*:20:
+     ...
+
+   Here are the functions for obtaining information from the group
+database.  There are several, modeled after the C library functions of
+the same names.
+
+     # group.awk --- functions for dealing with the group file
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     BEGIN    \
+     {
+         # Change to suit your system
+         _gr_awklib = "/usr/local/libexec/awk/"
+     }
+     
+     function _gr_init(    oldfs, oldrs, olddol0, grcat, n, a, i)
+     {
+         if (_gr_inited)
+             return
+     
+         oldfs = FS
+         oldrs = RS
+         olddol0 = $0
+         FS = ":"
+         RS = "\n"
+     
+         grcat = _gr_awklib "grcat"
+         while ((grcat | getline) > 0) {
+             if ($1 in _gr_byname)
+                 _gr_byname[$1] = _gr_byname[$1] "," $4
+             else
+                 _gr_byname[$1] = $0
+             if ($3 in _gr_bygid)
+                 _gr_bygid[$3] = _gr_bygid[$3] "," $4
+             else
+                 _gr_bygid[$3] = $0
+     
+             n = split($4, a, "[ \t]*,[ \t]*")
+             for (i = 1; i <= n; i++)
+                 if (a[i] in _gr_groupsbyuser)
+                     _gr_groupsbyuser[a[i]] = \
+                         _gr_groupsbyuser[a[i]] " " $1
+                 else
+                     _gr_groupsbyuser[a[i]] = $1
+     
+             _gr_bycount[++_gr_count] = $0
+         }
+         close(grcat)
+         _gr_count = 0
+         _gr_inited++
+         FS = oldfs
+         RS = oldrs
+         $0 = olddol0
+     }
+
+   The `BEGIN' rule sets a private variable to the directory where
+`grcat' is stored.  Since it is used to help out an `awk' library
+routine, we have chosen to put it in `/usr/local/libexec/awk'.  You
+might want it to be in a different directory on your system.
+
+   These routines follow the same general outline as the user database
+routines (*note Reading the User Database: Passwd Functions.).  The
+`_gr_inited' variable is used to ensure that the database is scanned no
+more than once.  The `_gr_init' function first saves `FS', `RS', and
+`$0', and then sets `FS' and `RS' to the correct values for scanning
+the group information.
+
+   The group information is stored is several associative arrays.  The
+arrays are indexed by group name (`_gr_byname'), by group-id number
+(`_gr_bygid'), and by position in the database (`_gr_bycount').  There
+is an additional array indexed by user name (`_gr_groupsbyuser'), that
+is a space separated list of groups that each user belongs to.
+
+   Unlike the user database, it is possible to have multiple records in
+the database for the same group.  This is common when a group has a
+large number of members.  Such a pair of entries might look like:
+
+     tvpeople:*:101:johny,jay,arsenio
+     tvpeople:*:101:david,conan,tom,joan
+
+   For this reason, `_gr_init' looks to see if a group name or group-id
+number has already been seen.  If it has, then the user names are
+simply concatenated onto the previous list of users.  (There is
+actually a subtle problem with the code presented above.  Suppose that
+the first time there were no names. This code adds the names with a
+leading comma. It also doesn't check that there is a `$4'.)
+
+   Finally, `_gr_init' closes the pipeline to `grcat', restores `FS',
+`RS', and `$0', initializes `_gr_count' to zero (it is used later), and
+makes `_gr_inited' non-zero.
+
+     function getgrnam(group)
+     {
+         _gr_init()
+         if (group in _gr_byname)
+             return _gr_byname[group]
+         return ""
+     }
+
+   The `getgrnam' function takes a group name as its argument, and if
+that group exists, it is returned. Otherwise, `getgrnam' returns the
+null string.
+
+     function getgrgid(gid)
+     {
+         _gr_init()
+         if (gid in _gr_bygid)
+             return _gr_bygid[gid]
+         return ""
+     }
+
+   The `getgrgid' function is similar, it takes a numeric group-id, and
+looks up the information associated with that group-id.
+
+     function getgruser(user)
+     {
+         _gr_init()
+         if (user in _gr_groupsbyuser)
+             return _gr_groupsbyuser[user]
+         return ""
+     }
+
+   The `getgruser' function does not have a C counterpart. It takes a
+user name, and returns the list of groups that have the user as a
+member.
+
+     function getgrent()
+     {
+         _gr_init()
+         if (++gr_count in _gr_bycount)
+             return _gr_bycount[_gr_count]
+         return ""
+     }
+
+   The `getgrent' function steps through the database one entry at a
+time.  It uses `_gr_count' to track its position in the list.
+
+     function endgrent()
+     {
+         _gr_count = 0
+     }
+
+   `endgrent' resets `_gr_count' to zero so that `getgrent' can start
+over again.
+
+   As with the user database routines, each function calls `_gr_init' to
+initialize the arrays.  Doing so only incurs the extra overhead of
+running `grcat' if these functions are used (as opposed to moving the
+body of `_gr_init' into a `BEGIN' rule).
+
+   Most of the work is in scanning the database and building the various
+associative arrays.  The functions that the user calls are themselves
+very simple, relying on `awk''s associative arrays to do work.
+
+   The `id' program in *Note Printing Out User Information: Id Program,
+uses these functions.
+
+
+File: gawk.info,  Node: Library Names,  Prev: Group Functions,  Up: Library Functions
+
+Naming Library Function Global Variables
+========================================
+
+   Due to the way the `awk' language evolved, variables are either
+"global" (usable by the entire program), or "local" (usable just by a
+specific function).  There is no intermediate state analogous to
+`static' variables in C.
+
+   Library functions often need to have global variables that they can
+use to preserve state information between calls to the function. For
+example, `getopt''s variable `_opti' (*note Processing Command Line
+Options: Getopt Function.), and the `_tm_months' array used by `mktime'
+(*note Turning Dates Into Timestamps: Mktime Function.).  Such
+variables are called "private", since the only functions that need to
+use them are the ones in the library.
+
+   When writing a library function, you should try to choose names for
+your private variables so that they will not conflict with any
+variables used by either another library function or a user's main
+program.  For example, a name like `i' or `j' is not a good choice,
+since user programs often use variable names like these for their own
+purposes.
+
+   The example programs shown in this chapter all start the names of
+their private variables with an underscore (`_').  Users generally
+don't use leading underscores in their variable names, so this
+convention immediately decreases the chances that the variable name
+will be accidentally shared with the user's program.
+
+   In addition, several of the library functions use a prefix that helps
+indicate what function or set of functions uses the variables. For
+example, `_tm_months' in `mktime' (*note Turning Dates Into Timestamps:
+Mktime Function.), and `_pw_byname' in the user data base routines
+(*note Reading the User Database: Passwd Functions.).  This convention
+is recommended, since it even further decreases the chance of
+inadvertent conflict among variable names.  Note that this convention
+can be used equally well both for variable names and for private
+function names too.
+
+   While I could have re-written all the library routines to use this
+convention, I did not do so, in order to show how my own `awk'
+programming style has evolved, and to provide some basis for this
+discussion.
+
+   As a final note on variable naming, if a function makes global
+variables available for use by a main program, it is a good convention
+to start that variable's name with a capital letter.  For example,
+`getopt''s `Opterr' and `Optind' variables (*note Processing Command
+Line Options: Getopt Function.).  The leading capital letter indicates
+that it is global, while the fact that the variable name is not all
+capital letters indicates that the variable is not one of `awk''s
+built-in variables, like `FS'.
+
+   It is also important that _all_ variables in library functions that
+do not need to save state are in fact declared local.  If this is not
+done, the variable could accidentally be used in the user's program,
+leading to bugs that are very difficult to track down.
+
+     function lib_func(x, y,    l1, l2)
+     {
+         ...
+         USE VARIABLE some_var  # some_var could be local
+         ...                   # but is not by oversight
+     }
+
+   A different convention, common in the Tcl community, is to use a
+single associative array to hold the values needed by the library
+function(s), or "package."  This significantly decreases the number of
+actual global names in use.  For example, the functions described in
+*Note Reading the User Database: Passwd Functions, might have used
+`PW_data["inited"]', `PW_data["total"]', `PW_data["count"]' and
+`PW_data["awklib"]', instead of `_pw_inited', `_pw_awklib', `_pw_total',
+and `_pw_count'.
+
+   The conventions presented in this section are exactly that,
+conventions. You are not required to write your programs this way, we
+merely recommend that you do so.
+
+
+File: gawk.info,  Node: Sample Programs,  Next: Language History,  Prev: Library Functions,  Up: Top
+
+Practical `awk' Programs
+************************
+
+   This chapter presents a potpourri of `awk' programs for your reading
+enjoyment.
+
+   Many of these programs use the library functions presented in *Note
+A Library of `awk' Functions: Library Functions.
+
+* Menu:
+
+* Clones::                    Clones of common utilities.
+* Miscellaneous Programs::    Some interesting `awk' programs.
+
+
+File: gawk.info,  Node: Clones,  Next: Miscellaneous Programs,  Prev: Sample Programs,  Up: Sample Programs
+
+Re-inventing Wheels for Fun and Profit
+======================================
+
+   This section presents a number of POSIX utilities that are
+implemented in `awk'.  Re-inventing these programs in `awk' is often
+enjoyable, since the algorithms can be very clearly expressed, and
+usually the code is very concise and simple.  This is true because
+`awk' does so much for you.
+
+   It should be noted that these programs are not necessarily intended
+to replace the installed versions on your system.  Instead, their
+purpose is to illustrate `awk' language programming for "real world"
+tasks.
+
+   The programs are presented in alphabetical order.
+
+* Menu:
+
+* Cut Program::             The `cut' utility.
+* Egrep Program::           The `egrep' utility.
+* Id Program::              The `id' utility.
+* Split Program::           The `split' utility.
+* Tee Program::             The `tee' utility.
+* Uniq Program::            The `uniq' utility.
+* Wc Program::              The `wc' utility.
+
+
+File: gawk.info,  Node: Cut Program,  Next: Egrep Program,  Prev: Clones,  Up: Clones
+
+Cutting Out Fields and Columns
+------------------------------
+
+   The `cut' utility selects, or "cuts," either characters or fields
+from its standard input and sends them to its standard output.  `cut'
+can cut out either a list of characters, or a list of fields.  By
+default, fields are separated by tabs, but you may supply a command
+line option to change the field "delimiter", i.e. the field separator
+character. `cut''s definition of fields is less general than `awk''s.
+
+   A common use of `cut' might be to pull out just the login name of
+logged-on users from the output of `who'.  For example, the following
+pipeline generates a sorted, unique list of the logged on users:
+
+     who | cut -c1-8 | sort | uniq
+
+   The options for `cut' are:
+
+`-c LIST'
+     Use LIST as the list of characters to cut out.  Items within the
+     list may be separated by commas, and ranges of characters can be
+     separated with dashes.  The list `1-8,15,22-35' specifies
+     characters one through eight, 15, and 22 through 35.
+
+`-f LIST'
+     Use LIST as the list of fields to cut out.
+
+`-d DELIM'
+     Use DELIM as the field separator character instead of the tab
+     character.
+
+`-s'
+     Suppress printing of lines that do not contain the field delimiter.
+
+   The `awk' implementation of `cut' uses the `getopt' library function
+(*note Processing Command Line Options: Getopt Function.), and the
+`join' library function (*note Merging an Array Into a String: Join
+Function.).
+
+   The program begins with a comment describing the options and a
+`usage' function which prints out a usage message and exits.  `usage'
+is called if invalid arguments are supplied.
+
+     # cut.awk --- implement cut in awk
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     # Options:
+     #    -f list        Cut fields
+     #    -d c           Field delimiter character
+     #    -c list        Cut characters
+     #
+     #    -s        Suppress lines without the delimiter character
+     
+     function usage(    e1, e2)
+     {
+         e1 = "usage: cut [-f list] [-d c] [-s] [files...]"
+         e2 = "usage: cut [-c list] [files...]"
+         print e1 > "/dev/stderr"
+         print e2 > "/dev/stderr"
+         exit 1
+     }
+
+The variables `e1' and `e2' are used so that the function fits nicely
+on the screen.
+
+   Next comes a `BEGIN' rule that parses the command line options.  It
+sets `FS' to a single tab character, since that is `cut''s default
+field separator.  The output field separator is also set to be the same
+as the input field separator.  Then `getopt' is used to step through
+the command line options.  One or the other of the variables
+`by_fields' or `by_chars' is set to true, to indicate that processing
+should be done by fields or by characters respectively.  When cutting
+by characters, the output field separator is set to the null string.
+
+     BEGIN    \
+     {
+         FS = "\t"    # default
+         OFS = FS
+         while ((c = getopt(ARGC, ARGV, "sf:c:d:")) != -1) {
+             if (c == "f") {
+                 by_fields = 1
+                 fieldlist = Optarg
+             } else if (c == "c") {
+                 by_chars = 1
+                 fieldlist = Optarg
+                 OFS = ""
+             } else if (c == "d") {
+                 if (length(Optarg) > 1) {
+                     printf("Using first character of %s" \
+                     " for delimiter\n", Optarg) > "/dev/stderr"
+                     Optarg = substr(Optarg, 1, 1)
+                 }
+                 FS = Optarg
+                 OFS = FS
+                 if (FS == " ")    # defeat awk semantics
+                     FS = "[ ]"
+             } else if (c == "s")
+                 suppress++
+             else
+                 usage()
+         }
+     
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+
+   Special care is taken when the field delimiter is a space. Using
+`" "' (a single space) for the value of `FS' is incorrect--`awk' would
+separate fields with runs of spaces, tabs and/or newlines, and we want
+them to be separated with individual spaces.  Also, note that after
+`getopt' is through, we have to clear out all the elements of `ARGV'
+from one to `Optind', so that `awk' will not try to process the command
+line options as file names.
+
+   After dealing with the command line options, the program verifies
+that the options make sense.  Only one or the other of `-c' and `-f'
+should be used, and both require a field list.  Then either
+`set_fieldlist' or `set_charlist' is called to pull apart the list of
+fields or characters.
+
+         if (by_fields && by_chars)
+             usage()
+     
+         if (by_fields == 0 && by_chars == 0)
+             by_fields = 1    # default
+     
+         if (fieldlist == "") {
+             print "cut: needs list for -c or -f" > "/dev/stderr"
+             exit 1
+         }
+     
+         if (by_fields)
+             set_fieldlist()
+         else
+             set_charlist()
+     }
+
+   Here is `set_fieldlist'.  It first splits the field list apart at
+the commas, into an array.  Then, for each element of the array, it
+looks to see if it is actually a range, and if so splits it apart. The
+range is verified to make sure the first number is smaller than the
+second.  Each number in the list is added to the `flist' array, which
+simply lists the fields that will be printed.  Normal field splitting
+is used.  The program lets `awk' handle the job of doing the field
+splitting.
+
+     function set_fieldlist(        n, m, i, j, k, f, g)
+     {
+         n = split(fieldlist, f, ",")
+         j = 1    # index in flist
+         for (i = 1; i <= n; i++) {
+             if (index(f[i], "-") != 0) { # a range
+                 m = split(f[i], g, "-")
+                 if (m != 2 || g[1] >= g[2]) {
+                     printf("bad field list: %s\n",
+                                       f[i]) > "/dev/stderr"
+                     exit 1
+                 }
+                 for (k = g[1]; k <= g[2]; k++)
+                     flist[j++] = k
+             } else
+                 flist[j++] = f[i]
+         }
+         nfields = j - 1
+     }
+
+   The `set_charlist' function is more complicated than `set_fieldlist'.
+The idea here is to use `gawk''s `FIELDWIDTHS' variable (*note Reading
+Fixed-width Data: Constant Size.), which describes constant width
+input.  When using a character list, that is exactly what we have.
+
+   Setting up `FIELDWIDTHS' is more complicated than simply listing the
+fields that need to be printed.  We have to keep track of the fields to
+be printed, and also the intervening characters that have to be skipped.
+For example, suppose you wanted characters one through eight, 15, and
+22 through 35.  You would use `-c 1-8,15,22-35'.  The necessary value
+for `FIELDWIDTHS' would be `"8 6 1 6 14"'.  This gives us five fields,
+and what should be printed are `$1', `$3', and `$5'.  The intermediate
+fields are "filler," stuff in between the desired data.
+
+   `flist' lists the fields to be printed, and `t' tracks the complete
+field list, including filler fields.
+
+     function set_charlist(    field, i, j, f, g, t,
+                               filler, last, len)
+     {
+         field = 1   # count total fields
+         n = split(fieldlist, f, ",")
+         j = 1       # index in flist
+         for (i = 1; i <= n; i++) {
+             if (index(f[i], "-") != 0) { # range
+                 m = split(f[i], g, "-")
+                 if (m != 2 || g[1] >= g[2]) {
+                     printf("bad character list: %s\n",
+                                    f[i]) > "/dev/stderr"
+                     exit 1
+                 }
+                 len = g[2] - g[1] + 1
+                 if (g[1] > 1)  # compute length of filler
+                     filler = g[1] - last - 1
+                 else
+                     filler = 0
+                 if (filler)
+                     t[field++] = filler
+                 t[field++] = len  # length of field
+                 last = g[2]
+                 flist[j++] = field - 1
+             } else {
+                 if (f[i] > 1)
+                     filler = f[i] - last - 1
+                 else
+                     filler = 0
+                 if (filler)
+                     t[field++] = filler
+                 t[field++] = 1
+                 last = f[i]
+                 flist[j++] = field - 1
+             }
+         }
+         FIELDWIDTHS = join(t, 1, field - 1)
+         nfields = j - 1
+     }
+
+   Here is the rule that actually processes the data.  If the `-s'
+option was given, then `suppress' will be true.  The first `if'
+statement makes sure that the input record does have the field
+separator.  If `cut' is processing fields, `suppress' is true, and the
+field separator character is not in the record, then the record is
+skipped.
+
+   If the record is valid, then at this point, `gawk' has split the data
+into fields, either using the character in `FS' or using fixed-length
+fields and `FIELDWIDTHS'.  The loop goes through the list of fields
+that should be printed.  If the corresponding field has data in it, it
+is printed.  If the next field also has data, then the separator
+character is written out in between the fields.
+
+     {
+         if (by_fields && suppress && $0 !~ FS)
+             next
+     
+         for (i = 1; i <= nfields; i++) {
+             if ($flist[i] != "") {
+                 printf "%s", $flist[i]
+                 if (i < nfields && $flist[i+1] != "")
+                     printf "%s", OFS
+             }
+         }
+         print ""
+     }
+
+   This version of `cut' relies on `gawk''s `FIELDWIDTHS' variable to
+do the character-based cutting.  While it would be possible in other
+`awk' implementations to use `substr' (*note Built-in Functions for
+String Manipulation: String Functions.), it would also be extremely
+painful to do so.  The `FIELDWIDTHS' variable supplies an elegant
+solution to the problem of picking the input line apart by characters.
+
+
+File: gawk.info,  Node: Egrep Program,  Next: Id Program,  Prev: Cut Program,  Up: Clones
+
+Searching for Regular Expressions in Files
+------------------------------------------
+
+   The `egrep' utility searches files for patterns.  It uses regular
+expressions that are almost identical to those available in `awk'
+(*note Regular Expression Constants: Regexp Constants.).  It is used
+this way:
+
+     egrep [ OPTIONS ] 'PATTERN' FILES ...
+
+   The PATTERN is a regexp.  In typical usage, the regexp is quoted to
+prevent the shell from expanding any of the special characters as file
+name wildcards.  Normally, `egrep' prints the lines that matched.  If
+multiple file names are provided on the command line, each output line
+is preceded by the name of the file and a colon.
+
+   The options are:
+
+`-c'
+     Print out a count of the lines that matched the pattern, instead
+     of the lines themselves.
+
+`-s'
+     Be silent.  No output is produced, and the exit value indicates
+     whether or not the pattern was matched.
+
+`-v'
+     Invert the sense of the test. `egrep' prints the lines that do
+     _not_ match the pattern, and exits successfully if the pattern was
+     not matched.
+
+`-i'
+     Ignore case distinctions in both the pattern and the input data.
+
+`-l'
+     Only print the names of the files that matched, not the lines that
+     matched.
+
+`-e PATTERN'
+     Use PATTERN as the regexp to match.  The purpose of the `-e'
+     option is to allow patterns that start with a `-'.
+
+   This version uses the `getopt' library function (*note Processing
+Command Line Options: Getopt Function.), and the file transition
+library program (*note Noting Data File Boundaries: Filetrans
+Function.).
+
+   The program begins with a descriptive comment, and then a `BEGIN'
+rule that processes the command line arguments with `getopt'.  The `-i'
+(ignore case) option is particularly easy with `gawk'; we just use the
+`IGNORECASE' built in variable (*note Built-in Variables::).
+
+     # egrep.awk --- simulate egrep in awk
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     # Options:
+     #    -c    count of lines
+     #    -s    silent - use exit value
+     #    -v    invert test, success if no match
+     #    -i    ignore case
+     #    -l    print filenames only
+     #    -e    argument is pattern
+     
+     BEGIN {
+         while ((c = getopt(ARGC, ARGV, "ce:svil")) != -1) {
+             if (c == "c")
+                 count_only++
+             else if (c == "s")
+                 no_print++
+             else if (c == "v")
+                 invert++
+             else if (c == "i")
+                 IGNORECASE = 1
+             else if (c == "l")
+                 filenames_only++
+             else if (c == "e")
+                 pattern = Optarg
+             else
+                 usage()
+         }
+
+   Next comes the code that handles the `egrep' specific behavior. If no
+pattern was supplied with `-e', the first non-option on the command
+line is used.  The `awk' command line arguments up to `ARGV[Optind]'
+are cleared, so that `awk' won't try to process them as files.  If no
+files were specified, the standard input is used, and if multiple files
+were specified, we make sure to note this so that the file names can
+precede the matched lines in the output.
+
+   The last two lines are commented out, since they are not needed in
+`gawk'.  They should be uncommented if you have to use another version
+of `awk'.
+
+         if (pattern == "")
+             pattern = ARGV[Optind++]
+     
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+         if (Optind >= ARGC) {
+             ARGV[1] = "-"
+             ARGC = 2
+         } else if (ARGC - Optind > 1)
+             do_filenames++
+     
+     #    if (IGNORECASE)
+     #        pattern = tolower(pattern)
+     }
+
+   The next set of lines should be uncommented if you are not using
+`gawk'.  This rule translates all the characters in the input line into
+lower-case if the `-i' option was specified.  The rule is commented out
+since it is not necessary with `gawk'.
+
+     #{
+     #    if (IGNORECASE)
+     #        $0 = tolower($0)
+     #}
+
+   The `beginfile' function is called by the rule in `ftrans.awk' when
+each new file is processed.  In this case, it is very simple; all it
+does is initialize a variable `fcount' to zero. `fcount' tracks how
+many lines in the current file matched the pattern.
+
+     function beginfile(junk)
+     {
+         fcount = 0
+     }
+
+   The `endfile' function is called after each file has been processed.
+It is used only when the user wants a count of the number of lines that
+matched.  `no_print' will be true only if the exit status is desired.
+`count_only' will be true if line counts are desired.  `egrep' will
+therefore only print line counts if printing and counting are enabled.
+The output format must be adjusted depending upon the number of files
+to be processed.  Finally, `fcount' is added to `total', so that we
+know how many lines altogether matched the pattern.
+
+     function endfile(file)
+     {
+         if (! no_print && count_only)
+             if (do_filenames)
+                 print file ":" fcount
+             else
+                 print fcount
+     
+         total += fcount
+     }
+
+   This rule does most of the work of matching lines. The variable
+`matches' will be true if the line matched the pattern. If the user
+wants lines that did not match, the sense of the `matches' is inverted
+using the `!' operator. `fcount' is incremented with the value of
+`matches', which will be either one or zero, depending upon a
+successful or unsuccessful match.  If the line did not match, the
+`next' statement just moves on to the next record.
+
+   There are several optimizations for performance in the following few
+lines of code. If the user only wants exit status (`no_print' is true),
+and we don't have to count lines, then it is enough to know that one
+line in this file matched, and we can skip on to the next file with
+`nextfile'.  Along similar lines, if we are only printing file names,
+and we don't need to count lines, we can print the file name, and then
+skip to the next file with `nextfile'.
+
+   Finally, each line is printed, with a leading filename and colon if
+necessary.
+
+     {
+         matches = ($0 ~ pattern)
+         if (invert)
+             matches = ! matches
+     
+         fcount += matches    # 1 or 0
+     
+         if (! matches)
+             next
+     
+         if (no_print && ! count_only)
+             nextfile
+     
+         if (filenames_only && ! count_only) {
+             print FILENAME
+             nextfile
+         }
+     
+         if (do_filenames && ! count_only)
+             print FILENAME ":" $0
+         else if (! count_only)
+             print
+     }
+
+   The `END' rule takes care of producing the correct exit status. If
+there were no matches, the exit status is one, otherwise it is zero.
+
+     END    \
+     {
+         if (total == 0)
+             exit 1
+         exit 0
+     }
+
+   The `usage' function prints a usage message in case of invalid
+options and then exits.
+
+     function usage(    e)
+     {
+         e = "Usage: egrep [-csvil] [-e pat] [files ...]"
+         print e > "/dev/stderr"
+         exit 1
+     }
+
+   The variable `e' is used so that the function fits nicely on the
+printed page.
+
+   Just a note on programming style. You may have noticed that the `END'
+rule uses backslash continuation, with the open brace on a line by
+itself.  This is so that it more closely resembles the way functions
+are written.  Many of the examples use this style. You can decide for
+yourself if you like writing your `BEGIN' and `END' rules this way, or
+not.
+
+
+File: gawk.info,  Node: Id Program,  Next: Split Program,  Prev: Egrep Program,  Up: Clones
+
+Printing Out User Information
+-----------------------------
+
+   The `id' utility lists a user's real and effective user-id numbers,
+real and effective group-id numbers, and the user's group set, if any.
+`id' will only print the effective user-id and group-id if they are
+different from the real ones.  If possible, `id' will also supply the
+corresponding user and group names.  The output might look like this:
+
+     $ id
+     -| uid=2076(arnold) gid=10(staff) groups=10(staff),4(tty)
+
+   This information is exactly what is provided by `gawk''s `/dev/user'
+special file (*note Special File Names in `gawk': Special Files.).
+However, the `id' utility provides a more palatable output than just a
+string of numbers.
+
+   Here is a simple version of `id' written in `awk'.  It uses the user
+database library functions (*note Reading the User Database: Passwd
+Functions.), and the group database library functions (*note Reading
+the Group Database: Group Functions.).
+
+   The program is fairly straightforward.  All the work is done in the
+`BEGIN' rule.  The user and group id numbers are obtained from
+`/dev/user'.  If there is no support for `/dev/user', the program gives
+up.
+
+   The code is repetitive.  The entry in the user database for the real
+user-id number is split into parts at the `:'. The name is the first
+field.  Similar code is used for the effective user-id number, and the
+group numbers.
+
+     # id.awk --- implement id in awk
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     # output is:
+     # uid=12(foo) euid=34(bar) gid=3(baz) \
+     #             egid=5(blat) groups=9(nine),2(two),1(one)
+     
+     BEGIN    \
+     {
+         if ((getline < "/dev/user") < 0) {
+             err = "id: no /dev/user support - cannot run"
+             print err > "/dev/stderr"
+             exit 1
+         }
+         close("/dev/user")
+     
+         uid = $1
+         euid = $2
+         gid = $3
+         egid = $4
+     
+         printf("uid=%d", uid)
+         pw = getpwuid(uid)
+         if (pw != "") {
+             split(pw, a, ":")
+             printf("(%s)", a[1])
+         }
+     
+         if (euid != uid) {
+             printf(" euid=%d", euid)
+             pw = getpwuid(euid)
+             if (pw != "") {
+                 split(pw, a, ":")
+                 printf("(%s)", a[1])
+             }
+         }
+     
+         printf(" gid=%d", gid)
+         pw = getgrgid(gid)
+         if (pw != "") {
+             split(pw, a, ":")
+             printf("(%s)", a[1])
+         }
+     
+         if (egid != gid) {
+             printf(" egid=%d", egid)
+             pw = getgrgid(egid)
+             if (pw != "") {
+                 split(pw, a, ":")
+                 printf("(%s)", a[1])
+             }
+         }
+     
+         if (NF > 4) {
+             printf(" groups=");
+             for (i = 5; i <= NF; i++) {
+                 printf("%d", $i)
+                 pw = getgrgid($i)
+                 if (pw != "") {
+                     split(pw, a, ":")
+                     printf("(%s)", a[1])
+                 }
+                 if (i < NF)
+                     printf(",")
+             }
+         }
+         print ""
+     }
+
+
+File: gawk.info,  Node: Split Program,  Next: Tee Program,  Prev: Id Program,  Up: Clones
+
+Splitting a Large File Into Pieces
+----------------------------------
+
+   The `split' program splits large text files into smaller pieces. By
+default, the output files are named `xaa', `xab', and so on. Each file
+has 1000 lines in it, with the likely exception of the last file. To
+change the number of lines in each file, you supply a number on the
+command line preceded with a minus, e.g., `-500' for files with 500
+lines in them instead of 1000.  To change the name of the output files
+to something like `myfileaa', `myfileab', and so on, you supply an
+additional argument that specifies the filename.
+
+   Here is a version of `split' in `awk'. It uses the `ord' and `chr'
+functions presented in *Note Translating Between Characters and
+Numbers: Ordinal Functions.
+
+   The program first sets its defaults, and then tests to make sure
+there are not too many arguments.  It then looks at each argument in
+turn.  The first argument could be a minus followed by a number. If it
+is, this happens to look like a negative number, so it is made
+positive, and that is the count of lines.  The data file name is
+skipped over, and the final argument is used as the prefix for the
+output file names.
+
+     # split.awk --- do split in awk
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     # usage: split [-num] [file] [outname]
+     
+     BEGIN    \
+     {
+         outfile = "x"    # default
+         count = 1000
+         if (ARGC > 4)
+             usage()
+     
+         i = 1
+         if (ARGV[i] ~ /^-[0-9]+$/) {
+             count = -ARGV[i]
+             ARGV[i] = ""
+             i++
+         }
+         # test argv in case reading from stdin instead of file
+         if (i in ARGV)
+             i++    # skip data file name
+         if (i in ARGV) {
+             outfile = ARGV[i]
+             ARGV[i] = ""
+         }
+     
+         s1 = s2 = "a"
+         out = (outfile s1 s2)
+     }
+
+   The next rule does most of the work. `tcount' (temporary count)
+tracks how many lines have been printed to the output file so far. If
+it is greater than `count', it is time to close the current file and
+start a new one.  `s1' and `s2' track the current suffixes for the file
+name. If they are both `z', the file is just too big.  Otherwise, `s1'
+moves to the next letter in the alphabet and `s2' starts over again at
+`a'.
+
+     {
+         if (++tcount > count) {
+             close(out)
+             if (s2 == "z") {
+                 if (s1 == "z") {
+                     printf("split: %s is too large to split\n", \
+                            FILENAME) > "/dev/stderr"
+                     exit 1
+                 }
+                 s1 = chr(ord(s1) + 1)
+                 s2 = "a"
+             } else
+                 s2 = chr(ord(s2) + 1)
+             out = (outfile s1 s2)
+             tcount = 1
+         }
+         print > out
+     }
+
+   The `usage' function simply prints an error message and exits.
+
+     function usage(   e)
+     {
+         e = "usage: split [-num] [file] [outname]"
+         print e > "/dev/stderr"
+         exit 1
+     }
+
+The variable `e' is used so that the function fits nicely on the screen.
+
+   This program is a bit sloppy; it relies on `awk' to close the last
+file for it automatically, instead of doing it in an `END' rule.
+
+
+File: gawk.info,  Node: Tee Program,  Next: Uniq Program,  Prev: Split Program,  Up: Clones
+
+Duplicating Output Into Multiple Files
+--------------------------------------
+
+   The `tee' program is known as a "pipe fitting."  `tee' copies its
+standard input to its standard output, and also duplicates it to the
+files named on the command line.  Its usage is:
+
+     tee [-a] file ...
+
+   The `-a' option tells `tee' to append to the named files, instead of
+truncating them and starting over.
+
+   The `BEGIN' rule first makes a copy of all the command line
+arguments, into an array named `copy'.  `ARGV[0]' is not copied, since
+it is not needed.  `tee' cannot use `ARGV' directly, since `awk' will
+attempt to process each file named in `ARGV' as input data.
+
+   If the first argument is `-a', then the flag variable `append' is
+set to true, and both `ARGV[1]' and `copy[1]' are deleted. If `ARGC' is
+less than two, then no file names were supplied, and `tee' prints a
+usage message and exits.  Finally, `awk' is forced to read the standard
+input by setting `ARGV[1]' to `"-"', and `ARGC' to two.
+
+     # tee.awk --- tee in awk
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     # Revised December 1995
+     
+     BEGIN    \
+     {
+         for (i = 1; i < ARGC; i++)
+             copy[i] = ARGV[i]
+     
+         if (ARGV[1] == "-a") {
+             append = 1
+             delete ARGV[1]
+             delete copy[1]
+             ARGC--
+         }
+         if (ARGC < 2) {
+             print "usage: tee [-a] file ..." > "/dev/stderr"
+             exit 1
+         }
+         ARGV[1] = "-"
+         ARGC = 2
+     }
+
+   The single rule does all the work.  Since there is no pattern, it is
+executed for each line of input.  The body of the rule simply prints the
+line into each file on the command line, and then to the standard
+output.
+
+     {
+         # moving the if outside the loop makes it run faster
+         if (append)
+             for (i in copy)
+                 print >> copy[i]
+         else
+             for (i in copy)
+                 print > copy[i]
+         print
+     }
+
+   It would have been possible to code the loop this way:
+
+     for (i in copy)
+         if (append)
+             print >> copy[i]
+         else
+             print > copy[i]
+
+This is more concise, but it is also less efficient.  The `if' is
+tested for each record and for each output file.  By duplicating the
+loop body, the `if' is only tested once for each input record.  If
+there are N input records and M input files, the first method only
+executes N `if' statements, while the second would execute N`*'M `if'
+statements.
+
+   Finally, the `END' rule cleans up, by closing all the output files.
+
+     END    \
+     {
+         for (i in copy)
+             close(copy[i])
+     }
+
+
+File: gawk.info,  Node: Uniq Program,  Next: Wc Program,  Prev: Tee Program,  Up: Clones
+
+Printing Non-duplicated Lines of Text
+-------------------------------------
+
+   The `uniq' utility reads sorted lines of data on its standard input,
+and (by default) removes duplicate lines.  In other words, only unique
+lines are printed, hence the name.  `uniq' has a number of options. The
+usage is:
+
+     uniq [-udc [-N]] [+N] [ INPUT FILE [ OUTPUT FILE ]]
+
+   The option meanings are:
+
+`-d'
+     Only print repeated lines.
+
+`-u'
+     Only print non-repeated lines.
+
+`-c'
+     Count lines. This option overrides `-d' and `-u'.  Both repeated
+     and non-repeated lines are counted.
+
+`-N'
+     Skip N fields before comparing lines.  The definition of fields is
+     similar to `awk''s default: non-whitespace characters separated by
+     runs of spaces and/or tabs.
+
+`+N'
+     Skip N characters before comparing lines.  Any fields specified
+     with `-N' are skipped first.
+
+`INPUT FILE'
+     Data is read from the input file named on the command line,
+     instead of from the standard input.
+
+`OUTPUT FILE'
+     The generated output is sent to the named output file, instead of
+     to the standard output.
+
+   Normally `uniq' behaves as if both the `-d' and `-u' options had
+been provided.
+
+   Here is an `awk' implementation of `uniq'. It uses the `getopt'
+library function (*note Processing Command Line Options: Getopt
+Function.), and the `join' library function (*note Merging an Array
+Into a String: Join Function.).
+
+   The program begins with a `usage' function and then a brief outline
+of the options and their meanings in a comment.
+
+   The `BEGIN' rule deals with the command line arguments and options.
+It uses a trick to get `getopt' to handle options of the form `-25',
+treating such an option as the option letter `2' with an argument of
+`5'. If indeed two or more digits were supplied (`Optarg' looks like a
+number), `Optarg' is concatenated with the option digit, and then
+result is added to zero to make it into a number.  If there is only one
+digit in the option, then `Optarg' is not needed, and `Optind' must be
+decremented so that `getopt' will process it next time.  This code is
+admittedly a bit tricky.
+
+   If no options were supplied, then the default is taken, to print both
+repeated and non-repeated lines.  The output file, if provided, is
+assigned to `outputfile'.  Earlier, `outputfile' was initialized to the
+standard output, `/dev/stdout'.
+
+     # uniq.awk --- do uniq in awk
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     function usage(    e)
+     {
+         e = "Usage: uniq [-udc [-n]] [+n] [ in [ out ]]"
+         print e > "/dev/stderr"
+         exit 1
+     }
+     
+     # -c    count lines. overrides -d and -u
+     # -d    only repeated lines
+     # -u    only non-repeated lines
+     # -n    skip n fields
+     # +n    skip n characters, skip fields first
+     
+     BEGIN    \
+     {
+         count = 1
+         outputfile = "/dev/stdout"
+         opts = "udc0:1:2:3:4:5:6:7:8:9:"
+         while ((c = getopt(ARGC, ARGV, opts)) != -1) {
+             if (c == "u")
+                 non_repeated_only++
+             else if (c == "d")
+                 repeated_only++
+             else if (c == "c")
+                 do_count++
+             else if (index("0123456789", c) != 0) {
+                 # getopt requires args to options
+                 # this messes us up for things like -5
+                 if (Optarg ~ /^[0-9]+$/)
+                     fcount = (c Optarg) + 0
+                 else {
+                     fcount = c + 0
+                     Optind--
+                 }
+             } else
+                 usage()
+         }
+     
+         if (ARGV[Optind] ~ /^\+[0-9]+$/) {
+             charcount = substr(ARGV[Optind], 2) + 0
+             Optind++
+         }
+     
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+     
+         if (repeated_only == 0 && non_repeated_only == 0)
+             repeated_only = non_repeated_only = 1
+     
+         if (ARGC - Optind == 2) {
+             outputfile = ARGV[ARGC - 1]
+             ARGV[ARGC - 1] = ""
+         }
+     }
+
+   The following function, `are_equal', compares the current line,
+`$0', to the previous line, `last'.  It handles skipping fields and
+characters.
+
+   If no field count and no character count were specified, `are_equal'
+simply returns one or zero depending upon the result of a simple string
+comparison of `last' and `$0'.  Otherwise, things get more complicated.
+
+   If fields have to be skipped, each line is broken into an array using
+`split' (*note Built-in Functions for String Manipulation: String
+Functions.), and then the desired fields are joined back into a line
+using `join'.  The joined lines are stored in `clast' and `cline'.  If
+no fields are skipped, `clast' and `cline' are set to `last' and `$0'
+respectively.
+
+   Finally, if characters are skipped, `substr' is used to strip off the
+leading `charcount' characters in `clast' and `cline'.  The two strings
+are then compared, and `are_equal' returns the result.
+
+     function are_equal(    n, m, clast, cline, alast, aline)
+     {
+         if (fcount == 0 && charcount == 0)
+             return (last == $0)
+     
+         if (fcount > 0) {
+             n = split(last, alast)
+             m = split($0, aline)
+             clast = join(alast, fcount+1, n)
+             cline = join(aline, fcount+1, m)
+         } else {
+             clast = last
+             cline = $0
+         }
+         if (charcount) {
+             clast = substr(clast, charcount + 1)
+             cline = substr(cline, charcount + 1)
+         }
+     
+         return (clast == cline)
+     }
+
+   The following two rules are the body of the program.  The first one
+is executed only for the very first line of data.  It sets `last' equal
+to `$0', so that subsequent lines of text have something to be compared
+to.
+
+   The second rule does the work. The variable `equal' will be one or
+zero depending upon the results of `are_equal''s comparison. If `uniq'
+is counting repeated lines, then the `count' variable is incremented if
+the lines are equal. Otherwise the line is printed and `count' is
+reset, since the two lines are not equal.
+
+   If `uniq' is not counting, `count' is incremented if the lines are
+equal. Otherwise, if `uniq' is counting repeated lines, and more than
+one line has been seen, or if `uniq' is counting non-repeated lines,
+and only one line has been seen, then the line is printed, and `count'
+is reset.
+
+   Finally, similar logic is used in the `END' rule to print the final
+line of input data.
+
+     NR == 1 {
+         last = $0
+         next
+     }
+     
+     {
+         equal = are_equal()
+     
+         if (do_count) {    # overrides -d and -u
+             if (equal)
+                 count++
+             else {
+                 printf("%4d %s\n", count, last) > outputfile
+                 last = $0
+                 count = 1    # reset
+             }
+             next
+         }
+     
+         if (equal)
+             count++
+         else {
+             if ((repeated_only && count > 1) ||
+                 (non_repeated_only && count == 1))
+                     print last > outputfile
+             last = $0
+             count = 1
+         }
+     }
+     
+     END {
+         if (do_count)
+             printf("%4d %s\n", count, last) > outputfile
+         else if ((repeated_only && count > 1) ||
+                 (non_repeated_only && count == 1))
+             print last > outputfile
+     }
+
+
+File: gawk.info,  Node: Wc Program,  Prev: Uniq Program,  Up: Clones
+
+Counting Things
+---------------
+
+   The `wc' (word count) utility counts lines, words, and characters in
+one or more input files. Its usage is:
+
+     wc [-lwc] [ FILES ... ]
+
+   If no files are specified on the command line, `wc' reads its
+standard input. If there are multiple files, it will also print total
+counts for all the files.  The options and their meanings are:
+
+`-l'
+     Only count lines.
+
+`-w'
+     Only count words.  A "word" is a contiguous sequence of
+     non-whitespace characters, separated by spaces and/or tabs.
+     Happily, this is the normal way `awk' separates fields in its
+     input data.
+
+`-c'
+     Only count characters.
+
+   Implementing `wc' in `awk' is particularly elegant, since `awk' does
+a lot of the work for us; it splits lines into words (i.e.  fields) and
+counts them, it counts lines (i.e. records) for us, and it can easily
+tell us how long a line is.
+
+   This version uses the `getopt' library function (*note Processing
+Command Line Options: Getopt Function.), and the file transition
+functions (*note Noting Data File Boundaries: Filetrans Function.).
+
+   This version has one major difference from traditional versions of
+`wc'.  Our version always prints the counts in the order lines, words,
+and characters.  Traditional versions note the order of the `-l', `-w',
+and `-c' options on the command line, and print the counts in that
+order.
+
+   The `BEGIN' rule does the argument processing.  The variable
+`print_total' will be true if more than one file was named on the
+command line.
+
+     # wc.awk --- count lines, words, characters
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     # Options:
+     #    -l    only count lines
+     #    -w    only count words
+     #    -c    only count characters
+     #
+     # Default is to count lines, words, characters
+     
+     BEGIN {
+         # let getopt print a message about
+         # invalid options. we ignore them
+         while ((c = getopt(ARGC, ARGV, "lwc")) != -1) {
+             if (c == "l")
+                 do_lines = 1
+             else if (c == "w")
+                 do_words = 1
+             else if (c == "c")
+                 do_chars = 1
+         }
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+     
+         # if no options, do all
+         if (! do_lines && ! do_words && ! do_chars)
+             do_lines = do_words = do_chars = 1
+     
+         print_total = (ARGC - i > 2)
+     }
+
+   The `beginfile' function is simple; it just resets the counts of
+lines, words, and characters to zero, and saves the current file name in
+`fname'.
+
+   The `endfile' function adds the current file's numbers to the running
+totals of lines, words, and characters.  It then prints out those
+numbers for the file that was just read. It relies on `beginfile' to
+reset the numbers for the following data file.
+
+     function beginfile(file)
+     {
+         chars = lines = words = 0
+         fname = FILENAME
+     }
+     
+     function endfile(file)
+     {
+         tchars += chars
+         tlines += lines
+         twords += words
+         if (do_lines)
+             printf "\t%d", lines
+         if (do_words)
+             printf "\t%d", words
+         if (do_chars)
+             printf "\t%d", chars
+         printf "\t%s\n", fname
+     }
+
+   There is one rule that is executed for each line. It adds the length
+of the record to `chars'.  It has to add one, since the newline
+character separating records (the value of `RS') is not part of the
+record itself.  `lines' is incremented for each line read, and `words'
+is incremented by the value of `NF', the number of "words" on this
+line.(1)
+
+   Finally, the `END' rule simply prints the totals for all the files.
+
+     # do per line
+     {
+         chars += length($0) + 1    # get newline
+         lines++
+         words += NF
+     }
+     
+     END {
+         if (print_total) {
+             if (do_lines)
+                 printf "\t%d", tlines
+             if (do_words)
+                 printf "\t%d", twords
+             if (do_chars)
+                 printf "\t%d", tchars
+             print "\ttotal"
+         }
+     }
+
+   ---------- Footnotes ----------
+
+   (1) Examine the code in *Note Noting Data File Boundaries: Filetrans
+Function.  Why must `wc' use a separate `lines' variable, instead of
+using the value of `FNR' in `endfile'?
+
+
+File: gawk.info,  Node: Miscellaneous Programs,  Prev: Clones,  Up: Sample Programs
+
+A Grab Bag of `awk' Programs
+============================
+
+   This section is a large "grab bag" of miscellaneous programs.  We
+hope you find them both interesting and enjoyable.
+
+* Menu:
+
+* Dupword Program::         Finding duplicated words in a document.
+* Alarm Program::           An alarm clock.
+* Translate Program::       A program similar to the `tr' utility.
+* Labels Program::          Printing mailing labels.
+* Word Sorting::            A program to produce a word usage count.
+* History Sorting::         Eliminating duplicate entries from a history
+                            file.
+* Extract Program::         Pulling out programs from Texinfo source
+                            files.
+* Simple Sed::              A Simple Stream Editor.
+* Igawk Program::           A wrapper for `awk' that includes files.
+
+
+File: gawk.info,  Node: Dupword Program,  Next: Alarm Program,  Prev: Miscellaneous Programs,  Up: Miscellaneous Programs
+
+Finding Duplicated Words in a Document
+--------------------------------------
+
+   A common error when writing large amounts of prose is to accidentally
+duplicate words.  Often you will see this in text as something like "the
+the program does the following ...."  When the text is on-line, often
+the duplicated words occur at the end of one line and the beginning of
+another, making them very difficult to spot.
+
+   This program, `dupword.awk', scans through a file one line at a time,
+and looks for adjacent occurrences of the same word.  It also saves the
+last word on a line (in the variable `prev') for comparison with the
+first word on the next line.
+
+   The first two statements make sure that the line is all lower-case,
+so that, for example, "The" and "the" compare equal to each other.  The
+second statement removes all non-alphanumeric and non-whitespace
+characters from the line, so that punctuation does not affect the
+comparison either.  This sometimes leads to reports of duplicated words
+that really are different, but this is unusual.
+
+     # dupword --- find duplicate words in text
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # December 1991
+     
+     {
+         $0 = tolower($0)
+         gsub(/[^A-Za-z0-9 \t]/, "");
+         if ($1 == prev)
+             printf("%s:%d: duplicate %s\n",
+                 FILENAME, FNR, $1)
+         for (i = 2; i <= NF; i++)
+             if ($i == $(i-1))
+                 printf("%s:%d: duplicate %s\n",
+                     FILENAME, FNR, $i)
+         prev = $NF
+     }
+
+
+File: gawk.info,  Node: Alarm Program,  Next: Translate Program,  Prev: Dupword Program,  Up: Miscellaneous Programs
+
+An Alarm Clock Program
+----------------------
+
+   The following program is a simple "alarm clock" program.  You give
+it a time of day, and an optional message.  At the given time, it
+prints the message on the standard output. In addition, you can give it
+the number of times to repeat the message, and also a delay between
+repetitions.
+
+   This program uses the `gettimeofday' function from *Note Managing
+the Time of Day: Gettimeofday Function.
+
+   All the work is done in the `BEGIN' rule.  The first part is argument
+checking and setting of defaults; the delay, the count, and the message
+to print.  If the user supplied a message, but it does not contain the
+ASCII BEL character (known as the "alert" character, `\a'), then it is
+added to the message.  (On many systems, printing the ASCII BEL
+generates some sort of audible alert. Thus, when the alarm goes off,
+the system calls attention to itself, in case the user is not looking
+at their computer or terminal.)
+
+     # alarm --- set an alarm
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     # usage: alarm time [ "message" [ count [ delay ] ] ]
+     
+     BEGIN    \
+     {
+         # Initial argument sanity checking
+         usage1 = "usage: alarm time ['message' [count [delay]]]"
+         usage2 = sprintf("\t(%s) time ::= hh:mm", ARGV[1])
+     
+         if (ARGC < 2) {
+             print usage > "/dev/stderr"
+             exit 1
+         } else if (ARGC == 5) {
+             delay = ARGV[4] + 0
+             count = ARGV[3] + 0
+             message = ARGV[2]
+         } else if (ARGC == 4) {
+             count = ARGV[3] + 0
+             message = ARGV[2]
+         } else if (ARGC == 3) {
+             message = ARGV[2]
+         } else if (ARGV[1] !~ /[0-9]?[0-9]:[0-9][0-9]/) {
+             print usage1 > "/dev/stderr"
+             print usage2 > "/dev/stderr"
+             exit 1
+         }
+     
+         # set defaults for once we reach the desired time
+         if (delay == 0)
+             delay = 180    # 3 minutes
+         if (count == 0)
+             count = 5
+         if (message == "")
+             message = sprintf("\aIt is now %s!\a", ARGV[1])
+         else if (index(message, "\a") == 0)
+             message = "\a" message "\a"
+
+   The next section of code turns the alarm time into hours and minutes,
+and converts it if necessary to a 24-hour clock.  Then it turns that
+time into a count of the seconds since midnight.  Next it turns the
+current time into a count of seconds since midnight.  The difference
+between the two is how long to wait before setting off the alarm.
+
+         # split up dest time
+         split(ARGV[1], atime, ":")
+         hour = atime[1] + 0    # force numeric
+         minute = atime[2] + 0  # force numeric
+     
+         # get current broken down time
+         gettimeofday(now)
+     
+         # if time given is 12-hour hours and it's after that
+         # hour, e.g., `alarm 5:30' at 9 a.m. means 5:30 p.m.,
+         # then add 12 to real hour
+         if (hour < 12 && now["hour"] > hour)
+             hour += 12
+     
+         # set target time in seconds since midnight
+         target = (hour * 60 * 60) + (minute * 60)
+     
+         # get current time in seconds since midnight
+         current = (now["hour"] * 60 * 60) + \
+                    (now["minute"] * 60) + now["second"]
+     
+         # how long to sleep for
+         naptime = target - current
+         if (naptime <= 0) {
+             print "time is in the past!" > "/dev/stderr"
+             exit 1
+         }
+
+   Finally, the program uses the `system' function (*note Built-in
+Functions for Input/Output: I/O Functions.)  to call the `sleep'
+utility.  The `sleep' utility simply pauses for the given number of
+seconds.  If the exit status is not zero, the program assumes that
+`sleep' was interrupted, and exits. If `sleep' exited with an OK status
+(zero), then the program prints the message in a loop, again using
+`sleep' to delay for however many seconds are necessary.
+
+         # zzzzzz..... go away if interrupted
+         if (system(sprintf("sleep %d", naptime)) != 0)
+             exit 1
+     
+         # time to notify!
+         command = sprintf("sleep %d", delay)
+         for (i = 1; i <= count; i++) {
+             print message
+             # if sleep command interrupted, go away
+             if (system(command) != 0)
+                 break
+         }
+     
+         exit 0
+     }
+
+
+File: gawk.info,  Node: Translate Program,  Next: Labels Program,  Prev: Alarm Program,  Up: Miscellaneous Programs
+
+Transliterating Characters
+--------------------------
+
+   The system `tr' utility transliterates characters.  For example, it
+is often used to map upper-case letters into lower-case, for further
+processing.
+
+     GENERATE DATA | tr '[A-Z]' '[a-z]' | PROCESS DATA ...
+
+   You give `tr' two lists of characters enclosed in square brackets.
+Usually, the lists are quoted to keep the shell from attempting to do a
+filename expansion.(1)  When processing the input, the first character
+in the first list is replaced with the first character in the second
+list, the second character in the first list is replaced with the
+second character in the second list, and so on.  If there are more
+characters in the "from" list than in the "to" list, the last character
+of the "to" list is used for the remaining characters in the "from"
+list.
+
+   Some time ago, a user proposed to us that we add a transliteration
+function to `gawk'.  Being opposed to "creeping featurism," I wrote the
+following program to prove that character transliteration could be done
+with a user-level function.  This program is not as complete as the
+system `tr' utility, but it will do most of the job.
+
+   The `translate' program demonstrates one of the few weaknesses of
+standard `awk': dealing with individual characters is very painful,
+requiring repeated use of the `substr', `index', and `gsub' built-in
+functions (*note Built-in Functions for String Manipulation: String
+Functions.).(2)
+
+   There are two functions.  The first, `stranslate', takes three
+arguments.
+
+`from'
+     A list of characters to translate from.
+
+`to'
+     A list of characters to translate to.
+
+`target'
+     The string to do the translation on.
+
+   Associative arrays make the translation part fairly easy. `t_ar'
+holds the "to" characters, indexed by the "from" characters.  Then a
+simple loop goes through `from', one character at a time.  For each
+character in `from', if the character appears in `target', `gsub' is
+used to change it to the corresponding `to' character.
+
+   The `translate' function simply calls `stranslate' using `$0' as the
+target.  The main program sets two global variables, `FROM' and `TO',
+from the command line, and then changes `ARGV' so that `awk' will read
+from the standard input.
+
+   Finally, the processing rule simply calls `translate' for each
+record.
+
+     # translate --- do tr like stuff
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # August 1989
+     
+     # bugs: does not handle things like: tr A-Z a-z, it has
+     # to be spelled out. However, if `to' is shorter than `from',
+     # the last character in `to' is used for the rest of `from'.
+     
+     function stranslate(from, to, target,     lf, lt, t_ar, i, c)
+     {
+         lf = length(from)
+         lt = length(to)
+         for (i = 1; i <= lt; i++)
+             t_ar[substr(from, i, 1)] = substr(to, i, 1)
+         if (lt < lf)
+             for (; i <= lf; i++)
+                 t_ar[substr(from, i, 1)] = substr(to, lt, 1)
+         for (i = 1; i <= lf; i++) {
+             c = substr(from, i, 1)
+             if (index(target, c) > 0)
+                 gsub(c, t_ar[c], target)
+         }
+         return target
+     }
+     
+     function translate(from, to)
+     {
+         return $0 = stranslate(from, to, $0)
+     }
+     
+     # main program
+     BEGIN {
+         if (ARGC < 3) {
+             print "usage: translate from to" > "/dev/stderr"
+             exit
+         }
+         FROM = ARGV[1]
+         TO = ARGV[2]
+         ARGC = 2
+         ARGV[1] = "-"
+     }
+     
+     {
+         translate(FROM, TO)
+         print
+     }
+
+   While it is possible to do character transliteration in a user-level
+function, it is not necessarily efficient, and we started to consider
+adding a built-in function.  However, shortly after writing this
+program, we learned that the System V Release 4 `awk' had added the
+`toupper' and `tolower' functions.  These functions handle the vast
+majority of the cases where character transliteration is necessary, and
+so we chose to simply add those functions to `gawk' as well, and then
+leave well enough alone.
+
+   An obvious improvement to this program would be to set up the `t_ar'
+array only once, in a `BEGIN' rule. However, this assumes that the
+"from" and "to" lists will never change throughout the lifetime of the
+program.
+
+   ---------- Footnotes ----------
+
+   (1) On older, non-POSIX systems, `tr' often does not require that
+the lists be enclosed in square brackets and quoted.  This is a feature.
+
+   (2) This program was written before `gawk' acquired the ability to
+split each character in a string into separate array elements.  How
+might this ability simplify the program?
+
+
+File: gawk.info,  Node: Labels Program,  Next: Word Sorting,  Prev: Translate Program,  Up: Miscellaneous Programs
+
+Printing Mailing Labels
+-----------------------
+
+   Here is a "real world"(1) program.  This script reads lists of names
+and addresses, and generates mailing labels.  Each page of labels has
+20 labels on it, two across and ten down.  The addresses are guaranteed
+to be no more than five lines of data.  Each address is separated from
+the next by a blank line.
+
+   The basic idea is to read 20 labels worth of data.  Each line of
+each label is stored in the `line' array.  The single rule takes care
+of filling the `line' array and printing the page when 20 labels have
+been read.
+
+   The `BEGIN' rule simply sets `RS' to the empty string, so that `awk'
+will split records at blank lines (*note How Input is Split into
+Records: Records.).  It sets `MAXLINES' to 100, since `MAXLINE' is the
+maximum number of lines on the page (20 * 5 = 100).
+
+   Most of the work is done in the `printpage' function.  The label
+lines are stored sequentially in the `line' array.  But they have to be
+printed horizontally; `line[1]' next to `line[6]', `line[2]' next to
+`line[7]', and so on.  Two loops are used to accomplish this.  The
+outer loop, controlled by `i', steps through every 10 lines of data;
+this is each row of labels.  The inner loop, controlled by `j', goes
+through the lines within the row.  As `j' goes from zero to four, `i+j'
+is the `j''th line in the row, and `i+j+5' is the entry next to it.
+The output ends up looking something like this:
+
+     line 1          line 6
+     line 2          line 7
+     line 3          line 8
+     line 4          line 9
+     line 5          line 10
+
+   As a final note, at lines 21 and 61, an extra blank line is printed,
+to keep the output lined up on the labels.  This is dependent on the
+particular brand of labels in use when the program was written.  You
+will also note that there are two blank lines at the top and two blank
+lines at the bottom.
+
+   The `END' rule arranges to flush the final page of labels; there may
+not have been an even multiple of 20 labels in the data.
+
+     # labels.awk
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # June 1992
+     
+     # Program to print labels.  Each label is 5 lines of data
+     # that may have blank lines.  The label sheets have 2
+     # blank lines at the top and 2 at the bottom.
+     
+     BEGIN    { RS = "" ; MAXLINES = 100 }
+     
+     function printpage(    i, j)
+     {
+         if (Nlines <= 0)
+             return
+     
+         printf "\n\n"        # header
+     
+         for (i = 1; i <= Nlines; i += 10) {
+             if (i == 21 || i == 61)
+                 print ""
+             for (j = 0; j < 5; j++) {
+                 if (i + j > MAXLINES)
+                     break
+                 printf "   %-41s %s\n", line[i+j], line[i+j+5]
+             }
+             print ""
+         }
+     
+         printf "\n\n"        # footer
+     
+         for (i in line)
+             line[i] = ""
+     }
+     
+     # main rule
+     {
+         if (Count >= 20) {
+             printpage()
+             Count = 0
+             Nlines = 0
+         }
+         n = split($0, a, "\n")
+         for (i = 1; i <= n; i++)
+             line[++Nlines] = a[i]
+         for (; i <= 5; i++)
+             line[++Nlines] = ""
+         Count++
+     }
+     
+     END    \
+     {
+         printpage()
+     }
+
+   ---------- Footnotes ----------
+
+   (1) "Real world" is defined as "a program actually used to get
+something done."
+
+
+File: gawk.info,  Node: Word Sorting,  Next: History Sorting,  Prev: Labels Program,  Up: Miscellaneous Programs
+
+Generating Word Usage Counts
+----------------------------
+
+   The following `awk' program prints the number of occurrences of each
+word in its input.  It illustrates the associative nature of `awk'
+arrays by using strings as subscripts.  It also demonstrates the `for X
+in ARRAY' construction.  Finally, it shows how `awk' can be used in
+conjunction with other utility programs to do a useful task of some
+complexity with a minimum of effort.  Some explanations follow the
+program listing.
+
+     awk '
+     # Print list of word frequencies
+     {
+         for (i = 1; i <= NF; i++)
+             freq[$i]++
+     }
+     
+     END {
+         for (word in freq)
+             printf "%s\t%d\n", word, freq[word]
+     }'
+
+   The first thing to notice about this program is that it has two
+rules.  The first rule, because it has an empty pattern, is executed on
+every line of the input.  It uses `awk''s field-accessing mechanism
+(*note Examining Fields: Fields.) to pick out the individual words from
+the line, and the built-in variable `NF' (*note Built-in Variables::)
+to know how many fields are available.
+
+   For each input word, an element of the array `freq' is incremented to
+reflect that the word has been seen an additional time.
+
+   The second rule, because it has the pattern `END', is not executed
+until the input has been exhausted.  It prints out the contents of the
+`freq' table that has been built up inside the first action.
+
+   This program has several problems that would prevent it from being
+useful by itself on real text files:
+
+   * Words are detected using the `awk' convention that fields are
+     separated by whitespace and that other characters in the input
+     (except newlines) don't have any special meaning to `awk'.  This
+     means that punctuation characters count as part of words.
+
+   * The `awk' language considers upper- and lower-case characters to be
+     distinct.  Therefore, `bartender' and `Bartender' are not treated
+     as the same word.  This is undesirable since, in normal text, words
+     are capitalized if they begin sentences, and a frequency analyzer
+     should not be sensitive to capitalization.
+
+   * The output does not come out in any useful order.  You're more
+     likely to be interested in which words occur most frequently, or
+     having an alphabetized table of how frequently each word occurs.
+
+   The way to solve these problems is to use some of the more advanced
+features of the `awk' language.  First, we use `tolower' to remove case
+distinctions.  Next, we use `gsub' to remove punctuation characters.
+Finally, we use the system `sort' utility to process the output of the
+`awk' script.  Here is the new version of the program:
+
+     # Print list of word frequencies
+     {
+         $0 = tolower($0)    # remove case distinctions
+         gsub(/[^a-z0-9_ \t]/, "", $0)  # remove punctuation
+         for (i = 1; i <= NF; i++)
+             freq[$i]++
+     }
+     
+     END {
+         for (word in freq)
+             printf "%s\t%d\n", word, freq[word]
+     }
+
+   Assuming we have saved this program in a file named `wordfreq.awk',
+and that the data is in `file1', the following pipeline
+
+     awk -f wordfreq.awk file1 | sort +1 -nr
+
+produces a table of the words appearing in `file1' in order of
+decreasing frequency.
+
+   The `awk' program suitably massages the data and produces a word
+frequency table, which is not ordered.
+
+   The `awk' script's output is then sorted by the `sort' utility and
+printed on the terminal.  The options given to `sort' in this example
+specify to sort using the second field of each input line (skipping one
+field), that the sort keys should be treated as numeric quantities
+(otherwise `15' would come before `5'), and that the sorting should be
+done in descending (reverse) order.
+
+   We could have even done the `sort' from within the program, by
+changing the `END' action to:
+
+     END {
+         sort = "sort +1 -nr"
+         for (word in freq)
+             printf "%s\t%d\n", word, freq[word] | sort
+         close(sort)
+     }
+
+   You would have to use this way of sorting on systems that do not
+have true pipes.
+
+   See the general operating system documentation for more information
+on how to use the `sort' program.
+
+
+File: gawk.info,  Node: History Sorting,  Next: Extract Program,  Prev: Word Sorting,  Up: Miscellaneous Programs
+
+Removing Duplicates from Unsorted Text
+--------------------------------------
+
+   The `uniq' program (*note Printing Non-duplicated Lines of Text:
+Uniq Program.), removes duplicate lines from _sorted_ data.
+
+   Suppose, however, you need to remove duplicate lines from a data
+file, but that you wish to preserve the order the lines are in?  A good
+example of this might be a shell history file.  The history file keeps
+a copy of all the commands you have entered, and it is not unusual to
+repeat a command several times in a row.  Occasionally you might wish
+to compact the history by removing duplicate entries.  Yet it is
+desirable to maintain the order of the original commands.
+
+   This simple program does the job.  It uses two arrays.  The `data'
+array is indexed by the text of each line.  For each line, `data[$0]'
+is incremented.
+
+   If a particular line has not been seen before, then `data[$0]' will
+be zero.  In that case, the text of the line is stored in
+`lines[count]'.  Each element of `lines' is a unique command, and the
+indices of `lines' indicate the order in which those lines were
+encountered.  The `END' rule simply prints out the lines, in order.
+
+     # histsort.awk --- compact a shell history file
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     # Thanks to Byron Rakitzis for the general idea
+     {
+         if (data[$0]++ == 0)
+             lines[++count] = $0
+     }
+     
+     END {
+         for (i = 1; i <= count; i++)
+             print lines[i]
+     }
+
+   This program also provides a foundation for generating other useful
+information.  For example, using the following `print' satement in the
+`END' rule would indicate how often a particular command was used.
+
+     print data[lines[i]], lines[i]
+
+   This works because `data[$0]' was incremented each time a line was
+seen.
+
+
+File: gawk.info,  Node: Extract Program,  Next: Simple Sed,  Prev: History Sorting,  Up: Miscellaneous Programs
+
+Extracting Programs from Texinfo Source Files
+---------------------------------------------
+
+   The nodes *Note A Library of `awk' Functions: Library Functions, and
+*Note Practical `awk' Programs: Sample Programs, are the top level
+nodes for a large number of `awk' programs.  If you wish to experiment
+with these programs, it is tedious to have to type them in by hand.
+Here we present a program that can extract parts of a Texinfo input
+file into separate files.
+
+   This Info file is written in Texinfo, the GNU project's document
+formatting language.  A single Texinfo source file can be used to
+produce both printed and on-line documentation.  The Texinfo language
+is described fully, starting with *Note Introduction: (texi)Top.
+
+   For our purposes, it is enough to know three things about Texinfo
+input files.
+
+   * The "at" symbol, `@', is special in Texinfo, much like `\' in C or
+     `awk'.  Literal `@' symbols are represented in Texinfo source
+     files as `@@'.
+
+   * Comments start with either `@c' or `@comment'.  The file
+     extraction program will work by using special comments that start
+     at the beginning of a line.
+
+   * Example text that should not be split across a page boundary is
+     bracketed between lines containing `@group' and `@end group'
+     commands.
+
+   The following program, `extract.awk', reads through a Texinfo source
+file, and does two things, based on the special comments.  Upon seeing
+`@c system ...', it runs a command, by extracting the command text from
+the control line and passing it on to the `system' function (*note
+Built-in Functions for Input/Output: I/O Functions.).  Upon seeing `@c
+file FILENAME', each subsequent line is sent to the file FILENAME,
+until `@c endfile' is encountered.  The rules in `extract.awk' will
+match either `@c' or `@comment' by letting the `omment' part be
+optional.  Lines containing `@group' and `@end group' are simply
+removed.  `extract.awk' uses the `join' library function (*note Merging
+an Array Into a String: Join Function.).
+
+   The example programs in the on-line Texinfo source for `The GNU Awk
+User's Guide' (`gawk.texi') have all been bracketed inside `file', and
+`endfile' lines.  The `gawk' distribution uses a copy of `extract.awk'
+to extract the sample programs and install many of them in a standard
+directory, where `gawk' can find them.
+
+   `extract.awk' begins by setting `IGNORECASE' to one, so that mixed
+upper-case and lower-case letters in the directives won't matter.
+
+   The first rule handles calling `system', checking that a command was
+given (`NF' is at least three), and also checking that the command
+exited with a zero exit status, signifying OK.
+
+     # extract.awk --- extract files and run programs
+     #                 from texinfo files
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # May 1993
+     
+     BEGIN    { IGNORECASE = 1 }
+     
+     /^@c(omment)?[ \t]+system/    \
+     {
+         if (NF < 3) {
+             e = (FILENAME ":" FNR)
+             e = (e  ": badly formed `system' line")
+             print e > "/dev/stderr"
+             next
+         }
+         $1 = ""
+         $2 = ""
+         stat = system($0)
+         if (stat != 0) {
+             e = (FILENAME ":" FNR)
+             e = (e ": warning: system returned " stat)
+             print e > "/dev/stderr"
+         }
+     }
+
+The variable `e' is used so that the function fits nicely on the screen.
+
+   The second rule handles moving data into files.  It verifies that a
+file name was given in the directive.  If the file named is not the
+current file, then the current file is closed.  This means that an `@c
+endfile' was not given for that file.  (We should probably print a
+diagnostic in this case, although at the moment we do not.)
+
+   The `for' loop does the work.  It reads lines using `getline' (*note
+Explicit Input with `getline': Getline.).  For an unexpected end of
+file, it calls the `unexpected_eof' function.  If the line is an
+"endfile" line, then it breaks out of the loop.  If the line is an
+`@group' or `@end group' line, then it ignores it, and goes on to the
+next line.
+
+   Most of the work is in the following few lines.  If the line has no
+`@' symbols, it can be printed directly.  Otherwise, each leading `@'
+must be stripped off.
+
+   To remove the `@' symbols, the line is split into separate elements
+of the array `a', using the `split' function (*note Built-in Functions
+for String Manipulation: String Functions.).  Each element of `a' that
+is empty indicates two successive `@' symbols in the original line.
+For each two empty elements (`@@' in the original file), we have to add
+back in a single `@' symbol.
+
+   When the processing of the array is finished, `join' is called with
+the value of `SUBSEP', to rejoin the pieces back into a single line.
+That line is then printed to the output file.
+
+     /^@c(omment)?[ \t]+file/    \
+     {
+         if (NF != 3) {
+             e = (FILENAME ":" FNR ": badly formed `file' line")
+             print e > "/dev/stderr"
+             next
+         }
+         if ($3 != curfile) {
+             if (curfile != "")
+                 close(curfile)
+             curfile = $3
+         }
+     
+         for (;;) {
+             if ((getline line) <= 0)
+                 unexpected_eof()
+             if (line ~ /^@c(omment)?[ \t]+endfile/)
+                 break
+             else if (line ~ /^@(end[ \t]+)?group/)
+                 continue
+             if (index(line, "@") == 0) {
+                 print line > curfile
+                 continue
+             }
+             n = split(line, a, "@")
+             # if a[1] == "", means leading @,
+             # don't add one back in.
+             for (i = 2; i <= n; i++) {
+                 if (a[i] == "") { # was an @@
+                     a[i] = "@"
+                     if (a[i+1] == "")
+                         i++
+                 }
+             }
+             print join(a, 1, n, SUBSEP) > curfile
+         }
+     }
+
+   An important thing to note is the use of the `>' redirection.
+Output done with `>' only opens the file once; it stays open and
+subsequent output is appended to the file (*note Redirecting Output of
+`print' and `printf': Redirection.).  This allows us to easily mix
+program text and explanatory prose for the same sample source file (as
+has been done here!) without any hassle.  The file is only closed when
+a new data file name is encountered, or at the end of the input file.
+
+   Finally, the function `unexpected_eof' prints an appropriate error
+message and then exits.
+
+   The `END' rule handles the final cleanup, closing the open file.
+
+     function unexpected_eof()
+     {
+         printf("%s:%d: unexpected EOF or error\n", \
+             FILENAME, FNR) > "/dev/stderr"
+         exit 1
+     }
+     
+     END {
+         if (curfile)
+             close(curfile)
+     }
+
+
+File: gawk.info,  Node: Simple Sed,  Next: Igawk Program,  Prev: Extract Program,  Up: Miscellaneous Programs
+
+A Simple Stream Editor
+----------------------
+
+   The `sed' utility is a "stream editor," a program that reads a
+stream of data, makes changes to it, and passes the modified data on.
+It is often used to make global changes to a large file, or to a stream
+of data generated by a pipeline of commands.
+
+   While `sed' is a complicated program in its own right, its most
+common use is to perform global substitutions in the middle of a
+pipeline:
+
+     command1 < orig.data | sed 's/old/new/g' | command2 > result
+
+   Here, the `s/old/new/g' tells `sed' to look for the regexp `old' on
+each input line, and replace it with the text `new', globally (i.e. all
+the occurrences on a line).  This is similar to `awk''s `gsub' function
+(*note Built-in Functions for String Manipulation: String Functions.).
+
+   The following program, `awksed.awk', accepts at least two command
+line arguments; the pattern to look for and the text to replace it
+with. Any additional arguments are treated as data file names to
+process. If none are provided, the standard input is used.
+
+     # awksed.awk --- do s/foo/bar/g using just print
+     #    Thanks to Michael Brennan for the idea
+     
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # August 1995
+     
+     function usage()
+     {
+         print "usage: awksed pat repl [files...]" > "/dev/stderr"
+         exit 1
+     }
+     
+     BEGIN {
+         # validate arguments
+         if (ARGC < 3)
+             usage()
+     
+         RS = ARGV[1]
+         ORS = ARGV[2]
+     
+         # don't use arguments as files
+         ARGV[1] = ARGV[2] = ""
+     }
+     
+     # look ma, no hands!
+     {
+         if (RT == "")
+             printf "%s", $0
+         else
+             print
+     }
+
+   The program relies on `gawk''s ability to have `RS' be a regexp and
+on the setting of `RT' to the actual text that terminated the record
+(*note How Input is Split into Records: Records.).
+
+   The idea is to have `RS' be the pattern to look for. `gawk' will
+automatically set `$0' to the text between matches of the pattern.
+This is text that we wish to keep, unmodified.  Then, by setting `ORS'
+to the replacement text, a simple `print' statement will output the
+text we wish to keep, followed by the replacement text.
+
+   There is one wrinkle to this scheme, which is what to do if the last
+record doesn't end with text that matches `RS'?  Using a `print'
+statement unconditionally prints the replacement text, which is not
+correct.
+
+   However, if the file did not end in text that matches `RS', `RT'
+will be set to the null string.  In this case, we can print `$0' using
+`printf' (*note Using `printf' Statements for Fancier Printing:
+Printf.).
+
+   The `BEGIN' rule handles the setup, checking for the right number of
+arguments, and calling `usage' if there is a problem. Then it sets `RS'
+and `ORS' from the command line arguments, and sets `ARGV[1]' and
+`ARGV[2]' to the null string, so that they will not be treated as file
+names (*note Using `ARGC' and `ARGV': ARGC and ARGV.).
+
+   The `usage' function prints an error message and exits.
+
+   Finally, the single rule handles the printing scheme outlined above,
+using `print' or `printf' as appropriate, depending upon the value of
+`RT'.
+
+
+File: gawk.info,  Node: Igawk Program,  Prev: Simple Sed,  Up: Miscellaneous Programs
+
+An Easy Way to Use Library Functions
+------------------------------------
+
+   Using library functions in `awk' can be very beneficial. It
+encourages code re-use and the writing of general functions. Programs
+are smaller, and therefore clearer.  However, using library functions
+is only easy when writing `awk' programs; it is painful when running
+them, requiring multiple `-f' options.  If `gawk' is unavailable, then
+so too is the `AWKPATH' environment variable and the ability to put
+`awk' functions into a library directory (*note Command Line Options:
+Options.).
+
+   It would be nice to be able to write programs like so:
+
+     # library functions
+     @include getopt.awk
+     @include join.awk
+     ...
+     
+     # main program
+     BEGIN {
+         while ((c = getopt(ARGC, ARGV, "a:b:cde")) != -1)
+             ...
+         ...
+     }
+
+   The following program, `igawk.sh', provides this service.  It
+simulates `gawk''s searching of the `AWKPATH' variable, and also allows
+"nested" includes; i.e. a file that has been included with `@include'
+can contain further `@include' statements.  `igawk' will make an effort
+to only include files once, so that nested includes don't accidentally
+include a library function twice.
+
+   `igawk' should behave externally just like `gawk'.  This means it
+should accept all of `gawk''s command line arguments, including the
+ability to have multiple source files specified via `-f', and the
+ability to mix command line and library source files.
+
+   The program is written using the POSIX Shell (`sh') command language.
+The way the program works is as follows:
+
+  1. Loop through the arguments, saving anything that doesn't represent
+     `awk' source code for later, when the expanded program is run.
+
+  2. For any arguments that do represent `awk' text, put the arguments
+     into a temporary file that will be expanded.  There are two cases.
+
+       a. Literal text, provided with `--source' or `--source='.  This
+          text is just echoed directly.  The `echo' program will
+          automatically supply a trailing newline.
+
+       b. File names provided with `-f'.  We use a neat trick, and echo
+          `@include FILENAME' into the temporary file.  Since the file
+          inclusion program will work the way `gawk' does, this will
+          get the text of the file included into the program at the
+          correct point.
+
+  3. Run an `awk' program (naturally) over the temporary file to expand
+     `@include' statements.  The expanded program is placed in a second
+     temporary file.
+
+  4. Run the expanded program with `gawk' and any other original
+     command line arguments that the user supplied (such as the data
+     file names).
+
+   The initial part of the program turns on shell tracing if the first
+argument was `debug'.  Otherwise, a shell `trap' statement arranges to
+clean up any temporary files on program exit or upon an interrupt.
+
+   The next part loops through all the command line arguments.  There
+are several cases of interest.
+
+`--'
+     This ends the arguments to `igawk'.  Anything else should be
+     passed on to the user's `awk' program without being evaluated.
+
+`-W'
+     This indicates that the next option is specific to `gawk'.  To make
+     argument processing easier, the `-W' is appended to the front of
+     the remaining arguments and the loop continues.  (This is an `sh'
+     programming trick.  Don't worry about it if you are not familiar
+     with `sh'.)
+
+`-v'
+`-F'
+     These are saved and passed on to `gawk'.
+
+`-f'
+`--file'
+`--file='
+`-Wfile='
+     The file name is saved to the temporary file `/tmp/ig.s.$$' with an
+     `@include' statement.  The `sed' utility is used to remove the
+     leading option part of the argument (e.g., `--file=').
+
+`--source'
+`--source='
+`-Wsource='
+     The source text is echoed into `/tmp/ig.s.$$'.
+
+`--version'
+`--version'
+`-Wversion'
+     `igawk' prints its version number, and runs `gawk --version' to
+     get the `gawk' version information, and then exits.
+
+   If none of `-f', `--file', `-Wfile', `--source', or `-Wsource', were
+supplied, then the first non-option argument should be the `awk'
+program.  If there are no command line arguments left, `igawk' prints
+an error message and exits.  Otherwise, the first argument is echoed
+into `/tmp/ig.s.$$'.
+
+   In any case, after the arguments have been processed, `/tmp/ig.s.$$'
+contains the complete text of the original `awk' program.
+
+   The `$$' in `sh' represents the current process ID number.  It is
+often used in shell programs to generate unique temporary file names.
+This allows multiple users to run `igawk' without worrying that the
+temporary file names will clash.
+
+   Here's the program:
+
+     #! /bin/sh
+     
+     # igawk --- like gawk but do @include processing
+     # Arnold Robbins, arnold@gnu.ai.mit.edu, Public Domain
+     # July 1993
+     
+     if [ "$1" = debug ]
+     then
+         set -x
+         shift
+     else
+         # cleanup on exit, hangup, interrupt, quit, termination
+         trap 'rm -f /tmp/ig.[se].$$' 0 1 2 3 15
+     fi
+     
+     while [ $# -ne 0 ] # loop over arguments
+     do
+         case $1 in
+         --)     shift; break;;
+     
+         -W)     shift
+                 set -- -W"$@"
+                 continue;;
+     
+         -[vF])  opts="$opts $1 '$2'"
+                 shift;;
+     
+         -[vF]*) opts="$opts '$1'" ;;
+     
+         -f)     echo @include "$2" >> /tmp/ig.s.$$
+                 shift;;
+     
+         -f*)    f=`echo "$1" | sed 's/-f//'`
+                 echo @include "$f" >> /tmp/ig.s.$$ ;;
+     
+         -?file=*)    # -Wfile or --file
+                 f=`echo "$1" | sed 's/-.file=//'`
+                 echo @include "$f" >> /tmp/ig.s.$$ ;;
+     
+         -?file)    # get arg, $2
+                 echo @include "$2" >> /tmp/ig.s.$$
+                 shift;;
+     
+         -?source=*)    # -Wsource or --source
+                 t=`echo "$1" | sed 's/-.source=//'`
+                 echo "$t" >> /tmp/ig.s.$$ ;;
+     
+         -?source)  # get arg, $2
+                 echo "$2" >> /tmp/ig.s.$$
+                 shift;;
+     
+         -?version)
+                 echo igawk: version 1.0 1>&2
+                 gawk --version
+                 exit 0 ;;
+     
+         -[W-]*)    opts="$opts '$1'" ;;
+     
+         *)      break;;
+         esac
+         shift
+     done
+     
+     if [ ! -s /tmp/ig.s.$$ ]
+     then
+         if [ -z "$1" ]
+         then
+              echo igawk: no program! 1>&2
+              exit 1
+         else
+             echo "$1" > /tmp/ig.s.$$
+             shift
+         fi
+     fi
+     
+     # at this point, /tmp/ig.s.$$ has the program
+
+   The `awk' program to process `@include' directives reads through the
+program, one line at a time using `getline' (*note Explicit Input with
+`getline': Getline.).  The input file names and `@include' statements
+are managed using a stack.  As each `@include' is encountered, the
+current file name is "pushed" onto the stack, and the file named in the
+`@include' directive becomes the current file name.  As each file is
+finished, the stack is "popped," and the previous input file becomes
+the current input file again.  The process is started by making the
+original file the first one on the stack.
+
+   The `pathto' function does the work of finding the full path to a
+file.  It simulates `gawk''s behavior when searching the `AWKPATH'
+environment variable (*note The `AWKPATH' Environment Variable: AWKPATH
+Variable.).  If a file name has a `/' in it, no path search is done.
+Otherwise, the file name is concatenated with the name of each
+directory in the path, and an attempt is made to open the generated file
+name.  The only way in `awk' to test if a file can be read is to go
+ahead and try to read it with `getline'; that is what `pathto' does.
+If the file can be read, it is closed, and the file name is returned.
+
+     gawk -- '
+     # process @include directives
+     
+     function pathto(file,    i, t, junk)
+     {
+         if (index(file, "/") != 0)
+             return file
+     
+         for (i = 1; i <= ndirs; i++) {
+             t = (pathlist[i] "/" file)
+             if ((getline junk < t) > 0) {
+                 # found it
+                 close(t)
+                 return t
+             }
+         }
+         return ""
+     }
+
+   The main program is contained inside one `BEGIN' rule.  The first
+thing it does is set up the `pathlist' array that `pathto' uses.  After
+splitting the path on `:', null elements are replaced with `"."', which
+represents the current directory.
+
+     BEGIN {
+         path = ENVIRON["AWKPATH"]
+         ndirs = split(path, pathlist, ":")
+         for (i = 1; i <= ndirs; i++) {
+             if (pathlist[i] == "")
+                 pathlist[i] = "."
+         }
+
+   The stack is initialized with `ARGV[1]', which will be
+`/tmp/ig.s.$$'.  The main loop comes next.  Input lines are read in
+succession. Lines that do not start with `@include' are printed
+verbatim.
+
+   If the line does start with `@include', the file name is in `$2'.
+`pathto' is called to generate the full path.  If it could not, then we
+print an error message and continue.
+
+   The next thing to check is if the file has been included already.
+The `processed' array is indexed by the full file name of each included
+file, and it tracks this information for us.  If the file has been
+seen, a warning message is printed. Otherwise, the new file name is
+pushed onto the stack and processing continues.
+
+   Finally, when `getline' encounters the end of the input file, the
+file is closed and the stack is popped.  When `stackptr' is less than
+zero, the program is done.
+
+         stackptr = 0
+         input[stackptr] = ARGV[1] # ARGV[1] is first file
+     
+         for (; stackptr >= 0; stackptr--) {
+             while ((getline < input[stackptr]) > 0) {
+                 if (tolower($1) != "@include") {
+                     print
+                     continue
+                 }
+                 fpath = pathto($2)
+                 if (fpath == "") {
+                     printf("igawk:%s:%d: cannot find %s\n", \
+                         input[stackptr], FNR, $2) > "/dev/stderr"
+                     continue
+                 }
+                 if (! (fpath in processed)) {
+                     processed[fpath] = input[stackptr]
+                     input[++stackptr] = fpath
+                 } else
+                     print $2, "included in", input[stackptr], \
+                         "already included in", \
+                         processed[fpath] > "/dev/stderr"
+             }
+             close(input[stackptr])
+         }
+     }' /tmp/ig.s.$$ > /tmp/ig.e.$$
+
+   The last step is to call `gawk' with the expanded program and the
+original options and command line arguments that the user supplied.
+`gawk''s exit status is passed back on to `igawk''s calling program.
+
+     eval gawk -f /tmp/ig.e.$$ $opts -- "$@"
+     
+     exit $?
+
+   This version of `igawk' represents my third attempt at this program.
+There are three key simplifications that made the program work better.
+
+  1. Using `@include' even for the files named with `-f' makes building
+     the initial collected `awk' program much simpler; all the
+     `@include' processing can be done once.
+
+  2. The `pathto' function doesn't try to save the line read with
+     `getline' when testing for the file's accessibility.  Trying to
+     save this line for use with the main program complicates things
+     considerably.
+
+  3. Using a `getline' loop in the `BEGIN' rule does it all in one
+     place.  It is not necessary to call out to a separate loop for
+     processing nested `@include' statements.
+
+   Also, this program illustrates that it is often worthwhile to combine
+`sh' and `awk' programming together.  You can usually accomplish quite
+a lot, without having to resort to low-level programming in C or C++,
+and it is frequently easier to do certain kinds of string and argument
+manipulation using the shell than it is in `awk'.
+
+   Finally, `igawk' shows that it is not always necessary to add new
+features to a program; they can often be layered on top.  With `igawk',
+there is no real reason to build `@include' processing into `gawk'
+itself.
+
+   As an additional example of this, consider the idea of having two
+files in a directory in the search path.
+
+`default.awk'
+     This file would contain a set of default library functions, such
+     as `getopt' and `assert'.
+
+`site.awk'
+     This file would contain library functions that are specific to a
+     site or installation, i.e. locally developed functions.  Having a
+     separate file allows `default.awk' to change with new `gawk'
+     releases, without requiring the system administrator to update it
+     each time by adding the local functions.
+
+   One user suggested that `gawk' be modified to automatically read
+these files upon startup.  Instead, it would be very simple to modify
+`igawk' to do this. Since `igawk' can process nested `@include'
+directives, `default.awk' could simply contain `@include' statements
+for the desired library functions.
+
+
+File: gawk.info,  Node: Language History,  Next: Gawk Summary,  Prev: Sample Programs,  Up: Top
+
+The Evolution of the `awk' Language
+***********************************
+
+   This Info file describes the GNU implementation of `awk', which
+follows the POSIX specification.  Many `awk' users are only familiar
+with the original `awk' implementation in Version 7 Unix.  (This
+implementation was the basis for `awk' in Berkeley Unix, through
+4.3-Reno.  The 4.4 release of Berkeley Unix uses `gawk' 2.15.2 for its
+version of `awk'.) This chapter briefly describes the evolution of the
+`awk' language, with cross references to other parts of the Info file
+where you can find more information.
+
+* Menu:
+
+* V7/SVR3.1::                   The major changes between V7 and System V
+                                Release 3.1.
+* SVR4::                        Minor changes between System V Releases 3.1
+                                and 4.
+* POSIX::                       New features from the POSIX standard.
+* BTL::                         New features from the Bell Laboratories
+                                version of `awk'.
+* POSIX/GNU::                   The extensions in `gawk' not in POSIX
+                                `awk'.
+
+
+File: gawk.info,  Node: V7/SVR3.1,  Next: SVR4,  Prev: Language History,  Up: Language History
+
+Major Changes between V7 and SVR3.1
+===================================
+
+   The `awk' language evolved considerably between the release of
+Version 7 Unix (1978) and the new version first made generally
+available in System V Release 3.1 (1987).  This section summarizes the
+changes, with cross-references to further details.
+
+   * The requirement for `;' to separate rules on a line (*note `awk'
+     Statements Versus Lines: Statements/Lines.).
+
+   * User-defined functions, and the `return' statement (*note
+     User-defined Functions: User-defined.).
+
+   * The `delete' statement (*note The `delete' Statement: Delete.).
+
+   * The `do'-`while' statement (*note The `do'-`while' Statement: Do
+     Statement.).
+
+   * The built-in functions `atan2', `cos', `sin', `rand' and `srand'
+     (*note Numeric Built-in Functions: Numeric Functions.).
+
+   * The built-in functions `gsub', `sub', and `match' (*note Built-in
+     Functions for String Manipulation: String Functions.).
+
+   * The built-in functions `close', and `system' (*note Built-in
+     Functions for Input/Output: I/O Functions.).
+
+   * The `ARGC', `ARGV', `FNR', `RLENGTH', `RSTART', and `SUBSEP'
+     built-in variables (*note Built-in Variables::).
+
+   * The conditional expression using the ternary operator `?:' (*note
+     Conditional Expressions: Conditional Exp.).
+
+   * The exponentiation operator `^' (*note Arithmetic Operators:
+     Arithmetic Ops.) and its assignment operator form `^=' (*note
+     Assignment Expressions: Assignment Ops.).
+
+   * C-compatible operator precedence, which breaks some old `awk'
+     programs (*note Operator Precedence (How Operators Nest):
+     Precedence.).
+
+   * Regexps as the value of `FS' (*note Specifying How Fields are
+     Separated: Field Separators.), and as the third argument to the
+     `split' function (*note Built-in Functions for String
+     Manipulation: String Functions.).
+
+   * Dynamic regexps as operands of the `~' and `!~' operators (*note
+     How to Use Regular Expressions: Regexp Usage.).
+
+   * The escape sequences `\b', `\f', and `\r' (*note Escape
+     Sequences::).  (Some vendors have updated their old versions of
+     `awk' to recognize `\r', `\b', and `\f', but this is not something
+     you can rely on.)
+
+   * Redirection of input for the `getline' function (*note Explicit
+     Input with `getline': Getline.).
+
+   * Multiple `BEGIN' and `END' rules (*note The `BEGIN' and `END'
+     Special Patterns: BEGIN/END.).
+
+   * Multi-dimensional arrays (*note Multi-dimensional Arrays:
+     Multi-dimensional.).
+
+
+File: gawk.info,  Node: SVR4,  Next: POSIX,  Prev: V7/SVR3.1,  Up: Language History
+
+Changes between SVR3.1 and SVR4
+===============================
+
+   The System V Release 4 version of Unix `awk' added these features
+(some of which originated in `gawk'):
+
+   * The `ENVIRON' variable (*note Built-in Variables::).
+
+   * Multiple `-f' options on the command line (*note Command Line
+     Options: Options.).
+
+   * The `-v' option for assigning variables before program execution
+     begins (*note Command Line Options: Options.).
+
+   * The `--' option for terminating command line options.
+
+   * The `\a', `\v', and `\x' escape sequences (*note Escape
+     Sequences::).
+
+   * A defined return value for the `srand' built-in function (*note
+     Numeric Built-in Functions: Numeric Functions.).
+
+   * The `toupper' and `tolower' built-in string functions for case
+     translation (*note Built-in Functions for String Manipulation:
+     String Functions.).
+
+   * A cleaner specification for the `%c' format-control letter in the
+     `printf' function (*note Format-Control Letters: Control Letters.).
+
+   * The ability to dynamically pass the field width and precision
+     (`"%*.*d"') in the argument list of the `printf' function (*note
+     Format-Control Letters: Control Letters.).
+
+   * The use of regexp constants such as `/foo/' as expressions, where
+     they are equivalent to using the matching operator, as in `$0 ~
+     /foo/' (*note Using Regular Expression Constants: Using Constant
+     Regexps.).
+
+
+File: gawk.info,  Node: POSIX,  Next: BTL,  Prev: SVR4,  Up: Language History
+
+Changes between SVR4 and POSIX `awk'
+====================================
+
+   The POSIX Command Language and Utilities standard for `awk'
+introduced the following changes into the language:
+
+   * The use of `-W' for implementation-specific options.
+
+   * The use of `CONVFMT' for controlling the conversion of numbers to
+     strings (*note Conversion of Strings and Numbers: Conversion.).
+
+   * The concept of a numeric string, and tighter comparison rules to go
+     with it (*note Variable Typing and Comparison Expressions: Typing
+     and Comparison.).
+
+   * More complete documentation of many of the previously undocumented
+     features of the language.
+
+   The following common extensions are not permitted by the POSIX
+standard:
+
+   * `\x' escape sequences are not recognized (*note Escape
+     Sequences::).
+
+   * Newlines do not act as whitespace to separate fields when `FS' is
+     equal to a single space.
+
+   * The synonym `func' for the keyword `function' is not recognized
+     (*note Function Definition Syntax: Definition Syntax.).
+
+   * The operators `**' and `**=' cannot be used in place of `^' and
+     `^=' (*note Arithmetic Operators: Arithmetic Ops., and also *note
+     Assignment Expressions: Assignment Ops.).
+
+   * Specifying `-Ft' on the command line does not set the value of
+     `FS' to be a single tab character (*note Specifying How Fields are
+     Separated: Field Separators.).
+
+   * The `fflush' built-in function is not supported (*note Built-in
+     Functions for Input/Output: I/O Functions.).
+
+
+File: gawk.info,  Node: BTL,  Next: POSIX/GNU,  Prev: POSIX,  Up: Language History
+
+Extensions in the Bell Laboratories `awk'
+=========================================
+
+   Brian Kernighan, one of the original designers of Unix `awk', has
+made his version available via anonymous `ftp' (*note Other Freely
+Available `awk' Implementations: Other Versions.).  This section
+describes extensions in his version of `awk' that are not in POSIX
+`awk'.
+
+   * The `-mf NNN' and `-mr NNN' command line options to set the
+     maximum number of fields, and the maximum record size, respectively
+     (*note Command Line Options: Options.).
+
+   * The `fflush' built-in function for flushing buffered output (*note
+     Built-in Functions for Input/Output: I/O Functions.).
+
+
+
+File: gawk.info,  Node: POSIX/GNU,  Prev: BTL,  Up: Language History
+
+Extensions in `gawk' Not in POSIX `awk'
+=======================================
+
+   The GNU implementation, `gawk', adds a number of features.  This
+sections lists them in the order they were added to `gawk'.  They can
+all be disabled with either the `--traditional' or `--posix' options
+(*note Command Line Options: Options.).
+
+   Version 2.10 of `gawk' introduced these features:
+
+   * The `AWKPATH' environment variable for specifying a path search for
+     the `-f' command line option (*note Command Line Options:
+     Options.).
+
+   * The `IGNORECASE' variable and its effects (*note Case-sensitivity
+     in Matching: Case-sensitivity.).
+
+   * The `/dev/stdin', `/dev/stdout', `/dev/stderr', and `/dev/fd/N'
+     file name interpretation (*note Special File Names in `gawk':
+     Special Files.).
+
+   Version 2.13 of `gawk' introduced these features:
+
+   * The `FIELDWIDTHS' variable and its effects (*note Reading
+     Fixed-width Data: Constant Size.).
+
+   * The `systime' and `strftime' built-in functions for obtaining and
+     printing time stamps (*note Functions for Dealing with Time
+     Stamps: Time Functions.).
+
+   * The `-W lint' option to provide source code and run time error and
+     portability checking (*note Command Line Options: Options.).
+
+   * The `-W compat' option to turn off these extensions (*note Command
+     Line Options: Options.).
+
+   * The `-W posix' option for full POSIX compliance (*note Command
+     Line Options: Options.).
+
+   Version 2.14 of `gawk' introduced these features:
+
+   * The `next file' statement for skipping to the next data file
+     (*note The `nextfile' Statement: Nextfile Statement.).
+
+   Version 2.15 of `gawk' introduced these features:
+
+   * The `ARGIND' variable, that tracks the movement of `FILENAME'
+     through `ARGV'  (*note Built-in Variables::).
+
+   * The `ERRNO' variable, that contains the system error message when
+     `getline' returns -1, or when `close' fails (*note Built-in
+     Variables::).
+
+   * The ability to use GNU-style long named options that start with
+     `--' (*note Command Line Options: Options.).
+
+   * The `--source' option for mixing command line and library file
+     source code (*note Command Line Options: Options.).
+
+   * The `/dev/pid', `/dev/ppid', `/dev/pgrpid', and `/dev/user' file
+     name interpretation (*note Special File Names in `gawk': Special
+     Files.).
+
+   Version 3.0 of `gawk' introduced these features:
+
+   * The `next file' statement became `nextfile' (*note The `nextfile'
+     Statement: Nextfile Statement.).
+
+   * The `--lint-old' option to warn about constructs that are not
+     available in the original Version 7 Unix version of `awk' (*note
+     Major Changes between V7 and SVR3.1: V7/SVR3.1.).
+
+   * The `--traditional' option was added as a better name for
+     `--compat' (*note Command Line Options: Options.).
+
+   * The ability for `FS' to be a null string, and for the third
+     argument to `split' to be the null string (*note Making Each
+     Character a Separate Field: Single Character Fields.).
+
+   * The ability for `RS' to be a regexp (*note How Input is Split into
+     Records: Records.).
+
+   * The `RT' variable (*note How Input is Split into Records:
+     Records.).
+
+   * The `gensub' function for more powerful text manipulation (*note
+     Built-in Functions for String Manipulation: String Functions.).
+
+   * The `strftime' function acquired a default time format, allowing
+     it to be called with no arguments (*note Functions for Dealing
+     with Time Stamps: Time Functions.).
+
+   * Full support for both POSIX and GNU regexps (*note Regular
+     Expressions: Regexp.).
+
+   * The `--re-interval' option to provide interval expressions in
+     regexps (*note Regular Expression Operators: Regexp Operators.).
+
+   * `IGNORECASE' changed, now applying to string comparison as well as
+     regexp operations (*note Case-sensitivity in Matching:
+     Case-sensitivity.).
+
+   * The `-m' option and the `fflush' function from the Bell Labs
+     research version of `awk' (*note Command Line Options: Options.;
+     also *note Built-in Functions for Input/Output: I/O Functions.).
+
+   * The use of GNU Autoconf to control the configuration process
+     (*note Compiling `gawk' for Unix: Quick Installation.).
+
+   * Amiga support (*note Installing `gawk' on an Amiga: Amiga
+     Installation.).
+
+
+
+File: gawk.info,  Node: Gawk Summary,  Next: Installation,  Prev: Language History,  Up: Top
+
+`gawk' Summary
+**************
+
+   This appendix provides a brief summary of the `gawk' command line
+and the `awk' language.  It is designed to serve as "quick reference."
+It is therefore terse, but complete.
+
+* Menu:
+
+* Command Line Summary::        Recapitulation of the command line.
+* Language Summary::            A terse review of the language.
+* Variables/Fields::            Variables, fields, and arrays.
+* Rules Summary::               Patterns and Actions, and their component
+                                parts.
+* Actions Summary::             Quick overview of actions.
+* Functions Summary::           Defining and calling functions.
+* Historical Features::         Some undocumented but supported ``features''.
+
+
+File: gawk.info,  Node: Command Line Summary,  Next: Language Summary,  Prev: Gawk Summary,  Up: Gawk Summary
+
+Command Line Options Summary
+============================
+
+   The command line consists of options to `gawk' itself, the `awk'
+program text (if not supplied via the `-f' option), and values to be
+made available in the `ARGC' and `ARGV' predefined `awk' variables:
+
+     gawk [POSIX OR GNU STYLE OPTIONS] -f SOURCE-FILE [`--'] FILE ...
+     gawk [POSIX OR GNU STYLE OPTIONS] [`--'] 'PROGRAM' FILE ...
+
+   The options that `gawk' accepts are:
+
+`-F FS'
+`--field-separator FS'
+     Use FS for the input field separator (the value of the `FS'
+     predefined variable).
+
+`-f PROGRAM-FILE'
+`--file PROGRAM-FILE'
+     Read the `awk' program source from the file PROGRAM-FILE, instead
+     of from the first command line argument.
+
+`-mf NNN'
+`-mr NNN'
+     The `f' flag sets the maximum number of fields, and the `r' flag
+     sets the maximum record size.  These options are ignored by
+     `gawk', since `gawk' has no predefined limits; they are only for
+     compatibility with the Bell Labs research version of Unix `awk'.
+
+`-v VAR=VAL'
+`--assign VAR=VAL'
+     Assign the variable VAR the value VAL before program execution
+     begins.
+
+`-W traditional'
+`-W compat'
+`--traditional'
+`--compat'
+     Use compatibility mode, in which `gawk' extensions are turned off.
+
+`-W copyleft'
+`-W copyright'
+`--copyleft'
+`--copyright'
+     Print the short version of the General Public License on the
+     standard output, and exit.  This option may disappear in a future
+     version of `gawk'.
+
+`-W help'
+`-W usage'
+`--help'
+`--usage'
+     Print a relatively short summary of the available options on the
+     standard output, and exit.
+
+`-W lint'
+`--lint'
+     Give warnings about dubious or non-portable `awk' constructs.
+
+`-W lint-old'
+`--lint-old'
+     Warn about constructs that are not available in the original
+     Version 7 Unix version of `awk'.
+
+`-W posix'
+`--posix'
+     Use POSIX compatibility mode, in which `gawk' extensions are
+     turned off and additional restrictions apply.
+
+`-W re-interval'
+`--re-interval'
+     Allow interval expressions (*note Regular Expression Operators:
+     Regexp Operators.), in regexps.
+
+`-W source=PROGRAM-TEXT'
+`--source PROGRAM-TEXT'
+     Use PROGRAM-TEXT as `awk' program source code.  This option allows
+     mixing command line source code with source code from files, and is
+     particularly useful for mixing command line programs with library
+     functions.
+
+`-W version'
+`--version'
+     Print version information for this particular copy of `gawk' on
+     the error output.
+
+`--'
+     Signal the end of options.  This is useful to allow further
+     arguments to the `awk' program itself to start with a `-'.  This
+     is mainly for consistency with POSIX argument parsing conventions.
+
+   Any other options are flagged as invalid, but are otherwise ignored.
+*Note Command Line Options: Options, for more details.
+
+
+File: gawk.info,  Node: Language Summary,  Next: Variables/Fields,  Prev: Command Line Summary,  Up: Gawk Summary
+
+Language Summary
+================
+
+   An `awk' program consists of a sequence of zero or more
+pattern-action statements and optional function definitions.  One or
+the other of the pattern and action may be omitted.
+
+     PATTERN    { ACTION STATEMENTS }
+     PATTERN
+               { ACTION STATEMENTS }
+     
+     function NAME(PARAMETER LIST)     { ACTION STATEMENTS }
+
+   `gawk' first reads the program source from the PROGRAM-FILE(s), if
+specified, or from the first non-option argument on the command line.
+The `-f' option may be used multiple times on the command line.  `gawk'
+reads the program text from all the PROGRAM-FILE files, effectively
+concatenating them in the order they are specified.  This is useful for
+building libraries of `awk' functions, without having to include them
+in each new `awk' program that uses them.  To use a library function in
+a file from a program typed in on the command line, specify `--source
+'PROGRAM'', and type your program in between the single quotes.  *Note
+Command Line Options: Options.
+
+   The environment variable `AWKPATH' specifies a search path to use
+when finding source files named with the `-f' option.  The default
+path, which is `.:/usr/local/share/awk'(1) is used if `AWKPATH' is not
+set.  If a file name given to the `-f' option contains a `/' character,
+no path search is performed.  *Note The `AWKPATH' Environment Variable:
+AWKPATH Variable.
+
+   `gawk' compiles the program into an internal form, and then proceeds
+to read each file named in the `ARGV' array.  The initial values of
+`ARGV' come from the command line arguments.  If there are no files
+named on the command line, `gawk' reads the standard input.
+
+   If a "file" named on the command line has the form `VAR=VAL', it is
+treated as a variable assignment: the variable VAR is assigned the
+value VAL.  If any of the files have a value that is the null string,
+that element in the list is skipped.
+
+   For each record in the input, `gawk' tests to see if it matches any
+PATTERN in the `awk' program.  For each pattern that the record
+matches, the associated ACTION is executed.
+
+   ---------- Footnotes ----------
+
+   (1) The path may use a directory other than `/usr/local/share/awk',
+depending upon how `gawk' was built and installed.
+
+
+File: gawk.info,  Node: Variables/Fields,  Next: Rules Summary,  Prev: Language Summary,  Up: Gawk Summary
+
+Variables and Fields
+====================
+
+   `awk' variables are not declared; they come into existence when they
+are first used.  Their values are either floating-point numbers or
+strings.  `awk' also has one-dimensional arrays; multiple-dimensional
+arrays may be simulated.  There are several predefined variables that
+`awk' sets as a program runs; these are summarized below.
+
+* Menu:
+
+* Fields Summary::              Input field splitting.
+* Built-in Summary::            `awk''s built-in variables.
+* Arrays Summary::              Using arrays.
+* Data Type Summary::           Values in `awk' are numbers or strings.
+
+
+File: gawk.info,  Node: Fields Summary,  Next: Built-in Summary,  Prev: Variables/Fields,  Up: Variables/Fields
+
+Fields
+------
+
+   As each input line is read, `gawk' splits the line into FIELDS,
+using the value of the `FS' variable as the field separator.  If `FS'
+is a single character, fields are separated by that character.
+Otherwise, `FS' is expected to be a full regular expression.  In the
+special case that `FS' is a single space, fields are separated by runs
+of spaces, tabs and/or newlines.(1) If `FS' is the null string (`""'),
+then each individual character in the record becomes a separate field.
+Note that the value of `IGNORECASE' (*note Case-sensitivity in
+Matching: Case-sensitivity.)  also affects how fields are split when
+`FS' is a regular expression.
+
+   Each field in the input line may be referenced by its position, `$1',
+`$2', and so on.  `$0' is the whole line.  The value of a field may be
+assigned to as well.  Field numbers need not be constants:
+
+     n = 5
+     print $n
+
+prints the fifth field in the input line.  The variable `NF' is set to
+the total number of fields in the input line.
+
+   References to non-existent fields (i.e. fields after `$NF') return
+the null string.  However, assigning to a non-existent field (e.g.,
+`$(NF+2) = 5') increases the value of `NF', creates any intervening
+fields with the null string as their value, and causes the value of
+`$0' to be recomputed, with the fields being separated by the value of
+`OFS'.  Decrementing `NF' causes the values of fields past the new
+value to be lost, and the value of `$0' to be recomputed, with the
+fields being separated by the value of `OFS'.  *Note Reading Input
+Files: Reading Files.
+
+   ---------- Footnotes ----------
+
+   (1) In POSIX `awk', newline does not separate fields.
+
+
+File: gawk.info,  Node: Built-in Summary,  Next: Arrays Summary,  Prev: Fields Summary,  Up: Variables/Fields
+
+Built-in Variables
+------------------
+
+   `gawk''s built-in variables are:
+
+`ARGC'
+     The number of elements in `ARGV'. See below for what is actually
+     included in `ARGV'.
+
+`ARGIND'
+     The index in `ARGV' of the current file being processed.  When
+     `gawk' is processing the input data files, it is always true that
+     `FILENAME == ARGV[ARGIND]'.
+
+`ARGV'
+     The array of command line arguments.  The array is indexed from
+     zero to `ARGC' - 1.  Dynamically changing `ARGC' and the contents
+     of `ARGV' can control the files used for data.  A null-valued
+     element in `ARGV' is ignored. `ARGV' does not include the options
+     to `awk' or the text of the `awk' program itself.
+
+`CONVFMT'
+     The conversion format to use when converting numbers to strings.
+
+`FIELDWIDTHS'
+     A space separated list of numbers describing the fixed-width input
+     data.
+
+`ENVIRON'
+     An array of environment variable values.  The array is indexed by
+     variable name, each element being the value of that variable.
+     Thus, the environment variable `HOME' is `ENVIRON["HOME"]'.  One
+     possible value might be `/home/arnold'.
+
+     Changing this array does not affect the environment seen by
+     programs which `gawk' spawns via redirection or the `system'
+     function.  (This may change in a future version of `gawk'.)
+
+     Some operating systems do not have environment variables.  The
+     `ENVIRON' array is empty when running on these systems.
+
+`ERRNO'
+     The system error message when an error occurs using `getline' or
+     `close'.
+
+`FILENAME'
+     The name of the current input file.  If no files are specified on
+     the command line, the value of `FILENAME' is the null string.
+
+`FNR'
+     The input record number in the current input file.
+
+`FS'
+     The input field separator, a space by default.
+
+`IGNORECASE'
+     The case-sensitivity flag for string comparisons and regular
+     expression operations.  If `IGNORECASE' has a non-zero value, then
+     pattern matching in rules, record separating with `RS', field
+     splitting with `FS', regular expression matching with `~' and
+     `!~', and the `gensub', `gsub', `index', `match', `split' and
+     `sub' built-in functions all ignore case when doing regular
+     expression operations, and all string comparisons are done
+     ignoring case.
+
+`NF'
+     The number of fields in the current input record.
+
+`NR'
+     The total number of input records seen so far.
+
+`OFMT'
+     The output format for numbers for the `print' statement, `"%.6g"'
+     by default.
+
+`OFS'
+     The output field separator, a space by default.
+
+`ORS'
+     The output record separator, by default a newline.
+
+`RS'
+     The input record separator, by default a newline.  If `RS' is set
+     to the null string, then records are separated by blank lines.
+     When `RS' is set to the null string, then the newline character
+     always acts as a field separator, in addition to whatever value
+     `FS' may have.  If `RS' is set to a multi-character string, it
+     denotes a regexp; input text matching the regexp separates records.
+
+`RT'
+     The input text that matched the text denoted by `RS', the record
+     separator.
+
+`RSTART'
+     The index of the first character last matched by `match'; zero if
+     no match.
+
+`RLENGTH'
+     The length of the string last matched by `match'; -1 if no match.
+
+`SUBSEP'
+     The string used to separate multiple subscripts in array elements,
+     by default `"\034"'.
+
+   *Note Built-in Variables::, for more information.
+
+
+File: gawk.info,  Node: Arrays Summary,  Next: Data Type Summary,  Prev: Built-in Summary,  Up: Variables/Fields
+
+Arrays
+------
+
+   Arrays are subscripted with an expression between square brackets
+(`[' and `]').  Array subscripts are _always_ strings; numbers are
+converted to strings as necessary, following the standard conversion
+rules (*note Conversion of Strings and Numbers: Conversion.).
+
+   If you use multiple expressions separated by commas inside the square
+brackets, then the array subscript is a string consisting of the
+concatenation of the individual subscript values, converted to strings,
+separated by the subscript separator (the value of `SUBSEP').
+
+   The special operator `in' may be used in a conditional context to
+see if an array has an index consisting of a particular value.
+
+     if (val in array)
+             print array[val]
+
+   If the array has multiple subscripts, use `(i, j, ...) in ARRAY' to
+test for existence of an element.
+
+   The `in' construct may also be used in a `for' loop to iterate over
+all the elements of an array.  *Note Scanning All Elements of an Array:
+Scanning an Array.
+
+   You can remove an element from an array using the `delete' statement.
+
+   You can clear an entire array using `delete ARRAY'.
+
+   *Note Arrays in `awk': Arrays.
+
+
+File: gawk.info,  Node: Data Type Summary,  Prev: Arrays Summary,  Up: Variables/Fields
+
+Data Types
+----------
+
+   The value of an `awk' expression is always either a number or a
+string.
+
+   Some contexts (such as arithmetic operators) require numeric values.
+They convert strings to numbers by interpreting the text of the string
+as a number.  If the string does not look like a number, it converts to
+zero.
+
+   Other contexts (such as concatenation) require string values.  They
+convert numbers to strings by effectively printing them with `sprintf'.
+*Note Conversion of Strings and Numbers: Conversion, for the details.
+
+   To force conversion of a string value to a number, simply add zero
+to it.  If the value you start with is already a number, this does not
+change it.
+
+   To force conversion of a numeric value to a string, concatenate it
+with the null string.
+
+   Comparisons are done numerically if both operands are numeric, or if
+one is numeric and the other is a numeric string.  Otherwise one or
+both operands are converted to strings and a string comparison is
+performed.  Fields, `getline' input, `FILENAME', `ARGV' elements,
+`ENVIRON' elements and the elements of an array created by `split' are
+the only items that can be numeric strings. String constants, such as
+`"3.1415927"' are not numeric strings, they are string constants.  The
+full rules for comparisons are described in *Note Variable Typing and
+Comparison Expressions: Typing and Comparison.
+
+   Uninitialized variables have the string value `""' (the null, or
+empty, string).  In contexts where a number is required, this is
+equivalent to zero.
+
+   *Note Variables::, for more information on variable naming and
+initialization; *note Conversion of Strings and Numbers: Conversion.,
+for more information on how variable values are interpreted.
+
+
+File: gawk.info,  Node: Rules Summary,  Next: Actions Summary,  Prev: Variables/Fields,  Up: Gawk Summary
+
+Patterns
+========
+
+* Menu:
+
+* Pattern Summary::             Quick overview of patterns.
+* Regexp Summary::              Quick overview of regular expressions.
+
+   An `awk' program is mostly composed of rules, each consisting of a
+pattern followed by an action.  The action is enclosed in `{' and `}'.
+Either the pattern may be missing, or the action may be missing, but
+not both.  If the pattern is missing, the action is executed for every
+input record.  A missing action is equivalent to `{ print }', which
+prints the entire line.
+
+   Comments begin with the `#' character, and continue until the end of
+the line.  Blank lines may be used to separate statements.  Statements
+normally end with a newline; however, this is not the case for lines
+ending in a `,', `{', `?', `:', `&&', or `||'.  Lines ending in `do' or
+`else' also have their statements automatically continued on the
+following line.  In other cases, a line can be continued by ending it
+with a `\', in which case the newline is ignored.
+
+   Multiple statements may be put on one line by separating each one
+with a `;'.  This applies to both the statements within the action part
+of a rule (the usual case), and to the rule statements.
+
+   *Note Comments in `awk' Programs: Comments, for information on
+`awk''s commenting convention; *note `awk' Statements Versus Lines:
+Statements/Lines., for a description of the line continuation mechanism
+in `awk'.
+
+
+File: gawk.info,  Node: Pattern Summary,  Next: Regexp Summary,  Prev: Rules Summary,  Up: Rules Summary
+
+Pattern Summary
+---------------
+
+   `awk' patterns may be one of the following:
+
+     /REGULAR EXPRESSION/
+     RELATIONAL EXPRESSION
+     PATTERN && PATTERN
+     PATTERN || PATTERN
+     PATTERN ? PATTERN : PATTERN
+     (PATTERN)
+     ! PATTERN
+     PATTERN1, PATTERN2
+     BEGIN
+     END
+
+   `BEGIN' and `END' are two special kinds of patterns that are not
+tested against the input.  The action parts of all `BEGIN' rules are
+concatenated as if all the statements had been written in a single
+`BEGIN' rule.  They are executed before any of the input is read.
+Similarly, all the `END' rules are concatenated, and executed when all
+the input is exhausted (or when an `exit' statement is executed).
+`BEGIN' and `END' patterns cannot be combined with other patterns in
+pattern expressions.  `BEGIN' and `END' rules cannot have missing
+action parts.
+
+   For `/REGULAR-EXPRESSION/' patterns, the associated statement is
+executed for each input record that matches the regular expression.
+Regular expressions are summarized below.
+
+   A RELATIONAL EXPRESSION may use any of the operators defined below in
+the section on actions.  These generally test whether certain fields
+match certain regular expressions.
+
+   The `&&', `||', and `!' operators are logical "and," logical "or,"
+and logical "not," respectively, as in C.  They do short-circuit
+evaluation, also as in C, and are used for combining more primitive
+pattern expressions.  As in most languages, parentheses may be used to
+change the order of evaluation.
+
+   The `?:' operator is like the same operator in C.  If the first
+pattern matches, then the second pattern is matched against the input
+record; otherwise, the third is matched.  Only one of the second and
+third patterns is matched.
+
+   The `PATTERN1, PATTERN2' form of a pattern is called a range
+pattern.  It matches all input lines starting with a line that matches
+PATTERN1, and continuing until a line that matches PATTERN2, inclusive.
+A range pattern cannot be used as an operand of any of the pattern
+operators.
+
+   *Note Pattern Elements: Pattern Overview.
+
+
+File: gawk.info,  Node: Regexp Summary,  Prev: Pattern Summary,  Up: Rules Summary
+
+Regular Expressions
+-------------------
+
+   Regular expressions are based on POSIX EREs (extended regular
+expressions).  The escape sequences allowed in string constants are
+also valid in regular expressions (*note Escape Sequences::).  Regexps
+are composed of characters as follows:
+
+`C'
+     matches the character C (assuming C is none of the characters
+     listed below).
+
+`\C'
+     matches the literal character C.
+
+`.'
+     matches any character, _including_ newline.  In strict POSIX mode,
+     `.' does not match the NUL character, which is a character with
+     all bits equal to zero.
+
+`^'
+     matches the beginning of a string.
+
+`$'
+     matches the end of a string.
+
+`[ABC...]'
+     matches any of the characters ABC... (character list).
+
+`[[:CLASS:]]'
+     matches any character in the character class CLASS. Allowable
+     classes are `alnum', `alpha', `blank', `cntrl', `digit', `graph',
+     `lower', `print', `punct', `space', `upper', and `xdigit'.
+
+`[[.SYMBOL.]]'
+     matches the multi-character collating symbol SYMBOL.  `gawk' does
+     not currently support collating symbols.
+
+`[[=CLASSNAME=]]'
+     matches any of the equivalent characters in the current locale
+     named by the equivalence class CLASSNAME.  `gawk' does not
+     currently support equivalence classes.
+
+`[^ABC...]'
+     matches any character except ABC... (negated character list).
+
+`R1|R2'
+     matches either R1 or R2 (alternation).
+
+`R1R2'
+     matches R1, and then R2 (concatenation).
+
+`R+'
+     matches one or more R's.
+
+`R*'
+     matches zero or more R's.
+
+`R?'
+     matches zero or one R's.
+
+`(R)'
+     matches R (grouping).
+
+`R{N}'
+`R{N,}'
+`R{N,M}'
+     matches at least N, N to any number, or N to M occurrences of R
+     (interval expressions).
+
+`\y'
+     matches the empty string at either the beginning or the end of a
+     word.
+
+`\B'
+     matches the empty string within a word.
+
+`\<'
+     matches the empty string at the beginning of a word.
+
+`\>'
+     matches the empty string at the end of a word.
+
+`\w'
+     matches any word-constituent character (alphanumeric characters and
+     the underscore).
+
+`\W'
+     matches any character that is not word-constituent.
+
+`\`'
+     matches the empty string at the beginning of a buffer (same as a
+     string in `gawk').
+
+`\''
+     matches the empty string at the end of a buffer.
+
+   The various command line options control how `gawk' interprets
+characters in regexps.
+
+No options
+     In the default case, `gawk' provide all the facilities of POSIX
+     regexps and the GNU regexp operators described above.  However,
+     interval expressions are not supported.
+
+`--posix'
+     Only POSIX regexps are supported, the GNU operators are not special
+     (e.g., `\w' matches a literal `w').  Interval expressions are
+     allowed.
+
+`--traditional'
+     Traditional Unix `awk' regexps are matched. The GNU operators are
+     not special, interval expressions are not available, and neither
+     are the POSIX character classes (`[[:alnum:]]' and so on).
+     Characters described by octal and hexadecimal escape sequences are
+     treated literally, even if they represent regexp metacharacters.
+
+`--re-interval'
+     Allow interval expressions in regexps, even if `--traditional' has
+     been provided.
+
+   *Note Regular Expressions: Regexp.
+
+
+File: gawk.info,  Node: Actions Summary,  Next: Functions Summary,  Prev: Rules Summary,  Up: Gawk Summary
+
+Actions
+=======
+
+   Action statements are enclosed in braces, `{' and `}'.  A missing
+action statement is equivalent to `{ print }'.
+
+   Action statements consist of the usual assignment, conditional, and
+looping statements found in most languages.  The operators, control
+statements, and Input/Output statements available are similar to those
+in C.
+
+   Comments begin with the `#' character, and continue until the end of
+the line.  Blank lines may be used to separate statements.  Statements
+normally end with a newline; however, this is not the case for lines
+ending in a `,', `{', `?', `:', `&&', or `||'.  Lines ending in `do' or
+`else' also have their statements automatically continued on the
+following line.  In other cases, a line can be continued by ending it
+with a `\', in which case the newline is ignored.
+
+   Multiple statements may be put on one line by separating each one
+with a `;'.  This applies to both the statements within the action part
+of a rule (the usual case), and to the rule statements.
+
+   *Note Comments in `awk' Programs: Comments, for information on
+`awk''s commenting convention; *note `awk' Statements Versus Lines:
+Statements/Lines., for a description of the line continuation mechanism
+in `awk'.
+
+* Menu:
+
+* Operator Summary::            `awk' operators.
+* Control Flow Summary::        The control statements.
+* I/O Summary::                 The I/O statements.
+* Printf Summary::              A summary of `printf'.
+* Special File Summary::        Special file names interpreted internally.
+* Built-in Functions Summary::  Built-in numeric and string functions.
+* Time Functions Summary::      Built-in time functions.
+* String Constants Summary::    Escape sequences in strings.
+
+
+File: gawk.info,  Node: Operator Summary,  Next: Control Flow Summary,  Prev: Actions Summary,  Up: Actions Summary
+
+Operators
+---------
+
+   The operators in `awk', in order of decreasing precedence, are:
+
+`(...)'
+     Grouping.
+
+`$'
+     Field reference.
+
+`++ --'
+     Increment and decrement, both prefix and postfix.
+
+`^'
+     Exponentiation (`**' may also be used, and `**=' for the assignment
+     operator, but they are not specified in the POSIX standard).
+
+`+ - !'
+     Unary plus, unary minus, and logical negation.
+
+`* / %'
+     Multiplication, division, and modulus.
+
+`+ -'
+     Addition and subtraction.
+
+`SPACE'
+     String concatenation.
+
+`< <= > >= != =='
+     The usual relational operators.
+
+`~ !~'
+     Regular expression match, negated match.
+
+`in'
+     Array membership.
+
+`&&'
+     Logical "and".
+
+`||'
+     Logical "or".
+
+`?:'
+     A conditional expression.  This has the form `EXPR1 ?  EXPR2 :
+     EXPR3'.  If EXPR1 is true, the value of the expression is EXPR2;
+     otherwise it is EXPR3.  Only one of EXPR2 and EXPR3 is evaluated.
+
+`= += -= *= /= %= ^='
+     Assignment.  Both absolute assignment (`VAR=VALUE') and operator
+     assignment (the other forms) are supported.
+
+   *Note Expressions::.
+
+
+File: gawk.info,  Node: Control Flow Summary,  Next: I/O Summary,  Prev: Operator Summary,  Up: Actions Summary
+
+Control Statements
+------------------
+
+   The control statements are as follows:
+
+     if (CONDITION) STATEMENT [ else STATEMENT ]
+     while (CONDITION) STATEMENT
+     do STATEMENT while (CONDITION)
+     for (EXPR1; EXPR2; EXPR3) STATEMENT
+     for (VAR in ARRAY) STATEMENT
+     break
+     continue
+     delete ARRAY[INDEX]
+     delete ARRAY
+     exit [ EXPRESSION ]
+     { STATEMENTS }
+
+   *Note Control Statements in Actions: Statements.
+
+
+File: gawk.info,  Node: I/O Summary,  Next: Printf Summary,  Prev: Control Flow Summary,  Up: Actions Summary
+
+I/O Statements
+--------------
+
+   The Input/Output statements are as follows:
+
+`getline'
+     Set `$0' from next input record; set `NF', `NR', `FNR'.  *Note
+     Explicit Input with `getline': Getline.
+
+`getline <FILE'
+     Set `$0' from next record of FILE; set `NF'.
+
+`getline VAR'
+     Set VAR from next input record; set `NR', `FNR'.
+
+`getline VAR <FILE'
+     Set VAR from next record of FILE.
+
+`COMMAND | getline'
+     Run COMMAND, piping its output into `getline'; sets `$0', `NF',
+     `NR'.
+
+`COMMAND | getline `var''
+     Run COMMAND, piping its output into `getline'; sets VAR.
+
+`next'
+     Stop processing the current input record.  The next input record
+     is read and processing starts over with the first pattern in the
+     `awk' program.  If the end of the input data is reached, the `END'
+     rule(s), if any, are executed.  *Note The `next' Statement: Next
+     Statement.
+
+`nextfile'
+     Stop processing the current input file.  The next input record
+     read comes from the next input file.  `FILENAME' is updated, `FNR'
+     is set to one, `ARGIND' is incremented, and processing starts over
+     with the first pattern in the `awk' program.  If the end of the
+     input data is reached, the `END' rule(s), if any, are executed.
+     Earlier versions of `gawk' used `next file'; this usage is still
+     supported, but is considered to be deprecated.  *Note The
+     `nextfile' Statement: Nextfile Statement.
+
+`print'
+     Prints the current record.  *Note Printing Output: Printing.
+
+`print EXPR-LIST'
+     Prints expressions.
+
+`print EXPR-LIST > FILE'
+     Prints expressions to FILE. If FILE does not exist, it is created.
+     If it does exist, its contents are deleted the first time the
+     `print' is executed.
+
+`print EXPR-LIST >> FILE'
+     Prints expressions to FILE.  The previous contents of FILE are
+     retained, and the output of `print' is appended to the file.
+
+`print EXPR-LIST | COMMAND'
+     Prints expressions, sending the output down a pipe to COMMAND.
+     The pipeline to the command stays open until the `close' function
+     is called.
+
+`printf FMT, EXPR-LIST'
+     Format and print.
+
+`printf FMT, EXPR-LIST > file'
+     Format and print to FILE. If FILE does not exist, it is created.
+     If it does exist, its contents are deleted the first time the
+     `printf' is executed.
+
+`printf FMT, EXPR-LIST >> FILE'
+     Format and print to FILE.  The previous contents of FILE are
+     retained, and the output of `printf' is appended to the file.
+
+`printf FMT, EXPR-LIST | COMMAND'
+     Format and print, sending the output down a pipe to COMMAND.  The
+     pipeline to the command stays open until the `close' function is
+     called.
+
+   `getline' returns zero on end of file, and -1 on an error.  In the
+event of an error, `getline' will set `ERRNO' to the value of a
+system-dependent string that describes the error.
+
+
+File: gawk.info,  Node: Printf Summary,  Next: Special File Summary,  Prev: I/O Summary,  Up: Actions Summary
+
+`printf' Summary
+----------------
+
+   Conversion specification have the form
+`%'[FLAG][WIDTH][`.'PREC]FORMAT.  Items in brackets are optional.
+
+   The `awk' `printf' statement and `sprintf' function accept the
+following conversion specification formats:
+
+`%c'
+     An ASCII character.  If the argument used for `%c' is numeric, it
+     is treated as a character and printed.  Otherwise, the argument is
+     assumed to be a string, and the only first character of that
+     string is printed.
+
+`%d'
+`%i'
+     A decimal number (the integer part).
+
+`%e'
+`%E'
+     A floating point number of the form `[-]d.dddddde[+-]dd'.  The
+     `%E' format uses `E' instead of `e'.
+
+`%f'
+     A floating point number of the form [`-']`ddd.dddddd'.
+
+`%g'
+`%G'
+     Use either the `%e' or `%f' formats, whichever produces a shorter
+     string, with non-significant zeros suppressed.  `%G' will use `%E'
+     instead of `%e'.
+
+`%o'
+     An unsigned octal number (again, an integer).
+
+`%s'
+     A character string.
+
+`%x'
+`%X'
+     An unsigned hexadecimal number (an integer).  The `%X' format uses
+     `A' through `F' instead of `a' through `f' for decimal 10 through
+     15.
+
+`%%'
+     A single `%' character; no argument is converted.
+
+   There are optional, additional parameters that may lie between the
+`%' and the control letter:
+
+`-'
+     The expression should be left-justified within its field.
+
+`SPACE'
+     For numeric conversions, prefix positive values with a space, and
+     negative values with a minus sign.
+
+`+'
+     The plus sign, used before the width modifier (see below), says to
+     always supply a sign for numeric conversions, even if the data to
+     be formatted is positive. The `+' overrides the space modifier.
+
+`#'
+     Use an "alternate form" for certain control letters.  For `o',
+     supply a leading zero.  For `x', and `X', supply a leading `0x' or
+     `0X' for a non-zero result.  For `e', `E', and `f', the result
+     will always contain a decimal point.  For `g', and `G', trailing
+     zeros are not removed from the result.
+
+`0'
+     A leading `0' (zero) acts as a flag, that indicates output should
+     be padded with zeros instead of spaces.  This applies even to
+     non-numeric output formats.  This flag only has an effect when the
+     field width is wider than the value to be printed.
+
+`WIDTH'
+     The field should be padded to this width. The field is normally
+     padded with spaces.  If the `0' flag has been used, it is padded
+     with zeros.
+
+`.PREC'
+     A number that specifies the precision to use when printing.  For
+     the `e', `E', and `f' formats, this specifies the number of digits
+     you want printed to the right of the decimal point.  For the `g',
+     and `G' formats, it specifies the maximum number of significant
+     digits.  For the `d', `o', `i', `u', `x', and `X' formats, it
+     specifies the minimum number of digits to print.  For the `s'
+     format, it specifies the maximum number of characters from the
+     string that should be printed.
+
+   Either or both of the WIDTH and PREC values may be specified as `*'.
+In that case, the particular value is taken from the argument list.
+
+   *Note Using `printf' Statements for Fancier Printing: Printf.
+
+
+File: gawk.info,  Node: Special File Summary,  Next: Built-in Functions Summary,  Prev: Printf Summary,  Up: Actions Summary
+
+Special File Names
+------------------
+
+   When doing I/O redirection from either `print' or `printf' into a
+file, or via `getline' from a file, `gawk' recognizes certain special
+file names internally.  These file names allow access to open file
+descriptors inherited from `gawk''s parent process (usually the shell).
+The file names are:
+
+`/dev/stdin'
+     The standard input.
+
+`/dev/stdout'
+     The standard output.
+
+`/dev/stderr'
+     The standard error output.
+
+`/dev/fd/N'
+     The file denoted by the open file descriptor N.
+
+   In addition, reading the following files provides process related
+information about the running `gawk' program.  All returned records are
+terminated with a newline.
+
+`/dev/pid'
+     Returns the process ID of the current process.
+
+`/dev/ppid'
+     Returns the parent process ID of the current process.
+
+`/dev/pgrpid'
+     Returns the process group ID of the current process.
+
+`/dev/user'
+     At least four space-separated fields, containing the return values
+     of the `getuid', `geteuid', `getgid', and `getegid' system calls.
+     If there are any additional fields, they are the group IDs
+     returned by `getgroups' system call.  (Multiple groups may not be
+     supported on all systems.)
+
+These file names may also be used on the command line to name data
+files.  These file names are only recognized internally if you do not
+actually have files with these names on your system.
+
+   *Note Special File Names in `gawk': Special Files, for a longer
+description that provides the motivation for this feature.
+
+
+File: gawk.info,  Node: Built-in Functions Summary,  Next: Time Functions Summary,  Prev: Special File Summary,  Up: Actions Summary
+
+Built-in Functions
+------------------
+
+   `awk' provides a number of built-in functions for performing numeric
+operations, string related operations, and I/O related operations.
+
+   The built-in arithmetic functions are:
+
+`atan2(Y, X)'
+     the arctangent of Y/X in radians.
+
+`cos(EXPR)'
+     the cosine of EXPR, which is in radians.
+
+`exp(EXPR)'
+     the exponential function (`e ^ EXPR').
+
+`int(EXPR)'
+     truncates to integer.
+
+`log(EXPR)'
+     the natural logarithm of `expr'.
+
+`rand()'
+     a random number between zero and one.
+
+`sin(EXPR)'
+     the sine of EXPR, which is in radians.
+
+`sqrt(EXPR)'
+     the square root function.
+
+`srand([EXPR])'
+     use EXPR as a new seed for the random number generator.  If no EXPR
+     is provided, the time of day is used.  The return value is the
+     previous seed for the random number generator.
+
+   `awk' has the following built-in string functions:
+
+`gensub(REGEX, SUBST, HOW [, TARGET])'
+     If HOW is a string beginning with `g' or `G', then replace each
+     match of REGEX in TARGET with SUBST.  Otherwise, replace the
+     HOW'th occurrence. If TARGET is not supplied, use `$0'.  The
+     return value is the changed string; the original TARGET is not
+     modified. Within SUBST, `\N', where N is a digit from one to nine,
+     can be used to indicate the text that matched the N'th
+     parenthesized subexpression.  This function is `gawk'-specific.
+
+`gsub(REGEX, SUBST [, TARGET])'
+     for each substring matching the regular expression REGEX in the
+     string TARGET, substitute the string SUBST, and return the number
+     of substitutions. If TARGET is not supplied, use `$0'.
+
+`index(STR, SEARCH)'
+     returns the index of the string SEARCH in the string STR, or zero
+     if SEARCH is not present.
+
+`length([STR])'
+     returns the length of the string STR.  The length of `$0' is
+     returned if no argument is supplied.
+
+`match(STR, REGEX)'
+     returns the position in STR where the regular expression REGEX
+     occurs, or zero if REGEX is not present, and sets the values of
+     `RSTART' and `RLENGTH'.
+
+`split(STR, ARR [, REGEX])'
+     splits the string STR into the array ARR on the regular expression
+     REGEX, and returns the number of elements.  If REGEX is omitted,
+     `FS' is used instead. REGEX can be the null string, causing each
+     character to be placed into its own array element.  The array ARR
+     is cleared first.
+
+`sprintf(FMT, EXPR-LIST)'
+     prints EXPR-LIST according to FMT, and returns the resulting
+     string.
+
+`sub(REGEX, SUBST [, TARGET])'
+     just like `gsub', but only the first matching substring is
+     replaced.
+
+`substr(STR, INDEX [, LEN])'
+     returns the LEN-character substring of STR starting at INDEX.  If
+     LEN is omitted, the rest of STR is used.
+
+`tolower(STR)'
+     returns a copy of the string STR, with all the upper-case
+     characters in STR translated to their corresponding lower-case
+     counterparts.  Non-alphabetic characters are left unchanged.
+
+`toupper(STR)'
+     returns a copy of the string STR, with all the lower-case
+     characters in STR translated to their corresponding upper-case
+     counterparts.  Non-alphabetic characters are left unchanged.
+
+   The I/O related functions are:
+
+`close(EXPR)'
+     Close the open file or pipe denoted by EXPR.
+
+`fflush([EXPR])'
+     Flush any buffered output for the output file or pipe denoted by
+     EXPR.  If EXPR is omitted, standard output is flushed.  If EXPR is
+     the null string (`""'), all output buffers are flushed.
+
+`system(CMD-LINE)'
+     Execute the command CMD-LINE, and return the exit status.  If your
+     operating system does not support `system', calling it will
+     generate a fatal error.
+
+     `system("")' can be used to force `awk' to flush any pending
+     output.  This is more portable, but less obvious, than calling
+     `fflush'.
+
+
+File: gawk.info,  Node: Time Functions Summary,  Next: String Constants Summary,  Prev: Built-in Functions Summary,  Up: Actions Summary
+
+Time Functions
+--------------
+
+   The following two functions are available for getting the current
+time of day, and for formatting time stamps.  They are specific to
+`gawk'.
+
+`systime()'
+     returns the current time of day as the number of seconds since a
+     particular epoch (Midnight, January 1, 1970 UTC, on POSIX systems).
+
+`strftime([FORMAT[, TIMESTAMP]])'
+     formats TIMESTAMP according to the specification in FORMAT.  The
+     current time of day is used if no TIMESTAMP is supplied.  A
+     default format equivalent to the output of the `date' utility is
+     used if no FORMAT is supplied.  *Note Functions for Dealing with
+     Time Stamps: Time Functions, for the details on the conversion
+     specifiers that `strftime' accepts.
+
+
+File: gawk.info,  Node: String Constants Summary,  Prev: Time Functions Summary,  Up: Actions Summary
+
+String Constants
+----------------
+
+   String constants in `awk' are sequences of characters enclosed in
+double quotes (`"').  Within strings, certain "escape sequences" are
+recognized, as in C.  These are:
+
+`\\'
+     A literal backslash.
+
+`\a'
+     The "alert" character; usually the ASCII BEL character.
+
+`\b'
+     Backspace.
+
+`\f'
+     Formfeed.
+
+`\n'
+     Newline.
+
+`\r'
+     Carriage return.
+
+`\t'
+     Horizontal tab.
+
+`\v'
+     Vertical tab.
+
+`\xHEX DIGITS'
+     The character represented by the string of hexadecimal digits
+     following the `\x'.  As in ANSI C, all following hexadecimal
+     digits are considered part of the escape sequence.  E.g., `"\x1B"'
+     is a string containing the ASCII ESC (escape) character.  (The `\x'
+     escape sequence is not in POSIX `awk'.)
+
+`\DDD'
+     The character represented by the one, two, or three digit sequence
+     of octal digits.  Thus, `"\033"' is also a string containing the
+     ASCII ESC (escape) character.
+
+`\C'
+     The literal character C, if C is not one of the above.
+
+   The escape sequences may also be used inside constant regular
+expressions (e.g., the regexp `/[ \t\f\n\r\v]/' matches whitespace
+characters).
+
+   *Note Escape Sequences::.
+
+
+File: gawk.info,  Node: Functions Summary,  Next: Historical Features,  Prev: Actions Summary,  Up: Gawk Summary
+
+User-defined Functions
+======================
+
+   Functions in `awk' are defined as follows:
+
+     function NAME(PARAMETER LIST) { STATEMENTS }
+
+   Actual parameters supplied in the function call are used to
+instantiate the formal parameters declared in the function.  Arrays are
+passed by reference, other variables are passed by value.
+
+   If there are fewer arguments passed than there are names in
+PARAMETER-LIST, the extra names are given the null string as their
+value.  Extra names have the effect of local variables.
+
+   The open-parenthesis in a function call of a user-defined function
+must immediately follow the function name, without any intervening
+white space.  This is to avoid a syntactic ambiguity with the
+concatenation operator.
+
+   The word `func' may be used in place of `function' (but not in POSIX
+`awk').
+
+   Use the `return' statement to return a value from a function.
+
+   *Note User-defined Functions: User-defined.
+
+
+File: gawk.info,  Node: Historical Features,  Prev: Functions Summary,  Up: Gawk Summary
+
+Historical Features
+===================
+
+   There are two features of historical `awk' implementations that
+`gawk' supports.
+
+   First, it is possible to call the `length' built-in function not only
+with no arguments, but even without parentheses!
+
+     a = length
+
+is the same as either of
+
+     a = length()
+     a = length($0)
+
+For example:
+
+     $ echo abcdef | awk '{ print length }'
+     -| 6
+
+This feature is marked as "deprecated" in the POSIX standard, and
+`gawk' will issue a warning about its use if `--lint' is specified on
+the command line.  (The ability to use `length' this way was actually
+an accident of the original Unix `awk' implementation.  If any built-in
+function used `$0' as its default argument, it was possible to call
+that function without the parentheses.  In particular, it was common
+practice to use the `length' function in this fashion, and this usage
+was documented in the `awk' manual page.)
+
+   The other historical feature is the use of either the `break'
+statement, or the `continue' statement outside the body of a `while',
+`for', or `do' loop.  Traditional `awk' implementations have treated
+such usage as equivalent to the `next' statement.  More recent versions
+of Unix `awk' do not allow it. `gawk' supports this usage if
+`--traditional' has been specified.
+
+   *Note Command Line Options: Options, for more information about the
+`--posix' and `--lint' options.
+
+
+File: gawk.info,  Node: Installation,  Next: Notes,  Prev: Gawk Summary,  Up: Top
+
+Installing `gawk'
+*****************
+
+   This appendix provides instructions for installing `gawk' on the
+various platforms that are supported by the developers.  The primary
+developers support Unix (and one day, GNU), while the other ports were
+contributed.  The file `ACKNOWLEDGMENT' in the `gawk' distribution
+lists the electronic mail addresses of the people who did the
+respective ports, and they are also provided in *Note Reporting
+Problems and Bugs: Bugs.
+
+* Menu:
+
+* Gawk Distribution::           What is in the `gawk' distribution.
+* Unix Installation::           Installing `gawk' under various versions
+                                of Unix.
+* VMS Installation::            Installing `gawk' on VMS.
+* PC Installation::             Installing and Compiling `gawk' on MS-DOS
+                                and OS/2
+* Atari Installation::          Installing `gawk' on the Atari ST.
+* Amiga Installation::          Installing `gawk' on an Amiga.
+* Bugs::                        Reporting Problems and Bugs.
+* Other Versions::              Other freely available `awk'
+                                implementations.
+
+
+File: gawk.info,  Node: Gawk Distribution,  Next: Unix Installation,  Prev: Installation,  Up: Installation
+
+The `gawk' Distribution
+=======================
+
+   This section first describes how to get the `gawk' distribution, how
+to extract it, and then what is in the various files and subdirectories.
+
+* Menu:
+
+* Getting::                     How to get the distribution.
+* Extracting::                  How to extract the distribution.
+* Distribution contents::       What is in the distribution.
+
+
+File: gawk.info,  Node: Getting,  Next: Extracting,  Prev: Gawk Distribution,  Up: Gawk Distribution
+
+Getting the `gawk' Distribution
+-------------------------------
+
+   There are three ways you can get GNU software.
+
+  1. You can copy it from someone else who already has it.
+
+  2. You can order `gawk' directly from the Free Software Foundation.
+     Software distributions are available for Unix, MS-DOS, and VMS, on
+     tape, CD-ROM, or floppies (MS-DOS only).  The address is:
+
+          Free Software Foundation
+          59 Temple Place--Suite 330
+          Boston, MA  02111-1307 USA
+          Phone: +1-617-542-5942
+          Fax (including Japan): +1-617-542-2652
+          E-mail: `gnu@prep.ai.mit.edu'
+
+     Ordering from the FSF directly contributes to the support of the
+     foundation and to the production of more free software.
+
+  3. You can get `gawk' by using anonymous `ftp' to the Internet host
+     `ftp.gnu.ai.mit.edu', in the directory `/pub/gnu'.
+
+     Here is a list of alternate `ftp' sites from which you can obtain
+     GNU software.  When a site is listed as "SITE`:'DIRECTORY" the
+     DIRECTORY indicates the directory where GNU software is kept.  You
+     should use a site that is geographically close to you.
+
+    Asia:
+
+         `cair-archive.kaist.ac.kr:/pub/gnu'
+         `ftp.cs.titech.ac.jp'
+         `ftp.nectec.or.th:/pub/mirrors/gnu'
+         `utsun.s.u-tokyo.ac.jp:/ftpsync/prep'
+
+    Australia:
+
+         `archie.au:/gnu'
+               (`archie.oz' or `archie.oz.au' for ACSnet)
+
+    Africa:
+
+         `ftp.sun.ac.za:/pub/gnu'
+
+    Middle East:
+
+         `ftp.technion.ac.il:/pub/unsupported/gnu'
+
+    Europe:
+
+         `archive.eu.net'
+         `ftp.denet.dk'
+         `ftp.eunet.ch'
+         `ftp.funet.fi:/pub/gnu'
+         `ftp.ieunet.ie:pub/gnu'
+         `ftp.informatik.rwth-aachen.de:/pub/gnu'
+         `ftp.informatik.tu-muenchen.de'
+         `ftp.luth.se:/pub/unix/gnu'
+         `ftp.mcc.ac.uk'
+         `ftp.stacken.kth.se'
+         `ftp.sunet.se:/pub/gnu'
+         `ftp.univ-lyon1.fr:pub/gnu'
+         `ftp.win.tue.nl:/pub/gnu'
+         `irisa.irisa.fr:/pub/gnu'
+         `isy.liu.se'
+         `nic.switch.ch:/mirror/gnu'
+         `src.doc.ic.ac.uk:/gnu'
+         `unix.hensa.ac.uk:/pub/uunet/systems/gnu'
+
+    South America:
+
+         `ftp.inf.utfsm.cl:/pub/gnu'
+         `ftp.unicamp.br:/pub/gnu'
+
+    Western Canada:
+
+         `ftp.cs.ubc.ca:/mirror2/gnu'
+
+    USA:
+
+         `col.hp.com:/mirrors/gnu'
+         `f.ms.uky.edu:/pub3/gnu'
+         `ftp.cc.gatech.edu:/pub/gnu'
+         `ftp.cs.columbia.edu:/archives/gnu/prep'
+         `ftp.digex.net:/pub/gnu'
+         `ftp.hawaii.edu:/mirrors/gnu'
+         `ftp.kpc.com:/pub/mirror/gnu'
+
+    USA (continued):
+         `ftp.uu.net:/systems/gnu'
+         `gatekeeper.dec.com:/pub/GNU'
+         `jaguar.utah.edu:/gnustuff'
+         `labrea.stanford.edu'
+         `mrcnext.cso.uiuc.edu:/pub/gnu'
+         `vixen.cso.uiuc.edu:/gnu'
+         `wuarchive.wustl.edu:/systems/gnu'
+
+
+File: gawk.info,  Node: Extracting,  Next: Distribution contents,  Prev: Getting,  Up: Gawk Distribution
+
+Extracting the Distribution
+---------------------------
+
+   `gawk' is distributed as a `tar' file compressed with the GNU Zip
+program, `gzip'.
+
+   Once you have the distribution (for example, `gawk-3.0.1.tar.gz'),
+first use `gzip' to expand the file, and then use `tar' to extract it.
+You can use the following pipeline to produce the `gawk' distribution:
+
+     # Under System V, add 'o' to the tar flags
+     gzip -d -c gawk-3.0.1.tar.gz | tar -xvpf -
+
+This will create a directory named `gawk-3.0.1' in the current
+directory.
+
+   The distribution file name is of the form `gawk-V.R.N.tar.gz'.  The
+V represents the major version of `gawk', the R represents the current
+release of version V, and the N represents a "patch level", meaning
+that minor bugs have been fixed in the release.  The current patch
+level is 0, but when retrieving distributions, you should get the
+version with the highest version, release, and patch level.  (Note that
+release levels greater than or equal to 90 denote "beta," or
+non-production software; you may not wish to retrieve such a version
+unless you don't mind experimenting.)
+
+   If you are not on a Unix system, you will need to make other
+arrangements for getting and extracting the `gawk' distribution.  You
+should consult a local expert.
+
+
+File: gawk.info,  Node: Distribution contents,  Prev: Extracting,  Up: Gawk Distribution
+
+Contents of the `gawk' Distribution
+-----------------------------------
+
+   The `gawk' distribution has a number of C source files,
+documentation files, subdirectories and files related to the
+configuration process (*note Compiling and Installing `gawk' on Unix:
+Unix Installation.), and several subdirectories related to different,
+non-Unix, operating systems.
+
+various `.c', `.y', and `.h' files
+     These files are the actual `gawk' source code.
+
+`README'
+`README_d/README.*'
+     Descriptive files: `README' for `gawk' under Unix, and the rest
+     for the various hardware and software combinations.
+
+`INSTALL'
+     A file providing an overview of the configuration and installation
+     process.
+
+`PORTS'
+     A list of systems to which `gawk' has been ported, and which have
+     successfully run the test suite.
+
+`ACKNOWLEDGMENT'
+     A list of the people who contributed major parts of the code or
+     documentation.
+
+`ChangeLog'
+     A detailed list of source code changes as bugs are fixed or
+     improvements made.
+
+`NEWS'
+     A list of changes to `gawk' since the last release or patch.
+
+`COPYING'
+     The GNU General Public License.
+
+`FUTURES'
+     A brief list of features and/or changes being contemplated for
+     future releases, with some indication of the time frame for the
+     feature, based on its difficulty.
+
+`LIMITATIONS'
+     A list of those factors that limit `gawk''s performance.  Most of
+     these depend on the hardware or operating system software, and are
+     not limits in `gawk' itself.
+
+`POSIX.STD'
+     A description of one area where the POSIX standard for `awk' is
+     incorrect, and how `gawk' handles the problem.
+
+`PROBLEMS'
+     A file describing known problems with the current release.
+
+`doc/awkforai.txt'
+     A short article describing why `gawk' is a good language for AI
+     (Artificial Intelligence) programming.
+
+`doc/README.card'
+`doc/ad.block'
+`doc/awkcard.in'
+`doc/cardfonts'
+`doc/colors'
+`doc/macros'
+`doc/no.colors'
+`doc/setter.outline'
+     The `troff' source for a five-color `awk' reference card.  A
+     modern version of `troff', such as GNU Troff (`groff') is needed
+     to produce the color version. See the file `README.card' for
+     instructions if you have an older `troff'.
+
+`doc/gawk.1'
+     The `troff' source for a manual page describing `gawk'.  This is
+     distributed for the convenience of Unix users.
+
+`doc/gawk.texi'
+     The Texinfo source file for this Info file.  It should be
+     processed with TeX to produce a printed document, and with
+     `makeinfo' to produce an Info file.
+
+`doc/gawk.info'
+     The generated Info file for this Info file.
+
+`doc/igawk.1'
+     The `troff' source for a manual page describing the `igawk'
+     program presented in *Note An Easy Way to Use Library Functions:
+     Igawk Program.
+
+`doc/Makefile.in'
+     The input file used during the configuration process to generate
+     the actual `Makefile' for creating the documentation.
+
+`Makefile.in'
+`acconfig.h'
+`aclocal.m4'
+`configh.in'
+`configure.in'
+`configure'
+`custom.h'
+`missing/*'
+     These files and subdirectory are used when configuring `gawk' for
+     various Unix systems.  They are explained in detail in *Note
+     Compiling and Installing `gawk' on Unix: Unix Installation.
+
+`awklib/extract.awk'
+`awklib/Makefile.in'
+     The `awklib' directory contains a copy of `extract.awk' (*note
+     Extracting Programs from Texinfo Source Files: Extract Program.),
+     which can be used to extract the sample programs from the Texinfo
+     source file for this Info file, and a `Makefile.in' file, which
+     `configure' uses to generate a `Makefile'.  As part of the process
+     of building `gawk', the library functions from *Note A Library of
+     `awk' Functions: Library Functions, and the `igawk' program from
+     *Note An Easy Way to Use Library Functions: Igawk Program, are
+     extracted into ready to use files.  They are installed as part of
+     the installation process.
+
+`amiga/*'
+     Files needed for building `gawk' on an Amiga.  *Note Installing
+     `gawk' on an Amiga: Amiga Installation, for details.
+
+`atari/*'
+     Files needed for building `gawk' on an Atari ST.  *Note Installing
+     `gawk' on the Atari ST: Atari Installation, for details.
+
+`pc/*'
+     Files needed for building `gawk' under MS-DOS and OS/2.  *Note
+     MS-DOS and OS/2 Installation and Compilation: PC Installation, for
+     details.
+
+`vms/*'
+     Files needed for building `gawk' under VMS.  *Note How to Compile
+     and Install `gawk' on VMS: VMS Installation, for details.
+
+`test/*'
+     A test suite for `gawk'.  You can use `make check' from the top
+     level `gawk' directory to run your version of `gawk' against the
+     test suite.  If `gawk' successfully passes `make check' then you
+     can be confident of a successful port.
+
+
+File: gawk.info,  Node: Unix Installation,  Next: VMS Installation,  Prev: Gawk Distribution,  Up: Installation
+
+Compiling and Installing `gawk' on Unix
+=======================================
+
+   Usually, you can compile and install `gawk' by typing only two
+commands.  However, if you do use an unusual system, you may need to
+configure `gawk' for your system yourself.
+
+* Menu:
+
+* Quick Installation::          Compiling `gawk' under Unix.
+* Configuration Philosophy::    How it's all supposed to work.
+
+
+File: gawk.info,  Node: Quick Installation,  Next: Configuration Philosophy,  Prev: Unix Installation,  Up: Unix Installation
+
+Compiling `gawk' for Unix
+-------------------------
+
+   After you have extracted the `gawk' distribution, `cd' to
+`gawk-3.0.1'.  Like most GNU software, `gawk' is configured
+automatically for your Unix system by running the `configure' program.
+This program is a Bourne shell script that was generated automatically
+using GNU `autoconf'.  (The `autoconf' software is described fully
+starting with *Note Introduction: (autoconf)Top.)
+
+   To configure `gawk', simply run `configure':
+
+     sh ./configure
+
+   This produces a `Makefile' and `config.h' tailored to your system.
+The `config.h' file describes various facts about your system.  You may
+wish to edit the `Makefile' to change the `CFLAGS' variable, which
+controls the command line options that are passed to the C compiler
+(such as optimization levels, or compiling for debugging).
+
+   Alternatively, you can add your own values for most `make'
+variables, such as `CC' and `CFLAGS', on the command line when running
+`configure':
+
+     CC=cc CFLAGS=-g sh ./configure
+
+See the file `INSTALL' in the `gawk' distribution for all the details.
+
+   After you have run `configure', and possibly edited the `Makefile',
+type:
+
+     make
+
+and shortly thereafter, you should have an executable version of `gawk'.
+That's all there is to it!  (If these steps do not work, please send in
+a bug report; *note Reporting Problems and Bugs: Bugs..)
+
+
+File: gawk.info,  Node: Configuration Philosophy,  Prev: Quick Installation,  Up: Unix Installation
+
+The Configuration Process
+-------------------------
+
+   (This section is of interest only if you know something about using
+the C language and the Unix operating system.)
+
+   The source code for `gawk' generally attempts to adhere to formal
+standards wherever possible.  This means that `gawk' uses library
+routines that are specified by the ANSI C standard and by the POSIX
+operating system interface standard.  When using an ANSI C compiler,
+function prototypes are used to help improve the compile-time checking.
+
+   Many Unix systems do not support all of either the ANSI or the POSIX
+standards.  The `missing' subdirectory in the `gawk' distribution
+contains replacement versions of those subroutines that are most likely
+to be missing.
+
+   The `config.h' file that is created by the `configure' program
+contains definitions that describe features of the particular operating
+system where you are attempting to compile `gawk'.  The three things
+described by this file are what header files are available, so that
+they can be correctly included, what (supposedly) standard functions
+are actually available in your C libraries, and other miscellaneous
+facts about your variant of Unix.  For example, there may not be an
+`st_blksize' element in the `stat' structure.  In this case
+`HAVE_ST_BLKSIZE' would be undefined.
+
+   It is possible for your C compiler to lie to `configure'. It may do
+so by not exiting with an error when a library function is not
+available.  To get around this, you can edit the file `custom.h'.  Use
+an `#ifdef' that is appropriate for your system, and either `#define'
+any constants that `configure' should have defined but didn't, or
+`#undef' any constants that `configure' defined and should not have.
+`custom.h' is automatically included by `config.h'.
+
+   It is also possible that the `configure' program generated by
+`autoconf' will not work on your system in some other fashion.  If you
+do have a problem, the file `configure.in' is the input for `autoconf'.
+You may be able to change this file, and generate a new version of
+`configure' that will work on your system.  *Note Reporting Problems
+and Bugs: Bugs, for information on how to report problems in
+configuring `gawk'.  The same mechanism may be used to send in updates
+to `configure.in' and/or `custom.h'.
+
+
+File: gawk.info,  Node: VMS Installation,  Next: PC Installation,  Prev: Unix Installation,  Up: Installation
+
+How to Compile and Install `gawk' on VMS
+========================================
+
+   This section describes how to compile and install `gawk' under VMS.
+
+* Menu:
+
+* VMS Compilation::             How to compile `gawk' under VMS.
+* VMS Installation Details::    How to install `gawk' under VMS.
+* VMS Running::                 How to run `gawk' under VMS.
+* VMS POSIX::                   Alternate instructions for VMS POSIX.
+
+
+File: gawk.info,  Node: VMS Compilation,  Next: VMS Installation Details,  Prev: VMS Installation,  Up: VMS Installation
+
+Compiling `gawk' on VMS
+-----------------------
+
+   To compile `gawk' under VMS, there is a `DCL' command procedure that
+will issue all the necessary `CC' and `LINK' commands, and there is
+also a `Makefile' for use with the `MMS' utility.  From the source
+directory, use either
+
+     $ @[.VMS]VMSBUILD.COM
+
+or
+
+     $ MMS/DESCRIPTION=[.VMS]DESCRIP.MMS GAWK
+
+   Depending upon which C compiler you are using, follow one of the sets
+of instructions in this table:
+
+VAX C V3.x
+     Use either `vmsbuild.com' or `descrip.mms' as is.  These use
+     `CC/OPTIMIZE=NOLINE', which is essential for Version 3.0.
+
+VAX C V2.x
+     You must have Version 2.3 or 2.4; older ones won't work.  Edit
+     either `vmsbuild.com' or `descrip.mms' according to the comments
+     in them.  For `vmsbuild.com', this just entails removing two `!'
+     delimiters.  Also edit `config.h' (which is a copy of file
+     `[.config]vms-conf.h') and comment out or delete the two lines
+     `#define __STDC__ 0' and `#define VAXC_BUILTINS' near the end.
+
+GNU C
+     Edit `vmsbuild.com' or `descrip.mms'; the changes are different
+     from those for VAX C V2.x, but equally straightforward.  No
+     changes to `config.h' should be needed.
+
+DEC C
+     Edit `vmsbuild.com' or `descrip.mms' according to their comments.
+     No changes to `config.h' should be needed.
+
+   `gawk' has been tested under VAX/VMS 5.5-1 using VAX C V3.2, GNU C
+1.40 and 2.3.  It should work without modifications for VMS V4.6 and up.
+
+
+File: gawk.info,  Node: VMS Installation Details,  Next: VMS Running,  Prev: VMS Compilation,  Up: VMS Installation
+
+Installing `gawk' on VMS
+------------------------
+
+   To install `gawk', all you need is a "foreign" command, which is a
+`DCL' symbol whose value begins with a dollar sign. For example:
+
+     $ GAWK :== $disk1:[gnubin]GAWK
+
+(Substitute the actual location of `gawk.exe' for `$disk1:[gnubin]'.)
+The symbol should be placed in the `login.com' of any user who wishes
+to run `gawk', so that it will be defined every time the user logs on.
+Alternatively, the symbol may be placed in the system-wide
+`sylogin.com' procedure, which will allow all users to run `gawk'.
+
+   Optionally, the help entry can be loaded into a VMS help library:
+
+     $ LIBRARY/HELP SYS$HELP:HELPLIB [.VMS]GAWK.HLP
+
+(You may want to substitute a site-specific help library rather than
+the standard VMS library `HELPLIB'.)  After loading the help text,
+
+     $ HELP GAWK
+
+will provide information about both the `gawk' implementation and the
+`awk' programming language.
+
+   The logical name `AWK_LIBRARY' can designate a default location for
+`awk' program files.  For the `-f' option, if the specified filename
+has no device or directory path information in it, `gawk' will look in
+the current directory first, then in the directory specified by the
+translation of `AWK_LIBRARY' if the file was not found.  If after
+searching in both directories, the file still is not found, then `gawk'
+appends the suffix `.awk' to the filename and the file search will be
+re-tried.  If `AWK_LIBRARY' is not defined, that portion of the file
+search will fail benignly.
+
+
+File: gawk.info,  Node: VMS Running,  Next: VMS POSIX,  Prev: VMS Installation Details,  Up: VMS Installation
+
+Running `gawk' on VMS
+---------------------
+
+   Command line parsing and quoting conventions are significantly
+different on VMS, so examples in this Info file or from other sources
+often need minor changes.  They _are_ minor though, and all `awk'
+programs should run correctly.
+
+   Here are a couple of trivial tests:
+
+     $ gawk -- "BEGIN {print ""Hello, World!""}"
+     $ gawk -"W" version
+     ! could also be -"W version" or "-W version"
+
+Note that upper-case and mixed-case text must be quoted.
+
+   The VMS port of `gawk' includes a `DCL'-style interface in addition
+to the original shell-style interface (see the help entry for details).
+One side-effect of dual command line parsing is that if there is only a
+single parameter (as in the quoted string program above), the command
+becomes ambiguous.  To work around this, the normally optional `--'
+flag is required to force Unix style rather than `DCL' parsing.  If any
+other dash-type options (or multiple parameters such as data files to be
+processed) are present, there is no ambiguity and `--' can be omitted.
+
+   The default search path when looking for `awk' program files
+specified by the `-f' option is `"SYS$DISK:[],AWK_LIBRARY:"'.  The
+logical name `AWKPATH' can be used to override this default.  The format
+of `AWKPATH' is a comma-separated list of directory specifications.
+When defining it, the value should be quoted so that it retains a single
+translation, and not a multi-translation `RMS' searchlist.
+
+
+File: gawk.info,  Node: VMS POSIX,  Prev: VMS Running,  Up: VMS Installation
+
+Building and Using `gawk' on VMS POSIX
+--------------------------------------
+
+   Ignore the instructions above, although `vms/gawk.hlp' should still
+be made available in a help library.  The source tree should be unpacked
+into a container file subsystem rather than into the ordinary VMS file
+system.  Make sure that the two scripts, `configure' and
+`vms/posix-cc.sh', are executable; use `chmod +x' on them if necessary.
+Then execute the following two commands:
+
+     psx> CC=vms/posix-cc.sh configure
+     psx> make CC=c89 gawk
+
+The first command will construct files `config.h' and `Makefile' out of
+templates, using a script to make the C compiler fit `configure''s
+expectations.  The second command will compile and link `gawk' using
+the C compiler directly; ignore any warnings from `make' about being
+unable to redefine `CC'.  `configure' will take a very long time to
+execute, but at least it provides incremental feedback as it runs.
+
+   This has been tested with VAX/VMS V6.2, VMS POSIX V2.0, and DEC C
+V5.2.
+
+   Once built, `gawk' will work like any other shell utility.  Unlike
+the normal VMS port of `gawk', no special command line manipulation is
+needed in the VMS POSIX environment.
+
+
+File: gawk.info,  Node: PC Installation,  Next: Atari Installation,  Prev: VMS Installation,  Up: Installation
+
+MS-DOS and OS/2 Installation and Compilation
+============================================
+
+   If you have received a binary distribution prepared by the DOS
+maintainers, then `gawk' and the necessary support files will appear
+under the `gnu' directory, with executables in `gnu/bin', libraries in
+`gnu/lib/awk', and manual pages under `gnu/man'.  This is designed for
+easy installation to a `/gnu' directory on your drive, but the files
+can be installed anywhere provided `AWKPATH' is set properly.
+Regardless of the installation directory, the first line of `igawk.cmd'
+and `igawk.bat' (in `gnu/bin') may need to be edited.
+
+   The binary distribution will contain a separate file describing the
+contents. In particular, it may include more than one version of the
+`gawk' executable. OS/2 binary distributions may have a different
+arrangement, but installation is similar.
+
+   The OS/2 and MS-DOS versions of `gawk' search for program files as
+described in *Note The `AWKPATH' Environment Variable: AWKPATH Variable.
+However, semicolons (rather than colons) separate elements in the
+`AWKPATH' variable. If `AWKPATH' is not set or is empty, then the
+default search path is `".;c:/lib/awk;c:/gnu/lib/awk"'.
+
+   An `sh'-like shell (as opposed to `command.com' under MS-DOS or
+`cmd.exe' under OS/2) may be useful for `awk' programming.  Ian
+Stewartson has written an excellent shell for MS-DOS and OS/2, and a
+`ksh' clone and GNU Bash are available for OS/2. The file
+`README_d/README.pc' in the `gawk' distribution contains information on
+these shells. Users of Stewartson's shell on DOS should examine its
+documentation on handling of command-lines. In particular, the setting
+for `gawk' in the shell configuration may need to be changed, and the
+`ignoretype' option may also be of interest.
+
+   `gawk' can be compiled for MS-DOS and OS/2 using the GNU development
+tools from DJ Delorie (DJGPP, MS-DOS-only) or Eberhard Mattes (EMX,
+MS-DOS and OS/2).  Microsoft C can be used to build 16-bit versions for
+MS-DOS and OS/2.  The file `README_d/README.pc' in the `gawk'
+distribution contains additional notes, and `pc/Makefile' contains
+important notes on compilation options.
+
+   To build `gawk', copy the files in the `pc' directory (_except_ for
+`ChangeLog') to the directory with the rest of the `gawk' sources. The
+`Makefile' contains a configuration section with comments, and may need
+to be edited in order to work with your `make' utility.
+
+   The `Makefile' contains a number of targets for building various
+MS-DOS and OS/2 versions. A list of targets will be printed if the
+`make' command is given without a target. As an example, to build `gawk'
+using the DJGPP tools, enter `make djgpp'.
+
+   Using `make' to run the standard tests and to install `gawk'
+requires additional Unix-like tools, including `sh', `sed', and `cp'.
+In order to run the tests, the `test/*.ok' files may need to be
+converted so that they have the usual DOS-style end-of-line markers.
+Most of the tests will work properly with Stewartson's shell along with
+the companion utilities or appropriate GNU utilities.  However, some
+editing of `test/Makefile' is required. It is recommended that the file
+`pc/Makefile.tst' be copied to `test/Makefile' as a replacement.
+Details can be found in `README_d/README.pc'.
+
+
+File: gawk.info,  Node: Atari Installation,  Next: Amiga Installation,  Prev: PC Installation,  Up: Installation
+
+Installing `gawk' on the Atari ST
+=================================
+
+   There are no substantial differences when installing `gawk' on
+various Atari models.  Compiled `gawk' executables do not require a
+large amount of memory with most `awk' programs and should run on all
+Motorola processor based models (called further ST, even if that is not
+exactly right).
+
+   In order to use `gawk', you need to have a shell, either text or
+graphics, that does not map all the characters of a command line to
+upper-case.  Maintaining case distinction in option flags is very
+important (*note Command Line Options: Options.).  These days this is
+the default, and it may only be a problem for some very old machines.
+If your system does not preserve the case of option flags, you will
+need to upgrade your tools.  Support for I/O redirection is necessary
+to make it easy to import `awk' programs from other environments.
+Pipes are nice to have, but not vital.
+
+* Menu:
+
+* Atari Compiling::           Compiling `gawk' on Atari
+* Atari Using::               Running `gawk' on Atari
+
+
+File: gawk.info,  Node: Atari Compiling,  Next: Atari Using,  Prev: Atari Installation,  Up: Atari Installation
+
+Compiling `gawk' on the Atari ST
+--------------------------------
+
+   A proper compilation of `gawk' sources when `sizeof(int)' differs
+from `sizeof(void *)' requires an ANSI C compiler. An initial port was
+done with `gcc'.  You may actually prefer executables where `int's are
+four bytes wide, but the other variant works as well.
+
+   You may need quite a bit of memory when trying to recompile the
+`gawk' sources, as some source files (`regex.c' in particular) are quite
+big.  If you run out of memory compiling such a file, try reducing the
+optimization level for this particular file; this may help.
+
+   With a reasonable shell (Bash will do), and in particular if you run
+Linux, MiNT or a similar operating system, you have a pretty good
+chance that the `configure' utility will succeed.  Otherwise sample
+versions of `config.h' and `Makefile.st' are given in the `atari'
+subdirectory and can be edited and copied to the corresponding files in
+the main source directory.  Even if `configure' produced something, it
+might be advisable to compare its results with the sample versions and
+possibly make adjustments.
+
+   Some `gawk' source code fragments depend on a preprocessor define
+`atarist'.  This basically assumes the TOS environment with `gcc'.
+Modify these sections as appropriate if they are not right for your
+environment.  Also see the remarks about `AWKPATH' and `envsep' in
+*Note Running `gawk' on the Atari ST: Atari Using.
+
+   As shipped, the sample `config.h' claims that the `system' function
+is missing from the libraries, which is not true, and an alternative
+implementation of this function is provided in `atari/system.c'.
+Depending upon your particular combination of shell and operating
+system, you may wish to change the file to indicate that `system' is
+available.
+
+
+File: gawk.info,  Node: Atari Using,  Prev: Atari Compiling,  Up: Atari Installation
+
+Running `gawk' on the Atari ST
+------------------------------
+
+   An executable version of `gawk' should be placed, as usual, anywhere
+in your `PATH' where your shell can find it.
+
+   While executing, `gawk' creates a number of temporary files.  When
+using `gcc' libraries for TOS, `gawk' looks for either of the
+environment variables `TEMP' or `TMPDIR', in that order.  If either one
+is found, its value is assumed to be a directory for temporary files.
+This directory must exist, and if you can spare the memory, it is a
+good idea to put it on a RAM drive.  If neither `TEMP' nor `TMPDIR' are
+found, then `gawk' uses the current directory for its temporary files.
+
+   The ST version of `gawk' searches for its program files as described
+in *Note The `AWKPATH' Environment Variable: AWKPATH Variable.  The
+default value for the `AWKPATH' variable is taken from `DEFPATH'
+defined in `Makefile'. The sample `gcc'/TOS `Makefile' for the ST in
+the distribution sets `DEFPATH' to `".,c:\lib\awk,c:\gnu\lib\awk"'.
+The search path can be modified by explicitly setting `AWKPATH' to
+whatever you wish.  Note that colons cannot be used on the ST to
+separate elements in the `AWKPATH' variable, since they have another,
+reserved, meaning.  Instead, you must use a comma to separate elements
+in the path.  When recompiling, the separating character can be
+modified by initializing the `envsep' variable in `atari/gawkmisc.atr'
+to another value.
+
+   Although `awk' allows great flexibility in doing I/O redirections
+from within a program, this facility should be used with care on the ST
+running under TOS.  In some circumstances the OS routines for file
+handle pool processing lose track of certain events, causing the
+computer to crash, and requiring a reboot.  Often a warm reboot is
+sufficient.  Fortunately, this happens infrequently, and in rather
+esoteric situations.  In particular, avoid having one part of an `awk'
+program using `print' statements explicitly redirected to
+`"/dev/stdout"', while other `print' statements use the default
+standard output, and a calling shell has redirected standard output to
+a file.
+
+   When `gawk' is compiled with the ST version of `gcc' and its usual
+libraries, it will accept both `/' and `\' as path separators.  While
+this is convenient, it should be remembered that this removes one,
+technically valid, character (`/') from your file names, and that it
+may create problems for external programs, called via the `system'
+function, which may not support this convention.  Whenever it is
+possible that a file created by `gawk' will be used by some other
+program, use only backslashes.  Also remember that in `awk',
+backslashes in strings have to be doubled in order to get literal
+backslashes (*note Escape Sequences::).
+
+
+File: gawk.info,  Node: Amiga Installation,  Next: Bugs,  Prev: Atari Installation,  Up: Installation
+
+Installing `gawk' on an Amiga
+=============================
+
+   You can install `gawk' on an Amiga system using a Unix emulation
+environment available via anonymous `ftp' from `wuarchive.wustl.edu' in
+the directory `pub/aminet/dev/gcc'.  This includes a shell based on
+`pdksh'.  The primary component of this environment is a Unix emulation
+library, `ixemul.lib'.
+
+   A more complete distribution for the Amiga is available on the
+FreshFish CD-ROM from:
+
+     CRONUS
+     1840 E. Warner Road #105-265
+     Tempe, AZ 85284  USA
+     US Toll Free: (800) 804-0833
+     Phone: +1-602-491-0442
+     FAX: +1-602-491-0048
+     Email:  `info@ninemoons.com'
+     WWW: `http://www.ninemoons.com'
+     Anonymous `ftp' site: `ftp.ninemoons.com'
+
+   Once you have the distribution, you can configure `gawk' simply by
+running `configure':
+
+     configure -v m68k-cbm-amigados
+
+   Then run `make', and you should be all set!  (If these steps do not
+work, please send in a bug report; *note Reporting Problems and Bugs:
+Bugs..)
+
+
+File: gawk.info,  Node: Bugs,  Next: Other Versions,  Prev: Amiga Installation,  Up: Installation
+
+Reporting Problems and Bugs
+===========================
+
+   If you have problems with `gawk' or think that you have found a bug,
+please report it to the developers; we cannot promise to do anything
+but we might well want to fix it.
+
+   Before reporting a bug, make sure you have actually found a real bug.
+Carefully reread the documentation and see if it really says you can do
+what you're trying to do.  If it's not clear whether you should be able
+to do something or not, report that too; it's a bug in the
+documentation!
+
+   Before reporting a bug or trying to fix it yourself, try to isolate
+it to the smallest possible `awk' program and input data file that
+reproduces the problem.  Then send us the program and data file, some
+idea of what kind of Unix system you're using, and the exact results
+`gawk' gave you.  Also say what you expected to occur; this will help
+us decide whether the problem was really in the documentation.
+
+   Once you have a precise problem, there are two e-mail addresses you
+can send mail to.
+
+Internet:
+     `bug-gnu-utils@prep.ai.mit.edu'
+
+UUCP:
+     `uunet!prep.ai.mit.edu!bug-gnu-utils'
+
+   Please include the version number of `gawk' you are using.  You can
+get this information with the command `gawk --version'.  You should
+send a carbon copy of your mail to Arnold Robbins, who can be reached
+at `arnold@gnu.ai.mit.edu'.
+
+   *Important!* Do _not_ try to report bugs in `gawk' by posting to the
+Usenet/Internet newsgroup `comp.lang.awk'.  While the `gawk' developers
+do occasionally read this newsgroup, there is no guarantee that we will
+see your posting.  The steps described above are the official,
+recognized ways for reporting bugs.
+
+   Non-bug suggestions are always welcome as well.  If you have
+questions about things that are unclear in the documentation or are
+just obscure features, ask Arnold Robbins; he will try to help you out,
+although he may not have the time to fix the problem.  You can send him
+electronic mail at the Internet address above.
+
+   If you find bugs in one of the non-Unix ports of `gawk', please send
+an electronic mail message to the person who maintains that port.  They
+are listed below, and also in the `README' file in the `gawk'
+distribution.  Information in the `README' file should be considered
+authoritative if it conflicts with this Info file.
+
+   The people maintaining the non-Unix ports of `gawk' are:
+
+MS-DOS
+     Scott Deifik, `scottd@amgen.com', and Darrel Hankerson,
+     `hankedr@mail.auburn.edu'.
+
+OS/2
+     Kai Uwe Rommel, `rommel@ars.de'.
+
+VMS
+     Pat Rankin, `rankin@eql.caltech.edu'.
+
+Atari ST
+     Michal Jaegermann, `michal@gortel.phys.ualberta.ca'.
+
+Amiga
+     Fred Fish, `fnf@ninemoons.com'.
+
+   If your bug is also reproducible under Unix, please send copies of
+your report to the general GNU bug list, as well as to Arnold Robbins,
+at the addresses listed above.
+
+
+File: gawk.info,  Node: Other Versions,  Prev: Bugs,  Up: Installation
+
+Other Freely Available `awk' Implementations
+============================================
+
+     It's kind of fun to put comments like this in your awk code.
+           `// Do C++ comments work? answer: yes! of course'
+     Michael Brennan
+
+   There are two other freely available `awk' implementations.  This
+section briefly describes where to get them.
+
+Unix `awk'
+     Brian Kernighan has been able to make his implementation of `awk'
+     freely available.  You can get it via anonymous `ftp' to the host
+     `netlib.att.com'.  Change directory to `/netlib/research'. Use
+     "binary" or "image" mode, and retrieve `awk.bundle.Z'.
+
+     This is a shell archive that has been compressed with the
+     `compress' utility. It can be uncompressed with either
+     `uncompress' or the GNU `gunzip' utility.
+
+     This version requires an ANSI C compiler; GCC (the GNU C compiler)
+     works quite nicely.
+
+`mawk'
+     Michael Brennan has written an independent implementation of `awk',
+     called `mawk'.  It is available under the GPL (*note GNU GENERAL
+     PUBLIC LICENSE: Copying.), just as `gawk' is.
+
+     You can get it via anonymous `ftp' to the host `ftp.whidbey.net'.
+     Change directory to `/pub/brennan'.  Use "binary" or "image" mode,
+     and retrieve `mawk1.3.3.tar.gz' (or the latest version that is
+     there).
+
+     `gunzip' may be used to decompress this file. Installation is
+     similar to `gawk''s (*note Compiling and Installing `gawk' on
+     Unix: Unix Installation.).
+
+
+File: gawk.info,  Node: Notes,  Next: Glossary,  Prev: Installation,  Up: Top
+
+Implementation Notes
+********************
+
+   This appendix contains information mainly of interest to
+implementors and maintainers of `gawk'.  Everything in it applies
+specifically to `gawk', and not to other implementations.
+
+* Menu:
+
+* Compatibility Mode::          How to disable certain `gawk' extensions.
+* Additions::                   Making Additions To `gawk'.
+* Future Extensions::           New features that may be implemented one day.
+* Improvements::                Suggestions for improvements by volunteers.
+
+
+File: gawk.info,  Node: Compatibility Mode,  Next: Additions,  Prev: Notes,  Up: Notes
+
+Downward Compatibility and Debugging
+====================================
+
+   *Note Extensions in `gawk' Not in POSIX `awk': POSIX/GNU, for a
+summary of the GNU extensions to the `awk' language and program.  All
+of these features can be turned off by invoking `gawk' with the
+`--traditional' option, or with the `--posix' option.
+
+   If `gawk' is compiled for debugging with `-DDEBUG', then there is
+one more option available on the command line:
+
+`-W parsedebug'
+`--parsedebug'
+     Print out the parse stack information as the program is being
+     parsed.
+
+   This option is intended only for serious `gawk' developers, and not
+for the casual user.  It probably has not even been compiled into your
+version of `gawk', since it slows down execution.
+
+
+File: gawk.info,  Node: Additions,  Next: Future Extensions,  Prev: Compatibility Mode,  Up: Notes
+
+Making Additions to `gawk'
+==========================
+
+   If you should find that you wish to enhance `gawk' in a significant
+fashion, you are perfectly free to do so.  That is the point of having
+free software; the source code is available, and you are free to change
+it as you wish (*note GNU GENERAL PUBLIC LICENSE: Copying.).
+
+   This section discusses the ways you might wish to change `gawk', and
+any considerations you should bear in mind.
+
+* Menu:
+
+* Adding Code::             Adding code to the main body of `gawk'.
+* New Ports::               Porting `gawk' to a new operating system.
+
+
+File: gawk.info,  Node: Adding Code,  Next: New Ports,  Prev: Additions,  Up: Additions
+
+Adding New Features
+-------------------
+
+   You are free to add any new features you like to `gawk'.  However,
+if you want your changes to be incorporated into the `gawk'
+distribution, there are several steps that you need to take in order to
+make it possible for me to include to your changes.
+
+  1. Get the latest version.  It is much easier for me to integrate
+     changes if they are relative to the most recent distributed
+     version of `gawk'.  If your version of `gawk' is very old, I may
+     not be able to integrate them at all.  *Note Getting the `gawk'
+     Distribution: Getting, for information on getting the latest
+     version of `gawk'.
+
+  2. See *note (Version)Top:: standards, GNU Coding Standards.  This
+     document describes how GNU software should be written. If you
+     haven't read it, please do so, preferably _before_ starting to
+     modify `gawk'.  (The `GNU Coding Standards' are available as part
+     of the Autoconf distribution, from the FSF.)
+
+  3. Use the `gawk' coding style.  The C code for `gawk' follows the
+     instructions in the `GNU Coding Standards', with minor exceptions.
+     The code is formatted using the traditional "K&R" style,
+     particularly as regards the placement of braces and the use of
+     tabs.  In brief, the coding rules for `gawk' are:
+
+        * Use old style (non-prototype) function headers when defining
+          functions.
+
+        * Put the name of the function at the beginning of its own line.
+
+        * Put the return type of the function, even if it is `int', on
+          the line above the line with the name and arguments of the
+          function.
+
+        * The declarations for the function arguments should not be
+          indented.
+
+        * Put spaces around parentheses used in control structures
+          (`if', `while', `for', `do', `switch' and `return').
+
+        * Do not put spaces in front of parentheses used in function
+          calls.
+
+        * Put spaces around all C operators, and after commas in
+          function calls.
+
+        * Do not use the comma operator to produce multiple
+          side-effects, except in `for' loop initialization and
+          increment parts, and in macro bodies.
+
+        * Use real tabs for indenting, not spaces.
+
+        * Use the "K&R" brace layout style.
+
+        * Use comparisons against `NULL' and `'\0'' in the conditions of
+          `if', `while' and `for' statements, and in the `case's of
+          `switch' statements, instead of just the plain pointer or
+          character value.
+
+        * Use the `TRUE', `FALSE', and `NULL' symbolic constants, and
+          the character constant `'\0'' where appropriate, instead of
+          `1' and `0'.
+
+        * Provide one-line descriptive comments for each function.
+
+        * Do not use `#elif'. Many older Unix C compilers cannot handle
+          it.
+
+        * Do not use the `alloca' function for allocating memory off
+          the stack.  Its use causes more portability trouble than the
+          minor benefit of not having to free the storage. Instead, use
+          `malloc' and `free'.
+
+     If I have to reformat your code to follow the coding style used in
+     `gawk', I may not bother.
+
+  4. Be prepared to sign the appropriate paperwork.  In order for the
+     FSF to distribute your changes, you must either place those
+     changes in the public domain, and submit a signed statement to that
+     effect, or assign the copyright in your changes to the FSF.  Both
+     of these actions are easy to do, and _many_ people have done so
+     already. If you have questions, please contact me (*note Reporting
+     Problems and Bugs: Bugs.), or `gnu@prep.ai.mit.edu'.
+
+  5. Update the documentation.  Along with your new code, please supply
+     new sections and or chapters for this Info file.  If at all
+     possible, please use real Texinfo, instead of just supplying
+     unformatted ASCII text (although even that is better than no
+     documentation at all).  Conventions to be followed in `The GNU Awk
+     User's Guide' are provided after the `@bye' at the end of the
+     Texinfo source file.  If possible, please update the man page as
+     well.
+
+     You will also have to sign paperwork for your documentation
+     changes.
+
+  6. Submit changes as context diffs or unified diffs.  Use `diff -c -r
+     -N' or `diff -u -r -N' to compare the original `gawk' source tree
+     with your version.  (I find context diffs to be more readable, but
+     unified diffs are more compact.)  I recommend using the GNU
+     version of `diff'.  Send the output produced by either run of
+     `diff' to me when you submit your changes.  *Note Reporting
+     Problems and Bugs: Bugs, for the electronic mail information.
+
+     Using this format makes it easy for me to apply your changes to the
+     master version of the `gawk' source code (using `patch').  If I
+     have to apply the changes manually, using a text editor, I may not
+     do so, particularly if there are lots of changes.
+
+   Although this sounds like a lot of work, please remember that while
+you may write the new code, I have to maintain it and support it, and
+if it isn't possible for me to do that with a minimum of extra work,
+then I probably will not.
+
+
+File: gawk.info,  Node: New Ports,  Prev: Adding Code,  Up: Additions
+
+Porting `gawk' to a New Operating System
+----------------------------------------
+
+   If you wish to port `gawk' to a new operating system, there are
+several steps to follow.
+
+  1. Follow the guidelines in *Note Adding New Features: Adding Code,
+     concerning coding style, submission of diffs, and so on.
+
+  2. When doing a port, bear in mind that your code must co-exist
+     peacefully with the rest of `gawk', and the other ports. Avoid
+     gratuitous changes to the system-independent parts of the code. If
+     at all possible, avoid sprinkling `#ifdef's just for your port
+     throughout the code.
+
+     If the changes needed for a particular system affect too much of
+     the code, I probably will not accept them.  In such a case, you
+     will, of course, be able to distribute your changes on your own,
+     as long as you comply with the GPL (*note GNU GENERAL PUBLIC
+     LICENSE: Copying.).
+
+  3. A number of the files that come with `gawk' are maintained by other
+     people at the Free Software Foundation.  Thus, you should not
+     change them unless it is for a very good reason. I.e. changes are
+     not out of the question, but changes to these files will be
+     scrutinized extra carefully.  The files are `alloca.c',
+     `getopt.h', `getopt.c', `getopt1.c', `regex.h', `regex.c', `dfa.h',
+     `dfa.c', `install-sh', and `mkinstalldirs'.
+
+  4. Be willing to continue to maintain the port.  Non-Unix operating
+     systems are supported by volunteers who maintain the code needed
+     to compile and run `gawk' on their systems. If no-one volunteers
+     to maintain a port, that port becomes unsupported, and it may be
+     necessary to remove it from the distribution.
+
+  5. Supply an appropriate `gawkmisc.???' file.  Each port has its own
+     `gawkmisc.???' that implements certain operating system specific
+     functions. This is cleaner than a plethora of `#ifdef's scattered
+     throughout the code.  The `gawkmisc.c' in the main source
+     directory includes the appropriate `gawkmisc.???' file from each
+     subdirectory.  Be sure to update it as well.
+
+     Each port's `gawkmisc.???' file has a suffix reminiscent of the
+     machine or operating system for the port. For example,
+     `pc/gawkmisc.pc' and `vms/gawkmisc.vms'. The use of separate
+     suffixes, instead of plain `gawkmisc.c', makes it possible to move
+     files from a port's subdirectory into the main subdirectory,
+     without accidentally destroying the real `gawkmisc.c' file.
+     (Currently, this is only an issue for the MS-DOS and OS/2 ports.)
+
+  6. Supply a `Makefile' and any other C source and header files that
+     are necessary for your operating system.  All your code should be
+     in a separate subdirectory, with a name that is the same as, or
+     reminiscent of, either your operating system or the computer
+     system.  If possible, try to structure things so that it is not
+     necessary to move files out of the subdirectory into the main
+     source directory.  If that is not possible, then be sure to avoid
+     using names for your files that duplicate the names of files in
+     the main source directory.
+
+  7. Update the documentation.  Please write a section (or sections)
+     for this Info file describing the installation and compilation
+     steps needed to install and/or compile `gawk' for your system.
+
+  8. Be prepared to sign the appropriate paperwork.  In order for the
+     FSF to distribute your code, you must either place your code in
+     the public domain, and submit a signed statement to that effect,
+     or assign the copyright in your code to the FSF.  Both of these
+     actions are easy to do, and _many_ people have done so already. If
+     you have questions, please contact me, or `gnu@prep.ai.mit.edu'.
+
+   Following these steps will make it much easier to integrate your
+changes into `gawk', and have them co-exist happily with the code for
+other operating systems that is already there.
+
+   In the code that you supply, and that you maintain, feel free to use
+a coding style and brace layout that suits your taste.
+
+
+File: gawk.info,  Node: Future Extensions,  Next: Improvements,  Prev: Additions,  Up: Notes
+
+Probable Future Extensions
+==========================
+
+     AWK is a language similar to PERL, only considerably more elegant.
+     Arnold Robbins
+     
+     Hey!
+     Larry Wall
+
+   This section briefly lists extensions and possible improvements that
+indicate the directions we are currently considering for `gawk'.  The
+file `FUTURES' in the `gawk' distributions lists these extensions as
+well.
+
+   This is a list of probable future changes that will be usable by the
+`awk' language programmer.
+
+Localization
+     The GNU project is starting to support multiple languages.  It
+     will at least be possible to make `gawk' print its warnings and
+     error messages in languages other than English.  It may be
+     possible for `awk' programs to also use the multiple language
+     facilities, separate from `gawk' itself.
+
+Databases
+     It may be possible to map a GDBM/NDBM/SDBM file into an `awk'
+     array.
+
+A `PROCINFO' Array
+     The special files that provide process-related information (*note
+     Special File Names in `gawk': Special Files.)  may be superseded
+     by a `PROCINFO' array that would provide the same information, in
+     an easier to access fashion.
+
+More `lint' warnings
+     There are more things that could be checked for portability.
+
+Control of subprocess environment
+     Changes made in `gawk' to the array `ENVIRON' may be propagated to
+     subprocesses run by `gawk'.
+
+   This is a list of probable improvements that will make `gawk'
+perform better.
+
+An Improved Version of `dfa'
+     The `dfa' pattern matcher from GNU `grep' has some problems.
+     Either a new version or a fixed one will deal with some important
+     regexp matching issues.
+
+Use of GNU `malloc'
+     The GNU version of `malloc' could potentially speed up `gawk',
+     since it relies heavily on the use of dynamic memory allocation.
+
+Use of the `rx' regexp library
+     The `rx' regular expression library could potentially speed up all
+     regexp operations that require knowing the exact location of
+     matches.  This includes record termination, field and array
+     splitting, and the `sub', `gsub', `gensub' and `match' functions.
+
+
+File: gawk.info,  Node: Improvements,  Prev: Future Extensions,  Up: Notes
+
+Suggestions for Improvements
+============================
+
+   Here are some projects that would-be `gawk' hackers might like to
+take on.  They vary in size from a few days to a few weeks of
+programming, depending on which one you choose and how fast a
+programmer you are.  Please send any improvements you write to the
+maintainers at the GNU project.  *Note Adding New Features: Adding Code,
+for guidelines to follow when adding new features to `gawk'.  *Note
+Reporting Problems and Bugs: Bugs, for information on contacting the
+maintainers.
+
+  1. Compilation of `awk' programs: `gawk' uses a Bison (YACC-like)
+     parser to convert the script given it into a syntax tree; the
+     syntax tree is then executed by a simple recursive evaluator.
+     This method incurs a lot of overhead, since the recursive
+     evaluator performs many procedure calls to do even the simplest
+     things.
+
+     It should be possible for `gawk' to convert the script's parse tree
+     into a C program which the user would then compile, using the
+     normal C compiler and a special `gawk' library to provide all the
+     needed functions (regexps, fields, associative arrays, type
+     coercion, and so on).
+
+     An easier possibility might be for an intermediate phase of `awk'
+     to convert the parse tree into a linear byte code form like the
+     one used in GNU Emacs Lisp.  The recursive evaluator would then be
+     replaced by a straight line byte code interpreter that would be
+     intermediate in speed between running a compiled program and doing
+     what `gawk' does now.
+
+  2. The programs in the test suite could use documenting in this
+     Info file.
+
+  3. See the `FUTURES' file for more ideas.  Contact us if you would
+     seriously like to tackle any of the items listed there.
+
+
+File: gawk.info,  Node: Glossary,  Next: Copying,  Prev: Notes,  Up: Top
+
+Glossary
+********
+
+Action
+     A series of `awk' statements attached to a rule.  If the rule's
+     pattern matches an input record, `awk' executes the rule's action.
+     Actions are always enclosed in curly braces.  *Note Overview of
+     Actions: Action Overview.
+
+Amazing `awk' Assembler
+     Henry Spencer at the University of Toronto wrote a retargetable
+     assembler completely as `awk' scripts.  It is thousands of lines
+     long, including machine descriptions for several eight-bit
+     microcomputers.  It is a good example of a program that would have
+     been better written in another language.
+
+Amazingly Workable Formatter (`awf')
+     Henry Spencer at the University of Toronto wrote a formatter that
+     accepts a large subset of the `nroff -ms' and `nroff -man'
+     formatting commands, using `awk' and `sh'.
+
+ANSI
+     The American National Standards Institute.  This organization
+     produces many standards, among them the standards for the C and
+     C++ programming languages.
+
+Assignment
+     An `awk' expression that changes the value of some `awk' variable
+     or data object.  An object that you can assign to is called an
+     "lvalue".  The assigned values are called "rvalues".  *Note
+     Assignment Expressions: Assignment Ops.
+
+`awk' Language
+     The language in which `awk' programs are written.
+
+`awk' Program
+     An `awk' program consists of a series of "patterns" and "actions",
+     collectively known as "rules".  For each input record given to the
+     program, the program's rules are all processed in turn.  `awk'
+     programs may also contain function definitions.
+
+`awk' Script
+     Another name for an `awk' program.
+
+Bash
+     The GNU version of the standard shell (the Bourne-Again shell).
+     See "Bourne Shell."
+
+BBS
+     See "Bulletin Board System."
+
+Boolean Expression
+     Named after the English mathematician Boole. See "Logical
+     Expression."
+
+Bourne Shell
+     The standard shell (`/bin/sh') on Unix and Unix-like systems,
+     originally written by Steven R. Bourne.  Many shells (Bash, `ksh',
+     `pdksh', `zsh') are generally upwardly compatible with the Bourne
+     shell.
+
+Built-in Function
+     The `awk' language provides built-in functions that perform various
+     numerical, time stamp related, and string computations.  Examples
+     are `sqrt' (for the square root of a number) and `substr' (for a
+     substring of a string).  *Note Built-in Functions: Built-in.
+
+Built-in Variable
+     `ARGC', `ARGIND', `ARGV', `CONVFMT', `ENVIRON', `ERRNO',
+     `FIELDWIDTHS', `FILENAME', `FNR', `FS', `IGNORECASE', `NF', `NR',
+     `OFMT', `OFS', `ORS', `RLENGTH', `RSTART', `RS', `RT', and
+     `SUBSEP', are the variables that have special meaning to `awk'.
+     Changing some of them affects `awk''s running environment.
+     Several of these variables are specific to `gawk'.  *Note Built-in
+     Variables::.
+
+Braces
+     See "Curly Braces."
+
+Bulletin Board System
+     A computer system allowing users to log in and read and/or leave
+     messages for other users of the system, much like leaving paper
+     notes on a bulletin board.
+
+C
+     The system programming language that most GNU software is written
+     in.  The `awk' programming language has C-like syntax, and this
+     Info file points out similarities between `awk' and C when
+     appropriate.
+
+Character Set
+     The set of numeric codes used by a computer system to represent the
+     characters (letters, numbers, punctuation, etc.) of a particular
+     country or place. The most common character set in use today is
+     ASCII (American Standard Code for Information Interchange).  Many
+     European countries use an extension of ASCII known as ISO-8859-1
+     (ISO Latin-1).
+
+CHEM
+     A preprocessor for `pic' that reads descriptions of molecules and
+     produces `pic' input for drawing them.  It was written in `awk' by
+     Brian Kernighan and Jon Bentley, and is available from
+     `netlib@research.att.com'.
+
+Compound Statement
+     A series of `awk' statements, enclosed in curly braces.  Compound
+     statements may be nested.  *Note Control Statements in Actions:
+     Statements.
+
+Concatenation
+     Concatenating two strings means sticking them together, one after
+     another, giving a new string.  For example, the string `foo'
+     concatenated with the string `bar' gives the string `foobar'.
+     *Note String Concatenation: Concatenation.
+
+Conditional Expression
+     An expression using the `?:' ternary operator, such as `EXPR1 ?
+     EXPR2 : EXPR3'.  The expression EXPR1 is evaluated; if the result
+     is true, the value of the whole expression is the value of EXPR2,
+     otherwise the value is EXPR3.  In either case, only one of EXPR2
+     and EXPR3 is evaluated.  *Note Conditional Expressions:
+     Conditional Exp.
+
+Comparison Expression
+     A relation that is either true or false, such as `(a < b)'.
+     Comparison expressions are used in `if', `while', `do', and `for'
+     statements, and in patterns to select which input records to
+     process.  *Note Variable Typing and Comparison Expressions: Typing
+     and Comparison.
+
+Curly Braces
+     The characters `{' and `}'.  Curly braces are used in `awk' for
+     delimiting actions, compound statements, and function bodies.
+
+Dark Corner
+     An area in the language where specifications often were (or still
+     are) not clear, leading to unexpected or undesirable behavior.
+     Such areas are marked in this Info file with "(d.c.)" in the text,
+     and are indexed under the heading "dark corner."
+
+Data Objects
+     These are numbers and strings of characters.  Numbers are
+     converted into strings and vice versa, as needed.  *Note
+     Conversion of Strings and Numbers: Conversion.
+
+Double Precision
+     An internal representation of numbers that can have fractional
+     parts.  Double precision numbers keep track of more digits than do
+     single precision numbers, but operations on them are more
+     expensive.  This is the way `awk' stores numeric values.  It is
+     the C type `double'.
+
+Dynamic Regular Expression
+     A dynamic regular expression is a regular expression written as an
+     ordinary expression.  It could be a string constant, such as
+     `"foo"', but it may also be an expression whose value can vary.
+     *Note Using Dynamic Regexps: Computed Regexps.
+
+Environment
+     A collection of strings, of the form NAME`='VAL, that each program
+     has available to it. Users generally place values into the
+     environment in order to provide information to various programs.
+     Typical examples are the environment variables `HOME' and `PATH'.
+
+Empty String
+     See "Null String."
+
+Escape Sequences
+     A special sequence of characters used for describing non-printing
+     characters, such as `\n' for newline, or `\033' for the ASCII ESC
+     (escape) character.  *Note Escape Sequences::.
+
+Field
+     When `awk' reads an input record, it splits the record into pieces
+     separated by whitespace (or by a separator regexp which you can
+     change by setting the built-in variable `FS').  Such pieces are
+     called fields.  If the pieces are of fixed length, you can use the
+     built-in variable `FIELDWIDTHS' to describe their lengths.  *Note
+     Specifying How Fields are Separated: Field Separators, and also see
+     *Note Reading Fixed-width Data: Constant Size.
+
+Floating Point Number
+     Often referred to in mathematical terms as a "rational" number,
+     this is just a number that can have a fractional part.  See
+     "Double Precision" and "Single Precision."
+
+Format
+     Format strings are used to control the appearance of output in the
+     `printf' statement.  Also, data conversions from numbers to strings
+     are controlled by the format string contained in the built-in
+     variable `CONVFMT'.  *Note Format-Control Letters: Control Letters.
+
+Function
+     A specialized group of statements used to encapsulate general or
+     program-specific tasks.  `awk' has a number of built-in functions,
+     and also allows you to define your own.  *Note Built-in Functions:
+     Built-in, and *Note User-defined Functions: User-defined.
+
+FSF
+     See "Free Software Foundation."
+
+Free Software Foundation
+     A non-profit organization dedicated to the production and
+     distribution of freely distributable software.  It was founded by
+     Richard M. Stallman, the author of the original Emacs editor.  GNU
+     Emacs is the most widely used version of Emacs today.
+
+`gawk'
+     The GNU implementation of `awk'.
+
+General Public License
+     This document describes the terms under which `gawk' and its source
+     code may be distributed. (*note GNU GENERAL PUBLIC LICENSE:
+     Copying.)
+
+GNU
+     "GNU's not Unix".  An on-going project of the Free Software
+     Foundation to create a complete, freely distributable,
+     POSIX-compliant computing environment.
+
+GPL
+     See "General Public License."
+
+Hexadecimal
+     Base 16 notation, where the digits are `0'-`9' and `A'-`F', with
+     `A' representing 10, `B' representing 11, and so on up to `F' for
+     15.  Hexadecimal numbers are written in C using a leading `0x', to
+     indicate their base.  Thus, `0x12' is 18 (one times 16 plus 2).
+
+I/O
+     Abbreviation for "Input/Output," the act of moving data into and/or
+     out of a running program.
+
+Input Record
+     A single chunk of data read in by `awk'.  Usually, an `awk' input
+     record consists of one line of text.  *Note How Input is Split
+     into Records: Records.
+
+Integer
+     A whole number, i.e. a number that does not have a fractional part.
+
+Keyword
+     In the `awk' language, a keyword is a word that has special
+     meaning.  Keywords are reserved and may not be used as variable
+     names.
+
+     `gawk''s keywords are: `BEGIN', `END', `if', `else', `while',
+     `do...while', `for', `for...in', `break', `continue', `delete',
+     `next', `nextfile', `function', `func', and `exit'.
+
+Logical Expression
+     An expression using the operators for logic, AND, OR, and NOT,
+     written `&&', `||', and `!' in `awk'. Often called Boolean
+     expressions, after the mathematician who pioneered this kind of
+     mathematical logic.
+
+Lvalue
+     An expression that can appear on the left side of an assignment
+     operator.  In most languages, lvalues can be variables or array
+     elements.  In `awk', a field designator can also be used as an
+     lvalue.
+
+Null String
+     A string with no characters in it.  It is represented explicitly in
+     `awk' programs by placing two double-quote characters next to each
+     other (`""').  It can appear in input data by having two successive
+     occurrences of the field separator appear next to each other.
+
+Number
+     A numeric valued data object.  The `gawk' implementation uses
+     double precision floating point to represent numbers.  Very old
+     `awk' implementations use single precision floating point.
+
+Octal
+     Base-eight notation, where the digits are `0'-`7'.  Octal numbers
+     are written in C using a leading `0', to indicate their base.
+     Thus, `013' is 11 (one times 8 plus 3).
+
+Pattern
+     Patterns tell `awk' which input records are interesting to which
+     rules.
+
+     A pattern is an arbitrary conditional expression against which
+     input is tested.  If the condition is satisfied, the pattern is
+     said to "match" the input record.  A typical pattern might compare
+     the input record against a regular expression.  *Note Pattern
+     Elements: Pattern Overview.
+
+POSIX
+     The name for a series of standards being developed by the IEEE
+     that specify a Portable Operating System interface.  The "IX"
+     denotes the Unix heritage of these standards.  The main standard
+     of interest for `awk' users is `IEEE Standard for Information
+     Technology, Standard 1003.2-1992, Portable Operating System
+     Interface (POSIX) Part 2: Shell and Utilities'.  Informally, this
+     standard is often referred to as simply "P1003.2."
+
+Private
+     Variables and/or functions that are meant for use exclusively by
+     library functions, and not for the main `awk' program. Special
+     care must be taken when naming such variables and functions.
+     *Note Naming Library Function Global Variables: Library Names.
+
+Range (of input lines)
+     A sequence of consecutive lines from the input file.  A pattern
+     can specify ranges of input lines for `awk' to process, or it can
+     specify single lines.  *Note Pattern Elements: Pattern Overview.
+
+Recursion
+     When a function calls itself, either directly or indirectly.  If
+     this isn't clear, refer to the entry for "recursion."
+
+Redirection
+     Redirection means performing input from other than the standard
+     input stream, or output to other than the standard output stream.
+
+     You can redirect the output of the `print' and `printf' statements
+     to a file or a system command, using the `>', `>>', and `|'
+     operators.  You can redirect input to the `getline' statement using
+     the `<' and `|' operators.  *Note Redirecting Output of `print'
+     and `printf': Redirection, and *Note Explicit Input with
+     `getline': Getline.
+
+Regexp
+     Short for "regular expression".  A regexp is a pattern that
+     denotes a set of strings, possibly an infinite set.  For example,
+     the regexp `R.*xp' matches any string starting with the letter `R'
+     and ending with the letters `xp'.  In `awk', regexps are used in
+     patterns and in conditional expressions.  Regexps may contain
+     escape sequences.  *Note Regular Expressions: Regexp.
+
+Regular Expression
+     See "regexp."
+
+Regular Expression Constant
+     A regular expression constant is a regular expression written
+     within slashes, such as `/foo/'.  This regular expression is chosen
+     when you write the `awk' program, and cannot be changed doing its
+     execution.  *Note How to Use Regular Expressions: Regexp Usage.
+
+Rule
+     A segment of an `awk' program that specifies how to process single
+     input records.  A rule consists of a "pattern" and an "action".
+     `awk' reads an input record; then, for each rule, if the input
+     record satisfies the rule's pattern, `awk' executes the rule's
+     action.  Otherwise, the rule does nothing for that input record.
+
+Rvalue
+     A value that can appear on the right side of an assignment
+     operator.  In `awk', essentially every expression has a value.
+     These values are rvalues.
+
+`sed'
+     See "Stream Editor."
+
+Short-Circuit
+     The nature of the `awk' logical operators `&&' and `||'.  If the
+     value of the entire expression can be deduced from evaluating just
+     the left-hand side of these operators, the right-hand side will not
+     be evaluated (*note Boolean Expressions: Boolean Ops.).
+
+Side Effect
+     A side effect occurs when an expression has an effect aside from
+     merely producing a value.  Assignment expressions, increment and
+     decrement expressions and function calls have side effects.  *Note
+     Assignment Expressions: Assignment Ops.
+
+Single Precision
+     An internal representation of numbers that can have fractional
+     parts.  Single precision numbers keep track of fewer digits than
+     do double precision numbers, but operations on them are less
+     expensive in terms of CPU time.  This is the type used by some
+     very old versions of `awk' to store numeric values.  It is the C
+     type `float'.
+
+Space
+     The character generated by hitting the space bar on the keyboard.
+
+Special File
+     A file name interpreted internally by `gawk', instead of being
+     handed directly to the underlying operating system.  For example,
+     `/dev/stderr'.  *Note Special File Names in `gawk': Special Files.
+
+Stream Editor
+     A program that reads records from an input stream and processes
+     them one or more at a time.  This is in contrast with batch
+     programs, which may expect to read their input files in entirety
+     before starting to do anything, and with interactive programs,
+     which require input from the user.
+
+String
+     A datum consisting of a sequence of characters, such as `I am a
+     string'.  Constant strings are written with double-quotes in the
+     `awk' language, and may contain escape sequences.  *Note Escape
+     Sequences::.
+
+Tab
+     The character generated by hitting the `TAB' key on the keyboard.
+     It usually expands to up to eight spaces upon output.
+
+Unix
+     A computer operating system originally developed in the early
+     1970's at AT&T Bell Laboratories.  It initially became popular in
+     universities around the world, and later moved into commercial
+     evnironments as a software development system and network server
+     system. There are many commercial versions of Unix, as well as
+     several work-alike systems whose source code is freely available
+     (such as Linux, NetBSD, and FreeBSD).
+
+Whitespace
+     A sequence of space, tab, or newline characters occurring inside
+     an input record or a string.
+
+
+File: gawk.info,  Node: Copying,  Next: Index,  Prev: Glossary,  Up: Top
+
+GNU GENERAL PUBLIC LICENSE
+**************************
+
+                         Version 2, June 1991
+
+     Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+     59 Temple Place --- Suite 330, Boston, MA 02111-1307, USA
+     
+     Everyone is permitted to copy and distribute verbatim copies
+     of this license document, but changing it is not allowed.
+
+Preamble
+========
+
+   The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.)  You can apply it to
+your programs, too.
+
+   When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it in
+new free programs; and that you know you can do these things.
+
+   To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+   For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+   We protect your rights with two steps: (1) copyright the software,
+and (2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+   Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+   Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+   The precise terms and conditions for copying, distribution and
+modification follow.
+
+    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains a
+     notice placed by the copyright holder saying it may be distributed
+     under the terms of this General Public License.  The "Program",
+     below, refers to any such program or work, and a "work based on
+     the Program" means either the Program or any derivative work under
+     copyright law: that is to say, a work containing the Program or a
+     portion of it, either verbatim or with modifications and/or
+     translated into another language.  (Hereinafter, translation is
+     included without limitation in the term "modification".)  Each
+     licensee is addressed as "you".
+
+     Activities other than copying, distribution and modification are
+     not covered by this License; they are outside its scope.  The act
+     of running the Program is not restricted, and the output from the
+     Program is covered only if its contents constitute a work based on
+     the Program (independent of having been made by running the
+     Program).  Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+     source code as you receive it, in any medium, provided that you
+     conspicuously and appropriately publish on each copy an appropriate
+     copyright notice and disclaimer of warranty; keep intact all the
+     notices that refer to this License and to the absence of any
+     warranty; and give any other recipients of the Program a copy of
+     this License along with the Program.
+
+     You may charge a fee for the physical act of transferring a copy,
+     and you may at your option offer warranty protection in exchange
+     for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+     of it, thus forming a work based on the Program, and copy and
+     distribute such modifications or work under the terms of Section 1
+     above, provided that you also meet all of these conditions:
+
+       a. You must cause the modified files to carry prominent notices
+          stating that you changed the files and the date of any change.
+
+       b. You must cause any work that you distribute or publish, that
+          in whole or in part contains or is derived from the Program
+          or any part thereof, to be licensed as a whole at no charge
+          to all third parties under the terms of this License.
+
+       c. If the modified program normally reads commands interactively
+          when run, you must cause it, when started running for such
+          interactive use in the most ordinary way, to print or display
+          an announcement including an appropriate copyright notice and
+          a notice that there is no warranty (or else, saying that you
+          provide a warranty) and that users may redistribute the
+          program under these conditions, and telling the user how to
+          view a copy of this License.  (Exception: if the Program
+          itself is interactive but does not normally print such an
+          announcement, your work based on the Program is not required
+          to print an announcement.)
+
+     These requirements apply to the modified work as a whole.  If
+     identifiable sections of that work are not derived from the
+     Program, and can be reasonably considered independent and separate
+     works in themselves, then this License, and its terms, do not
+     apply to those sections when you distribute them as separate
+     works.  But when you distribute the same sections as part of a
+     whole which is a work based on the Program, the distribution of
+     the whole must be on the terms of this License, whose permissions
+     for other licensees extend to the entire whole, and thus to each
+     and every part regardless of who wrote it.
+
+     Thus, it is not the intent of this section to claim rights or
+     contest your rights to work written entirely by you; rather, the
+     intent is to exercise the right to control the distribution of
+     derivative or collective works based on the Program.
+
+     In addition, mere aggregation of another work not based on the
+     Program with the Program (or with a work based on the Program) on
+     a volume of a storage or distribution medium does not bring the
+     other work under the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+     under Section 2) in object code or executable form under the terms
+     of Sections 1 and 2 above provided that you also do one of the
+     following:
+
+       a. Accompany it with the complete corresponding machine-readable
+          source code, which must be distributed under the terms of
+          Sections 1 and 2 above on a medium customarily used for
+          software interchange; or,
+
+       b. Accompany it with a written offer, valid for at least three
+          years, to give any third party, for a charge no more than your
+          cost of physically performing source distribution, a complete
+          machine-readable copy of the corresponding source code, to be
+          distributed under the terms of Sections 1 and 2 above on a
+          medium customarily used for software interchange; or,
+
+       c. Accompany it with the information you received as to the offer
+          to distribute corresponding source code.  (This alternative is
+          allowed only for non-commercial distribution and only if you
+          received the program in object code or executable form with
+          such an offer, in accord with Subsection b above.)
+
+     The source code for a work means the preferred form of the work for
+     making modifications to it.  For an executable work, complete
+     source code means all the source code for all modules it contains,
+     plus any associated interface definition files, plus the scripts
+     used to control compilation and installation of the executable.
+     However, as a special exception, the source code distributed need
+     not include anything that is normally distributed (in either
+     source or binary form) with the major components (compiler,
+     kernel, and so on) of the operating system on which the executable
+     runs, unless that component itself accompanies the executable.
+
+     If distribution of executable or object code is made by offering
+     access to copy from a designated place, then offering equivalent
+     access to copy the source code from the same place counts as
+     distribution of the source code, even though third parties are not
+     compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+     except as expressly provided under this License.  Any attempt
+     otherwise to copy, modify, sublicense or distribute the Program is
+     void, and will automatically terminate your rights under this
+     License.  However, parties who have received copies, or rights,
+     from you under this License will not have their licenses
+     terminated so long as such parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+     signed it.  However, nothing else grants you permission to modify
+     or distribute the Program or its derivative works.  These actions
+     are prohibited by law if you do not accept this License.
+     Therefore, by modifying or distributing the Program (or any work
+     based on the Program), you indicate your acceptance of this
+     License to do so, and all its terms and conditions for copying,
+     distributing or modifying the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+     Program), the recipient automatically receives a license from the
+     original licensor to copy, distribute or modify the Program
+     subject to these terms and conditions.  You may not impose any
+     further restrictions on the recipients' exercise of the rights
+     granted herein.  You are not responsible for enforcing compliance
+     by third parties to this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+     infringement or for any other reason (not limited to patent
+     issues), conditions are imposed on you (whether by court order,
+     agreement or otherwise) that contradict the conditions of this
+     License, they do not excuse you from the conditions of this
+     License.  If you cannot distribute so as to satisfy simultaneously
+     your obligations under this License and any other pertinent
+     obligations, then as a consequence you may not distribute the
+     Program at all.  For example, if a patent license would not permit
+     royalty-free redistribution of the Program by all those who
+     receive copies directly or indirectly through you, then the only
+     way you could satisfy both it and this License would be to refrain
+     entirely from distribution of the Program.
+
+     If any portion of this section is held invalid or unenforceable
+     under any particular circumstance, the balance of the section is
+     intended to apply and the section as a whole is intended to apply
+     in other circumstances.
+
+     It is not the purpose of this section to induce you to infringe any
+     patents or other property right claims or to contest validity of
+     any such claims; this section has the sole purpose of protecting
+     the integrity of the free software distribution system, which is
+     implemented by public license practices.  Many people have made
+     generous contributions to the wide range of software distributed
+     through that system in reliance on consistent application of that
+     system; it is up to the author/donor to decide if he or she is
+     willing to distribute software through any other system and a
+     licensee cannot impose that choice.
+
+     This section is intended to make thoroughly clear what is believed
+     to be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+     certain countries either by patents or by copyrighted interfaces,
+     the original copyright holder who places the Program under this
+     License may add an explicit geographical distribution limitation
+     excluding those countries, so that distribution is permitted only
+     in or among countries not thus excluded.  In such case, this
+     License incorporates the limitation as if written in the body of
+     this License.
+
+  9. The Free Software Foundation may publish revised and/or new
+     versions of the General Public License from time to time.  Such
+     new versions will be similar in spirit to the present version, but
+     may differ in detail to address new problems or concerns.
+
+     Each version is given a distinguishing version number.  If the
+     Program specifies a version number of this License which applies
+     to it and "any later version", you have the option of following
+     the terms and conditions either of that version or of any later
+     version published by the Free Software Foundation.  If the Program
+     does not specify a version number of this License, you may choose
+     any version ever published by the Free Software Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+     programs whose distribution conditions are different, write to the
+     author to ask for permission.  For software which is copyrighted
+     by the Free Software Foundation, write to the Free Software
+     Foundation; we sometimes make exceptions for this.  Our decision
+     will be guided by the two goals of preserving the free status of
+     all derivatives of our free software and of promoting the sharing
+     and reuse of software generally.
+
+                                NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO
+     WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE
+     LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+     HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT
+     WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT
+     NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
+     FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS TO THE
+     QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+     PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
+     SERVICING, REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
+     WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY
+     MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE
+     LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL,
+     INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR
+     INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+     DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU
+     OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY
+     OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
+     ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
+
+                      END OF TERMS AND CONDITIONS
+
+How to Apply These Terms to Your New Programs
+=============================================
+
+   If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these
+terms.
+
+   To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+     ONE LINE TO GIVE THE PROGRAM'S NAME AND AN IDEA OF WHAT IT DOES.
+     Copyright (C) 19YY  NAME OF AUTHOR
+     
+     This program is free software; you can redistribute it and/or
+     modify it under the terms of the GNU General Public License
+     as published by the Free Software Foundation; either version 2
+     of the License, or (at your option) any later version.
+     
+     This program is distributed in the hope that it will be useful,
+     but WITHOUT ANY WARRANTY; without even the implied warranty of
+     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+     GNU General Public License for more details.
+     
+     You should have received a copy of the GNU General Public License
+     along with this program; if not, write to the Free Software
+     Foundation, Inc., 59 Temple Place --- Suite 330, Boston, MA 02111-1307, USA.
+
+   Also add information on how to contact you by electronic and paper
+mail.
+
+   If the program is interactive, make it output a short notice like
+this when it starts in an interactive mode:
+
+     Gnomovision version 69, Copyright (C) 19YY NAME OF AUTHOR
+     Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
+     type `show w'.  This is free software, and you are welcome
+     to redistribute it under certain conditions; type `show c'
+     for details.
+
+   The hypothetical commands `show w' and `show c' should show the
+appropriate parts of the General Public License.  Of course, the
+commands you use may be called something other than `show w' and `show
+c'; they could even be mouse-clicks or menu items--whatever suits your
+program.
+
+   You should also get your employer (if you work as a programmer) or
+your school, if any, to sign a "copyright disclaimer" for the program,
+if necessary.  Here is a sample; alter the names:
+
+     Yoyodyne, Inc., hereby disclaims all copyright
+     interest in the program `Gnomovision'
+     (which makes passes at compilers) written
+     by James Hacker.
+     
+     SIGNATURE OF TY COON, 1 April 1989
+     Ty Coon, President of Vice
+
+   This General Public License does not permit incorporating your
+program into proprietary programs.  If your program is a subroutine
+library, you may consider it more useful to permit linking proprietary
+applications with the library.  If this is what you want to do, use the
+GNU Library General Public License instead of this License.
+
+
+File: gawk.info,  Node: Index,  Prev: Copying,  Up: Top
+
+Index
+*****
+
+* Menu:
+
+* ! operator:                            Boolean Ops.
+* != operator:                           Typing and Comparison.
+* !~ operator <1>:                       Typing and Comparison.
+* !~ operator <2>:                       Regexp Constants.
+* !~ operator <3>:                       Computed Regexps.
+* !~ operator <4>:                       Case-sensitivity.
+* !~ operator:                           Regexp Usage.
+* # (comment):                           Comments.
+* #! (executable scripts):               Executable Scripts.
+* $ (field operator):                    Fields.
+* && operator:                           Boolean Ops.
+* --assign option:                       Options.
+* --compat option:                       Options.
+* --copyleft option:                     Options.
+* --copyright option:                    Options.
+* --field-separator option:              Options.
+* --file option:                         Options.
+* --help option:                         Options.
+* --lint option:                         Options.
+* --lint-old option:                     Options.
+* --posix option:                        Options.
+* --source option:                       Options.
+* --traditional option:                  Options.
+* --usage option:                        Options.
+* --version option:                      Options.
+* -f option:                             Options.
+* -F option <1>:                         Options.
+* -F option:                             Command Line Field Separator.
+* -f option:                             Long.
+* -v option:                             Options.
+* -W option:                             Options.
+* /dev/fd:                               Special Files.
+* /dev/pgrpid:                           Special Files.
+* /dev/pid:                              Special Files.
+* /dev/ppid:                             Special Files.
+* /dev/stderr:                           Special Files.
+* /dev/stdin:                            Special Files.
+* /dev/stdout:                           Special Files.
+* /dev/user <1>:                         Passwd Functions.
+* /dev/user:                             Special Files.
+* < operator:                            Typing and Comparison.
+* <= operator:                           Typing and Comparison.
+* == operator:                           Typing and Comparison.
+* > operator:                            Typing and Comparison.
+* >= operator:                           Typing and Comparison.
+* \' regexp operator:                    GNU Regexp Operators.
+* \< regexp operator:                    GNU Regexp Operators.
+* \> regexp operator:                    GNU Regexp Operators.
+* \` regexp operator:                    GNU Regexp Operators.
+* \B regexp operator:                    GNU Regexp Operators.
+* \W regexp operator:                    GNU Regexp Operators.
+* \w regexp operator:                    GNU Regexp Operators.
+* \y regexp operator:                    GNU Regexp Operators.
+* _gr_init:                              Group Functions.
+* _pw_init:                              Passwd Functions.
+* _tm_addup:                             Mktime Function.
+* _tm_isleap:                            Mktime Function.
+* accessing fields:                      Fields.
+* account information <1>:               Group Functions.
+* account information:                   Passwd Functions.
+* acronym:                               History.
+* action, curly braces:                  Action Overview.
+* action, default:                       Very Simple.
+* action, definition of:                 Action Overview.
+* action, empty:                         Very Simple.
+* action, separating statements:         Action Overview.
+* adding new features:                   Adding Code.
+* addition:                              Arithmetic Ops.
+* Aho, Alfred:                           History.
+* AI programming, using gawk:            Distribution contents.
+* alarm.awk:                             Alarm Program.
+* amiga:                                 Amiga Installation.
+* anchors in regexps:                    Regexp Operators.
+* and operator:                          Boolean Ops.
+* anonymous ftp <1>:                     Other Versions.
+* anonymous ftp:                         Getting.
+* applications of awk:                   When.
+* ARGC:                                  Auto-set.
+* ARGIND <1>:                            Other Arguments.
+* ARGIND:                                Auto-set.
+* argument processing:                   Getopt Function.
+* arguments in function call:            Function Calls.
+* arguments, command line:               Invoking Gawk.
+* ARGV <1>:                              Other Arguments.
+* ARGV:                                  Auto-set.
+* arithmetic operators:                  Arithmetic Ops.
+* array assignment:                      Assigning Elements.
+* array reference:                       Reference to Elements.
+* array subscripts, uninitialized variables: Uninitialized Subscripts.
+* arrays:                                Array Intro.
+* arrays, associative:                   Array Intro.
+* arrays, definition of:                 Array Intro.
+* arrays, deleting an element:           Delete.
+* arrays, deleting entire contents:      Delete.
+* arrays, multi-dimensional subscripts:  Multi-dimensional.
+* arrays, presence of elements:          Reference to Elements.
+* arrays, sparse:                        Array Intro.
+* arrays, special for statement:         Scanning an Array.
+* arrays, the in operator:               Reference to Elements.
+* artificial intelligence, using gawk:   Distribution contents.
+* ASCII:                                 Ordinal Functions.
+* assert:                                Assert Function.
+* assert, C version:                     Assert Function.
+* assertions:                            Assert Function.
+* assignment operators:                  Assignment Ops.
+* assignment to fields:                  Changing Fields.
+* associative arrays:                    Array Intro.
+* atan2:                                 Numeric Functions.
+* atari:                                 Atari Installation.
+* automatic initialization:              More Complex.
+* awk language, POSIX version <1>:       Definition Syntax.
+* awk language, POSIX version <2>:       String Functions.
+* awk language, POSIX version <3>:       User-modified.
+* awk language, POSIX version <4>:       Next Statement.
+* awk language, POSIX version <5>:       Continue Statement.
+* awk language, POSIX version <6>:       Break Statement.
+* awk language, POSIX version <7>:       Precedence.
+* awk language, POSIX version <8>:       Assignment Ops.
+* awk language, POSIX version <9>:       Arithmetic Ops.
+* awk language, POSIX version <10>:      Conversion.
+* awk language, POSIX version <11>:      Format Modifiers.
+* awk language, POSIX version <12>:      OFMT.
+* awk language, POSIX version <13>:      Field Splitting Summary.
+* awk language, POSIX version <14>:      Regexp Operators.
+* awk language, POSIX version:           Escape Sequences.
+* awk language, V.4 version <1>:         SVR4.
+* awk language, V.4 version:             Escape Sequences.
+* AWKPATH environment variable:          AWKPATH Variable.
+* awksed:                                Simple Sed.
+* backslash continuation <1>:            Egrep Program.
+* backslash continuation:                Statements/Lines.
+* backslash continuation and comments:   Statements/Lines.
+* backslash continuation in csh <1>:     Statements/Lines.
+* backslash continuation in csh:         More Complex.
+* basic function of awk:                 Getting Started.
+* BBS-list file:                         Sample Data Files.
+* BEGIN special pattern:                 BEGIN/END.
+* beginfile:                             Filetrans Function.
+* body of a loop:                        While Statement.
+* book, using this:                      This Manual.
+* boolean expressions:                   Boolean Ops.
+* boolean operators:                     Boolean Ops.
+* break statement:                       Break Statement.
+* break, outside of loops:               Break Statement.
+* Brennan, Michael <1>:                  Other Versions.
+* Brennan, Michael <2>:                  Simple Sed.
+* Brennan, Michael:                      Delete.
+* buffer matching operators:             GNU Regexp Operators.
+* buffering output:                      I/O Functions.
+* buffering, interactive vs. non-interactive: I/O Functions.
+* buffering, non-interactive vs. interactive: I/O Functions.
+* buffers, flushing:                     I/O Functions.
+* bugs, known in gawk:                   Known Bugs.
+* built-in functions:                    Built-in.
+* built-in variables:                    Built-in Variables.
+* built-in variables, convey information: Auto-set.
+* built-in variables, user modifiable:   User-modified.
+* call by reference:                     Function Caveats.
+* call by value:                         Function Caveats.
+* calling a function <1>:                Function Caveats.
+* calling a function:                    Function Calls.
+* case conversion:                       String Functions.
+* case sensitivity:                      Case-sensitivity.
+* changing contents of a field:          Changing Fields.
+* changing the record separator:         Records.
+* character classes:                     Regexp Operators.
+* character encodings:                   Ordinal Functions.
+* character list:                        Regexp Operators.
+* character list, complemented:          Regexp Operators.
+* character sets:                        Ordinal Functions.
+* chr:                                   Ordinal Functions.
+* close <1>:                             I/O Functions.
+* close:                                 Close Files And Pipes.
+* closing input files and pipes:         Close Files And Pipes.
+* closing output files and pipes:        Close Files And Pipes.
+* coding style used in gawk:             Adding Code.
+* collating elements:                    Regexp Operators.
+* collating symbols:                     Regexp Operators.
+* command line:                          Invoking Gawk.
+* command line formats:                  Running gawk.
+* command line, setting FS on:           Command Line Field Separator.
+* comments:                              Comments.
+* comments and backslash continuation:   Statements/Lines.
+* common mistakes <1>:                   Typing and Comparison.
+* common mistakes <2>:                   Print Examples.
+* common mistakes <3>:                   Basic Field Splitting.
+* common mistakes:                       Computed Regexps.
+* comp.lang.awk:                         Bugs.
+* comparison expressions:                Typing and Comparison.
+* comparisons, string vs. regexp:        Typing and Comparison.
+* compatibility mode <1>:                POSIX/GNU.
+* compatibility mode:                    Options.
+* complemented character list:           Regexp Operators.
+* compound statement:                    Statements.
+* computed regular expressions:          Computed Regexps.
+* concatenation:                         Concatenation.
+* conditional expression:                Conditional Exp.
+* configuring gawk:                      Configuration Philosophy.
+* constants, types of:                   Constants.
+* continuation of lines:                 Statements/Lines.
+* continue statement:                    Continue Statement.
+* continue, outside of loops:            Continue Statement.
+* control statement:                     Statements.
+* conversion of case:                    String Functions.
+* conversion of strings and numbers:     Conversion.
+* conversions, during subscripting:      Numeric Array Subscripts.
+* converting dates to timestamps:        Mktime Function.
+* CONVFMT <1>:                           Numeric Array Subscripts.
+* CONVFMT <2>:                           User-modified.
+* CONVFMT:                               Conversion.
+* cos:                                   Numeric Functions.
+* csh, backslash continuation <1>:       Statements/Lines.
+* csh, backslash continuation:           More Complex.
+* curly braces:                          Action Overview.
+* custom.h configuration file:           Configuration Philosophy.
+* cut utility:                           Cut Program.
+* cut.awk:                               Cut Program.
+* d.c., see "dark corner":               This Manual.
+* dark corner <1>:                       Other Arguments.
+* dark corner <2>:                       Invoking Gawk.
+* dark corner <3>:                       String Functions.
+* dark corner <4>:                       Uninitialized Subscripts.
+* dark corner <5>:                       Auto-set.
+* dark corner <6>:                       Exit Statement.
+* dark corner <7>:                       Continue Statement.
+* dark corner <8>:                       Break Statement.
+* dark corner <9>:                       Using BEGIN/END.
+* dark corner <10>:                      Truth Values.
+* dark corner <11>:                      Conversion.
+* dark corner <12>:                      Assignment Options.
+* dark corner <13>:                      Using Constant Regexps.
+* dark corner <14>:                      Format Modifiers.
+* dark corner <15>:                      Control Letters.
+* dark corner <16>:                      OFMT.
+* dark corner <17>:                      Getline Summary.
+* dark corner <18>:                      Plain Getline.
+* dark corner <19>:                      Multiple Line.
+* dark corner <20>:                      Field Splitting Summary.
+* dark corner <21>:                      Single Character Fields.
+* dark corner <22>:                      Records.
+* dark corner <23>:                      Escape Sequences.
+* dark corner:                           This Manual.
+* data-driven languages:                 Getting Started.
+* dates, converting to timestamps:       Mktime Function.
+* decrement operators:                   Increment Ops.
+* default action:                        Very Simple.
+* default pattern:                       Very Simple.
+* defining functions:                    Definition Syntax.
+* Deifik, Scott <1>:                     Bugs.
+* Deifik, Scott:                         Acknowledgements.
+* delete statement:                      Delete.
+* deleting elements of arrays:           Delete.
+* deleting entire arrays:                Delete.
+* deprecated features:                   Obsolete.
+* deprecated options:                    Obsolete.
+* differences between gawk and awk <1>:  AWKPATH Variable.
+* differences between gawk and awk <2>:  String Functions.
+* differences between gawk and awk <3>:  Calling Built-in.
+* differences between gawk and awk <4>:  Delete.
+* differences between gawk and awk <5>:  Nextfile Statement.
+* differences between gawk and awk <6>:  I/O And BEGIN/END.
+* differences between gawk and awk <7>:  Conditional Exp.
+* differences between gawk and awk <8>:  Arithmetic Ops.
+* differences between gawk and awk <9>:  Using Constant Regexps.
+* differences between gawk and awk <10>: Scalar Constants.
+* differences between gawk and awk <11>: Close Files And Pipes.
+* differences between gawk and awk <12>: Special Files.
+* differences between gawk and awk <13>: Redirection.
+* differences between gawk and awk <14>: Getline Summary.
+* differences between gawk and awk <15>: Getline Intro.
+* differences between gawk and awk <16>: Single Character Fields.
+* differences between gawk and awk <17>: Records.
+* differences between gawk and awk:      Case-sensitivity.
+* directory search:                      AWKPATH Variable.
+* division:                              Arithmetic Ops.
+* documenting awk programs <1>:          Library Names.
+* documenting awk programs:              Comments.
+* dupword.awk:                           Dupword Program.
+* dynamic regular expressions:           Computed Regexps.
+* EBCDIC:                                Ordinal Functions.
+* egrep <1>:                             Regexp Operators.
+* egrep:                                 One-shot.
+* egrep utility:                         Egrep Program.
+* egrep.awk:                             Egrep Program.
+* element assignment:                    Assigning Elements.
+* element of array:                      Reference to Elements.
+* empty action:                          Very Simple.
+* empty pattern:                         Empty.
+* empty program:                         Invoking Gawk.
+* empty string <1>:                      Truth Values.
+* empty string <2>:                      Conversion.
+* empty string <3>:                      Regexp Field Splitting.
+* empty string:                          Records.
+* END special pattern:                   BEGIN/END.
+* endfile:                               Filetrans Function.
+* endgrent:                              Group Functions.
+* endpwent:                              Passwd Functions.
+* ENVIRON:                               Auto-set.
+* environment variable, AWKPATH:         AWKPATH Variable.
+* environment variable, POSIXLY_CORRECT: Options.
+* equivalence classes:                   Regexp Operators.
+* ERRNO <1>:                             Auto-set.
+* ERRNO <2>:                             Close Files And Pipes.
+* ERRNO:                                 Getline Intro.
+* errors, common <1>:                    Typing and Comparison.
+* errors, common <2>:                    Print Examples.
+* errors, common <3>:                    Basic Field Splitting.
+* errors, common:                        Computed Regexps.
+* escape processing, sub et. al.:        String Functions.
+* escape sequence notation:              Escape Sequences.
+* evaluation, order of:                  Calling Built-in.
+* examining fields:                      Fields.
+* executable scripts:                    Executable Scripts.
+* exit statement:                        Exit Statement.
+* exp:                                   Numeric Functions.
+* explicit input:                        Getline.
+* exponentiation:                        Arithmetic Ops.
+* expression:                            Expressions.
+* expression, assignment:                Assignment Ops.
+* expression, boolean:                   Boolean Ops.
+* expression, comparison:                Typing and Comparison.
+* expression, conditional:               Conditional Exp.
+* expression, matching:                  Typing and Comparison.
+* extract.awk:                           Extract Program.
+* features, adding:                      Adding Code.
+* fflush:                                I/O Functions.
+* field operator $:                      Fields.
+* field separator, choice of:            Basic Field Splitting.
+* field separator, FS:                   Basic Field Splitting.
+* field separator, on command line:      Command Line Field Separator.
+* field, changing contents of:           Changing Fields.
+* fields:                                Fields.
+* fields, separating:                    Basic Field Splitting.
+* FIELDWIDTHS:                           User-modified.
+* file descriptors:                      Special Files.
+* file, awk program:                     Long.
+* FILENAME <1>:                          Auto-set.
+* FILENAME <2>:                          Getline Summary.
+* FILENAME:                              Reading Files.
+* FILENAME, being set by getline:        Getline Summary.
+* Fish, Fred:                            Bugs.
+* flushing buffers:                      I/O Functions.
+* FNR <1>:                               Auto-set.
+* FNR:                                   Records.
+* for (x in ...):                        Scanning an Array.
+* for statement:                         For Statement.
+* format specifier:                      Control Letters.
+* format string:                         Basic Printf.
+* format, numeric output:                OFMT.
+* formatted output:                      Printf.
+* formatted timestamps:                  Gettimeofday Function.
+* Free Software Foundation <1>:          Getting.
+* Free Software Foundation:              Manual History.
+* FreeBSD:                               Manual History.
+* Friedl, Jeffrey:                       Acknowledgements.
+* FS <1>:                                User-modified.
+* FS:                                    Basic Field Splitting.
+* ftp, anonymous <1>:                    Other Versions.
+* ftp, anonymous:                        Getting.
+* function call <1>:                     Function Caveats.
+* function call:                         Function Calls.
+* function definition:                   Definition Syntax.
+* function, recursive:                   Definition Syntax.
+* functions, undefined:                  Function Caveats.
+* functions, user-defined:               User-defined.
+* gawk coding style:                     Adding Code.
+* gensub:                                String Functions.
+* getgrent:                              Group Functions.
+* getgrent, C version:                   Group Functions.
+* getgrgid:                              Group Functions.
+* getgrnam:                              Group Functions.
+* getgruser:                             Group Functions.
+* getline:                               Getline.
+* getline, return values:                Getline Intro.
+* getline, setting FILENAME:             Getline Summary.
+* getopt:                                Getopt Function.
+* getopt, C version:                     Getopt Function.
+* getpwent:                              Passwd Functions.
+* getpwent, C version:                   Passwd Functions.
+* getpwnam:                              Passwd Functions.
+* getpwuid:                              Passwd Functions.
+* gettimeofday:                          Gettimeofday Function.
+* getting gawk:                          Getting.
+* GNU Project:                           Manual History.
+* grcat program:                         Group Functions.
+* grcat.c:                               Group Functions.
+* group file:                            Group Functions.
+* group information:                     Group Functions.
+* gsub:                                  String Functions.
+* gsub, third argument of:               String Functions.
+* Hankerson, Darrel <1>:                 Bugs.
+* Hankerson, Darrel:                     Acknowledgements.
+* historical features <1>:               Historical Features.
+* historical features <2>:               String Functions.
+* historical features <3>:               Continue Statement.
+* historical features <4>:               Break Statement.
+* historical features:                   Command Line Field Separator.
+* history of awk:                        History.
+* histsort.awk:                          History Sorting.
+* how awk works:                         Two Rules.
+* Hughes, Phil:                          Acknowledgements.
+* I/O from BEGIN and END:                I/O And BEGIN/END.
+* id utility:                            Id Program.
+* id.awk:                                Id Program.
+* if-else statement:                     If Statement.
+* igawk.sh:                              Igawk Program.
+* IGNORECASE <1>:                        User-modified.
+* IGNORECASE:                            Case-sensitivity.
+* ignoring case:                         Case-sensitivity.
+* implementation limits <1>:             Redirection.
+* implementation limits:                 Getline Summary.
+* in operator:                           Typing and Comparison.
+* increment operators:                   Increment Ops.
+* index:                                 String Functions.
+* initialization, automatic:             More Complex.
+* input:                                 Reading Files.
+* input file, sample:                    Sample Data Files.
+* input files, skipping:                 Nextfile Function.
+* input pipeline:                        Getline/Pipe.
+* input redirection:                     Getline/File.
+* input, explicit:                       Getline.
+* input, getline command:                Getline.
+* input, multiple line records:          Multiple Line.
+* input, standard:                       Read Terminal.
+* installation, amiga:                   Amiga Installation.
+* installation, atari:                   Atari Installation.
+* installation, MS-DOS and OS/2:         PC Installation.
+* installation, unix:                    Quick Installation.
+* installation, vms:                     VMS Installation.
+* int:                                   Numeric Functions.
+* interaction, awk and other programs:   I/O Functions.
+* interactive buffering vs. non-interactive: I/O Functions.
+* interval expressions:                  Regexp Operators.
+* inventory-shipped file:                Sample Data Files.
+* invocation of gawk:                    Invoking Gawk.
+* ISO 8601:                              Time Functions.
+* ISO 8859-1 <1>:                        Glossary.
+* ISO 8859-1:                            Case-sensitivity.
+* ISO Latin-1 <1>:                       Glossary.
+* ISO Latin-1:                           Case-sensitivity.
+* Jaegermann, Michal <1>:                Bugs.
+* Jaegermann, Michal:                    Acknowledgements.
+* join:                                  Join Function.
+* Kernighan, Brian <1>:                  Other Versions.
+* Kernighan, Brian <2>:                  BTL.
+* Kernighan, Brian <3>:                  Acknowledgements.
+* Kernighan, Brian:                      History.
+* known bugs:                            Known Bugs.
+* labels.awk:                            Labels Program.
+* language, awk:                         This Manual.
+* language, data-driven:                 Getting Started.
+* language, procedural:                  Getting Started.
+* leftmost longest match <1>:            Multiple Line.
+* leftmost longest match:                Leftmost Longest.
+* length:                                String Functions.
+* limitations <1>:                       Redirection.
+* limitations:                           Getline Summary.
+* line break:                            Statements/Lines.
+* line continuation <1>:                 Conditional Exp.
+* line continuation <2>:                 Boolean Ops.
+* line continuation <3>:                 Print Examples.
+* line continuation:                     Statements/Lines.
+* Linux <1>:                             Atari Compiling.
+* Linux:                                 Manual History.
+* locale, definition of:                 Time Functions.
+* log:                                   Numeric Functions.
+* logical false:                         Truth Values.
+* logical operations:                    Boolean Ops.
+* logical true:                          Truth Values.
+* login information:                     Passwd Functions.
+* long options:                          Invoking Gawk.
+* loop:                                  While Statement.
+* loops, exiting:                        Break Statement.
+* lvalue:                                Assignment Ops.
+* mark parity:                           Ordinal Functions.
+* match:                                 String Functions.
+* matching ranges of lines:              Ranges.
+* matching, leftmost longest <1>:        Multiple Line.
+* matching, leftmost longest:            Leftmost Longest.
+* mawk:                                  Other Versions.
+* merging strings:                       Join Function.
+* metacharacters:                        Regexp Operators.
+* mistakes, common <1>:                  Typing and Comparison.
+* mistakes, common <2>:                  Print Examples.
+* mistakes, common <3>:                  Basic Field Splitting.
+* mistakes, common:                      Computed Regexps.
+* mktime:                                Mktime Function.
+* modifiers (in format specifiers):      Format Modifiers.
+* multi-dimensional subscripts:          Multi-dimensional.
+* multiple line records:                 Multiple Line.
+* multiple passes over data:             Other Arguments.
+* multiple statements on one line:       Statements/Lines.
+* multiplication:                        Arithmetic Ops.
+* names, use of:                         Definition Syntax.
+* namespace issues in awk:               Library Names.
+* namespaces:                            Definition Syntax.
+* NetBSD:                                Manual History.
+* new awk:                               History.
+* new awk vs. old awk:                   Names.
+* newline:                               Statements/Lines.
+* next file statement:                   Nextfile Statement.
+* next statement:                        Next Statement.
+* next, inside a user-defined function:  Next Statement.
+* nextfile function:                     Nextfile Function.
+* nextfile statement:                    Nextfile Statement.
+* NF <1>:                                Auto-set.
+* NF:                                    Fields.
+* non-interactive buffering vs. interactive: I/O Functions.
+* not operator:                          Boolean Ops.
+* NR <1>:                                Auto-set.
+* NR:                                    Records.
+* null string <1>:                       Truth Values.
+* null string <2>:                       Conversion.
+* null string:                           Regexp Field Splitting.
+* null string, as array subscript:       Uninitialized Subscripts.
+* number of fields, NF:                  Fields.
+* number of records, NR, FNR:            Records.
+* numbers, used as subscripts:           Numeric Array Subscripts.
+* numeric character values:              Ordinal Functions.
+* numeric constant:                      Scalar Constants.
+* numeric output format:                 OFMT.
+* numeric string:                        Typing and Comparison.
+* numeric value:                         Scalar Constants.
+* obsolete features:                     Obsolete.
+* obsolete options:                      Obsolete.
+* OFMT <1>:                              User-modified.
+* OFMT <2>:                              Conversion.
+* OFMT:                                  OFMT.
+* OFS <1>:                               User-modified.
+* OFS:                                   Output Separators.
+* old awk:                               History.
+* old awk vs. new awk:                   Names.
+* one-liners:                            One-liners.
+* operations, logical:                   Boolean Ops.
+* operator precedence:                   Precedence.
+* operators, arithmetic:                 Arithmetic Ops.
+* operators, assignment:                 Assignment Ops.
+* operators, boolean:                    Boolean Ops.
+* operators, decrement:                  Increment Ops.
+* operators, increment:                  Increment Ops.
+* operators, regexp matching:            Regexp Usage.
+* operators, relational:                 Typing and Comparison.
+* operators, short-circuit:              Boolean Ops.
+* operators, string:                     Concatenation.
+* operators, string-matching:            Regexp Usage.
+* options, command line:                 Invoking Gawk.
+* options, long:                         Invoking Gawk.
+* or operator:                           Boolean Ops.
+* ord:                                   Ordinal Functions.
+* order of evaluation:                   Calling Built-in.
+* ORS <1>:                               User-modified.
+* ORS:                                   Output Separators.
+* output:                                Printing.
+* output field separator, OFS:           Output Separators.
+* output format specifier, OFMT:         OFMT.
+* output record separator, ORS:          Output Separators.
+* output redirection:                    Redirection.
+* output, buffering:                     I/O Functions.
+* output, formatted:                     Printf.
+* output, piping:                        Redirection.
+* passes, multiple:                      Other Arguments.
+* password file:                         Passwd Functions.
+* path, search:                          AWKPATH Variable.
+* pattern, BEGIN:                        BEGIN/END.
+* pattern, default:                      Very Simple.
+* pattern, definition of:                Patterns and Actions.
+* pattern, empty:                        Empty.
+* pattern, END:                          BEGIN/END.
+* pattern, range:                        Ranges.
+* pattern, regular expressions:          Regexp.
+* patterns, types of:                    Kinds of Patterns.
+* per file initialization and clean-up:  Filetrans Function.
+* PERL:                                  Future Extensions.
+* pipeline, input:                       Getline/Pipe.
+* pipes for output:                      Redirection.
+* portability issues <1>:                Portability Notes.
+* portability issues <2>:                Definition Syntax.
+* portability issues <3>:                I/O Functions.
+* portability issues <4>:                String Functions.
+* portability issues <5>:                Delete.
+* portability issues <6>:                Close Files And Pipes.
+* portability issues <7>:                Escape Sequences.
+* portability issues:                    Statements/Lines.
+* porting gawk:                          New Ports.
+* POSIX awk <1>:                         Definition Syntax.
+* POSIX awk <2>:                         String Functions.
+* POSIX awk <3>:                         User-modified.
+* POSIX awk <4>:                         Next Statement.
+* POSIX awk <5>:                         Continue Statement.
+* POSIX awk <6>:                         Break Statement.
+* POSIX awk <7>:                         Precedence.
+* POSIX awk <8>:                         Assignment Ops.
+* POSIX awk <9>:                         Arithmetic Ops.
+* POSIX awk <10>:                        Conversion.
+* POSIX awk <11>:                        Format Modifiers.
+* POSIX awk <12>:                        OFMT.
+* POSIX awk <13>:                        Field Splitting Summary.
+* POSIX awk <14>:                        Regexp Operators.
+* POSIX awk:                             Escape Sequences.
+* POSIX mode:                            Options.
+* POSIXLY_CORRECT environment variable:  Options.
+* precedence:                            Precedence.
+* precedence, regexp operators:          Regexp Operators.
+* print statement:                       Print.
+* printf statement, syntax of:           Basic Printf.
+* printf, format-control characters:     Control Letters.
+* printf, modifiers:                     Format Modifiers.
+* printing:                              Printing.
+* procedural languages:                  Getting Started.
+* process information:                   Special Files.
+* processing arguments:                  Getopt Function.
+* program file:                          Long.
+* program, awk:                          This Manual.
+* program, definition of:                Getting Started.
+* program, self contained:               Executable Scripts.
+* programs, documenting <1>:             Library Names.
+* programs, documenting:                 Comments.
+* pwcat program:                         Passwd Functions.
+* pwcat.c:                               Passwd Functions.
+* quotient:                              Arithmetic Ops.
+* quoting, shell <1>:                    Long.
+* quoting, shell:                        Read Terminal.
+* Rakitzis, Byron:                       History Sorting.
+* rand:                                  Numeric Functions.
+* random numbers, seed of:               Numeric Functions.
+* range pattern:                         Ranges.
+* Rankin, Pat <1>:                       Bugs.
+* Rankin, Pat <2>:                       Assignment Ops.
+* Rankin, Pat:                           Acknowledgements.
+* reading files:                         Reading Files.
+* reading files, getline command:        Getline.
+* reading files, multiple line records:  Multiple Line.
+* record separator, RS:                  Records.
+* record terminator, RT:                 Records.
+* record, definition of:                 Records.
+* records, multiple line:                Multiple Line.
+* recursive function:                    Definition Syntax.
+* redirection of input:                  Getline/File.
+* redirection of output:                 Redirection.
+* reference to array:                    Reference to Elements.
+* regexp:                                Regexp.
+* regexp as expression:                  Typing and Comparison.
+* regexp comparison vs. string comparison: Typing and Comparison.
+* regexp constant:                       Regexp Usage.
+* regexp constants, difference between slashes and quotes: Computed Regexps.
+* regexp match/non-match operators <1>:  Typing and Comparison.
+* regexp match/non-match operators:      Regexp Usage.
+* regexp matching operators:             Regexp Usage.
+* regexp operators:                      Regexp Operators.
+* regexp operators, GNU specific:        GNU Regexp Operators.
+* regexp operators, precedence of:       Regexp Operators.
+* regexp, anchors:                       Regexp Operators.
+* regexp, dynamic:                       Computed Regexps.
+* regexp, effect of command line options: GNU Regexp Operators.
+* regular expression:                    Regexp.
+* regular expression metacharacters:     Regexp Operators.
+* regular expressions as field separators: Basic Field Splitting.
+* regular expressions as patterns:       Regexp.
+* regular expressions as record separators: Records.
+* regular expressions, computed:         Computed Regexps.
+* relational operators:                  Typing and Comparison.
+* remainder:                             Arithmetic Ops.
+* removing elements of arrays:           Delete.
+* return statement:                      Return Statement.
+* RFC-1036:                              Time Functions.
+* RFC-822:                               Time Functions.
+* RLENGTH <1>:                           String Functions.
+* RLENGTH:                               Auto-set.
+* Robbins, Miriam:                       Acknowledgements.
+* Rommel, Kai Uwe <1>:                   Bugs.
+* Rommel, Kai Uwe:                       Acknowledgements.
+* round:                                 Round Function.
+* rounding:                              Round Function.
+* RS <1>:                                User-modified.
+* RS:                                    Records.
+* RSTART <1>:                            String Functions.
+* RSTART:                                Auto-set.
+* RT <1>:                                Auto-set.
+* RT <2>:                                Multiple Line.
+* RT:                                    Records.
+* rule, definition of:                   Getting Started.
+* running awk programs:                  Running gawk.
+* running long programs:                 Long.
+* rvalue:                                Assignment Ops.
+* sample input file:                     Sample Data Files.
+* scanning an array:                     Scanning an Array.
+* script, definition of:                 Getting Started.
+* scripts, executable:                   Executable Scripts.
+* scripts, shell:                        Executable Scripts.
+* search path:                           AWKPATH Variable.
+* search path, for source files:         AWKPATH Variable.
+* sed utility <1>:                       Igawk Program.
+* sed utility <2>:                       Simple Sed.
+* sed utility:                           Field Splitting Summary.
+* seed for random numbers:               Numeric Functions.
+* self contained programs:               Executable Scripts.
+* shell quoting <1>:                     Long.
+* shell quoting:                         Read Terminal.
+* shell scripts:                         Executable Scripts.
+* short-circuit operators:               Boolean Ops.
+* side effect:                           Assignment Ops.
+* simple stream editor:                  Simple Sed.
+* sin:                                   Numeric Functions.
+* single character fields:               Single Character Fields.
+* single quotes, why needed:             One-shot.
+* skipping input files:                  Nextfile Function.
+* skipping lines between markers:        Ranges.
+* sparse arrays:                         Array Intro.
+* split:                                 String Functions.
+* split utility:                         Split Program.
+* split.awk:                             Split Program.
+* sprintf:                               String Functions.
+* sqrt:                                  Numeric Functions.
+* srand:                                 Numeric Functions.
+* Stallman, Richard <1>:                 Acknowledgements.
+* Stallman, Richard:                     Manual History.
+* standard error output:                 Special Files.
+* standard input <1>:                    Special Files.
+* standard input <2>:                    Reading Files.
+* standard input:                        Read Terminal.
+* standard output:                       Special Files.
+* statement, compound:                   Statements.
+* stream editor:                         Field Splitting Summary.
+* stream editor, simple:                 Simple Sed.
+* strftime:                              Time Functions.
+* string comparison vs. regexp comparison: Typing and Comparison.
+* string constants:                      Constants.
+* string operators:                      Concatenation.
+* string-matching operators:             Regexp Usage.
+* sub:                                   String Functions.
+* sub, third argument of:                String Functions.
+* subscripts in arrays:                  Multi-dimensional.
+* SUBSEP <1>:                            Multi-dimensional.
+* SUBSEP:                                User-modified.
+* substr:                                String Functions.
+* subtraction:                           Arithmetic Ops.
+* system:                                I/O Functions.
+* systime:                               Time Functions.
+* Tcl:                                   Library Names.
+* tee utility:                           Tee Program.
+* tee.awk:                               Tee Program.
+* terminator, record:                    Records.
+* time of day:                           Time Functions.
+* timestamps:                            Time Functions.
+* timestamps, converting from dates:     Mktime Function.
+* timestamps, formatted:                 Gettimeofday Function.
+* tolower:                               String Functions.
+* toupper:                               String Functions.
+* translate.awk:                         Translate Program.
+* Trueman, David:                        Acknowledgements.
+* truth values:                          Truth Values.
+* type conversion:                       Conversion.
+* types of variables <1>:                Typing and Comparison.
+* types of variables:                    Assignment Ops.
+* undefined functions:                   Function Caveats.
+* undocumented features:                 Undocumented.
+* uninitialized variables, as array subscripts: Uninitialized Subscripts.
+* uniq utility:                          Uniq Program.
+* uniq.awk:                              Uniq Program.
+* use of comments:                       Comments.
+* user information:                      Passwd Functions.
+* user-defined functions:                User-defined.
+* user-defined variables:                Using Variables.
+* uses of awk:                           What Is Awk.
+* using this book:                       This Manual.
+* values of characters as numbers:       Ordinal Functions.
+* variable shadowing:                    Definition Syntax.
+* variable typing:                       Typing and Comparison.
+* variables, user-defined:               Using Variables.
+* Wall, Larry:                           Future Extensions.
+* wc utility:                            Wc Program.
+* wc.awk:                                Wc Program.
+* Weinberger, Peter:                     History.
+* when to use awk:                       When.
+* while statement:                       While Statement.
+* word boundaries, matching:             GNU Regexp Operators.
+* word, regexp definition of:            GNU Regexp Operators.
+* wordfreq.sh:                           Word Sorting.
+* || operator:                           Boolean Ops.
+* ~ operator <1>:                        Typing and Comparison.
+* ~ operator <2>:                        Regexp Constants.
+* ~ operator <3>:                        Computed Regexps.
+* ~ operator <4>:                        Case-sensitivity.
+* ~ operator:                            Regexp Usage.
+
+
+
+Tag Table:
+Node: Top1197
+Node: Preface20700
+Ref: Preface-Footnote-121817
+Node: History22049
+Node: Manual History23407
+Node: Acknowledgements26997
+Node: What Is Awk30624
+Node: This Manual32278
+Node: Conventions34919
+Node: Sample Data Files36211
+Node: Getting Started39294
+Node: Names41602
+Ref: Names-Footnote-143099
+Node: Running gawk43171
+Node: One-shot44332
+Node: Read Terminal45719
+Node: Long47331
+Node: Executable Scripts48724
+Ref: Executable Scripts-Footnote-150374
+Ref: Executable Scripts-Footnote-250523
+Node: Comments50977
+Node: Very Simple52137
+Node: Two Rules54184
+Node: More Complex56363
+Node: Statements/Lines59479
+Node: Other Features63752
+Node: When64478
+Node: One-liners66412
+Node: Regexp69299
+Node: Regexp Usage70625
+Node: Escape Sequences72775
+Node: Regexp Operators78227
+Node: GNU Regexp Operators89260
+Node: Case-sensitivity92965
+Node: Leftmost Longest96080
+Node: Computed Regexps97615
+Node: Reading Files100272
+Node: Records102039
+Node: Fields108534
+Ref: Fields-Footnote-1111516
+Node: Non-Constant Fields111602
+Node: Changing Fields113888
+Node: Field Separators118295
+Node: Basic Field Splitting118997
+Node: Regexp Field Splitting122226
+Node: Single Character Fields124792
+Node: Command Line Field Separator125869
+Node: Field Splitting Summary129109
+Ref: Field Splitting Summary-Footnote-1131028
+Node: Constant Size131129
+Node: Multiple Line135166
+Node: Getline140574
+Node: Getline Intro141648
+Node: Plain Getline142611
+Node: Getline/Variable144875
+Node: Getline/File146017
+Node: Getline/Variable/File147327
+Node: Getline/Pipe149301
+Node: Getline/Variable/Pipe151391
+Node: Getline Summary152509
+Node: Printing154103
+Node: Print155171
+Node: Print Examples157271
+Node: Output Separators159882
+Node: OFMT161780
+Node: Printf163182
+Node: Basic Printf164086
+Node: Control Letters165620
+Node: Format Modifiers168308
+Node: Printf Examples172457
+Node: Redirection175236
+Node: Special Files179874
+Node: Close Files And Pipes185111
+Node: Expressions189172
+Node: Constants191378
+Node: Scalar Constants191857
+Ref: Scalar Constants-Footnote-1192717
+Node: Regexp Constants192861
+Node: Using Constant Regexps193323
+Node: Variables196524
+Node: Using Variables197178
+Node: Assignment Options198613
+Node: Conversion200557
+Node: Arithmetic Ops203738
+Node: Concatenation205872
+Node: Assignment Ops207227
+Node: Increment Ops212822
+Node: Truth Values215350
+Node: Typing and Comparison216398
+Node: Boolean Ops222298
+Node: Conditional Exp225991
+Node: Function Calls227668
+Node: Precedence230548
+Node: Patterns and Actions233936
+Node: Pattern Overview234362
+Node: Kinds of Patterns235137
+Node: Regexp Patterns236274
+Node: Expression Patterns236828
+Node: Ranges240480
+Node: BEGIN/END243199
+Node: Using BEGIN/END243668
+Node: I/O And BEGIN/END246630
+Node: Empty248646
+Node: Action Overview248945
+Node: Statements251516
+Node: If Statement253222
+Node: While Statement254725
+Node: Do Statement256756
+Node: For Statement257858
+Node: Break Statement261115
+Node: Continue Statement263386
+Node: Next Statement265382
+Node: Nextfile Statement267879
+Node: Exit Statement269793
+Node: Built-in Variables271803
+Node: User-modified272899
+Ref: User-modified-Footnote-1277688
+Node: Auto-set277750
+Ref: Auto-set-Footnote-1284073
+Node: ARGC and ARGV284279
+Node: Arrays286981
+Node: Array Intro288444
+Node: Reference to Elements292320
+Node: Assigning Elements294270
+Node: Array Example294772
+Node: Scanning an Array296491
+Node: Delete298821
+Node: Numeric Array Subscripts300881
+Node: Uninitialized Subscripts302787
+Node: Multi-dimensional304427
+Node: Multi-scanning307522
+Node: Built-in309165
+Node: Calling Built-in310154
+Node: Numeric Functions312125
+Ref: Numeric Functions-Footnote-1315673
+Node: String Functions315943
+Ref: String Functions-Footnote-1334729
+Ref: String Functions-Footnote-2334780
+Node: I/O Functions334873
+Ref: I/O Functions-Footnote-1340366
+Node: Time Functions340457
+Ref: Time Functions-Footnote-1348776
+Ref: Time Functions-Footnote-2348887
+Ref: Time Functions-Footnote-3349163
+Node: User-defined349307
+Node: Definition Syntax350019
+Node: Function Example354268
+Node: Function Caveats356598
+Node: Return Statement360469
+Node: Invoking Gawk363124
+Node: Options364359
+Ref: Options-Footnote-1373162
+Node: Other Arguments373187
+Node: AWKPATH Variable375833
+Ref: AWKPATH Variable-Footnote-1378281
+Node: Obsolete378581
+Node: Undocumented379247
+Node: Known Bugs379455
+Node: Library Functions380593
+Node: Portability Notes383012
+Node: Nextfile Function384296
+Ref: Nextfile Function-Footnote-1389001
+Node: Assert Function389171
+Node: Round Function392510
+Node: Ordinal Functions394155
+Ref: Ordinal Functions-Footnote-1397387
+Node: Join Function397606
+Node: Mktime Function399658
+Ref: Mktime Function-Footnote-1411149
+Node: Gettimeofday Function411232
+Node: Filetrans Function415244
+Node: Getopt Function418921
+Node: Passwd Functions430277
+Node: Group Functions438612
+Node: Library Names446509
+Node: Sample Programs450434
+Node: Clones450925
+Node: Cut Program452019
+Node: Egrep Program462048
+Node: Id Program469710
+Node: Split Program472981
+Node: Tee Program476359
+Node: Uniq Program479155
+Node: Wc Program486700
+Ref: Wc Program-Footnote-1490936
+Node: Miscellaneous Programs491117
+Node: Dupword Program492027
+Node: Alarm Program493698
+Node: Translate Program498243
+Ref: Translate Program-Footnote-1502730
+Ref: Translate Program-Footnote-2502873
+Node: Labels Program503053
+Ref: Labels Program-Footnote-1506512
+Node: Word Sorting506596
+Node: History Sorting510940
+Node: Extract Program512909
+Node: Simple Sed519866
+Node: Igawk Program523210
+Node: Language History536353
+Node: V7/SVR3.1537586
+Node: SVR4540239
+Node: POSIX541759
+Node: BTL543378
+Node: POSIX/GNU544142
+Node: Gawk Summary548573
+Node: Command Line Summary549397
+Node: Language Summary552373
+Ref: Language Summary-Footnote-1554630
+Node: Variables/Fields554753
+Node: Fields Summary555487
+Ref: Fields Summary-Footnote-1557215
+Node: Built-in Summary557273
+Node: Arrays Summary560918
+Node: Data Type Summary562211
+Node: Rules Summary564037
+Node: Pattern Summary565565
+Node: Regexp Summary567750
+Node: Actions Summary571132
+Node: Operator Summary572964
+Node: Control Flow Summary574191
+Node: I/O Summary574748
+Node: Printf Summary577737
+Node: Special File Summary581075
+Node: Built-in Functions Summary582753
+Node: Time Functions Summary586753
+Node: String Constants Summary587644
+Node: Functions Summary588964
+Node: Historical Features590025
+Node: Installation591523
+Node: Gawk Distribution592738
+Node: Getting593241
+Node: Extracting596226
+Node: Distribution contents597613
+Node: Unix Installation602527
+Node: Quick Installation603036
+Node: Configuration Philosophy604554
+Node: VMS Installation606956
+Node: VMS Compilation607495
+Node: VMS Installation Details609099
+Node: VMS Running610741
+Node: VMS POSIX612331
+Node: PC Installation613611
+Node: Atari Installation617014
+Node: Atari Compiling618198
+Node: Atari Using620107
+Node: Amiga Installation622953
+Node: Bugs624071
+Node: Other Versions627040
+Node: Notes628614
+Node: Compatibility Mode629221
+Node: Additions630064
+Node: Adding Code630762
+Node: New Ports636102
+Node: Future Extensions640270
+Node: Improvements642519
+Node: Glossary644387
+Node: Copying661452
+Node: Index680644
+
+End Tag Table
author	Arnold D. Robbins <arnold@skeeve.com>	2010-07-16 12:47:28 +0300
committer	Arnold D. Robbins <arnold@skeeve.com>	2010-07-16 12:47:28 +0300
commit	6719bb6e1c5576e857ab6fc121ec31a75161a3e7 (patch)
tree	97cba951750ceb73899e48490dbb33674e5b29e1 /doc/gawk.info
parent	558ba97bdeac5a68bb9248a5c4cdf2feeb24e771 (diff)
download	egawk-6719bb6e1c5576e857ab6fc121ec31a75161a3e7.tar.gz egawk-6719bb6e1c5576e857ab6fc121ec31a75161a3e7.tar.bz2 egawk-6719bb6e1c5576e857ab6fc121ec31a75161a3e7.zip