aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawk.info
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawk.info')
-rw-r--r--doc/gawk.info7159
1 files changed, 4035 insertions, 3124 deletions
diff --git a/doc/gawk.info b/doc/gawk.info
index aad73f7a..e7854caf 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -20,16 +20,13 @@ implementation of AWK.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
-Invariant Sections being "GNU General Public License", the Front-Cover
-texts being (a) (see below), and with the Back-Cover Texts being (b)
-(see below). A copy of the license is included in the section entitled
-"GNU Free Documentation License".
+Invariant Sections being "GNU General Public License", with the
+Front-Cover Texts being "A GNU Manual", and with the Back-Cover Texts
+as in (a) below. A copy of the license is included in the section
+entitled "GNU Free Documentation License".
- a. "A GNU Manual"
-
- b. "You have the freedom to copy and modify this GNU manual. Buying
- copies from the FSF supports it in developing GNU and promoting
- software freedom."
+ a. The FSF's Back-Cover Text is: "You have the freedom to copy and
+ modify this GNU manual."

File: gawk.info, Node: Top, Next: Foreword, Up: (dir)
@@ -51,16 +48,13 @@ implementation of AWK.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
-Invariant Sections being "GNU General Public License", the Front-Cover
-texts being (a) (see below), and with the Back-Cover Texts being (b)
-(see below). A copy of the license is included in the section entitled
-"GNU Free Documentation License".
-
- a. "A GNU Manual"
+Invariant Sections being "GNU General Public License", with the
+Front-Cover Texts being "A GNU Manual", and with the Back-Cover Texts
+as in (a) below. A copy of the license is included in the section
+entitled "GNU Free Documentation License".
- b. "You have the freedom to copy and modify this GNU manual. Buying
- copies from the FSF supports it in developing GNU and promoting
- software freedom."
+ a. The FSF's Back-Cover Text is: "You have the freedom to copy and
+ modify this GNU manual."
* Menu:
@@ -126,8 +120,8 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
includes command-line syntax.
* One-shot:: Running a short throwaway
`awk' program.
-* Read Terminal:: Using no input files (input from
- terminal instead).
+* Read Terminal:: Using no input files (input from the
+ keyboard instead).
* Long:: Putting permanent `awk'
programs in files.
* Executable Scripts:: Making self-contained `awk'
@@ -149,6 +143,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Other Features:: Other Features of `awk'.
* When:: When to use `gawk' and when to
use other things.
+* Intro Summary:: Summary of the introduction.
* Command Line:: How to run `awk'.
* Options:: Command-line options and their
meanings.
@@ -170,6 +165,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
program.
* Obsolete:: Obsolete Options and/or features.
* Undocumented:: Undocumented Options and Features.
+* Invoking Summary:: Invocation summary.
* Regexp Usage:: How to Use Regular Expressions.
* Escape Sequences:: How to write nonprinting characters.
* Regexp Operators:: Regular Expression Operators.
@@ -178,8 +174,12 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Case-sensitivity:: How to do case-insensitive matching.
* Leftmost Longest:: How much text matches.
* Computed Regexps:: Using Dynamic Regexps.
+* Regexp Summary:: Regular expressions summary.
* Records:: Controlling how data is split into
records.
+* awk split records:: How standard `awk' splits
+ records.
+* gawk split records:: How `gawk' splits records.
* Fields:: An introduction to fields.
* Nonconstant Fields:: Nonconstant Field Numbers.
* Changing Fields:: Changing the Contents of a Field.
@@ -218,6 +218,8 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Read Timeout:: Reading input with a timeout.
* Command line directories:: What happens if you put a directory on
the command line.
+* Input Summary:: Input summary.
+* Input Exercises:: Exercises.
* Print:: The `print' statement.
* Print Examples:: Simple examples of `print'
statements.
@@ -241,6 +243,8 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Special Caveats:: Things to watch out for.
* Close Files And Pipes:: Closing Input and Output Files and
Pipes.
+* Output Summary:: Output summary.
+* Output exercises:: Exercises.
* Values:: Constants, Variables, and Regular
Expressions.
* Constants:: String, numeric and regexp constants.
@@ -256,6 +260,9 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
This is an advanced method of input.
* Conversion:: The conversion of strings to numbers
and vice versa.
+* Strings And Numbers:: How `awk' Converts Between
+ Strings And Numbers.
+* Locale influences conversions:: How the locale may affect conversions.
* All Operators:: `gawk''s operators.
* Arithmetic Ops:: Arithmetic operations (`+',
`-', etc.)
@@ -283,6 +290,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Function Calls:: A function call is an expression.
* Precedence:: How various operators nest.
* Locales:: How the locale affects things.
+* Expressions Summary:: Expressions summary.
* Pattern Overview:: What goes into a pattern.
* Regexp Patterns:: Using regexps as patterns.
* Expression Patterns:: Any expression can be used as a
@@ -329,6 +337,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
gives you information.
* ARGC and ARGV:: Ways to use `ARGC' and
`ARGV'.
+* Pattern Action Summary:: Patterns and Actions summary.
* Array Basics:: The basics of arrays.
* Array Intro:: Introduction to Arrays
* Reference to Elements:: How to examine one element of an
@@ -351,6 +360,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
`awk'.
* Multiscanning:: Scanning multidimensional arrays.
* Arrays of Arrays:: True multidimensional arrays.
+* Arrays Summary:: Summary of arrays.
* Built-in:: Summarizes the built-in functions.
* Calling Built-in:: How to call built-in functions.
* Numeric Functions:: Functions that work with numbers,
@@ -385,6 +395,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
runtime.
* Indirect Calls:: Choosing the function to call at
runtime.
+* Functions Summary:: Summary of functions.
* Library Names:: How to best name private global
variables in library functions.
* General Functions:: Functions that are of general use.
@@ -419,6 +430,8 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Group Functions:: Functions for getting group
information.
* Walking Arrays:: A function to walk arrays of arrays.
+* Library Functions Summary:: Summary of library functions.
+* Library exercises:: Exercises.
* Running Examples:: How to run these examples.
* Clones:: Clones of common utilities.
* Cut Program:: The `cut' utility.
@@ -448,6 +461,8 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Anagram Program:: Finding anagrams from a dictionary.
* Signature Program:: People do amazing things with too much
time on their hands.
+* Programs Summary:: Summary of programs.
+* Programs Exercises:: Exercises.
* Nondecimal Data:: Allowing nondecimal input data.
* Array Sorting:: Facilities for controlling array
traversal and sorting arrays.
@@ -459,6 +474,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* TCP/IP Networking:: Using `gawk' for network
programming.
* Profiling:: Profiling your `awk' programs.
+* Advanced Features Summary:: Summary of advanced features.
* I18N and L10N:: Internationalization and Localization.
* Explaining gettext:: How GNU `gettext' works.
* Programmer i18n:: Features for the programmer.
@@ -470,6 +486,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* I18N Example:: A simple i18n example.
* Gawk I18N:: `gawk' is also
internationalized.
+* I18N Summary:: Summary of I18N stuff.
* Debugging:: Introduction to `gawk'
debugger.
* Debugging Concepts:: Debugging in General.
@@ -488,31 +505,23 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Miscellaneous Debugger Commands:: Miscellaneous Commands.
* Readline Support:: Readline support.
* Limitations:: Limitations and future plans.
-* General Arithmetic:: An introduction to computer
- arithmetic.
-* Floating Point Issues:: Stuff to know about floating-point
- numbers.
-* String Conversion Precision:: The String Value Can Lie.
-* Unexpected Results:: Floating Point Numbers Are Not
- Abstract Numbers.
-* POSIX Floating Point Problems:: Standards Versus Existing Practice.
-* Integer Programming:: Effective integer programming.
-* Floating-point Programming:: Effective Floating-point Programming.
-* Floating-point Representation:: Binary floating-point representation.
-* Floating-point Context:: Floating-point context.
-* Rounding Mode:: Floating-point rounding mode.
-* Gawk and MPFR:: How `gawk' provides
- arbitrary-precision arithmetic.
-* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
- Arithmetic with `gawk'.
-* Setting Precision:: Setting the working precision.
-* Setting Rounding Mode:: Setting the rounding mode.
-* Floating-point Constants:: Representing floating-point constants.
-* Changing Precision:: Changing the precision of a number.
-* Exact Arithmetic:: Exact arithmetic with floating-point
- numbers.
+* Debugging Summary:: Debugging summary.
+* Computer Arithmetic:: A quick intro to computer math.
+* Math Definitions:: Defining terms used.
+* MPFR features:: The MPFR features in `gawk'.
+* FP Math Caution:: Things to know.
+* Inexactness of computations:: Floating point math is not exact.
+* Inexact representation:: Numbers are not exactly represented.
+* Comparing FP Values:: How to compare floating point values.
+* Errors accumulate:: Errors get bigger as they go.
+* Getting Accuracy:: Getting more accuracy takes some work.
+* Try To Round:: Add digits and round.
+* Setting precision:: How to set the precision.
+* Setting the rounding mode:: How to set the rounding mode.
* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic
with `gawk'.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+* Floating point summary:: Summary of floating point discussion.
* Extension Intro:: What is an extension.
* Plugin License:: A note about licensing.
* Extension Mechanism Outline:: An outline of how it works.
@@ -574,6 +583,8 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Extension Sample Time:: An interface to `gettimeofday()'
and `sleep()'.
* gawkextlib:: The `gawkextlib' project.
+* Extension summary:: Extension summary.
+* Extension Exercises:: Exercises.
* V7/SVR3.1:: The major changes between V7 and
System V Release 3.1.
* SVR4:: Minor changes between System V
@@ -590,6 +601,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
ranges.
* Contributors:: The major contributors to
`gawk'.
+* History summary:: History summary.
* Gawk Distribution:: What is in the `gawk'
distribution.
* Getting:: How to get the distribution.
@@ -628,6 +640,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Bugs:: Reporting Problems and Bugs.
* Other Versions:: Other freely available `awk'
implementations.
+* Installation summary:: Summary of installation.
* Compatibility Mode:: How to disable certain `gawk'
extensions.
* Additions:: Making Additions To `gawk'.
@@ -636,8 +649,8 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
`gawk'.
* New Ports:: Porting `gawk' to a new
operating system.
-* Derived Files:: Why derived files are kept in the
- `git' repository.
+* Derived Files:: Why derived files are kept in the Git
+ repository.
* Future Extensions:: New features that may be implemented
one day.
* Implementation Limitations:: Some limitations of the
@@ -648,18 +661,19 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Extension Other Design Decisions:: Some other design decisions.
* Extension Future Growth:: Some room for future growth.
* Old Extension Mechanism:: Some compatibility for old extensions.
+* Notes summary:: Summary of implementation notes.
* Basic High Level:: The high level view.
* Basic Data Typing:: A very quick intro to data types.
- To Miriam, for making me complete.
+ To my parents, for their love, and for the wonderful example they
+set for me.
- To Chana, for the joy you bring us.
+ To my wife Miriam, for making me complete. Thank you for building
+your life together with me.
- To Rivka, for the exponential increase.
+ To our children Chana, Rivka, Nachum and Malka, for enrichening our
+lives in innumerable ways.
- To Nachum, for the added dimension.
-
- To Malka, for the new beginning.

File: gawk.info, Node: Foreword, Next: Preface, Prev: Top, Up: Top
@@ -791,6 +805,10 @@ and other `awk' implementations.
* Perform simple network communications
+ * Profile and debug `awk' programs.
+
+ * Extend the language with functions written in C or C++.
+
This Info file teaches you about the `awk' language and how you can
use it effectively. You should already be familiar with basic system
commands, such as `cat' and `ls',(2) as well as basic shell facilities,
@@ -799,13 +817,13 @@ such as input/output (I/O) redirection and pipes.
Implementations of the `awk' language are available for many
different computing environments. This Info file, while describing the
`awk' language in general, also describes the particular implementation
-of `awk' called `gawk' (which stands for "GNU awk"). `gawk' runs on a
-broad range of Unix systems, ranging from Intel(R)-architecture
-PC-based computers up through large-scale systems, such as Crays.
-`gawk' has also been ported to Mac OS X, Microsoft Windows (all
-versions) and OS/2 PCs, and VMS. (Some other, obsolete systems to
-which `gawk' was once ported are no longer supported and the code for
-those systems has been removed.)
+of `awk' called `gawk' (which stands for "GNU `awk'"). `gawk' runs on
+a broad range of Unix systems, ranging from Intel(R)-architecture
+PC-based computers up through large-scale systems. `gawk' has also
+been ported to Mac OS X, Microsoft Windows (all versions) and OS/2 PCs,
+and OpenVMS. (Some other, obsolete systems to which `gawk' was once
+ported are no longer supported and the code for those systems has been
+removed.)
* Menu:
@@ -822,7 +840,7 @@ those systems has been removed.)
---------- Footnotes ----------
- (1) The 2008 POSIX standard is online at
+ (1) The 2008 POSIX standard is accessable online at
`http://www.opengroup.org/onlinepubs/9699919799/'.
(2) These commands are available on POSIX-compliant systems, as well
@@ -892,20 +910,21 @@ The `awk' language has evolved over the years. Full details are
provided in *note Language History::. The language described in this
Info file is often referred to as "new `awk'" (`nawk').
- Because of this, there are systems with multiple versions of `awk'.
-Some systems have an `awk' utility that implements the original version
-of the `awk' language and a `nawk' utility for the new version. Others
-have an `oawk' version for the "old `awk'" language and plain `awk' for
-the new one. Still others only have one version, which is usually the
-new one.(1)
-
- All in all, this makes it difficult for you to know which version of
-`awk' you should run when writing your programs. The best advice we
-can give here is to check your local documentation. Look for `awk',
-`oawk', and `nawk', as well as for `gawk'. It is likely that you
-already have some version of new `awk' on your system, which is what
-you should use when running your programs. (Of course, if you're
-reading this Info file, chances are good that you have `gawk'!)
+ For some time after new `awk' was introduced, there were systems
+with multiple versions of `awk'. Some systems had an `awk' utility
+that implemented the original version of the `awk' language and a
+`nawk' utility for the new version. Others had an `oawk' version for
+the "old `awk'" language and plain `awk' for the new one. Still others
+only had one version, which is usually the new one.
+
+ Today, only Solaris systems still use an old `awk' for the default
+`awk' utility. (A more modern `awk' lives in `/usr/xpg6/bin' on these
+systems.) All other modern systems use some version of new `awk'.(1)
+
+ It is likely that you already have some version of new `awk' on your
+system, which is what you should use when running your programs. (Of
+course, if you're reading this Info file, chances are good that you
+have `gawk'!)
Throughout this Info file, whenever we refer to a language feature
that should be available in any complete implementation of POSIX `awk',
@@ -914,7 +933,7 @@ specific to the GNU implementation, we use the term `gawk'.
---------- Footnotes ----------
- (1) Often, these systems use `gawk' for their `awk' implementation!
+ (1) Many of these systems use `gawk' for their `awk' implementation!

File: gawk.info, Node: This Manual, Next: Conventions, Prev: Names, Up: Preface
@@ -1043,7 +1062,7 @@ material for those who are completely unfamiliar with computer
programming.
The *note Glossary::, defines most, if not all, the significant
-terms used throughout the book. If you find terms that you aren't
+terms used throughout the Info file. If you find terms that you aren't
familiar with, try looking them up here.
*note Copying::, and *note GNU Free Documentation License::, present
@@ -1121,7 +1140,7 @@ editor. GNU Emacs is the most widely used version of Emacs today.
Software Foundation to create a complete, freely distributable,
POSIX-compliant computing environment. The FSF uses the "GNU General
Public License" (GPL) to ensure that their software's source code is
-always available to the end user. A copy of the GPL is included for
+always available to the end user. A copy of the GPL is included for
your reference (*note Copying::). The GPL applies to the C language
source code for `gawk'. To find out more about the FSF and the GNU
Project online, see the GNU Project's home page (http://www.gnu.org).
@@ -1160,19 +1179,18 @@ published the first two editions under the title `The GNU Awk User's
Guide'.
This edition maintains the basic structure of the previous editions.
-For Edition 4.0, the content has been thoroughly reviewed and updated.
-All references to `gawk' versions prior to 4.0 have been removed. Of
-significant note for this edition was *note Debugger::.
+For FSF edition 4.0, the content has been thoroughly reviewed and
+updated. All references to `gawk' versions prior to 4.0 have been
+removed. Of significant note for this edition was *note Debugger::.
- For edition 4.1, the content has been reorganized into parts, and
-the major new additions are *note Arbitrary Precision Arithmetic::, and
-*note Dynamic Extensions::.
+ For FSF edition 4.1, the content has been reorganized into parts,
+and the major new additions are *note Arbitrary Precision Arithmetic::,
+and *note Dynamic Extensions::.
- `GAWK: Effective AWK Programming' will undoubtedly continue to
-evolve. An electronic version comes with the `gawk' distribution from
-the FSF. If you find an error in this Info file, please report it!
-*Note Bugs::, for information on submitting problem reports
-electronically.
+ This Info file will undoubtedly continue to evolve. An electronic
+version comes with the `gawk' distribution from the FSF. If you find
+an error in this Info file, please report it! *Note Bugs::, for
+information on submitting problem reports electronically.
---------- Footnotes ----------
@@ -1199,14 +1217,17 @@ something more broad, I acquired the `awk.info' domain.
contributed code: the archive did not grow and the domain went unused
for several years.
- Fortunately, late in 2008, a volunteer took on the task of setting up
-an `awk'-related web site--`http://awk.info'--and did a very nice job.
+ Late in 2008, a volunteer took on the task of setting up an
+`awk'-related web site--`http://awk.info'--and did a very nice job.
If you have written an interesting `awk' program, or have written a
`gawk' extension that you would like to share with the rest of the
world, please see `http://awk.info/?contribute' for how to contribute
it to the web site.
+ As of this writing, this website is in search of a maintainer; please
+contact me if you are interested.
+

File: gawk.info, Node: Acknowledgments, Prev: How To Contribute, Up: Preface
@@ -1279,6 +1300,10 @@ be a pleasure working with this team of fine people.
Notable code and documentation contributions were made by a number
of people. *Note Contributors::, for the full list.
+ Thanks to Patrice Dumas for the new `makeinfo' program. Thanks to
+Karl Berry who continues to work to keep the Texinfo markup language
+sane.
+
I would like to thank Brian Kernighan for invaluable assistance
during the testing and debugging of `gawk', and for ongoing help and
advice in clarifying numerous points about the language. We could not
@@ -1293,12 +1318,6 @@ also must acknowledge my gratitude to G-d, for the many opportunities
He has sent my way, as well as for the gifts He has given me with which
to take advantage of those opportunities.
-
-Arnold Robbins
-Nof Ayalon
-ISRAEL
-May, 2013
-

File: gawk.info, Node: Getting Started, Next: Invoking Gawk, Prev: Preface, Up: Top
@@ -1327,7 +1346,7 @@ for now. *Note User-defined::.) Each rule specifies one pattern to
search for and one action to perform upon finding the pattern.
Syntactically, a rule consists of a pattern followed by an action.
-The action is enclosed in curly braces to separate it from the pattern.
+The action is enclosed in braces to separate it from the pattern.
Newlines usually separate rules. Therefore, an `awk' program looks
like this:
@@ -1350,6 +1369,7 @@ like this:
* Other Features:: Other Features of `awk'.
* When:: When to use `gawk' and when to use
other things.
+* Intro Summary:: Summary of the introduction.

File: gawk.info, Node: Running gawk, Next: Sample Data Files, Up: Getting Started
@@ -1375,7 +1395,7 @@ variations of each.
* One-shot:: Running a short throwaway `awk'
program.
-* Read Terminal:: Using no input files (input from terminal
+* Read Terminal:: Using no input files (input from the keyboard
instead).
* Long:: Putting permanent `awk' programs in
files.
@@ -1425,7 +1445,7 @@ following command line:
awk 'PROGRAM'
`awk' applies the PROGRAM to the "standard input", which usually means
-whatever you type on the terminal. This continues until you indicate
+whatever you type on the keyboard. This continues until you indicate
end-of-file by typing `Ctrl-d'. (On other operating systems, the
end-of-file character may be different. For example, on OS/2, it is
`Ctrl-z'.)
@@ -1606,7 +1626,7 @@ at a later time.
will probably print strange messages about syntax errors. For
example, look at the following:
- $ awk '{ print "hello" } # let's be cute'
+ $ awk 'BEGIN { print "hello" } # let's be cute'
>
The shell sees that the first two quotes match, and that a new
@@ -1646,6 +1666,23 @@ knowledge of shell quoting rules. The following rules apply only to
POSIX-compliant, Bourne-style shells (such as Bash, the GNU Bourne-Again
Shell). If you use the C shell, you're on your own.
+ Before diving into the rules, we introduce a concept that appears
+throughout this Info file, which is that of the "null", or empty,
+string.
+
+ The null string is character data that has no value. In other
+words, it is empty. It is written in `awk' programs like this: `""'.
+In the shell, it can be written using single or double quotes: `""' or
+`'''. While the null string has no characters in it, it does exist.
+Consider this command:
+
+ $ echo ""
+
+Here, the `echo' utility receives a single argument, even though that
+argument has no characters in it. In the rest of this Info file, we use
+the terms "null string" and "empty string" interchangeably. Now, on to
+the quoting rules.
+
* Quoted items can be concatenated with nonquoted items as well as
with other quoted items. The shell turns everything into one
argument for the command.
@@ -1859,10 +1896,10 @@ for _every_ input line. If the action is omitted, the default action
is to print all lines that match the pattern.
Thus, we could leave out the action (the `print' statement and the
-curly braces) in the previous example and the result would be the same:
-`awk' prints all lines matching the pattern `li'. By comparison,
-omitting the `print' statement but retaining the curly braces makes an
-empty action that does nothing (i.e., no lines are printed).
+braces) in the previous example and the result would be the same: `awk'
+prints all lines matching the pattern `li'. By comparison, omitting
+the `print' statement but retaining the braces makes an empty action
+that does nothing (i.e., no lines are printed).
Many practical `awk' programs are just a line or two. Following is a
collection of useful, short programs to get you started. Some of these
@@ -1886,7 +1923,7 @@ different ways to do the same things shown here:
awk 'length($0) > 80' data
The sole rule has a relational expression as its pattern and it
- has no action--so the default action, printing the record, is used.
+ has no action--so it uses the default action, printing the record.
* Print the length of the longest line in `data':
@@ -1943,9 +1980,8 @@ File: gawk.info, Node: Two Rules, Next: More Complex, Prev: Very Simple, Up:
The `awk' utility reads the input files one line at a time. For each
line, `awk' tries the patterns of each of the rules. If several
-patterns match, then several actions are run in the order in which they
-appear in the `awk' program. If no patterns match, then no actions are
-run.
+patterns match, then several actions execute in the order in which they
+appear in the `awk' program. If no patterns match, then no actions run.
After processing all the rules that match the line (and perhaps
there are none), `awk' reads the next line. (However, *note Next
@@ -2021,12 +2057,12 @@ contains the file name.(1)
The `$6 == "Nov"' in our `awk' program is an expression that tests
whether the sixth field of the output from `ls -l' matches the string
-`Nov'. Each time a line has the string `Nov' for its sixth field, the
-action `sum += $5' is performed. This adds the fifth field (the file's
-size) to the variable `sum'. As a result, when `awk' has finished
-reading all the input lines, `sum' is the total of the sizes of the
-files whose lines matched the pattern. (This works because `awk'
-variables are automatically initialized to zero.)
+`Nov'. Each time a line has the string `Nov' for its sixth field,
+`awk' performs the action `sum += $5'. This adds the fifth field (the
+file's size) to the variable `sum'. As a result, when `awk' has
+finished reading all the input lines, `sum' is the total of the sizes
+of the files whose lines matched the pattern. (This works because
+`awk' variables are automatically initialized to zero.)
After the last line of output from `ls' has been processed, the
`END' rule executes and prints the value of `sum'. In this example,
@@ -2078,15 +2114,15 @@ We have generally not used backslash continuation in our sample
programs. `gawk' places no limit on the length of a line, so backslash
continuation is never strictly necessary; it just makes programs more
readable. For this same reason, as well as for clarity, we have kept
-most statements short in the sample programs presented throughout the
-Info file. Backslash continuation is most useful when your `awk'
-program is in a separate source file instead of entered from the
-command line. You should also note that many `awk' implementations are
-more particular about where you may use backslash continuation. For
-example, they may not allow you to split a string constant using
-backslash continuation. Thus, for maximum portability of your `awk'
-programs, it is best not to split your lines in the middle of a regular
-expression or a string.
+most statements short in the programs presented throughout the Info
+file. Backslash continuation is most useful when your `awk' program is
+in a separate source file instead of entered from the command line.
+You should also note that many `awk' implementations are more
+particular about where you may use backslash continuation. For example,
+they may not allow you to split a string constant using backslash
+continuation. Thus, for maximum portability of your `awk' programs, it
+is best not to split your lines in the middle of a regular expression
+or a string.
CAUTION: _Backslash continuation does not work as described with
the C shell._ It works for `awk' programs in files and for
@@ -2128,7 +2164,7 @@ comment, it ignores _everything_ on the rest of the line. For example:
> BEGIN rule
> }'
error--> gawk: cmd. line:2: BEGIN rule
- error--> gawk: cmd. line:2: ^ parse error
+ error--> gawk: cmd. line:2: ^ syntax error
In this case, it looks like the backslash would continue the comment
onto the next line. However, the backslash-newline combination is never
@@ -2177,7 +2213,7 @@ most of the variables and many of the functions. They are described
systematically in *note Built-in Variables::, and *note Built-in::.

-File: gawk.info, Node: When, Prev: Other Features, Up: Getting Started
+File: gawk.info, Node: When, Next: Intro Summary, Prev: Other Features, Up: Getting Started
1.8 When to Use `awk'
=====================
@@ -2207,14 +2243,39 @@ those that it has are much larger than they used to be.
If you find yourself writing `awk' scripts of more than, say, a few
hundred lines, you might consider using a different programming
-language. Emacs Lisp is a good choice if you need sophisticated string
-or pattern matching capabilities. The shell is also good at string and
-pattern matching; in addition, it allows powerful use of the system
-utilities. More conventional languages, such as C, C++, and Java, offer
-better facilities for system programming and for managing the complexity
-of large programs. Programs in these languages may require more lines
-of source code than the equivalent `awk' programs, but they are easier
-to maintain and usually run more efficiently.
+language. The shell is good at string and pattern matching; in
+addition, it allows powerful use of the system utilities. More
+conventional languages, such as C, C++, and Java, offer better
+facilities for system programming and for managing the complexity of
+large programs. Python offers a nice balance between high-level ease
+of programming and access to system facilities. Programs in these
+languages may require more lines of source code than the equivalent
+`awk' programs, but they are easier to maintain and usually run more
+efficiently.
+
+
+File: gawk.info, Node: Intro Summary, Prev: When, Up: Getting Started
+
+1.9 Summary
+===========
+
+ * Programs in `awk' consist of PATTERN-ACTION pairs.
+
+ * Use either `awk 'PROGRAM' FILES' or `awk -f PROGRAM-FILE FILES' to
+ run `awk'.
+
+ * You may use the special `#!' header line to create `awk' programs
+ that are directly executable.
+
+ * Comments in `awk' programs start with `#' and continue to the end
+ of the same line.
+
+ * Be aware of quoting issues when writing `awk' programs as part of
+ a larger shell script (or MS-Windows batch file).
+
+ * You may use backslash continuation to continue a source line.
+ Lines are automatically continued after a comma, open brace,
+ question mark, colon, `||', `&&', `do' and `else'.

File: gawk.info, Node: Invoking Gawk, Next: Regexp, Prev: Getting Started, Up: Top
@@ -2246,6 +2307,7 @@ this major node that don't interest you right now.
* Loading Shared Libraries:: Loading shared libraries into your program.
* Obsolete:: Obsolete Options and/or features.
* Undocumented:: Undocumented Options and Features.
+* Invoking Summary:: Invocation summary.

File: gawk.info, Node: Command Line, Next: Options, Up: Invoking Gawk
@@ -2257,8 +2319,8 @@ There are two ways to run `awk'--with an explicit program or with one
or more program files. Here are templates for both of them; items
enclosed in [...] in these templates are optional:
- awk [OPTIONS] -f progfile [`--'] FILE ...
- awk [OPTIONS] [`--'] 'PROGRAM' FILE ...
+ `awk' [OPTIONS] `-f' PROGFILE [`--'] FILE ...
+ `awk' [OPTIONS] [`--'] `'PROGRAM'' FILE ...
Besides traditional one-letter POSIX-style options, `gawk' also
supports GNU long options.
@@ -2344,11 +2406,11 @@ The following list describes options mandated by the POSIX standard:
treated as single-byte characters.
Normally, `gawk' follows the POSIX standard and attempts to process
- its input data according to the current locale. This can often
- involve converting multibyte characters into wide characters
- (internally), and can lead to problems or confusion if the input
- data does not contain valid multibyte characters. This option is
- an easy way to tell `gawk': "hands off my data!".
+ its input data according to the current locale (*note Locales::).
+ This can often involve converting multibyte characters into wide
+ characters (internally), and can lead to problems or confusion if
+ the input data does not contain valid multibyte characters. This
+ option is an easy way to tell `gawk': "hands off my data!".
`-c'
`--traditional'
@@ -2362,8 +2424,8 @@ The following list describes options mandated by the POSIX standard:
Print the short version of the General Public License and then
exit.
-`-d[FILE]'
-`--dump-variables[=FILE]'
+`-d'[FILE]
+`--dump-variables'[`='FILE]
Print a sorted list of global variables, their types, and final
values to FILE. If no FILE is provided, print this list to the
file named `awkvars.out' in the current directory. No space is
@@ -2377,25 +2439,25 @@ The following list describes options mandated by the POSIX standard:
particularly easy mistake to make with simple variable names like
`i', `j', etc.)
-`-D[FILE]'
-`--debug=[FILE]'
+`-D'[FILE]
+`--debug'[`='FILE]
Enable debugging of `awk' programs (*note Debugging::). By
default, the debugger reads commands interactively from the
- terminal. The optional FILE argument allows you to specify a file
+ keyboard. The optional FILE argument allows you to specify a file
with a list of commands for the debugger to execute
non-interactively. No space is allowed between the `-D' and FILE,
if FILE is supplied.
-`-e PROGRAM-TEXT'
-`--source PROGRAM-TEXT'
+`-e' PROGRAM-TEXT
+`--source' PROGRAM-TEXT
Provide program source code in the PROGRAM-TEXT. This option
allows you to mix source code in files with source code that you
enter on the command line. This is particularly useful when you
have library functions that you want to use from your command-line
programs (*note AWKPATH Variable::).
-`-E FILE'
-`--exec FILE'
+`-E' FILE
+`--exec' FILE
Similar to `-f', read `awk' program text from FILE. There are two
differences from `-f':
@@ -2428,37 +2490,41 @@ The following list describes options mandated by the POSIX standard:
Print a "usage" message summarizing the short and long style
options that `gawk' accepts and then exit.
-`-i SOURCE-FILE'
-`--include SOURCE-FILE'
+`-i' SOURCE-FILE
+`--include' SOURCE-FILE
Read `awk' source library from SOURCE-FILE. This option is
completely equivalent to using the `@include' directive inside
your program. This option is very similar to the `-f' option, but
there are two important differences. First, when `-i' is used,
- the program source will not be loaded if it has been previously
- loaded, whereas the `-f' will always load the file. Second,
- because this option is intended to be used with code libraries,
- `gawk' does not recognize such files as constituting main program
- input. Thus, after processing an `-i' argument, `gawk' still
- expects to find the main source code via the `-f' option or on the
+ the program source is not loaded if it has been previously loaded,
+ whereas with `-f', `gawk' always loads the file. Second, because
+ this option is intended to be used with code libraries, `gawk'
+ does not recognize such files as constituting main program input.
+ Thus, after processing an `-i' argument, `gawk' still expects to
+ find the main source code via the `-f' option or on the
command-line.
-`-l LIB'
-`--load LIB'
- Load a shared library LIB. This searches for the library using the
- `AWKLIBPATH' environment variable. The correct library suffix for
- your platform will be supplied by default, so it need not be
- specified in the library name. The library initialization routine
- should be named `dl_load()'. An alternative is to use the `@load'
- keyword inside the program to load a shared library.
-
-`-L [value]'
-`--lint[=value]'
+`-l' EXT
+`--load' EXT
+ Load a dynamic extension named EXT. Extensions are stored as
+ system shared libraries. This option searches for the library
+ using the `AWKLIBPATH' environment variable. The correct library
+ suffix for your platform will be supplied by default, so it need
+ not be specified in the extension name. The extension
+ initialization routine should be named `dl_load()'. An
+ alternative is to use the `@load' keyword inside the program to
+ load a shared library. This feature is described in detail in
+ *note Dynamic Extensions::.
+
+`-L'[VALUE]
+`--lint'[`='VALUE]
Warn about constructs that are dubious or nonportable to other
- `awk' implementations. Some warnings are issued when `gawk' first
- reads your program. Others are issued at runtime, as your program
- executes. With an optional argument of `fatal', lint warnings
- become fatal errors. This may be drastic, but its use will
- certainly encourage the development of cleaner `awk' programs.
+ `awk' implementations. No space is allowed between the `-D' and
+ VALUE, if VALUE is supplied. Some warnings are issued when `gawk'
+ first reads your program. Others are issued at runtime, as your
+ program executes. With an optional argument of `fatal', lint
+ warnings become fatal errors. This may be drastic, but its use
+ will certainly encourage the development of cleaner `awk' programs.
With an optional argument of `invalid', only warnings about things
that are actually invalid are issued. (This is not fully
implemented yet.)
@@ -2474,7 +2540,7 @@ The following list describes options mandated by the POSIX standard:
`--bignum'
Force arbitrary precision arithmetic on numbers. This option has
no effect if `gawk' is not compiled to use the GNU MPFR and MP
- libraries (*note Gawk and MPFR::).
+ libraries (*note Arbitrary Precision Arithmetic::).
`-n'
`--non-decimal-data'
@@ -2489,23 +2555,24 @@ The following list describes options mandated by the POSIX standard:
Force the use of the locale's decimal point character when parsing
numeric input data (*note Locales::).
-`-o[FILE]'
-`--pretty-print[=FILE]'
+`-o'[FILE]
+`--pretty-print'[`='FILE]
Enable pretty-printing of `awk' programs. By default, output
- program is created in a file named `awkprof.out'. The optional
- FILE argument allows you to specify a different file name for the
- output. No space is allowed between the `-o' and FILE, if FILE is
- supplied.
+ program is created in a file named `awkprof.out' (*note
+ Profiling::). The optional FILE argument allows you to specify a
+ different file name for the output. No space is allowed between
+ the `-o' and FILE, if FILE is supplied.
+
+ NOTE: In the past, this option would also execute your
+ program. This is no longer the case.
`-O'
`--optimize'
Enable some optimizations on the internal representation of the
- program. At the moment this includes just simple constant
- folding. The `gawk' maintainer hopes to add more optimizations
- over time.
+ program. At the moment this includes just simple constant folding.
-`-p[FILE]'
-`--profile[=FILE]'
+`-p'[FILE]
+`--profile'[`='FILE]
Enable profiling of `awk' programs (*note Profiling::). By
default, profiles are created in a file named `awkprof.out'. The
optional FILE argument allows you to specify a different file name
@@ -2538,15 +2605,15 @@ The following list describes options mandated by the POSIX standard:
data (*note Locales::).
If you supply both `--traditional' and `--posix' on the command
- line, `--posix' takes precedence. `gawk' also issues a warning if
- both options are supplied.
+ line, `--posix' takes precedence. `gawk' issues a warning if both
+ options are supplied.
`-r'
`--re-interval'
Allow interval expressions (*note Regexp Operators::) in regexps.
This is now `gawk''s default behavior. Nevertheless, this option
remains both for backward compatibility, and for use in
- combination with the `--traditional' option.
+ combination with `--traditional'.
`-S'
`--sandbox'
@@ -2586,7 +2653,7 @@ having to be included into each individual program. (As mentioned in
*note Definition Syntax::, function names must be unique.)
With standard `awk', library functions can still be used, even if
-the program is entered at the terminal, by specifying `-f /dev/tty'.
+the program is entered at the keyboard, by specifying `-f /dev/tty'.
After typing your program, type `Ctrl-d' (the end-of-file character) to
terminate it. (You may also use `-f -' to read program source from the
standard input but then you will not be able to also use the standard
@@ -2596,25 +2663,25 @@ input as a source of data.)
source file and command-line `awk' programs, `gawk' provides the
`--source' option. This does not require you to pre-empt the standard
input for your source code; it allows you to easily mix command-line
-and library source code (*note AWKPATH Variable::). The `--source'
-option may also be used multiple times on the command line.
+and library source code (*note AWKPATH Variable::). As with `-f', the
+`--source' and `--include' options may also be used multiple times on
+the command line.
If no `-f' or `--source' option is specified, then `gawk' uses the
first non-option command-line argument as the text of the program
source code.
If the environment variable `POSIXLY_CORRECT' exists, then `gawk'
-behaves in strict POSIX mode, exactly as if you had supplied the
-`--posix' command-line option. Many GNU programs look for this
-environment variable to suppress extensions that conflict with POSIX,
-but `gawk' behaves differently: it suppresses all extensions, even
-those that do not conflict with POSIX, and behaves in strict POSIX
-mode. If `--lint' is supplied on the command line and `gawk' turns on
-POSIX mode because of `POSIXLY_CORRECT', then it issues a warning
-message indicating that POSIX mode is in effect. You would typically
-set this variable in your shell's startup file. For a
-Bourne-compatible shell (such as Bash), you would add these lines to
-the `.profile' file in your home directory:
+behaves in strict POSIX mode, exactly as if you had supplied `--posix'.
+Many GNU programs look for this environment variable to suppress
+extensions that conflict with POSIX, but `gawk' behaves differently: it
+suppresses all extensions, even those that do not conflict with POSIX,
+and behaves in strict POSIX mode. If `--lint' is supplied on the
+command line and `gawk' turns on POSIX mode because of
+`POSIXLY_CORRECT', then it issues a warning message indicating that
+POSIX mode is in effect. You would typically set this variable in your
+shell's startup file. For a Bourne-compatible shell (such as Bash),
+you would add these lines to the `.profile' file in your home directory:
POSIXLY_CORRECT=true
export POSIXLY_CORRECT
@@ -2666,18 +2733,18 @@ begins scanning the argument list.
The variable values given on the command line are processed for
escape sequences (*note Escape Sequences::). (d.c.)
- In some earlier implementations of `awk', when a variable assignment
-occurred before any file names, the assignment would happen _before_
-the `BEGIN' rule was executed. `awk''s behavior was thus inconsistent;
-some command-line assignments were available inside the `BEGIN' rule,
-while others were not. Unfortunately, some applications came to depend
-upon this "feature." When `awk' was changed to be more consistent, the
-`-v' option was added to accommodate applications that depended upon
-the old behavior.
+ In some very early implementations of `awk', when a variable
+assignment occurred before any file names, the assignment would happen
+_before_ the `BEGIN' rule was executed. `awk''s behavior was thus
+inconsistent; some command-line assignments were available inside the
+`BEGIN' rule, while others were not. Unfortunately, some applications
+came to depend upon this "feature." When `awk' was changed to be more
+consistent, the `-v' option was added to accommodate applications that
+depended upon the old behavior.
The variable assignment feature is most useful for assigning to
variables such as `RS', `OFS', and `ORS', which control input and
-output formats before scanning the data files. It is also useful for
+output formats, before scanning the data files. It is also useful for
controlling state if multiple passes are needed over a data file. For
example:
@@ -2712,7 +2779,7 @@ with `getline' (*note Getline/File::).
In addition, `gawk' allows you to specify the special file name
`/dev/stdin', both on the command line and with `getline'. Some other
versions of `awk' also support this, but it is not standard. (Some
-operating systems provide a `/dev/stdin' file in the file system,
+operating systems provide a `/dev/stdin' file in the file system;
however, `gawk' always processes this file name itself.)

@@ -2751,9 +2818,9 @@ colons(1). `gawk' gets its search path from the `AWKPATH' environment
variable. If that variable does not exist, `gawk' uses a default path,
`.:/usr/local/share/awk'.(2)
- The search path feature is particularly useful for building libraries
-of useful `awk' functions. The library files can be placed in a
-standard directory in the default path and then specified on the
+ The search path feature is particularly helpful for building
+libraries of useful `awk' functions. The library files can be placed
+in a standard directory in the default path and then specified on the
command line with a short file name. Otherwise, the full file name
would have to be typed for each file.
@@ -2764,14 +2831,16 @@ in compatibility mode. This is true for both `--traditional' and
`--posix'. *Note Options::.
If the source code is not found after the initial search, the path
-is searched again after adding the default `.awk' suffix to the
-filename.
+is searched again after adding the default `.awk' suffix to the file
+name.
NOTE: To include the current directory in the path, either place
`.' explicitly in the path or write a null entry in the path. (A
null entry is indicated by starting or ending the path with a
- colon or by placing two colons next to each other (`::').) This
- path search mechanism is similar to the shell's.
+ colon or by placing two colons next to each other [`::'].) This
+ path search mechanism is similar to the shell's. (See `The
+ Bourne-Again SHell manual'.
+ (http://www.gnu.org/software/bash/manual/))
However, `gawk' always looks in the current directory _before_
searching `AWKPATH', so there is no real reason to include the
@@ -2779,8 +2848,8 @@ filename.
If `AWKPATH' is not defined in the environment, `gawk' places its
default search path into `ENVIRON["AWKPATH"]'. This makes it easy to
-determine the actual search path that `gawk' will use from within an
-`awk' program.
+determine the actual search path that `gawk' used from within an `awk'
+program.
While you can change `ENVIRON["AWKPATH"]' within your `awk' program,
this has no effect on the running program's behavior. This makes
@@ -2804,13 +2873,13 @@ File: gawk.info, Node: AWKLIBPATH Variable, Next: Other Environment Variables,
-------------------------------------------
The `AWKLIBPATH' environment variable is similar to the `AWKPATH'
-variable, but it is used to search for shared libraries specified with
-the `-l' option rather than for source files. If the library is not
-found, the path is searched again after adding the appropriate shared
-library suffix for the platform. For example, on GNU/Linux systems,
-the suffix `.so' is used. The search path specified is also used for
-libraries loaded via the `@load' keyword (*note Loading Shared
-Libraries::).
+variable, but it is used to search for loadable extensions (stored as
+system shared libraries) specified with the `-l' option rather than for
+source files. If the extension is not found, the path is searched
+again after adding the appropriate shared library suffix for the
+platform. For example, on GNU/Linux systems, the suffix `.so' is used.
+The search path specified is also used for extensions loaded via the
+`@load' keyword (*note Loading Shared Libraries::).

File: gawk.info, Node: Other Environment Variables, Prev: AWKLIBPATH Variable, Up: Environment Variables
@@ -2827,7 +2896,7 @@ used by regular users.
traditional and GNU extensions. *Note Options::.
`GAWK_SOCK_RETRIES'
- Controls the number of time `gawk' will attempt to retry a two-way
+ Controls the number of times `gawk' attempts to retry a two-way
TCP/IP (socket) connection before giving up. *Note TCP/IP
Networking::.
@@ -2844,9 +2913,18 @@ used by regular users.
the `gawk' developers for testing and tuning. They are subject to
change. The variables are:
+`AWKBUFSIZE'
+ This variable only affects `gawk' on POSIX-compliant systems.
+ With a value of `exact', `gawk' uses the size of each input file
+ as the size of the memory buffer to allocate for I/O. Otherwise,
+ the value should be a number, and `gawk' uses that number as the
+ size of the buffer to allocate. (When this variable is not set,
+ `gawk' uses the smaller of the file's size and the "default"
+ blocksize, which is usually the file systems I/O blocksize.)
+
`AWK_HASH'
- If this variable exists with a value of `gst', `gawk' will switch
- to using the hash function from GNU Smalltalk for managing arrays.
+ If this variable exists with a value of `gst', `gawk' switches to
+ using the hash function from GNU Smalltalk for managing arrays.
This function may be marginally faster than the standard function.
`AWKREADFUNC'
@@ -2949,7 +3027,7 @@ enclosed in double quotes.
NOTE: Keep in mind that this is a language construct and the file
name cannot be a string variable, but rather just a literal string
- in double quotes.
+ constant in double quotes.
The files to be included may be nested; e.g., given a third script,
namely `test3':
@@ -3004,22 +3082,22 @@ and this also applies to files named with `@include'.

File: gawk.info, Node: Loading Shared Libraries, Next: Obsolete, Prev: Include Files, Up: Invoking Gawk
-2.8 Loading Shared Libraries Into Your Program
-==============================================
+2.8 Loading Dynamic Extensions Into Your Program
+================================================
This minor node describes a feature that is specific to `gawk'.
- The `@load' keyword can be used to read external `awk' shared
-libraries. This allows you to link in compiled code that may offer
-superior performance and/or give you access to extended capabilities
-not supported by the `awk' language. The `AWKLIBPATH' variable is used
-to search for the shared library. Using `@load' is completely
-equivalent to using the `-l' command-line option.
+ The `@load' keyword can be used to read external `awk' extensions
+(stored as system shared libraries). This allows you to link in
+compiled code that may offer superior performance and/or give you
+access to extended capabilities not supported by the `awk' language.
+The `AWKLIBPATH' variable is used to search for the extension. Using
+`@load' is completely equivalent to using the `-l' command-line option.
- If the shared library is not initially found in `AWKLIBPATH', another
+ If the extension is not initially found in `AWKLIBPATH', another
search is conducted after appending the platform's default shared
-library suffix to the filename. For example, on GNU/Linux systems, the
-suffix `.so' is used.
+library suffix to the file name. For example, on GNU/Linux systems,
+the suffix `.so' is used.
$ gawk '@load "ordchr"; BEGIN {print chr(65)}'
-| A
@@ -3031,7 +3109,7 @@ This is equivalent to the following example:
For command-line usage, the `-l' option is more convenient, but `@load'
is useful for embedding inside an `awk' source file that requires
-access to a shared library.
+access to an extension.
*note Dynamic Extensions::, describes how to write extensions (in C
or C++) that can be loaded with either `@load' or the `-l' option.
@@ -3053,7 +3131,7 @@ worked. As of version 4.0, they are no longer interpreted specially by
`gawk'. (Use `PROCINFO' instead; see *note Auto-set::.)

-File: gawk.info, Node: Undocumented, Prev: Obsolete, Up: Invoking Gawk
+File: gawk.info, Node: Undocumented, Next: Invoking Summary, Prev: Obsolete, Up: Invoking Gawk
2.10 Undocumented Options and Features
======================================
@@ -3063,6 +3141,48 @@ File: gawk.info, Node: Undocumented, Prev: Obsolete, Up: Invoking Gawk
This minor node intentionally left blank.

+File: gawk.info, Node: Invoking Summary, Prev: Undocumented, Up: Invoking Gawk
+
+2.11 Summary
+============
+
+ * Use either `awk 'PROGRAM' FILES' or `awk -f PROGRAM-FILE FILES' to
+ run `awk'.
+
+ * The three standard `awk' options are `-f', `-F' and `-v'. `gawk'
+ supplies these and many others, as well as corresponding GNU-style
+ long options.
+
+ * Non-option command-line arguments are usually treated as file
+ names, unless they have the form `VAR=VALUE', in which case they
+ are taken as variable assignments to be performed at that point in
+ processing the input.
+
+ * All non-option command-line arguments, excluding the program text,
+ are placed in the `ARGV' array. Adjusting `ARGC' and `ARGV'
+ affects how `awk' processes input.
+
+ * You can use a single minus sign (`-') to refer to standard input
+ on the command line.
+
+ * `gawk' pays attention to a number of environment variables.
+ `AWKPATH', `AWKLIBPATH', and `POSIXLY_CORRECT' are the most
+ important ones.
+
+ * `gawk''s exit status conveys information to the program that
+ invoked it. Use the `exit' statement from within an `awk' program
+ to set the exit status.
+
+ * `gawk' allows you to include other `awk' source files into your
+ program using the `@include' statement and/or the `-i' and `-f'
+ command-line options.
+
+ * `gawk' allows you to load additional functions written in C or C++
+ using the `@load' statement and/or the `-l' option. (This
+ advanced feature is described later on in *note Dynamic
+ Extensions::.)
+
+
File: gawk.info, Node: Regexp, Next: Reading Files, Prev: Invoking Gawk, Up: Top
3 Regular Expressions
@@ -3091,6 +3211,7 @@ you specify more complicated classes of strings.
* Case-sensitivity:: How to do case-insensitive matching.
* Leftmost Longest:: How much text matches.
* Computed Regexps:: Using Dynamic Regexps.
+* Regexp Summary:: Regular expressions summary.

File: gawk.info, Node: Regexp Usage, Next: Escape Sequences, Up: Regexp
@@ -3102,8 +3223,8 @@ A regular expression can be used as a pattern by enclosing it in
slashes. Then the regular expression is tested against the entire text
of each record. (Normally, it only needs to match some part of the
text in order to succeed.) For example, the following prints the
-second field of each record that contains the string `li' anywhere in
-it:
+second field of each record where the string `li' appears anywhere in
+the record:
$ awk '/li/ { print $2 }' mail-list
-| 555-5553
@@ -3188,8 +3309,8 @@ apply to both string constants and regexp constants:
A literal backslash, `\'.
`\a'
- The "alert" character, `Ctrl-g', ASCII code 7 (BEL). (This
- usually makes some sort of audible noise.)
+ The "alert" character, `Ctrl-g', ASCII code 7 (BEL). (This often
+ makes some sort of audible noise.)
`\b'
Backspace, `Ctrl-h', ASCII code 8 (BS).
@@ -3324,20 +3445,21 @@ sequences and that are not listed in the table stand for themselves:
at the beginning of the string.
It is important to realize that `^' does not match the beginning of
- a line embedded in a string. The condition is not true in the
- following example:
+ a line (the point right after a `\n' newline character) embedded
+ in a string. The condition is not true in the following example:
if ("line1\nLINE 2" ~ /^L/) ...
`$'
This is similar to `^', but it matches only at the end of a string.
For example, `p$' matches a record that ends with a `p'. The `$'
- is an anchor and does not match the end of a line embedded in a
- string. The condition in the following example is not true:
+ is an anchor and does not match the end of a line (the point right
+ before a `\n' newline character) embedded in a string. The
+ condition in the following example is not true:
if ("line1\nLINE 2" ~ /1$/) ...
-`. (period)'
+`.' (period)
This matches any single character, _including_ the newline
character. For example, `.P' matches any single character
followed by a `P' in a string. Using concatenation, we can make a
@@ -3349,7 +3471,7 @@ sequences and that are not listed in the table stand for themselves:
Otherwise, NUL is just another character. Other versions of `awk'
may not be able to match the NUL character.
-`[...]'
+`['...`]'
This is called a "bracket expression".(1) It matches any _one_ of
the characters that are enclosed in the square brackets. For
example, `[MVX]' matches any one of the characters `M', `V', or
@@ -3357,7 +3479,7 @@ sequences and that are not listed in the table stand for themselves:
square brackets of a bracket expression is given in *note Bracket
Expressions::.
-`[^ ...]'
+`[^'...`]'
This is a "complemented bracket expression". The first character
after the `[' _must_ be a `^'. It matches any characters _except_
those in the square brackets. For example, `[^awk]' matches any
@@ -3373,7 +3495,7 @@ sequences and that are not listed in the table stand for themselves:
The alternation applies to the largest possible regexps on either
side.
-`(...)'
+`('...`)'
Parentheses are used for grouping in regular expressions, as in
arithmetic. They can be used to concatenate regular expressions
containing the alternation operator, `|'. For example,
@@ -3400,8 +3522,8 @@ sequences and that are not listed in the table stand for themselves:
This symbol is similar to `*', except that the preceding
expression must be matched at least once. This means that `wh+y'
would match `why' and `whhy', but not `wy', whereas `wh*y' would
- match all three of these strings. The following is a simpler way
- of writing the last `*' example:
+ match all three. The following is a simpler way of writing the
+ last `*' example:
awk '/\(c[ad]+r x\)/ { print }' sample
@@ -3410,9 +3532,9 @@ sequences and that are not listed in the table stand for themselves:
expression can be matched either once or not at all. For example,
`fe?d' matches `fed' and `fd', but nothing else.
-`{N}'
-`{N,}'
-`{N,M}'
+`{'N`}'
+`{'N`,}'
+`{'N`,'M`}'
One or two numbers inside braces denote an "interval expression".
If there is one number in the braces, the preceding regexp is
repeated N times. If there are two numbers separated by a comma,
@@ -3541,6 +3663,14 @@ set had other alphabetic characters in it, this would not match them.
With the POSIX character classes, you can write `/[[:alnum:]]/' to
match the alphabetic and numeric characters in your character set.
+ Some utilities that match regular expressions provide a non-standard
+`[:ascii:]' character class; `awk' does not. However, you can simulate
+such a construct using `[\x00-\x7F]'. This matches all values
+numerically between zero and 127, which is the defined range of the
+ASCII character set. Use a complemented character list
+(`[^\x00-\x7F]') to match any single-byte characters that are not in
+the ASCII range.
+
Two additional special sequences can appear in bracket expressions.
These apply to non-ASCII character sets, which can have single symbols
(called "collating elements") that are represented with more than one
@@ -3692,7 +3822,9 @@ works in any POSIX-compliant `awk'.
Another method, specific to `gawk', is to set the variable
`IGNORECASE' to a nonzero value (*note Built-in Variables::). When
`IGNORECASE' is not zero, _all_ regexp and string operations ignore
-case. Changing the value of `IGNORECASE' dynamically controls the
+case.
+
+ Changing the value of `IGNORECASE' dynamically controls the
case-sensitivity of the program as it runs. Case is significant by
default because `IGNORECASE' (like most variables) is initialized to
zero:
@@ -3715,9 +3847,6 @@ dynamically turn case-sensitivity on or off for all the rules at once.
`IGNORECASE' from the command line is a way to make a program
case-insensitive without having to edit it.
- Both regexp and string comparison operations are affected by
-`IGNORECASE'.
-
In multibyte locales, the equivalences between upper- and lowercase
characters are tested based on the wide-character values of the
locale's character set. Otherwise, the characters are tested based on
@@ -3770,7 +3899,7 @@ this principle is also important for regexp-based record and field
splitting (*note Records::, and also *note Field Separators::).

-File: gawk.info, Node: Computed Regexps, Prev: Leftmost Longest, Up: Regexp
+File: gawk.info, Node: Computed Regexps, Next: Regexp Summary, Prev: Leftmost Longest, Up: Regexp
3.8 Using Dynamic Regexps
=========================
@@ -3779,7 +3908,8 @@ The righthand side of a `~' or `!~' operator need not be a regexp
constant (i.e., a string of characters between slashes). It may be any
expression. The expression is evaluated and converted to a string if
necessary; the contents of the string are then used as the regexp. A
-regexp computed in this way is called a "dynamic regexp":
+regexp computed in this way is called a "dynamic regexp" or a "computed
+regexp":
BEGIN { digits_regexp = "[[:digit:]]+" }
$0 ~ digits_regexp { print }
@@ -3827,8 +3957,8 @@ constants," for several reasons:
Using `\n' in Bracket Expressions of Dynamic Regexps
- Some commercial versions of `awk' do not allow the newline character
-to be used inside a bracket expression for a dynamic regexp:
+ Some versions of `awk' do not allow the newline character to be used
+inside a bracket expression for a dynamic regexp:
$ awk '$0 ~ "[ \t\n]"'
error--> awk: newline in character class [
@@ -3848,6 +3978,44 @@ to be used inside a bracket expression for a dynamic regexp:
often in practice, but it's worth noting for future reference.

+File: gawk.info, Node: Regexp Summary, Prev: Computed Regexps, Up: Regexp
+
+3.9 Summary
+===========
+
+ * Regular expressions describe sets of strings to be matched. In
+ `awk', regular expression constants are written enclosed between
+ slashes: `/'...`/'.
+
+ * Regexp constants may be used by standalone in patterns and in
+ conditional expressions, or as part of matching expressions using
+ the `~' and `!~' operators.
+
+ * Escape sequences let you represent non-printable characters and
+ also let you represent regexp metacharacters as literal characters
+ to be matched.
+
+ * Regexp operators provide grouping, alternation and repetition.
+
+ * Bracket expressions give you a shorthand for specifying sets of
+ characters that can match at a particular point in a regexp.
+ Within bracket expressions, POSIX character classes let you specify
+ certain groups of characters in a locale-independent fashion.
+
+ * `gawk''s `IGNORECASE' variable lets you control the case
+ sensitivity of regexp matching. In other `awk' versions, use
+ `tolower()' or `toupper()'.
+
+ * Regular expressions match the leftmost longest text in the string
+ being matched. This matters for cases where you need to know the
+ extent of the match, such as for text substitution and when the
+ record separator is a regexp.
+
+ * Matching expressions may use dynamic regexps; that is string values
+ treated as regular expressions.
+
+
+
File: gawk.info, Node: Reading Files, Next: Printing, Prev: Regexp, Up: Top
4 Reading Input Files
@@ -3887,6 +4055,8 @@ have to be named on the `awk' command line (*note Getline::).
* Read Timeout:: Reading input with a timeout.
* Command line directories:: What happens if you put a directory on the
command line.
+* Input Summary:: Input summary.
+* Input Exercises:: Exercises.

File: gawk.info, Node: Records, Next: Fields, Up: Reading Files
@@ -3902,8 +4072,19 @@ started. Another built-in variable, `NR', records the total number of
input records read so far from all data files. It starts at zero, but
is never automatically reset to zero.
- Records are separated by a character called the "record separator".
-By default, the record separator is the newline character. This is why
+* Menu:
+
+* awk split records:: How standard `awk' splits records.
+* gawk split records:: How `gawk' splits records.
+
+
+File: gawk.info, Node: awk split records, Next: gawk split records, Up: Records
+
+4.1.1 Record Splitting With Standard `awk'
+------------------------------------------
+
+Records are separated by a character called the "record separator". By
+default, the record separator is the newline character. This is why
records are, by default, single lines. A different character can be
used for the record separator by assigning the character to the
built-in variable `RS'.
@@ -4023,16 +4204,22 @@ affected.
After the end of the record has been determined, `gawk' sets the
variable `RT' to the text in the input that matched `RS'.
- When using `gawk', the value of `RS' is not limited to a
-one-character string. It can be any regular expression (*note
-Regexp::). (c.e.) In general, each record ends at the next string that
-matches the regular expression; the next record starts at the end of
-the matching string. This general rule is actually at work in the
-usual case, where `RS' contains just a newline: a record ends at the
-beginning of the next matching string (the next newline in the input),
-and the following record starts just after the end of this string (at
-the first character of the following line). The newline, because it
-matches `RS', is not part of either record.
+
+File: gawk.info, Node: gawk split records, Prev: awk split records, Up: Records
+
+4.1.2 Record Splitting With `gawk'
+----------------------------------
+
+When using `gawk', the value of `RS' is not limited to a one-character
+string. It can be any regular expression (*note Regexp::). (c.e.) In
+general, each record ends at the next string that matches the regular
+expression; the next record starts at the end of the matching string.
+This general rule is actually at work in the usual case, where `RS'
+contains just a newline: a record ends at the beginning of the next
+matching string (the next newline in the input), and the following
+record starts just after the end of this string (at the first character
+of the following line). The newline, because it matches `RS', is not
+part of either record.
When `RS' is a single character, `RT' contains the same single
character. However, when `RS' is a regular expression, `RT' contains
@@ -4095,8 +4282,10 @@ use for `RS' in this case:
BEGIN { RS = "\0" } # whole file becomes one record?
`gawk' in fact accepts this, and uses the NUL character for the
-record separator. However, this usage is _not_ portable to most other
-`awk' implementations.
+record separator. This works for certain special files, such as
+`/proc/environ' on GNU/Linux systems, where the NUL character is in
+fact the record separator. However, this usage is _not_ portable to
+most other `awk' implementations.
Almost all other `awk' implementations(1) store strings internally
as C-style strings. C strings use the NUL character as the string
@@ -4107,10 +4296,9 @@ terminator. In effect, this means that `RS = "\0"' is the same as `RS
as a record separator. However, this is a special case: `mawk' does not
allow embedded NUL characters in strings.
- The best way to treat a whole file as a single record is to simply
-read the file in, one record at a time, concatenating each record onto
-the end of the previous ones.
-
+ *Note Readfile Function::, for an interesting, portable way to read
+whole files. If you are using `gawk', see *note Extension Sample
+Readfile::, for another option.
---------- Footnotes ----------
@@ -4135,7 +4323,7 @@ to these pieces of the record. You don't have to use them--you can
operate on the whole record if you want--but fields are what make
simple `awk' programs so powerful.
- A dollar-sign (`$') is used to refer to a field in an `awk' program,
+ You use a dollar-sign (`$') to refer to a field in an `awk' program,
followed by the number of the field you want. Thus, `$1' refers to the
first field, `$2' to the second, and so on. (Unlike the Unix shells,
the field numbers are not limited to single digits. `$127' is the one
@@ -4158,8 +4346,9 @@ the last one (such as `$8' when the record has only seven fields), you
get the empty string. (If used in a numeric operation, you get zero.)
The use of `$0', which looks like a reference to the "zero-th"
-field, is a special case: it represents the whole input record when you
-are not interested in specific fields. Here are some more examples:
+field, is a special case: it represents the whole input record. Use it
+when you are not interested in specific fields. Here are some more
+examples:
$ awk '$1 ~ /li/ { print $0 }' mail-list
-| Amelia 555-5553 amelia.zodiacusque@gmail.com F
@@ -4191,11 +4380,11 @@ File: gawk.info, Node: Nonconstant Fields, Next: Changing Fields, Prev: Field
4.3 Nonconstant Field Numbers
=============================
-The number of a field does not need to be a constant. Any expression in
-the `awk' language can be used after a `$' to refer to a field. The
-value of the expression specifies the field number. If the value is a
-string, rather than a number, it is converted to a number. Consider
-this example:
+A field number need not be a constant. Any expression in the `awk'
+language can be used after a `$' to refer to a field. The value of the
+expression specifies the field number. If the value is a string,
+rather than a number, it is converted to a number. Consider this
+example:
awk '{ print $NR }'
@@ -4212,7 +4401,7 @@ another example of using expressions as field numbers:
number of the field to print. The `*' sign represents multiplication,
so the expression `2*2' evaluates to four. The parentheses are used so
that the multiplication is done before the `$' operation; they are
-necessary whenever there is a binary operator in the field-number
+necessary whenever there is a binary operator(1) in the field-number
expression. This example, then, prints the type of relationship (the
fourth field) for every line of the file `mail-list'. (All of the
`awk' operators are listed, in order of decreasing precedence, in *note
@@ -4231,6 +4420,12 @@ Variables::). The expression `$NF' is not a special feature--it is the
direct consequence of evaluating `NF' and using its value as a field
number.
+ ---------- Footnotes ----------
+
+ (1) A "binary operator", such as `*' for multiplication, is one that
+takes two operands. The distinction is required, since `awk' also has
+unary (one-operand) and ternary (three-operand) operators.
+

File: gawk.info, Node: Changing Fields, Next: Field Separators, Prev: Nonconstant Fields, Up: Reading Files
@@ -4256,11 +4451,11 @@ three minus ten: `$3 - 10'. (*Note Arithmetic Ops::.) Then it prints
the original and new values for field three. (Someone in the warehouse
made a consistent mistake while inventorying the red boxes.)
- For this to work, the text in field `$3' must make sense as a
-number; the string of characters must be converted to a number for the
-computer to do arithmetic on it. The number resulting from the
-subtraction is converted back to a string of characters that then
-becomes field three. *Note Conversion::.
+ For this to work, the text in `$3' must make sense as a number; the
+string of characters must be converted to a number for the computer to
+do arithmetic on it. The number resulting from the subtraction is
+converted back to a string of characters that then becomes field three.
+*Note Conversion::.
When the value of a field is changed (as perceived by `awk'), the
text of the input record is recalculated to contain the new field where
@@ -4325,7 +4520,7 @@ even when you assign the empty string to a field. For example:
-| a::c:d
-| 4
-The field is still there; it just has an empty value, denoted by the
+The field is still there; it just has an empty value, delimited by the
two colons between `a' and `c'. This example shows what happens if you
create a new field:
@@ -4511,7 +4706,7 @@ letter):
> { print $2 }'
-| a
-In this case, the first field is "null" or empty.
+In this case, the first field is null, or empty.
The stripping of leading and trailing whitespace also comes into
play whenever `$0' is recomputed. For instance, study this pipeline:
@@ -4950,7 +5145,7 @@ affects field splitting with `FPAT'.
deal with this. Since there is no formal specification for CSV
data, there isn't much more to be done; the `FPAT' mechanism
provides an elegant solution for the majority of cases, and the
- `gawk' maintainer is satisfied with that.
+ `gawk' developers are satisfied with that.
As written, the regexp used for `FPAT' requires that each field have
a least one character. A straightforward modification (changing
@@ -5000,8 +5195,8 @@ doesn't start until the first nonblank line that follows--no matter how
many blank lines appear in a row, they are considered one record
separator.
- There is an important difference between `RS = ""' and `RS =
-"\n\n+"'. In the first case, leading newlines in the input data file
+ However, there is an important difference between `RS = ""' and `RS
+= "\n\n+"'. In the first case, leading newlines in the input data file
are ignored, and if a file ends without extra blank lines after the
last record, the final newline is removed from the record. In the
second case, this special processing is not done. (d.c.)
@@ -5111,7 +5306,7 @@ File: gawk.info, Node: Getline, Next: Read Timeout, Prev: Multiple Line, Up:
=================================
So far we have been getting our input data from `awk''s main input
-stream--either the standard input (usually your terminal, sometimes the
+stream--either the standard input (usually your keyboard, sometimes the
output from another program) or from the files specified on the command
line. The `awk' language has a special built-in command called
`getline' that can be used to read input under your explicit control.
@@ -5273,9 +5468,9 @@ are changed, resulting in a new value of `NF'. `RT' is also set.
According to POSIX, `getline < EXPRESSION' is ambiguous if
EXPRESSION contains unparenthesized operators other than `$'; for
example, `getline < dir "/" file' is ambiguous because the
-concatenation operator is not parenthesized. You should write it as
-`getline < (dir "/" file)' if you want your program to be portable to
-all `awk' implementations.
+concatenation operator (not discussed yet; *note Concatenation::) is
+not parenthesized. You should write it as `getline < (dir "/" file)' if
+you want your program to be portable to all `awk' implementations.

File: gawk.info, Node: Getline/Variable/File, Next: Getline/Pipe, Prev: Getline/File, Up: Getline
@@ -5480,10 +5675,10 @@ in mind:
testing the new record against every pattern. However, the new
record is tested against any subsequent rules.
- * Many `awk' implementations limit the number of pipelines that an
- `awk' program may have open to just one. In `gawk', there is no
- such limit. You can open as many pipelines (and coprocesses) as
- the underlying operating system permits.
+ * Some very old `awk' implementations limit the number of pipelines
+ that an `awk' program may have open to just one. In `gawk', there
+ is no such limit. You can open as many pipelines (and
+ coprocesses) as the underlying operating system permits.
* An interesting side effect occurs if you use `getline' without a
redirection inside a `BEGIN' rule. Because an unredirected
@@ -5522,9 +5717,9 @@ in mind:
file is encountered, before the element in `a' is assigned?
`gawk' treats `getline' like a function call, and evaluates the
- expression `a[++c]' before attempting to read from `f'. Other
- versions of `awk' only evaluate the expression once they know that
- there is a string value to be assigned. Caveat Emptor.
+ expression `a[++c]' before attempting to read from `f'. However,
+ some versions of `awk' only evaluate the expression once they know
+ that there is a string value to be assigned. Caveat Emptor.

File: gawk.info, Node: Getline Summary, Prev: Getline Notes, Up: Getline
@@ -5560,10 +5755,12 @@ File: gawk.info, Node: Read Timeout, Next: Command line directories, Prev: Ge
4.10 Reading Input With A Timeout
=================================
-You may specify a timeout in milliseconds for reading input from a
-terminal, pipe or two-way communication including, TCP/IP sockets. This
-can be done on a per input, command or connection basis, by setting a
-special element in the `PROCINFO' array:
+This minor node describes a feature that is specific to `gawk'.
+
+ You may specify a timeout in milliseconds for reading input from the
+keyboard, a pipe, or two-way communication, including TCP/IP sockets.
+This can be done on a per input, command or connection basis, by
+setting a special element in the `PROCINFO' (*note Auto-set::) array:
PROCINFO["input_name", "READ_TIMEOUT"] = TIMEOUT IN MILLISECONDS
@@ -5579,19 +5776,19 @@ from the server after a certain amount of time:
else if (ERRNO != "")
print ERRNO
- Here is how to read interactively from the terminal(1) without
-waiting for more than five seconds:
+ Here is how to read interactively from the user(1) without waiting
+for more than five seconds:
PROCINFO["/dev/stdin", "READ_TIMEOUT"] = 5000
while ((getline < "/dev/stdin") > 0)
print $0
- `gawk' will terminate the read operation if input does not arrive
-after waiting for the timeout period, return failure and set the
-`ERRNO' variable to an appropriate string value. A negative or zero
-value for the timeout is the same as specifying no timeout at all.
+ `gawk' terminates the read operation if input does not arrive after
+waiting for the timeout period, returns failure and sets the `ERRNO'
+variable to an appropriate string value. A negative or zero value for
+the timeout is the same as specifying no timeout at all.
- A timeout can also be set for reading from the terminal in the
+ A timeout can also be set for reading from the keyboard in the
implicit loop that reads input records and matches them against
patterns, like so:
@@ -5644,23 +5841,118 @@ writing.
---------- Footnotes ----------
- (1) This assumes that standard input is the keyboard
+ (1) This assumes that standard input is the keyboard.

-File: gawk.info, Node: Command line directories, Prev: Read Timeout, Up: Reading Files
+File: gawk.info, Node: Command line directories, Next: Input Summary, Prev: Read Timeout, Up: Reading Files
4.11 Directories On The Command Line
====================================
According to the POSIX standard, files named on the `awk' command line
-must be text files. It is a fatal error if they are not. Most
-versions of `awk' treat a directory on the command line as a fatal
-error.
+must be text files; it is a fatal error if they are not. Most versions
+of `awk' treat a directory on the command line as a fatal error.
By default, `gawk' produces a warning for a directory on the command
-line, but otherwise ignores it. If either of the `--posix' or
-`--traditional' options is given, then `gawk' reverts to treating a
-directory on the command line as a fatal error.
+line, but otherwise ignores it. This makes it easier to use shell
+wildcards with your `awk' program:
+
+ $ gawk -f whizprog.awk * Directories could kill this progam
+
+ If either of the `--posix' or `--traditional' options is given, then
+`gawk' reverts to treating a directory on the command line as a fatal
+error.
+
+ *Note Extension Sample Readdir::, for a way to treat directories as
+usable data from an `awk' program.
+
+
+File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command line directories, Up: Reading Files
+
+4.12 Summary
+============
+
+ * Input is split into records based on the value of `RS'. The
+ possibilities are as follows:
+
+ Value of `RS' Records are split on `awk' / `gawk'
+ ----------------------------------------------------------------------
+ Any single That character `awk'
+ character
+ The empty string Runs of two or more `awk'
+ (`""') newlines
+ A regexp Text that matches the `gawk'
+ regexp
+
+ * `gawk' sets `RT' to the text matched by `RS'.
+
+ * After splitting the input into records, `awk' further splits the
+ record into individual fields, named `$1', `$2' and so on. `$0' is
+ the whole record, and `NF' indicates how many fields there are.
+ The default way to split fields is between whitespace characters.
+
+ * Fields may be referenced using a variable, as in `$NF'. Fields
+ may also be assigned values, which causes the value of `$0' to be
+ recomputed when it is later referenced. Assigning to a field with
+ a number greater than `NF' creates the field and rebuilds the
+ record, using `OFS' to separate the fields. Incrementing `NF'
+ does the same thing. Decrementing `NF' throws away fields and
+ rebuilds the record.
+
+ * Field splitting is more complicated than record splitting.
+
+ Field separator value Fields are split ... `awk' /
+ `gawk'
+ ----------------------------------------------------------------------
+ `FS == " "' On runs of whitespace `awk'
+ `FS == ANY SINGLE On that character `awk'
+ CHARACTER'
+ `FS == REGEXP' On text matching the `awk'
+ regexp
+ `FS == ""' Each individual character `gawk'
+ is a separate field
+ `FIELDWIDTHS == LIST OF Based on character `gawk'
+ COLUMNS' position
+ `FPAT == REGEXP' On text around text `gawk'
+ matching the regexp
+
+ Using `FS = "\n"' causes the entire record to be a single field
+ (assuming that newlines separate records).
+
+ * `FS' may be set from the command line using the `-F' option. This
+ can also be done using command-line variable assignment.
+
+ * `PROCINFO["FS"]' can be used to see how fields are being split.
+
+ * Use `getline' in its various forms to read additional records,
+ from the default input stream, from a file, or from a pipe or
+ co-process.
+
+ * Use `PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to timeout for
+ FILE.
+
+ * Directories on the command line are fatal for standard `awk';
+ `gawk' ignores them if not in POSIX mode.
+
+
+
+File: gawk.info, Node: Input Exercises, Prev: Input Summary, Up: Reading Files
+
+4.13 Exercises
+==============
+
+ 1. Using the `FIELDWIDTHS' variable (*note Constant Size::), write a
+ program to read election data, where each record represents one
+ voter's votes. Come up with a way to define which columns are
+ associated with each ballot item, and print the total votes,
+ including abstentions, for each item.
+
+ 2. *note Plain Getline::, presented a program to remove C-style
+ comments (`/* ... */') from the input. That program does not work
+ if one comment ends on one line and another one starts later on
+ the same line. Write a program that does handle multiple comments
+ on the line.
+

File: gawk.info, Node: Printing, Next: Expressions, Prev: Reading Files, Up: Top
@@ -5696,6 +5988,8 @@ function.
`gawk' allows access to inherited file
descriptors.
* Close Files And Pipes:: Closing Input and Output Files and Pipes.
+* Output Summary:: Output summary.
+* Output exercises:: Exercises.

File: gawk.info, Node: Print, Next: Print Examples, Up: Printing
@@ -5704,9 +5998,9 @@ File: gawk.info, Node: Print, Next: Print Examples, Up: Printing
=========================
The `print' statement is used for producing output with simple,
-standardized formatting. Specify only the strings or numbers to print,
-in a list separated by commas. They are output, separated by single
-spaces, followed by a newline. The statement looks like this:
+standardized formatting. You specify only the strings or numbers to
+print, in a list separated by commas. They are output, separated by
+single spaces, followed by a newline. The statement looks like this:
print ITEM1, ITEM2, ...
@@ -5773,8 +6067,8 @@ Here is the same program, without the comma:
To someone unfamiliar with the `inventory-shipped' file, neither
example's output makes much sense. A heading line at the beginning
would make it clearer. Let's add some headings to our table of months
-(`$1') and green crates shipped (`$2'). We do this using the `BEGIN'
-pattern (*note BEGIN/END::) so that the headings are only printed once:
+(`$1') and green crates shipped (`$2'). We do this using a `BEGIN'
+rule (*note BEGIN/END::) so that the headings are only printed once:
awk 'BEGIN { print "Month Crates"
print "----- ------" }
@@ -5987,11 +6281,11 @@ width. Here is a list of the format-control letters:
the first byte of a string or to numeric values within the
range of a single byte (0-255).
-`%d, %i'
+`%d', `%i'
Print a decimal integer. The two control letters are equivalent.
(The `%i' specification is for compatibility with ISO C.)
-`%e, %E'
+`%e', `%E'
Print a number in scientific (exponential) notation; for example:
printf "%4.3e\n", 1950
@@ -6013,7 +6307,8 @@ width. Here is a list of the format-control letters:
On systems supporting IEEE 754 floating point format, values
representing negative infinity are formatted as `-inf' or
`-infinity', and positive infinity as `inf' and `infinity'. The
- special "not a number" value formats as `-nan' or `nan'.
+ special "not a number" value formats as `-nan' or `nan' (*note
+ Math Definitions::).
`%F'
Like `%f' but the infinity and "not a number" values are spelled
@@ -6022,7 +6317,7 @@ width. Here is a list of the format-control letters:
The `%F' format is a POSIX extension to ISO C; not all systems
support it. On those that don't, `gawk' uses `%f' instead.
-`%g, %G'
+`%g', `%G'
Print a number in either scientific notation or in floating-point
notation, whichever uses fewer characters; if the result is
printed in scientific notation, `%G' uses `E' instead of `e'.
@@ -6038,7 +6333,7 @@ width. Here is a list of the format-control letters:
use, because all numbers in `awk' are floating-point; it is
provided primarily for compatibility with C.)
-`%x, %X'
+`%x', `%X'
Print an unsigned hexadecimal integer; `%X' uses the letters `A'
through `F' instead of `a' through `f' (*note
Nondecimal-numbers::).
@@ -6213,11 +6508,12 @@ string, like so:
This is not particularly easy to read but it does work.
- C programmers may be used to supplying additional `l', `L', and `h'
-modifiers in `printf' format strings. These are not valid in `awk'.
-Most `awk' implementations silently ignore them. If `--lint' is
-provided on the command line (*note Options::), `gawk' warns about
-their use. If `--posix' is supplied, their use is a fatal error.
+ C programmers may be used to supplying additional modifiers (`h',
+`j', `l', `L', `t', and `z') in `printf' format strings. These are not
+valid in `awk'. Most `awk' implementations silently ignore them. If
+`--lint' is provided on the command line (*note Options::), `gawk'
+warns about their use. If `--posix' is supplied, their use is a fatal
+error.

File: gawk.info, Node: Printf Examples, Prev: Format Modifiers, Up: Printf
@@ -6258,7 +6554,7 @@ they are last on their lines. They don't need to have spaces after
them.
The table could be made to look even nicer by adding headings to the
-tops of the columns. This is done using the `BEGIN' pattern (*note
+tops of the columns. This is done using a `BEGIN' rule (*note
BEGIN/END::) so that the headers are only printed once, at the
beginning of the `awk' program:
@@ -6285,11 +6581,6 @@ be emphasized by storing it in a variable, like this:
printf format, "----", "------" }
{ printf format, $1, $2 }' mail-list
- At this point, it would be a worthwhile exercise to use the `printf'
-statement to line up the headings and table data for the
-`inventory-shipped' example that was covered earlier in the minor node
-on the `print' statement (*note Print::).
-

File: gawk.info, Node: Redirection, Next: Special Files, Prev: Printf, Up: Printing
@@ -6309,7 +6600,7 @@ commands, except that they are written inside the `awk' program.
There are four forms of output redirection: output to a file, output
appended to a file, output through a pipe to another command, and output
-to a coprocess. They are all shown for the `print' statement, but they
+to a coprocess. We show them all for the `print' statement, but they
work identically for `printf':
`print ITEMS > OUTPUT-FILE'
@@ -6390,7 +6681,7 @@ work identically for `printf':
FILE or COMMAND--it is not necessary to always use a string
constant. Using a variable is generally a good idea, because (if
you mean to refer to that same file or command) `awk' requires
- that the string value be spelled identically every time.
+ that the string value be written identically every time.
`print ITEMS |& COMMAND'
This redirection prints the items to the input of COMMAND. The
@@ -6450,8 +6741,8 @@ to rename the files. It then sends the list to the shell for execution.

File: gawk.info, Node: Special Files, Next: Close Files And Pipes, Prev: Redirection, Up: Printing
-5.7 Special File Names in `gawk'
-================================
+5.7 Special File Name in `gawk'
+===============================
`gawk' provides a number of special file names that it interprets
internally. These file names provide access to standard file
@@ -6502,7 +6793,7 @@ run from a background job, it may not have a terminal at all. Then
opening `/dev/tty' fails.
`gawk' provides special file names for accessing the three standard
-streams. (c.e.). It also provides syntax for accessing any other
+streams. (c.e.) It also provides syntax for accessing any other
inherited open files. If the file name matches one of these special
names when `gawk' redirects input or output, then it directly uses the
stream that the file name stands for. These special file names work
@@ -6587,7 +6878,7 @@ names that `gawk' provides:
behavior.

-File: gawk.info, Node: Close Files And Pipes, Prev: Special Files, Up: Printing
+File: gawk.info, Node: Close Files And Pipes, Next: Output Summary, Prev: Special Files, Up: Printing
5.8 Closing Input and Output Redirections
=========================================
@@ -6695,14 +6986,15 @@ end-of-file return status from `getline'), the child process is not
terminated;(1) more importantly, the file descriptor for the pipe is
not closed and released until `close()' is called or `awk' exits.
- `close()' will silently do nothing if given an argument that does
-not represent a file, pipe or coprocess that was opened with a
-redirection.
+ `close()' silently does nothing if given an argument that does not
+represent a file, pipe or coprocess that was opened with a redirection.
+In such a case, it returns a negative value, indicating an error. In
+addition, `gawk' sets `ERRNO' to a string indicating the error.
Note also that `close(FILENAME)' has no "magic" effects on the
implicit loop that reads through the files named on the command line.
-It is, more likely, a close of a file that was never opened, so `awk'
-silently does nothing.
+It is, more likely, a close of a file that was never opened with a
+redirection, so `awk' silently does nothing.
When using the `|&' operator to communicate with a coprocess, it is
occasionally useful to be able to close one end of the two-way pipe
@@ -6716,9 +7008,9 @@ I/O::, which discusses it in more detail and gives an example.
Using `close()''s Return Value
- In many versions of Unix `awk', the `close()' function is actually a
-statement. It is a syntax error to try and use the return value from
-`close()': (d.c.)
+ In many older versions of Unix `awk', the `close()' function is
+actually a statement. It is a syntax error to try and use the return
+value from `close()': (d.c.)
command = "..."
command | getline info
@@ -6753,6 +7045,56 @@ call. See the system manual pages for information on how to decode this
value.

+File: gawk.info, Node: Output Summary, Next: Output exercises, Prev: Close Files And Pipes, Up: Printing
+
+5.9 Summary
+===========
+
+ * The `print' statement prints comma-separated expressions. Each
+ expression is separated by the value of `OFS' and terminated by
+ the value of `ORS'. `OFMT' provides the conversion format for
+ numeric values for the `print' statement.
+
+ * The `printf' statement provides finer-grained control over output,
+ with format control letters for different data types and various
+ flags that modify the behavior of the format control letters.
+
+ * Output from both `print' and `printf' may be redirected to files,
+ pipes, and co-processes.
+
+ * `gawk' provides special file names for access to standard input,
+ output and error, and for network communications.
+
+ * Use `close()' to close open file, pipe and co-process redirections.
+ For co-processes, it is possible to close only one direction of the
+ communications.
+
+
+
+File: gawk.info, Node: Output exercises, Prev: Output Summary, Up: Printing
+
+5.10 Exercises
+==============
+
+ 1. Rewrite the program:
+
+ awk 'BEGIN { print "Month Crates"
+ print "----- ------" }
+ { print $1, " ", $2 }' inventory-shipped
+
+ from *note Output Separators::, by using a new value of `OFS'.
+
+ 2. Use the `printf' statement to line up the headings and table data
+ for the `inventory-shipped' example that was covered in *note
+ Print::.
+
+ 3. What happens if you forget the double quotes when redirecting
+ output, as follows:
+
+ BEGIN { print "Serious error detected!" > /dev/stderr }
+
+
+
File: gawk.info, Node: Expressions, Next: Patterns and Actions, Prev: Printing, Up: Top
6 Expressions
@@ -6778,6 +7120,7 @@ operators.
* Function Calls:: A function call is an expression.
* Precedence:: How various operators nest.
* Locales:: How the locale affects things.
+* Expressions Summary:: Expressions summary.

File: gawk.info, Node: Values, Next: All Operators, Up: Expressions
@@ -6847,7 +7190,8 @@ codes.
(1) The internal representation of all numbers, including integers,
uses double precision floating-point numbers. On most modern systems,
-these are in IEEE 754 standard format.
+these are in IEEE 754 standard format. *Note Arbitrary Precision
+Arithmetic::, for much more information.

File: gawk.info, Node: Nondecimal-numbers, Next: Regexp Constants, Prev: Scalar Constants, Up: Constants
@@ -6980,8 +7324,8 @@ the contents of the current input record.
Constant regular expressions are also used as the first argument for
the `gensub()', `sub()', and `gsub()' functions, as the second argument
-of the `match()' function, and as the third argument of the
-`patsplit()' function (*note String Functions::). Modern
+of the `match()' function, and as the third argument of the `split()'
+and `patsplit()' functions (*note String Functions::). Modern
implementations of `awk', including `gawk', allow the third argument of
`split()' to be a regexp constant, but some older implementations do
not. (d.c.) This can lead to confusion when attempting to use regexp
@@ -7119,6 +7463,22 @@ File: gawk.info, Node: Conversion, Prev: Variables, Up: Values
6.1.4 Conversion of Strings and Numbers
---------------------------------------
+Number to string and string to number conversion are generally
+straightforward. There can be subtleties to be aware of; this minor
+node discusses this important facet of `awk'.
+
+* Menu:
+
+* Strings And Numbers:: How `awk' Converts Between Strings And
+ Numbers.
+* Locale influences conversions:: How the locale may affect conversions.
+
+
+File: gawk.info, Node: Strings And Numbers, Next: Locale influences conversions, Up: Conversion
+
+6.1.4.1 How `awk' Converts Between Strings And Numbers
+......................................................
+
Strings are converted to numbers and numbers are converted to strings,
if the context of the `awk' program demands it. For example, if the
value of either `foo' or `bar' in the expression `foo + bar' happens to
@@ -7168,35 +7528,47 @@ value of `CONVFMT' may be. Given the following code fragment:
`b' has the value `"12"', not `"12.00"'. (d.c.)
+ Pre-POSIX `awk' Used `OFMT' For String Conversion
+
Prior to the POSIX standard, `awk' used the value of `OFMT' for
converting numbers to strings. `OFMT' specifies the output format to
use when printing numbers with `print'. `CONVFMT' was introduced in
order to separate the semantics of conversion from the semantics of
printing. Both `CONVFMT' and `OFMT' have the same default value:
`"%.6g"'. In the vast majority of cases, old `awk' programs do not
-change their behavior. However, these semantics for `OFMT' are
-something to keep in mind if you must port your new-style program to
-older implementations of `awk'. We recommend that instead of changing
-your programs, just port `gawk' itself. *Note Print::, for more
-information on the `print' statement.
-
- And, once again, where you are can matter when it comes to converting
-between numbers and strings. In *note Locales::, we mentioned that the
-local character set and language (the locale) can affect how `gawk'
-matches characters. The locale also affects numeric formats. In
-particular, for `awk' programs, it affects the decimal point character.
-The `"C"' locale, and most English-language locales, use the period
-character (`.') as the decimal point. However, many (if not most)
-European and non-English locales use the comma (`,') as the decimal
-point character.
+change their behavior. *Note Print::, for more information on the
+`print' statement.
+
+ ---------- Footnotes ----------
+
+ (1) Pathological cases can require up to 752 digits (!), but we
+doubt that you need to worry about this.
+
+
+File: gawk.info, Node: Locale influences conversions, Prev: Strings And Numbers, Up: Conversion
+
+6.1.4.2 Locales Can Influence Conversion
+........................................
+
+Where you are can matter when it comes to converting between numbers and
+strings. The local character set and language--the "locale"--can
+affect numeric formats. In particular, for `awk' programs, it affects
+the decimal point character and the thousands-separator character. The
+`"C"' locale, and most English-language locales, use the period
+character (`.') as the decimal point and don't have a thousands
+separator. However, many (if not most) European and non-English
+locales use the comma (`,') as the decimal point character. European
+locales often use either a space or a period as the thousands
+separator, if they have one.
The POSIX standard says that `awk' always uses the period as the
decimal point when reading the `awk' program source code, and for
command-line variable assignments (*note Other Arguments::). However,
when interpreting input data, for `print' and `printf' output, and for
number to string conversion, the local decimal point character is used.
-(d.c.) Here are some examples indicating the difference in behavior,
-on a GNU/Linux system:
+(d.c.) In all cases, numbers in source code and in input data cannot
+have a thousands separator. Here are some examples indicating the
+difference in behavior, on a GNU/Linux system:
$ export POSIXLY_CORRECT=1 Force POSIX behavior
$ gawk 'BEGIN { printf "%g\n", 3.1415927 }'
@@ -7241,11 +7613,6 @@ representation can have an unusual but important effect on the way
`gawk' converts some special string values to numbers. The details are
presented in *note POSIX Floating Point Problems::.
- ---------- Footnotes ----------
-
- (1) Pathological cases can require up to 752 digits (!), but we
-doubt that you need to worry about this.
-

File: gawk.info, Node: All Operators, Next: Truth Values and Conditions, Prev: Values, Up: Expressions
@@ -7401,9 +7768,9 @@ example:
print (a " " (a = "panic"))
}
-It is not defined whether the assignment to `a' happens before or after
-the value of `a' is retrieved for producing the concatenated value.
-The result could be either `don't panic', or `panic panic'.
+It is not defined whether the second assignment to `a' happens before
+or after the value of `a' is retrieved for producing the concatenated
+value. The result could be either `don't panic', or `panic panic'.
The precedence of concatenation, when mixed with other operators, is
often counter-intuitive. Consider this example:
@@ -7476,9 +7843,9 @@ that the assignment stores in the specified variable, field, or array
element. (Such values are called "rvalues".)
It is important to note that variables do _not_ have permanent types.
-A variable's type is simply the type of whatever value it happens to
-hold at the moment. In the following program fragment, the variable
-`foo' has a numeric value at first, and a string value later on:
+A variable's type is simply the type of whatever value was last assigned
+to it. In the following program fragment, the variable `foo' has a
+numeric value at first, and a string value later on:
foo = 1
print foo
@@ -7551,9 +7918,10 @@ The indices of `bar' are practically guaranteed to be different, because
the `rand()' function haven't been covered yet. *Note Arrays::, and
see *note Numeric Functions::, for more information). This example
illustrates an important fact about assignment operators: the lefthand
-expression is only evaluated _once_. It is up to the implementation as
-to which expression is evaluated first, the lefthand or the righthand.
-Consider this example:
+expression is only evaluated _once_.
+
+ It is up to the implementation as to which expression is evaluated
+first, the lefthand or the righthand. Consider this example:
i = 1
a[i += 2] = i + 1
@@ -7566,14 +7934,14 @@ converted to a number.
Operator Effect
--------------------------------------------------------------------------
-LVALUE `+=' INCREMENT Adds INCREMENT to the value of LVALUE.
-LVALUE `-=' DECREMENT Subtracts DECREMENT from the value of LVALUE.
-LVALUE `*=' Multiplies the value of LVALUE by COEFFICIENT.
+LVALUE `+=' INCREMENT Add INCREMENT to the value of LVALUE.
+LVALUE `-=' DECREMENT Subtract DECREMENT from the value of LVALUE.
+LVALUE `*=' Multiply the value of LVALUE by COEFFICIENT.
COEFFICIENT
-LVALUE `/=' DIVISOR Divides the value of LVALUE by DIVISOR.
-LVALUE `%=' MODULUS Sets LVALUE to its remainder by MODULUS.
+LVALUE `/=' DIVISOR Divide the value of LVALUE by DIVISOR.
+LVALUE `%=' MODULUS Set LVALUE to its remainder by MODULUS.
LVALUE `^=' POWER
-LVALUE `**=' POWER Raises LVALUE to the power POWER. (c.e.)
+LVALUE `**=' POWER Raise LVALUE to the power POWER. (c.e.)
Table 6.2: Arithmetic Assignment Operators
@@ -7596,8 +7964,8 @@ A workaround is:
awk '/[=]=/' /dev/null
- `gawk' does not have this problem, nor do the other freely available
-versions described in *note Other Versions::.
+ `gawk' does not have this problem; Brian Kernighan's `awk' and
+`mawk' also do not (*note Other Versions::).

File: gawk.info, Node: Increment Ops, Prev: Assignment Ops, Up: All Operators
@@ -7612,13 +7980,13 @@ they are convenient abbreviations for very common operations.
The operator used for adding one is written `++'. It can be used to
increment a variable either before or after taking its value. To
-pre-increment a variable `v', write `++v'. This adds one to the value
-of `v'--that new value is also the value of the expression. (The
+"pre-increment" a variable `v', write `++v'. This adds one to the
+value of `v'--that new value is also the value of the expression. (The
assignment expression `v += 1' is completely equivalent.) Writing the
-`++' after the variable specifies post-increment. This increments the
-variable value just the same; the difference is that the value of the
-increment expression itself is the variable's _old_ value. Thus, if
-`foo' has the value four, then the expression `foo++' has the value
+`++' after the variable specifies "post-increment". This increments
+the variable value just the same; the difference is that the value of
+the increment expression itself is the variable's _old_ value. Thus,
+if `foo' has the value four, then the expression `foo++' has the value
four, but it changes the value of `foo' to five. In other words, the
operator returns the old value of the variable, but with the side
effect of incrementing it.
@@ -7765,10 +8133,12 @@ The 1992 POSIX standard introduced the concept of a "numeric string",
which is simply a string that looks like a number--for example,
`" +2"'. This concept is used for determining the type of a variable.
The type of the variable is important because the types of two variables
-determine how they are compared. The various versions of the POSIX
-standard did not get the rules quite right for several editions.
-Fortunately, as of at least the 2008 standard (and possibly earlier),
-the standard has been fixed, and variable typing follows these rules:(1)
+determine how they are compared.
+
+ The various versions of the POSIX standard did not get the rules
+quite right for several editions. Fortunately, as of at least the 2008
+standard (and possibly earlier), the standard has been fixed, and
+variable typing follows these rules:(1)
* A numeric constant or the result of a numeric operation has the
NUMERIC attribute.
@@ -7825,10 +8195,9 @@ comparison is performed.
characters, and so is first and foremost of STRING type; input strings
that look numeric are additionally given the STRNUM attribute. Thus,
the six-character input string ` +3.14' receives the STRNUM attribute.
-In contrast, the eight-character literal `" +3.14"' appearing in
-program text is a string constant. The following examples print `1'
-when the comparison between the two different constants is true, `0'
-otherwise:
+In contrast, the eight characters `" +3.14"' appearing in program text
+comprise a string constant. The following examples print `1' when the
+comparison between the two different constants is true, `0' otherwise:
$ echo ' +3.14' | gawk '{ print $0 == " +3.14" }' True
-| 1
@@ -7950,7 +8319,7 @@ has the value one if `x' contains `foo', such as `"Oh, what a fool am
I!"'.
The righthand operand of the `~' and `!~' operators may be either a
-regexp constant (`/.../') or an ordinary expression. In the latter
+regexp constant (`/'...`/') or an ordinary expression. In the latter
case, the value of the expression as a string is used as a dynamic
regexp (*note Regexp Usage::; also *note Computed Regexps::).
@@ -7971,9 +8340,10 @@ File: gawk.info, Node: POSIX String Comparison, Prev: Comparison Operators, U
..........................................
The POSIX standard says that string comparison is performed based on
-the locale's collating order. This is usually very different from the
-results obtained when doing straight character-by-character
-comparison.(1)
+the locale's "collating order". This is the order in which characters
+sort, as defined by the locale (for more discussion, *note Ranges and
+Locales::). This order is usually very different from the results
+obtained when doing straight character-by-character comparison.(1)
Because this behavior differs considerably from existing practice,
`gawk' only implements it when in POSIX mode (*note Options::). Here
@@ -8125,7 +8495,7 @@ not. *Note Arrays::, for more information about arrays.
continued simply by putting a newline after either character. However,
putting a newline in front of either character does not work without
using backslash continuation (*note Statements/Lines::). If `--posix'
-is specified (*note Options::), then this extension is disabled.
+is specified (*note Options::), this extension is disabled.

File: gawk.info, Node: Function Calls, Next: Precedence, Prev: Truth Values and Conditions, Up: Expressions
@@ -8142,6 +8512,8 @@ available in every `awk' program. The `sqrt()' function is one of
these. *Note Built-in::, for a list of built-in functions and their
descriptions. In addition, you can define functions for use in your
program. *Note User-defined::, for instructions on how to do this.
+Finally, `gawk' lets you write functions in C or C++ that may be called
+from your program: see *note Dynamic Extensions::.
The way to use a function is with a "function call" expression,
which consists of the function name followed immediately by a list of
@@ -8173,19 +8545,21 @@ the number of which to take the square root:
If those arguments are not supplied, the functions use a reasonable
default value. *Note Built-in::, for full details. If arguments are
omitted in calls to user-defined functions, then those arguments are
-treated as local variables and initialized to the empty string (*note
-User-defined::).
+treated as local variables. Such local variables act like the empty
+string if referenced where a string value is required, and like zero if
+referenced where a numeric value is required (*note User-defined::).
As an advanced feature, `gawk' provides indirect function calls,
which is a way to choose the function to call at runtime, instead of
when you write the source code to your program. We defer discussion of
this feature until later; see *note Indirect Calls::.
- Like every other expression, the function call has a value, which is
-computed by the function based on the arguments you give it. In this
-example, the value of `sqrt(ARGUMENT)' is the square root of ARGUMENT.
-The following program reads numbers, one number per line, and prints the
-square root of each one:
+ Like every other expression, the function call has a value, often
+called the "return value", which is computed by the function based on
+the arguments you give it. In this example, the return value of
+`sqrt(ARGUMENT)' is the square root of ARGUMENT. The following program
+reads numbers, one number per line, and prints the square root of each
+one:
$ awk '{ print "The square root of", $1, "is", sqrt($1) }'
1
@@ -8258,7 +8632,7 @@ to avoid the problem the expression can be rewritten as `$($0++)--'.
This table presents `awk''s operators, in order of highest to lowest
precedence:
-`(...)'
+`('...`)'
Grouping.
`$'
@@ -8279,7 +8653,7 @@ precedence:
`+ -'
Addition, subtraction.
-`String Concatenation'
+String Concatenation
There is no special symbol for concatenation. The operands are
simply written side by side (*note Concatenation::).
@@ -8320,13 +8694,15 @@ precedence:
POSIX. For maximum portability, do not use them.

-File: gawk.info, Node: Locales, Prev: Precedence, Up: Expressions
+File: gawk.info, Node: Locales, Next: Expressions Summary, Prev: Precedence, Up: Expressions
6.6 Where You Are Makes A Difference
====================================
Modern systems support the notion of "locales": a way to tell the
-system about the local character set and language.
+system about the local character set and language. The ISO C standard
+defines a default `"C"' locale, which is an environment that is typical
+of what many C programmers are used to.
Once upon a time, the locale setting used to affect regexp matching
(*note Ranges and Locales::), but this is no longer true.
@@ -8338,6 +8714,13 @@ much better performance when reading records. Otherwise, `gawk' has to
make several function calls, _per input character_, to find the record
terminator.
+ Locales can affect how dates and times are formatted (*note Time
+Functions::). For example, a common way to abbreviate the date
+September 4, 2015 in the United States is "9/4/15." In many countries
+in Europe, however, it is abbreviated "4.9.15." Thus, the `%x'
+specification in a `"US"' locale might produce `9/4/15', while in a
+`"EUROPE"' locale, it might produce `4.9.15'.
+
According to POSIX, string comparison is also affected by locales
(similar to regular expressions). The details are presented in *note
POSIX String Comparison::.
@@ -8347,6 +8730,63 @@ used when `gawk' parses input data. This is discussed in detail in
*note Conversion::.

+File: gawk.info, Node: Expressions Summary, Prev: Locales, Up: Expressions
+
+6.7 Summary
+===========
+
+ * Expressions are the basic elements of computation in programs.
+ They are built from constants, variables, function calls and
+ combinations of the various kinds of values with operators.
+
+ * `awk' supplies three kinds of constants: numeric, string, and
+ regexp. `gawk' lets you specify numeric constants in octal and
+ hexadecimal (bases 8 and 16) in addition to decimal (base 10). In
+ certain contexts, a standalone regexp constant such as `/foo/' has
+ the same meaning as `$0 ~ /foo/'.
+
+ * Variables hold values between uses in computations. A number of
+ built-in variables provide information to your `awk' program, and
+ a number of others let you control how `awk' behaves.
+
+ * Numbers are automatically converted to strings, and strings to
+ numbers, as needed by `awk'. Numeric values are converted as if
+ they were formatted with `sprintf()' using the format in `CONVFMT'.
+ Locales can influence the conversions.
+
+ * `awk' provides the usual arithmetic operators (addition,
+ subtraction, multiplication, division, modulus), and unary plus
+ and minus. It also provides comparison operators, boolean
+ operators, and regexp matching operators. String concatenation is
+ accomplished by placing two expressions next to each other; there
+ is no explicit operator. The three-operand `?:' operator provides
+ an "if-else" test within expressions.
+
+ * Assignment operators provide convenient shorthands for common
+ arithmetic operations.
+
+ * In `awk', a value is considered to be true if it is non-zero _or_
+ non-null. Otherwise, the value is false.
+
+ * A value's type is set upon each assignment and may change over its
+ lifetime. The type determines how it behaves in comparisons
+ (string or numeric).
+
+ * Function calls return a value which may be used as part of a larger
+ expression. Expressions used to pass parameter values are fully
+ evaluated before the function is called. `awk' provides built-in
+ and user-defined functions; this is described later on in this
+ Info file.
+
+ * Operator precedence specifies the order in which operations are
+ performed, unless explicitly overridden by parentheses. `awk''s
+ operator precedence is compatible with that of C.
+
+ * Locales can affect the format of data as output by an `awk'
+ program, and occasionally the format for data read as input.
+
+
+
File: gawk.info, Node: Patterns and Actions, Next: Arrays, Prev: Expressions, Up: Top
7 Patterns, Actions, and Variables
@@ -8370,6 +8810,7 @@ top of. Now it's time to start building something useful.
* Statements:: Describes the various control statements in
detail.
* Built-in Variables:: Summarizes the built-in variables.
+* Pattern Action Summary:: Patterns and Actions summary.

File: gawk.info, Node: Pattern Overview, Next: Using Shell Variables, Up: Patterns and Actions
@@ -8398,10 +8839,10 @@ summary of the types of `awk' patterns:
A single expression. It matches when its value is nonzero (if a
number) or non-null (if a string). (*Note Expression Patterns::.)
-`PAT1, PAT2'
+`BEGPAT, ENDPAT'
A pair of patterns separated by a comma, specifying a range of
records. The range includes both the initial record that matches
- PAT1 and the final record that matches PAT2. (*Note Ranges::.)
+ BEGPAT and the final record that matches ENDPAT. (*Note Ranges::.)
`BEGIN'
`END'
@@ -8411,7 +8852,7 @@ summary of the types of `awk' patterns:
`BEGINFILE'
`ENDFILE'
Special patterns for you to supply startup or cleanup actions to be
- done on a per file basis. (*Note BEGINFILE/ENDFILE::.)
+ done on a per-file basis. (*Note BEGINFILE/ENDFILE::.)
`EMPTY'
The empty pattern matches every input record. (*Note Empty::.)
@@ -8531,7 +8972,7 @@ record. When a record matches BEGPAT, the range pattern is "turned on"
and the range pattern matches this record as well. As long as the
range pattern stays turned on, it automatically matches every input
record read. The range pattern also matches ENDPAT against every input
-record; when this succeeds, the range pattern is turned off again for
+record; when this succeeds, the range pattern is "turned off" again for
the following record. Then the range pattern goes back to checking
BEGPAT against each record.
@@ -8663,10 +9104,10 @@ File: gawk.info, Node: I/O And BEGIN/END, Prev: Using BEGIN/END, Up: BEGIN/EN
7.1.4.2 Input/Output from `BEGIN' and `END' Rules
.................................................
-There are several (sometimes subtle) points to remember when doing I/O
-from a `BEGIN' or `END' rule. The first has to do with the value of
-`$0' in a `BEGIN' rule. Because `BEGIN' rules are executed before any
-input is read, there simply is no input record, and therefore no
+There are several (sometimes subtle) points to be aware of when doing
+I/O from a `BEGIN' or `END' rule. The first has to do with the value
+of `$0' in a `BEGIN' rule. Because `BEGIN' rules are executed before
+any input is read, there simply is no input record, and therefore no
fields, when executing `BEGIN' rules. References to `$0' and the fields
yield a null string or zero, depending upon the context. One way to
give `$0' a real value is to execute a `getline' command without a
@@ -8734,10 +9175,10 @@ tasks that would otherwise be difficult or impossible to perform:
entirely. Otherwise, `gawk' exits with the usual fatal error.
* If you have written extensions that modify the record handling (by
- inserting an "input parser"), you can invoke them at this point,
- before `gawk' has started processing the file. (This is a _very_
- advanced feature, currently used only by the `gawkextlib' project
- (http://gawkextlib.sourceforge.net).)
+ inserting an "input parser," *note Input Parsers::), you can invoke
+ them at this point, before `gawk' has started processing the file.
+ (This is a _very_ advanced feature, currently used only by the
+ `gawkextlib' project (http://gawkextlib.sourceforge.net).)
The `ENDFILE' rule is called when `gawk' has finished processing the
last record in an input file. For the last input file, it will be
@@ -8756,7 +9197,7 @@ either a `BEGINFILE' or and `ENDFILE' rule. The `nextfile' statement
but not inside an `ENDFILE' rule.
The `getline' statement (*note Getline::) is restricted inside both
-`BEGINFILE' and `ENDFILE'. Only the `getline VARIABLE < FILE' form is
+`BEGINFILE' and `ENDFILE': only redirected forms of `getline' are
allowed.
`BEGINFILE' and `ENDFILE' are `gawk' extensions. In most other
@@ -8789,15 +9230,15 @@ to get the value of the shell variable into the body of the `awk'
program.
The most common method is to use shell quoting to substitute the
-variable's value into the program inside the script. For example, in
-the following program:
+variable's value into the program inside the script. For example,
+consider the following program:
printf "Enter search pattern: "
read pattern
awk "/$pattern/ "'{ nmatches++ }
END { print nmatches, "found" }' /path/to/data
-the `awk' program consists of two pieces of quoted text that are
+The `awk' program consists of two pieces of quoted text that are
concatenated together to form the program. The first part is
double-quoted, which allows substitution of the `pattern' shell
variable inside the quotes. The second part is single-quoted.
@@ -8809,7 +9250,7 @@ quotes when reading the program.
A better method is to use `awk''s variable assignment feature (*note
Assignment Options::) to assign the shell variable's value to an `awk'
-variable's value. Then use dynamic regexps to match the pattern (*note
+variable. Then use dynamic regexps to match the pattern (*note
Computed Regexps::). The following shows how to redo the previous
example using this technique:
@@ -8840,19 +9281,19 @@ which (but not both) may be omitted. The purpose of the "action" is to
tell `awk' what to do once a match for the pattern is found. Thus, in
outline, an `awk' program generally looks like this:
- [PATTERN] { ACTION }
- PATTERN [{ ACTION }]
+ [PATTERN] `{ ACTION }'
+ PATTERN [`{ ACTION }']
...
- function NAME(ARGS) { ... }
+ `function NAME(ARGS) { ... }'
...
An action consists of one or more `awk' "statements", enclosed in
-curly braces (`{...}'). Each statement specifies one thing to do. The
-statements are separated by newlines or semicolons. The curly braces
-around an action must be used even if the action contains only one
-statement, or if it contains no statements at all. However, if you
-omit the action entirely, omit the curly braces as well. An omitted
-action is equivalent to `{ print $0 }':
+braces (`{...}'). Each statement specifies one thing to do. The
+statements are separated by newlines or semicolons. The braces around
+an action must be used even if the action contains only one statement,
+or if it contains no statements at all. However, if you omit the
+action entirely, omit the braces as well. An omitted action is
+equivalent to `{ print $0 }':
/foo/ { } match `foo', do nothing -- empty action
/foo/ match `foo', print the record -- omitted action
@@ -8871,9 +9312,9 @@ Control statements
well as a few special ones (*note Statements::).
Compound statements
- Consist of one or more statements enclosed in curly braces. A
- compound statement is used in order to put several statements
- together in the body of an `if', `while', `do', or `for' statement.
+ Enclose one or more statements in braces. A compound statement is
+ used in order to put several statements together in the body of an
+ `if', `while', `do', or `for' statement.
Input statements
Use the `getline' command (*note Getline::). Also supplied in
@@ -8902,7 +9343,7 @@ statements contain other statements. For example, the `if' statement
contains another statement that may or may not be executed. The
contained statement is called the "body". To include more than one
statement in the body, group them into a single "compound statement"
-with curly braces, separating them with newlines or semicolons.
+with braces, separating them with newlines or semicolons.
* Menu:
@@ -8931,7 +9372,7 @@ File: gawk.info, Node: If Statement, Next: While Statement, Up: Statements
The `if'-`else' statement is `awk''s decision-making statement. It
looks like this:
- if (CONDITION) THEN-BODY [else ELSE-BODY]
+ `if (CONDITION) THEN-BODY' [`else ELSE-BODY']
The CONDITION is an expression that controls what the rest of the
statement does. If the CONDITION is true, THEN-BODY is executed;
@@ -8950,8 +9391,8 @@ the value of `x' is evenly divisible by two), then the first `print'
statement is executed; otherwise, the second `print' statement is
executed. If the `else' keyword appears on the same line as THEN-BODY
and THEN-BODY is not a compound statement (i.e., not surrounded by
-curly braces), then a semicolon must separate THEN-BODY from the `else'.
-To illustrate this, the previous example can be rewritten as:
+braces), then a semicolon must separate THEN-BODY from the `else'. To
+illustrate this, the previous example can be rewritten as:
if (x % 2 == 0) print "x is even"; else
print "x is odd"
@@ -9136,7 +9577,8 @@ File: gawk.info, Node: Switch Statement, Next: Break Statement, Prev: For Sta
7.4.5 The `switch' Statement
----------------------------
-This minor node describes a `gawk'-specific feature.
+This minor node describes a `gawk'-specific feature. If `gawk' is in
+compatibility mode (*note Options::), it is not available.
The `switch' statement allows the evaluation of an expression and
the execution of statements based on a `case' match. Case statements
@@ -9187,9 +9629,6 @@ is executed and then falls through into the `default' section,
executing its `print' statement. In turn, the -1 case will also be
executed since the `default' does not halt execution.
- This `switch' statement is a `gawk' extension. If `gawk' is in
-compatibility mode (*note Options::), it is not available.
-

File: gawk.info, Node: Break Statement, Next: Continue Statement, Prev: Switch Statement, Up: Statements
@@ -9202,15 +9641,15 @@ divisor of any integer, and also identifies prime numbers:
# find smallest divisor of num
{
- num = $1
- for (div = 2; div * div <= num; div++) {
- if (num % div == 0)
- break
- }
- if (num % div == 0)
- printf "Smallest divisor of %d is %d\n", num, div
- else
- printf "%d is prime\n", num
+ num = $1
+ for (div = 2; div * div <= num; div++) {
+ if (num % div == 0)
+ break
+ }
+ if (num % div == 0)
+ printf "Smallest divisor of %d is %d\n", num, div
+ else
+ printf "%d is prime\n", num
}
When the remainder is zero in the first `if' statement, `awk'
@@ -9225,17 +9664,17 @@ Statement::.)
# find smallest divisor of num
{
- num = $1
- for (div = 2; ; div++) {
- if (num % div == 0) {
- printf "Smallest divisor of %d is %d\n", num, div
- break
- }
- if (div * div > num) {
- printf "%d is prime\n", num
- break
+ num = $1
+ for (div = 2; ; div++) {
+ if (num % div == 0) {
+ printf "Smallest divisor of %d is %d\n", num, div
+ break
+ }
+ if (div * div > num) {
+ printf "%d is prime\n", num
+ break
+ }
}
- }
}
The `break' statement is also used to break out of the `switch'
@@ -9346,7 +9785,7 @@ rules. *Note BEGINFILE/ENDFILE::.
According to the POSIX standard, the behavior is undefined if the
`next' statement is used in a `BEGIN' or `END' rule. `gawk' treats it
-as a syntax error. Although POSIX permits it, some other `awk'
+as a syntax error. Although POSIX permits it, most other `awk'
implementations don't allow the `next' statement inside function bodies
(*note User-defined::). Just as with any other `next' statement, a
`next' statement inside a function body reads the next record and
@@ -9416,7 +9855,7 @@ The `exit' statement causes `awk' to immediately stop executing the
current rule and to stop processing input; any remaining input is
ignored. The `exit' statement is written as follows:
- exit [RETURN CODE]
+ `exit' [RETURN CODE]
When an `exit' statement is executed from a `BEGIN' rule, the
program stops processing everything immediately. No input records are
@@ -9450,12 +9889,12 @@ with a nonzero status. An `awk' program can do this using an `exit'
statement with a nonzero argument, as shown in the following example:
BEGIN {
- if (("date" | getline date_now) <= 0) {
- print "Can't get system date" > "/dev/stderr"
- exit 1
- }
- print "current date is", date_now
- close("date")
+ if (("date" | getline date_now) <= 0) {
+ print "Can't get system date" > "/dev/stderr"
+ exit 1
+ }
+ print "current date is", date_now
+ close("date")
}
NOTE: For full portability, exit values should be between zero and
@@ -9464,7 +9903,7 @@ statement with a nonzero argument, as shown in the following example:
systems.

-File: gawk.info, Node: Built-in Variables, Prev: Statements, Up: Patterns and Actions
+File: gawk.info, Node: Built-in Variables, Next: Pattern Action Summary, Prev: Statements, Up: Patterns and Actions
7.5 Built-in Variables
======================
@@ -9477,9 +9916,9 @@ of these automatically, so that they enable you to tell `awk' how to do
certain things. Others are set automatically by `awk', so that they
carry information from the internal workings of `awk' to your program.
- This minor node documents all the built-in variables of `gawk', most
-of which are also documented in the chapters describing their areas of
-activity.
+ This minor node documents all of `gawk''s built-in variables, most
+of which are also documented in the major nodes describing their areas
+of activity.
* Menu:
@@ -9496,8 +9935,13 @@ File: gawk.info, Node: User-modified, Next: Auto-set, Up: Built-in Variables
-------------------------------------------
The following is an alphabetical list of variables that you can change
-to control how `awk' does certain things. The variables that are
-specific to `gawk' are marked with a pound sign (`#').
+to control how `awk' does certain things.
+
+ The variables that are specific to `gawk' are marked with a pound
+sign (`#'). These variables are `gawk' extensions. In other `awk'
+implementations or if `gawk' is in compatibility mode (*note
+Options::), they are not special. (Any exceptions are noted in the
+description of each variable.)
`BINMODE #'
On non-POSIX systems, this variable specifies use of binary mode
@@ -9510,14 +9954,11 @@ specific to `gawk' are marked with a pound sign (`#').
string value of `"rw"' or `"wr"' indicates that all files should
use binary I/O. Any other string value is treated the same as
`"rw"', but causes `gawk' to generate a warning message.
- `BINMODE' is described in more detail in *note PC Using::.
-
- This variable is a `gawk' extension. In other `awk'
- implementations (except `mawk', *note Other Versions::), or if
- `gawk' is in compatibility mode (*note Options::), it is not
- special.
+ `BINMODE' is described in more detail in *note PC Using::. `mawk'
+ *note Other Versions::), also supports this variable, but only
+ using numeric values.
-`CONVFMT'
+``CONVFMT''
This string controls conversion of numbers to strings (*note
Conversion::). It works by being passed, in effect, as the first
argument to the `sprintf()' function (*note String Functions::).
@@ -9525,29 +9966,21 @@ specific to `gawk' are marked with a pound sign (`#').
POSIX standard.
`FIELDWIDTHS #'
- This is a space-separated list of columns that tells `gawk' how to
- split input with fixed columnar boundaries. Assigning a value to
+ A space-separated list of columns that tells `gawk' how to split
+ input with fixed columnar boundaries. Assigning a value to
`FIELDWIDTHS' overrides the use of `FS' and `FPAT' for field
splitting. *Note Constant Size::, for more information.
- If `gawk' is in compatibility mode (*note Options::), then
- `FIELDWIDTHS' has no special meaning, and field-splitting
- operations occur based exclusively on the value of `FS'.
-
`FPAT #'
- This is a regular expression (as a string) that tells `gawk' to
- create the fields based on text that matches the regular
- expression. Assigning a value to `FPAT' overrides the use of `FS'
- and `FIELDWIDTHS' for field splitting. *Note Splitting By
- Content::, for more information.
-
- If `gawk' is in compatibility mode (*note Options::), then `FPAT'
- has no special meaning, and field-splitting operations occur based
- exclusively on the value of `FS'.
+ A regular expression (as a string) that tells `gawk' to create the
+ fields based on text that matches the regular expression.
+ Assigning a value to `FPAT' overrides the use of `FS' and
+ `FIELDWIDTHS' for field splitting. *Note Splitting By Content::,
+ for more information.
`FS'
- This is the input field separator (*note Field Separators::). The
- value is a single-character string or a multicharacter regular
+ The input field separator (*note Field Separators::). The value
+ is a single-character string or a multicharacter regular
expression that matches the separations between fields in an input
record. If the value is the null string (`""'), then each
character in the record becomes a separate field. (This behavior
@@ -9583,13 +10016,9 @@ specific to `gawk' are marked with a pound sign (`#').
splitting when using a single-character field separator. *Note
Case-sensitivity::.
- If `gawk' is in compatibility mode (*note Options::), then
- `IGNORECASE' has no special meaning. Thus, string and regexp
- operations are always case-sensitive.
-
`LINT #'
When this variable is true (nonzero or non-null), `gawk' behaves
- as if the `--lint' command-line option is in effect. (*note
+ as if the `--lint' command-line option is in effect (*note
Options::). With a value of `"fatal"', lint warnings become fatal
errors. With a value of `"invalid"', only warnings about things
that are actually invalid are issued. (This is not fully
@@ -9605,13 +10034,13 @@ specific to `gawk' are marked with a pound sign (`#').
execution is independent of the flavor of `awk' being executed.
`OFMT'
- This string controls conversion of numbers to strings (*note
- Conversion::) for printing with the `print' statement. It works
- by being passed as the first argument to the `sprintf()' function
- (*note String Functions::). Its default value is `"%.6g"'.
- Earlier versions of `awk' also used `OFMT' to specify the format
- for converting numbers to strings in general expressions; this is
- now done by `CONVFMT'.
+ Controls conversion of numbers to strings (*note Conversion::) for
+ printing with the `print' statement. It works by being passed as
+ the first argument to the `sprintf()' function (*note String
+ Functions::). Its default value is `"%.6g"'. Earlier versions of
+ `awk' also used `OFMT' to specify the format for converting
+ numbers to strings in general expressions; this is now done by
+ `CONVFMT'.
`OFS'
This is the output field separator (*note Output Separators::).
@@ -9619,49 +10048,45 @@ specific to `gawk' are marked with a pound sign (`#').
Its default value is `" "', a string consisting of a single space.
`ORS'
- This is the output record separator. It is output at the end of
- every `print' statement. Its default value is `"\n"', the newline
+ The output record separator. It is output at the end of every
+ `print' statement. Its default value is `"\n"', the newline
character. (*Note Output Separators::.)
`PREC #'
The working precision of arbitrary precision floating-point
- numbers, 53 bits by default (*note Setting Precision::).
+ numbers, 53 bits by default (*note Setting precision::).
`ROUNDMODE #'
The rounding mode to use for arbitrary precision arithmetic on
- numbers, by default `"N"' (`roundTiesToEven' in the IEEE-754
- standard) (*note Setting Rounding Mode::).
+ numbers, by default `"N"' (`roundTiesToEven' in the IEEE 754
+ standard; *note Setting the rounding mode::).
-`RS'
- This is `awk''s input record separator. Its default value is a
- string containing a single newline character, which means that an
- input record consists of a single line of text. It can also be
- the null string, in which case records are separated by runs of
- blank lines. If it is a regexp, records are separated by matches
- of the regexp in the input text. (*Note Records::.)
+``RS''
+ The input record separator. Its default value is a string
+ containing a single newline character, which means that an input
+ record consists of a single line of text. It can also be the null
+ string, in which case records are separated by runs of blank lines.
+ If it is a regexp, records are separated by matches of the regexp
+ in the input text. (*Note Records::.)
The ability for `RS' to be a regular expression is a `gawk'
extension. In most other `awk' implementations, or if `gawk' is
in compatibility mode (*note Options::), just the first character
of `RS''s value is used.
-`SUBSEP'
- This is the subscript separator. It has the default value of
- `"\034"' and is used to separate the parts of the indices of a
- multidimensional array. Thus, the expression `foo["A", "B"]'
- really accesses `foo["A\034B"]' (*note Multidimensional::).
+``SUBSEP''
+ The subscript separator. It has the default value of `"\034"' and
+ is used to separate the parts of the indices of a multidimensional
+ array. Thus, the expression `foo["A", "B"]' really accesses
+ `foo["A\034B"]' (*note Multidimensional::).
`TEXTDOMAIN #'
- This variable is used for internationalization of programs at the
- `awk' level. It sets the default text domain for specially marked
- string constants in the source text, as well as for the
- `dcgettext()', `dcngettext()' and `bindtextdomain()' functions
- (*note Internationalization::). The default value of `TEXTDOMAIN'
- is `"messages"'.
-
- This variable is a `gawk' extension. In other `awk'
- implementations, or if `gawk' is in compatibility mode (*note
- Options::), it is not special.
+ Used for internationalization of programs at the `awk' level. It
+ sets the default text domain for specially marked string constants
+ in the source text, as well as for the `dcgettext()',
+ `dcngettext()' and `bindtextdomain()' functions (*note
+ Internationalization::). The default value of `TEXTDOMAIN' is
+ `"messages"'.
---------- Footnotes ----------
@@ -9675,10 +10100,14 @@ File: gawk.info, Node: Auto-set, Next: ARGC and ARGV, Prev: User-modified, U
The following is an alphabetical list of variables that `awk' sets
automatically on certain occasions in order to provide information to
-your program. The variables that are specific to `gawk' are marked
-with a pound sign (`#').
+your program.
+
+ The variables that are specific to `gawk' are marked with a pound
+sign (`#'). These variables are `gawk' extensions. In other `awk'
+implementations or if `gawk' is in compatibility mode (*note
+Options::), they are not special.
-`ARGC, ARGV'
+`ARGC', `ARGV'
The command-line arguments available to `awk' programs are stored
in an array called `ARGV'. `ARGC' is the number of command-line
arguments present. *Note Other Arguments::. Unlike most `awk'
@@ -9723,10 +10152,6 @@ with a pound sign (`#').
program, `gawk' automatically sets it to a new value when the next
file is opened.
- This variable is a `gawk' extension. In other `awk'
- implementations, or if `gawk' is in compatibility mode (*note
- Options::), it is not special.
-
`ENVIRON'
An associative array containing the values of the environment.
The array indices are the environment variable names; the elements
@@ -9746,12 +10171,12 @@ with a pound sign (`#').
Some operating systems may not have environment variables. On
such systems, the `ENVIRON' array is empty (except for
- `ENVIRON["AWKPATH"]', *note AWKPATH Variable:: and
- `ENVIRON["AWKLIBPATH"]', *note AWKLIBPATH Variable::).
+ `ENVIRON["AWKPATH"]' and `ENVIRON["AWKLIBPATH"]'; *note AWKPATH
+ Variable::, and *note AWKLIBPATH Variable::).
`ERRNO #'
- If a system error occurs during a redirection for `getline',
- during a read for `getline', or during a `close()' operation, then
+ If a system error occurs during a redirection for `getline', during
+ a read for `getline', or during a `close()' operation, then
`ERRNO' contains a string describing the error.
In addition, `gawk' clears `ERRNO' before opening each
@@ -9765,19 +10190,14 @@ with a pound sign (`#').
`getline' returning -1. You are, of course, free to clear it
yourself before doing an I/O operation.
- This variable is a `gawk' extension. In other `awk'
- implementations, or if `gawk' is in compatibility mode (*note
- Options::), it is not special.
-
`FILENAME'
- The name of the file that `awk' is currently reading. When no
- data files are listed on the command line, `awk' reads from the
- standard input and `FILENAME' is set to `"-"'. `FILENAME' is
- changed each time a new file is read (*note Reading Files::).
- Inside a `BEGIN' rule, the value of `FILENAME' is `""', since
- there are no input files being processed yet.(1) (d.c.) Note,
- though, that using `getline' (*note Getline::) inside a `BEGIN'
- rule can give `FILENAME' a value.
+ The name of the current input file. When no data files are listed
+ on the command line, `awk' reads from the standard input and
+ `FILENAME' is set to `"-"'. `FILENAME' changes each time a new
+ file is read (*note Reading Files::). Inside a `BEGIN' rule, the
+ value of `FILENAME' is `""', since there are no input files being
+ processed yet.(1) (d.c.) Note, though, that using `getline' (*note
+ Getline::) inside a `BEGIN' rule can give `FILENAME' a value.
`FNR'
The current record number in the current file. `FNR' is
@@ -9800,9 +10220,8 @@ with a pound sign (`#').
all the user-defined or extension functions in the program.
NOTE: Attempting to use the `delete' statement with the
- `FUNCTAB' array will cause a fatal error. Any attempt to
- assign to an element of the `FUNCTAB' array will also cause a
- fatal error.
+ `FUNCTAB' array causes a fatal error. Any attempt to assign
+ to an element of `FUNCTAB' also causes a fatal error.
`NR'
The number of input records `awk' has processed since the
@@ -9866,8 +10285,8 @@ with a pound sign (`#').
`PROCINFO["sorted_in"]'
If this element exists in `PROCINFO', its value controls the
- order in which array indices will be processed by `for (index
- in array) ...' loops. Since this is an advanced feature, we
+ order in which array indices will be processed by `for (INDEX
+ in ARRAY)' loops. Since this is an advanced feature, we
defer the full description until later; see *note Scanning an
Array::.
@@ -9884,8 +10303,8 @@ with a pound sign (`#').
The following additional elements in the array are available to
provide information about the MPFR and GMP libraries if your
- version of `gawk' supports arbitrary precision numbers (*note Gawk
- and MPFR::):
+ version of `gawk' supports arbitrary precision numbers (*note
+ Arbitrary Precision Arithmetic::):
`PROCINFO["mpfr_version"]'
The version of the GNU MPFR library.
@@ -9925,10 +10344,6 @@ with a pound sign (`#').
open input file, pipe, or coprocess. *Note Read Timeout::,
for more information.
- This array is a `gawk' extension. In other `awk' implementations,
- or if `gawk' is in compatibility mode (*note Options::), it is not
- special.
-
`RLENGTH'
The length of the substring matched by the `match()' function
(*note String Functions::). `RLENGTH' is set by invoking the
@@ -9943,12 +10358,8 @@ with a pound sign (`#').
match was found.
`RT #'
- This is set each time a record is read. It contains the input text
- that matched the text denoted by `RS', the record separator.
-
- This variable is a `gawk' extension. In other `awk'
- implementations, or if `gawk' is in compatibility mode (*note
- Options::), it is not special.
+ The input text that matched the text denoted by `RS', the record
+ separator. It is set every time a record is read.
`SYMTAB #'
An array whose indices are the names of all currently defined
@@ -9984,7 +10395,7 @@ with a pound sign (`#').
return SYMTAB[variable] *= amount
}
- NOTE: In order to avoid severe time-travel paradoxes(2),
+ NOTE: In order to avoid severe time-travel paradoxes,(2)
neither `FUNCTAB' nor `SYMTAB' are available as elements
within the `SYMTAB' array.
@@ -10119,6 +10530,55 @@ are passed on to the `awk' program. (*Note Getopt Function::, for an
`awk' library function that parses command-line options.)

+File: gawk.info, Node: Pattern Action Summary, Prev: Built-in Variables, Up: Patterns and Actions
+
+7.6 Summary
+===========
+
+ * Pattern-action pairs make up the basic elements of an `awk'
+ program. Patterns are either normal expressions, range
+ expressions, regexp constants, one of the special keywords
+ `BEGIN', `END', `BEGINFILE', `ENDFILE', or empty. The action
+ executes if the current record matches the pattern. Empty
+ (missing) patterns match all records.
+
+ * I/O from `BEGIN' and `END' rules have certain constraints. This
+ is also true, only more so, for `BEGINFILE' and `ENDFILE' rules.
+ The latter two give you "hooks" into `gawk''s file processing,
+ allowing you to recover from a file that otherwise would cause a
+ fatal error (such as a file that cannot be opened).
+
+ * Shell variables can be used in `awk' programs by careful use of
+ shell quoting. It is easier to pass a shell variable into `awk'
+ by using the `-v' option and an `awk' variable.
+
+ * Actions consist of statements enclosed in curly braces. Statements
+ are built up from expressions, control statements, compound
+ statements, input and output statements, and deletion statements.
+
+ * The control statements in `awk' are `if'-`else', `while', `for',
+ and `do'-`while'. `gawk' adds the `switch' statement. There are
+ two flavors of `for' statement: one for for performing general
+ looping, and the other iterating through an array.
+
+ * `break' and `continue' let you exit early or start the next
+ iteration of a loop (or get out of a `switch').
+
+ * `next' and `nextfile' let you read the next record and start over
+ at the top of your program, or skip to the next input file and
+ start over, respectively.
+
+ * The `exit' statement terminates your program. When executed from
+ an action (or function body) it transfers control to the `END'
+ statements. From an `END' statement body, it exits immediately.
+ You may pass an optional numeric value to be used at `awk''s exit
+ status.
+
+ * Some built-in variables provide control over `awk', mainly for I/O.
+ Other variables convey information from `awk' to your program.
+
+
+
File: gawk.info, Node: Arrays, Next: Functions, Prev: Patterns and Actions, Up: Top
8 Arrays in `awk'
@@ -10134,7 +10594,7 @@ remove array elements. It also describes how `awk' simulates
multidimensional arrays, as well as some of the less obvious points
about array usage. The major node moves on to discuss `gawk''s facility
for sorting arrays, and ends with a brief description of `gawk''s
-ability to support true multidimensional arrays.
+ability to support true arrays of arrays.
`awk' maintains a single set of names that may be used for naming
variables, arrays, and functions (*note User-defined::). Thus, you
@@ -10152,6 +10612,7 @@ cannot have a variable and an array with the same name in the same
* Multidimensional:: Emulating multidimensional arrays in
`awk'.
* Arrays of Arrays:: True multidimensional arrays.
+* Arrays Summary:: Summary of arrays.

File: gawk.info, Node: Array Basics, Next: Delete, Up: Arrays
@@ -10212,12 +10673,13 @@ declared.)
A contiguous array of four elements might look like the following
example, conceptually, if the element values are 8, `"foo"', `""', and
-30:
+30 as shown in *note figure-array-elements:::
- +---------+---------+--------+---------+
- | 8 | "foo" | "" | 30 | Value
- +---------+---------+--------+---------+
- 0 1 2 3 Index
++---------+---------+--------+---------+
+| 8 | "foo" | "" | 30 | @r{Value}
++---------+---------+--------+---------+
+ 0 1 2 3 @r{Index}
+Figure 8.1: A Contiguous Array
Only the values are stored; the indices are implicit from the order of
the values. Here, 8 is the value at index zero, because 8 appears in the
@@ -10260,9 +10722,9 @@ from English to French:
Here we decided to translate the number one in both spelled-out and
numeric form--thus illustrating that a single array can have both
-numbers and strings as indices. In fact, array subscripts are always
+numbers and strings as indices. (In fact, array subscripts are always
strings; this is discussed in more detail in *note Numeric Array
-Subscripts::. Here, the number `1' isn't double-quoted, since `awk'
+Subscripts::.) Here, the number `1' isn't double-quoted, since `awk'
automatically converts it to a string.
The value of `IGNORECASE' has no effect upon array subscripting.
@@ -10314,11 +10776,11 @@ been assigned any value as well as elements that have been deleted
To determine whether an element exists in an array at a certain
index, use the following expression:
- IND in ARRAY
+ INDX in ARRAY
-This expression tests whether the particular index IND exists, without
+This expression tests whether the particular index INDX exists, without
the side effect of creating that element if it is not present. The
-expression has the value one (true) if `ARRAY[IND]' exists and zero
+expression has the value one (true) if `ARRAY[INDX]' exists and zero
(false) if it does not exist. For example, this statement tests
whether the array `frequencies' contains the index `2':
@@ -10453,19 +10915,54 @@ built-in function `length()'.
The order in which elements of the array are accessed by this
statement is determined by the internal arrangement of the array
-elements within `awk' and normally cannot be controlled or changed.
-This can lead to problems if new elements are added to ARRAY by
-statements in the loop body; it is not predictable whether the `for'
+elements within `awk' and in standard `awk' cannot be controlled or
+changed. This can lead to problems if new elements are added to ARRAY
+by statements in the loop body; it is not predictable whether the `for'
loop will reach them. Similarly, changing VAR inside the loop may
produce strange results. It is best to avoid such things.
+ As a point of information, `gawk' sets up the list of elements to be
+iterated over before the loop starts, and does not change it. But not
+all `awk' versions do so. Consider this program, named `loopcheck.awk':
+
+ BEGIN {
+ a["here"] = "here"
+ a["is"] = "is"
+ a["a"] = "a"
+ a["loop"] = "loop"
+ for (i in a) {
+ j++
+ a[j] = j
+ print i
+ }
+ }
+
+ Here is what happens when run with `gawk':
+
+ $ gawk -f loopcheck.awk
+ -| here
+ -| loop
+ -| a
+ -| is
+
+ Contrast this to Brian Kernighan's `awk':
+
+ $ nawk -f loopcheck.awk
+ -| loop
+ -| here
+ -| is
+ -| a
+ -| 1
+

File: gawk.info, Node: Controlling Scanning, Prev: Scanning an Array, Up: Array Basics
-8.1.6 Using Predefined Array Scanning Orders
---------------------------------------------
+8.1.6 Using Predefined Array Scanning Orders With `gawk'
+--------------------------------------------------------
-By default, when a `for' loop traverses an array, the order is
+This node describes a feature that is specific to `gawk'.
+
+ By default, when a `for' loop traverses an array, the order is
undefined, meaning that the `awk' implementation determines the order
in which the array is traversed. This order is usually based on the
internal implementation of arrays and will vary from one version of
@@ -10751,7 +11248,7 @@ might look like this:
-| line 3
-| line 2
- Unfortunately, the very first line of input data did not come out in
+ Unfortunately, the very first line of input data did not appear in
the output!
Upon first glance, we would think that this program should have
@@ -10906,7 +11403,7 @@ The result is to set `separate[1]' to `"1"' and `separate[2]' to
recovered.

-File: gawk.info, Node: Arrays of Arrays, Prev: Multidimensional, Up: Arrays
+File: gawk.info, Node: Arrays of Arrays, Next: Arrays Summary, Prev: Multidimensional, Up: Arrays
8.6 Arrays of Arrays
====================
@@ -11028,6 +11525,54 @@ by creating an arbitrary index:
-| a

+File: gawk.info, Node: Arrays Summary, Prev: Arrays of Arrays, Up: Arrays
+
+8.7 Summary
+===========
+
+ * Standard `awk' provides one-dimensional associative arrays (arrays
+ indexed by string values). All arrays are associative; numeric
+ indices are converted automatically to strings.
+
+ * Array elements are referenced as `ARRAY[INDX]'. Referencing an
+ element creates it if it did not exist previously.
+
+ * The proper way to see if an array has an element with a given index
+ is to use the `in' operator: `INDX in ARRAY'.
+
+ * Use `for (INDX in ARRAY) ...' to scan through all the individual
+ elements of an array. In the body of the loop, INDX takes on the
+ value of each element's index in turn.
+
+ * The order in which a `for (INDX in ARRAY)' loop traverses an array
+ is undefined in POSIX `awk' and varies among implementations.
+ `gawk' lets you control the order by assigning special predefined
+ values to `PROCINFO["sorted_in"]'.
+
+ * Use `delete ARRAY[INDX]' to delete an individual element. You may
+ also use `delete ARRAY' to delete all of the elements in the
+ array. This latter feature has been a common extension for many
+ years and is now standard, but may not be supported by all
+ commercial versions of `awk'.
+
+ * Standard `awk' simulates multidimensional arrays by separating
+ subscript values with a comma. The values are concatenated into a
+ single string, separated by the value of `SUBSEP'. The fact that
+ such a subscript was created in this way is not retained; thus
+ changing `SUBSEP' may have unexpected consequences. You can use
+ `(SUB1, SUB2, ...) in ARRAY' to see if such a multidimensional
+ subscript exists in ARRAY.
+
+ * `gawk' provides true arrays of arrays. You use a separate set of
+ square brackets for each dimension in such an array:
+ `data[row][col]', for example. Array elements may thus be either
+ scalar values (number or string) or another array.
+
+ * Use the `isarray()' built-in function to determine if an array
+ element is itself a subarray.
+
+
+
File: gawk.info, Node: Functions, Next: Library Functions, Prev: Arrays, Up: Top
9 Functions
@@ -11047,6 +11592,7 @@ major node describes these "user-defined" functions.
* Built-in:: Summarizes the built-in functions.
* User-defined:: Describes User-defined functions in detail.
* Indirect Calls:: Choosing the function to call at runtime.
+* Functions Summary:: Summary of functions.

File: gawk.info, Node: Built-in, Next: User-defined, Up: Functions
@@ -11134,6 +11680,20 @@ brackets ([ ]):
`cos(X)'
Return the cosine of X, with X in radians.
+`div(NUMERATOR, DENOMINATOR, RESULT)'
+ Perform integer division, similar to the standard C function of the
+ same name. First, truncate `numerator' and `denominator' to
+ integers. Clear the `result' array, and then set
+ `result["quotient"]' to the result of `numerator / denominator',
+ truncated to an integer, and set `result["remainder"]' to the
+ result of `numerator % denominator', truncated to an integer.
+ This function is primarily intended for use with arbitrary length
+ integers; it avoids creating MPFR arbitrary precision
+ floating-point values (*note Arbitrary Precision Integers::).
+
+ This function is a `gawk' extension. It is not available in
+ compatibility mode (*note Options::).
+
`exp(X)'
Return the exponential of X (`e ^ X') or report an error if X is
out of range. The range of values X can have depends on your
@@ -11198,7 +11758,7 @@ brackets ([ ]):
Return the positive square root of X. `gawk' prints a warning
message if X is negative. Thus, `sqrt(4)' is 2.
-`srand([X])'
+`srand('[X]`)'
Set the starting point, or seed, for generating random numbers to
the value X.
@@ -11252,12 +11812,22 @@ returns the number of characters in a string, and not the number of
bytes used to represent those characters. Similarly, `index()' works
with character indices, and not byte indices.
+ CAUTION: A number of functions deal with indices into strings.
+ For these functions, the first character of a string is at
+ position (index) one. This is different from C and the languages
+ descended from it, where the first character is at position zero.
+ You need to remember this when doing index calculations,
+ particularly if you are used to C.
+
In the following list, optional parameters are enclosed in square
brackets ([ ]). Several functions perform string substitution; the
full discussion is provided in the description of the `sub()' function,
which comes towards the end since the list is presented in alphabetic
-order. Those functions that are specific to `gawk' are marked with a
-pound sign (`#'):
+order.
+
+ Those functions that are specific to `gawk' are marked with a pound
+sign (`#'). They are not available in compatibility mode (*note
+Options::):
* Menu:
@@ -11265,8 +11835,8 @@ pound sign (`#'):
`&' with `sub()', `gsub()', and
`gensub()'.
-`asort(SOURCE [, DEST [, HOW ] ]) #'
-`asorti(SOURCE [, DEST [, HOW ] ]) #'
+`asort('SOURCE [`,' DEST [`,' HOW ] ]`) #'
+`asorti('SOURCE [`,' DEST [`,' HOW ] ]`) #'
These two functions are similar in behavior, so they are described
together.
@@ -11313,10 +11883,7 @@ pound sign (`#'):
a[2] = "last"
a[3] = "middle"
- `asort()' and `asorti()' are `gawk' extensions; they are not
- available in compatibility mode (*note Options::).
-
-`gensub(REGEXP, REPLACEMENT, HOW [, TARGET]) #'
+`gensub(REGEXP, REPLACEMENT, HOW' [`, TARGET']`) #'
Search the target string TARGET for matches of the regular
expression REGEXP. If HOW is a string beginning with `g' or `G'
(short for "global"), then replace all matches of REGEXP with
@@ -11366,10 +11933,7 @@ pound sign (`#'):
If REGEXP does not match TARGET, `gensub()''s return value is the
original unchanged value of TARGET.
- `gensub()' is a `gawk' extension; it is not available in
- compatibility mode (*note Options::).
-
-`gsub(REGEXP, REPLACEMENT [, TARGET])'
+`gsub(REGEXP, REPLACEMENT' [`, TARGET']`)'
Search TARGET for _all_ of the longest, leftmost, _nonoverlapping_
matching substrings it can find and replace them with REPLACEMENT.
The `g' in `gsub()' stands for "global," which means replace
@@ -11393,12 +11957,11 @@ pound sign (`#'):
$ awk 'BEGIN { print index("peanut", "an") }'
-| 3
- If FIND is not found, `index()' returns zero. (Remember that
- string indices in `awk' start at one.)
+ If FIND is not found, `index()' returns zero.
It is a fatal error to use a regexp constant for FIND.
-`length([STRING])'
+`length('[STRING]`)'
Return the number of characters in STRING. If STRING is a number,
the length of the digit string representing that number is
returned. For example, `length("abcde")' is five. By contrast,
@@ -11438,14 +12001,14 @@ pound sign (`#'):
array argument is not portable. If `--posix' is supplied, using
an array argument is a fatal error (*note Arrays::).
-`match(STRING, REGEXP [, ARRAY])'
+`match(STRING, REGEXP' [`, ARRAY']`)'
Search STRING for the longest, leftmost substring matched by the
- regular expression, REGEXP and return the character position, or
- "index", at which that substring begins (one, if it starts at the
+ regular expression, REGEXP and return the character position
+ (index) at which that substring begins (one, if it starts at the
beginning of STRING). If no match is found, return zero.
- The REGEXP argument may be either a regexp constant (`/.../') or a
- string constant (`"..."'). In the latter case, the string is
+ The REGEXP argument may be either a regexp constant (`/'...`/') or
+ a string constant (`"'...`"'). In the latter case, the string is
treated as a regexp to be matched. *Note Computed Regexps::, for a
discussion of the difference between the two forms, and the
implications for writing your program correctly.
@@ -11525,7 +12088,7 @@ pound sign (`#'):
compatibility mode (*note Options::), using a third argument is a
fatal error.
-`patsplit(STRING, ARRAY [, FIELDPAT [, SEPS ] ]) #'
+`patsplit(STRING, ARRAY' [`, FIELDPAT' [`, SEPS' ] ]`) #'
Divide STRING into pieces defined by FIELDPAT and store the pieces
in ARRAY and the separator strings in the SEPS array. The first
piece is stored in `ARRAY[1]', the second piece in `ARRAY[2]', and
@@ -11544,10 +12107,7 @@ pound sign (`#'):
Before splitting the string, `patsplit()' deletes any previously
existing elements in the arrays ARRAY and SEPS.
- The `patsplit()' function is a `gawk' extension. In compatibility
- mode (*note Options::), it is not available.
-
-`split(STRING, ARRAY [, FIELDSEP [, SEPS ] ])'
+`split(STRING, ARRAY' [`, FIELDSEP' [`, SEPS' ] ]`)'
Divide STRING into pieces separated by FIELDSEP and store the
pieces in ARRAY and the separator strings in the SEPS array. The
first piece is stored in `ARRAY[1]', the second piece in
@@ -11612,6 +12172,9 @@ pound sign (`#'):
has one element only. The value of that element is the original
STRING.
+ In POSIX mode (*note Options::), the fourth argument is not
+ allowed.
+
`sprintf(FORMAT, EXPRESSION1, ...)'
Return (without printing) the string that `printf' would have
printed out with the same arguments (*note Printf::). For example:
@@ -11637,18 +12200,15 @@ pound sign (`#'):
Note also that `strtonum()' uses the current locale's decimal point
for recognizing numbers (*note Locales::).
- `strtonum()' is a `gawk' extension; it is not available in
- compatibility mode (*note Options::).
-
-`sub(REGEXP, REPLACEMENT [, TARGET])'
+`sub(REGEXP, REPLACEMENT' [`, TARGET']`)'
Search TARGET, which is treated as a string, for the leftmost,
longest substring matched by the regular expression REGEXP.
Modify the entire string by replacing the matched text with
REPLACEMENT. The modified string becomes the new value of TARGET.
Return the number of substitutions made (zero or one).
- The REGEXP argument may be either a regexp constant (`/.../') or a
- string constant (`"..."'). In the latter case, the string is
+ The REGEXP argument may be either a regexp constant (`/'...`/') or
+ a string constant (`"'...`"'). In the latter case, the string is
treated as a regexp to be matched. *Note Computed Regexps::, for a
discussion of the difference between the two forms, and the
implications for writing your program correctly.
@@ -11713,7 +12273,7 @@ pound sign (`#'):
into a string, and then the value of that string is treated as the
regexp to match.
-`substr(STRING, START [, LENGTH])'
+`substr(STRING, START' [`, LENGTH' ]`)'
Return a LENGTH-character-long substring of STRING, starting at
character number START. The first character of a string is
character number one.(3) For example, `substr("washington", 5, 3)'
@@ -11791,9 +12351,9 @@ backslashes and ampersands into the replacement text, you need to
remember that there are several levels of "escape processing" going on.
First, there is the "lexical" level, which is when `awk' reads your
-program and builds an internal copy of it that can be executed. Then
-there is the runtime level, which is when `awk' actually scans the
-replacement string to determine what to generate.
+program and builds an internal copy of it to execute. Then there is
+the runtime level, which is when `awk' actually scans the replacement
+string to determine what to generate.
At both levels, `awk' looks for a defined set of characters that can
come after a backslash. At the lexical level, it looks for the escape
@@ -11970,7 +12530,7 @@ File: gawk.info, Node: I/O Functions, Next: Time Functions, Prev: String Func
The following functions relate to input/output (I/O). Optional
parameters are enclosed in square brackets ([ ]):
-`close(FILENAME [, HOW])'
+`close('FILENAME [`,' HOW]`)'
Close the file FILENAME for input or output. Alternatively, the
argument may be a shell command that was used for creating a
coprocess, or for redirecting to or from a pipe; then the
@@ -11985,7 +12545,10 @@ parameters are enclosed in square brackets ([ ]):
not matter. *Note Two-way I/O::, which discusses this feature in
more detail and gives an example.
-`fflush([FILENAME])'
+ Note that the second argument to `close()' is a `gawk' extension;
+ it is not available in compatibility mode (*note Options::).
+
+`fflush('[FILENAME]`)'
Flush any buffered output associated with FILENAME, which is
either a file opened for writing or a shell command for
redirecting output to a pipe or coprocess.
@@ -12001,10 +12564,10 @@ parameters are enclosed in square brackets ([ ]):
function--`gawk' also buffers its output and the `fflush()'
function forces `gawk' to flush its buffers.
- `fflush()' was added to Brian Kernighan's version of `awk' in
- April of 1992. For two decades, it was not part of the POSIX
- standard. As of December, 2012, it was accepted for inclusion
- into the POSIX standard. See the Austin Group website
+ `fflush()' was added to Brian Kernighan's `awk' in April of 1992.
+ For two decades, it was not part of the POSIX standard. As of
+ December, 2012, it was accepted for inclusion into the POSIX
+ standard. See the Austin Group website
(http://austingroupbugs.net/view.php?id=634).
POSIX standardizes `fflush()' as follows: If there is no argument,
@@ -12022,7 +12585,7 @@ parameters are enclosed in square brackets ([ ]):
to flush only the standard output.
`fflush()' returns zero if the buffer is successfully flushed;
- otherwise, it returns non-zero (`gawk' returns -1). In the case
+ otherwise, it returns non-zero. (`gawk' returns -1.) In the case
where all buffers are flushed, the return value is zero only if
all buffers were flushed successfully. Otherwise, it is -1, and
`gawk' warns about the problem FILENAME.
@@ -12192,7 +12755,7 @@ enclosed in square brackets ([ ]):
If DATESPEC does not contain enough elements or if the resulting
time is out of range, `mktime()' returns -1.
-`strftime([FORMAT [, TIMESTAMP [, UTC-FLAG]]])'
+`strftime(' [FORMAT [`,' TIMESTAMP [`,' UTC-FLAG] ] ]`)'
Format the time specified by TIMESTAMP based on the contents of
the FORMAT string and return the result. It is similar to the
function of the same name in ISO C. If UTC-FLAG is present and is
@@ -12272,11 +12835,11 @@ the following date format specifications:
`%g'
The year modulo 100 of the ISO 8601 week number, as a decimal
- number (00-99). For example, January 1, 1993 is in week 53 of
- 1992. Thus, the year of its ISO 8601 week number is 1992, even
- though its year is 1993. Similarly, December 31, 1973 is in week
- 1 of 1974. Thus, the year of its ISO week number is 1974, even
- though its year is 1973.
+ number (00-99). For example, January 1, 2012 is in week 53 of
+ 2011. Thus, the year of its ISO 8601 week number is 2011, even
+ though its year is 2012. Similarly, December 31, 2012 is in week
+ 1 of 2013. Thus, the year of its ISO week number is 2013, even
+ though its year is 2012.
`%G'
The full year of the ISO week number, as a decimal number.
@@ -12356,7 +12919,7 @@ the following date format specifications:
The year modulo 100 as a decimal number (00-99).
`%Y'
- The full year as a decimal number (e.g., 2011).
+ The full year as a decimal number (e.g., 2015).
`%z'
The timezone offset in a +HHMM format (e.g., the format necessary
@@ -12378,15 +12941,6 @@ the following date format specifications:
If a conversion specifier is not one of the above, the behavior is
undefined.(6)
- Informally, a "locale" is the geographic place in which a program is
-meant to run. For example, a common way to abbreviate the date
-September 4, 2012 in the United States is "9/4/12." In many countries
-in Europe, however, it is abbreviated "4.9.12." Thus, the `%x'
-specification in a `"US"' locale might produce `9/4/12', while in a
-`"EUROPE"' locale, it might produce `4.9.12'. The ISO C standard
-defines a default `"C"' locale, which is an environment that is typical
-of what many C programmers are used to.
-
For systems that are not yet fully standards-compliant, `gawk'
supplies a copy of `strftime()' from the GNU C Library. It supports
all of the just-listed format specifications. If that version is used
@@ -12416,7 +12970,7 @@ to the standard output and interprets the current time according to the
format specifiers in the string. For example:
$ date '+Today is %A, %B %d, %Y.'
- -| Today is Wednesday, March 30, 2011.
+ -| Today is Monday, May 05, 2014.
Here is the `gawk' version of the `date' utility. It has a shell
"wrapper" to handle the `-u' option, which requires that `date' run as
@@ -12433,7 +12987,7 @@ if the time zone is set to UTC:
esac
gawk 'BEGIN {
- format = "%a %b %e %H:%M:%S %Z %Y"
+ format = PROCINFO["strftime"]
exitval = 0
if (ARGC > 2)
@@ -12510,23 +13064,23 @@ again with `10111001' and shift it left by three bits, you end up with
`11001000'. `gawk' provides built-in functions that implement the
bitwise operations just described. They are:
-`and(V1, V2 [, ...])'
+``and(V1, V2' [`,' ...]`)''
Return the bitwise AND of the arguments. There must be at least
two.
-`compl(VAL)'
+``compl(VAL)''
Return the bitwise complement of VAL.
-`lshift(VAL, COUNT)'
+``lshift(VAL, COUNT)''
Return the value of VAL, shifted left by COUNT bits.
-`or(V1, V2 [, ...])'
+``or(V1, V2' [`,' ...]`)''
Return the bitwise OR of the arguments. There must be at least two.
-`rshift(VAL, COUNT)'
+``rshift(VAL, COUNT)''
Return the value of VAL, shifted right by COUNT bits.
-`xor(V1, V2 [, ...])'
+``xor(V1, V2' [`,' ...]`)''
Return the bitwise XOR of the arguments. There must be at least
two.
@@ -12611,8 +13165,8 @@ File: gawk.info, Node: Type Functions, Next: I18N Functions, Prev: Bitwise Fu
`gawk' provides a single function that lets you distinguish an array
from a scalar variable. This is necessary for writing code that
-traverses every element of a true multidimensional array (*note Arrays
-of Arrays::).
+traverses every element of an array of arrays. (*note Arrays of
+Arrays::).
`isarray(X)'
Return a true value if X is an array. Otherwise return false.
@@ -12642,7 +13196,7 @@ descriptions here are purposely brief. *Note Internationalization::,
for the full story. Optional parameters are enclosed in square
brackets ([ ]):
-`bindtextdomain(DIRECTORY [, DOMAIN])'
+`bindtextdomain(DIRECTORY' [`,' DOMAIN]`)'
Set the directory in which `gawk' will look for message
translation files, in case they will not or cannot be placed in
the "standard" locations (e.g., during testing). It returns the
@@ -12652,13 +13206,13 @@ brackets ([ ]):
the null string (`""'), then `bindtextdomain()' returns the
current binding for the given DOMAIN.
-`dcgettext(STRING [, DOMAIN [, CATEGORY]])'
+`dcgettext(STRING' [`,' DOMAIN [`,' CATEGORY] ]`)'
Return the translation of STRING in text domain DOMAIN for locale
category CATEGORY. The default value for DOMAIN is the current
value of `TEXTDOMAIN'. The default value for CATEGORY is
`"LC_MESSAGES"'.
-`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])'
+`dcngettext(STRING1, STRING2, NUMBER' [`,' DOMAIN [`,' CATEGORY] ]`)'
Return the plural form used for NUMBER of the translation of
STRING1 and STRING2 in text domain DOMAIN for locale category
CATEGORY. STRING1 is the English singular variant of a message,
@@ -12701,10 +13255,10 @@ starting to execute any of it.
The definition of a function named NAME looks like this:
- function NAME([PARAMETER-LIST])
- {
+ `function' NAME`('[PARAMETER-LIST]`)'
+ `{'
BODY-OF-FUNCTION
- }
+ `}'
Here, NAME is the name of the function to define. A valid function
name is like a valid variable name: a sequence of letters, digits, and
@@ -12715,15 +13269,21 @@ function.
PARAMETER-LIST is an optional list of the function's arguments and
local variable names, separated by commas. When the function is called,
the argument names are used to hold the argument values given in the
-call. The local variables are initialized to the empty string. A
-function cannot have two parameters with the same name, nor may it have
-a parameter with the same name as the function itself.
+call.
- In addition, according to the POSIX standard, function parameters
-cannot have the same name as one of the special built-in variables
-(*note Built-in Variables::. Not all versions of `awk' enforce this
+ A function cannot have two parameters with the same name, nor may it
+have a parameter with the same name as the function itself. In
+addition, according to the POSIX standard, function parameters cannot
+have the same name as one of the special built-in variables (*note
+Built-in Variables::). Not all versions of `awk' enforce this
restriction.)
+ Local variables act like the empty string if referenced where a
+string value is required, and like zero if referenced where a numeric
+value is required. This is the same as regular variables that have
+never been assigned a value. (There is more to understand about local
+variables; *note Dynamic Typing::.)
+
The BODY-OF-FUNCTION consists of `awk' statements. It is the most
important part of the definition, because it says what the function
should actually _do_. The argument names exist to give the body a way
@@ -12871,7 +13431,7 @@ an `awk' version of `ctime()':
function ctime(ts, format)
{
- format = "%a %b %e %H:%M:%S %Z %Y"
+ format = PROCINFO["strftime"]
if (ts == 0)
ts = systime() # use current time as default
return strftime(format, ts)
@@ -12925,9 +13485,10 @@ File: gawk.info, Node: Variable Scope, Next: Pass By Value/Reference, Prev: C
9.2.3.2 Controlling Variable Scope
..................................
-There is no way to make a variable local to a `{ ... }' block in `awk',
-but you can make a variable local to a function. It is good practice to
-do so whenever a variable is needed only in that function.
+Unlike many languages, there is no way to make a variable local to a
+`{' ... `}' block in `awk', but you can make a variable local to a
+function. It is good practice to do so whenever a variable is needed
+only in that function.
To make a variable local to a function, simply declare the variable
as an argument after the actual function arguments (*note Definition
@@ -13146,11 +13707,11 @@ control to the calling part of the `awk' program. It can also be used
to return a value for use in the rest of the `awk' program. It looks
like this:
- return [EXPRESSION]
+ `return' [EXPRESSION]
The EXPRESSION part is optional. Due most likely to an oversight,
POSIX does not define what the return value is if you omit the
-EXPRESSION. Technically speaking, this make the returned value
+EXPRESSION. Technically speaking, this makes the returned value
undefined, and therefore, unpredictable. In practice, though, all
versions of `awk' simply return the null string, which acts like zero
if used in a numeric context.
@@ -13244,14 +13805,14 @@ Here is an annotated sample program:
}
In this example, the first call to `foo()' generates a fatal error,
-so `gawk' will not report the second error. If you comment out that
-call, though, then `gawk' will report the second error.
+so `awk' will not report the second error. If you comment out that
+call, though, then `awk' does report the second error.
Usually, such things aren't a big issue, but it's worth being aware
of them.

-File: gawk.info, Node: Indirect Calls, Prev: User-defined, Up: Functions
+File: gawk.info, Node: Indirect Calls, Next: Functions Summary, Prev: User-defined, Up: Functions
9.3 Indirect Function Calls
===========================
@@ -13539,6 +14100,63 @@ example, in the following case:
`gawk' will look up the actual function to call only once.

+File: gawk.info, Node: Functions Summary, Prev: Indirect Calls, Up: Functions
+
+9.4 Summary
+===========
+
+ * `awk' provides built-in functions and lets you define your own
+ functions.
+
+ * POSIX `awk' provides three kinds of built-in functions: numeric,
+ string, and I/O. `gawk' provides functions that work with values
+ representing time, do bit manipulation, sort arrays, and
+ internationalize and localize programs. `gawk' also provides
+ several extensions to some of standard functions, typically in the
+ form of additional arguments.
+
+ * Functions accept zero or more arguments and return a value. The
+ expressions that provide the argument values are completely
+ evaluated before the function is called. Order of evaluation is
+ not defined. The return value can be ignored.
+
+ * The handling of backslash in `sub()' and `gsub()' is not simple.
+ It is more straightforward in `gawk''s `gensub()' function, but
+ that function still requires care in its use.
+
+ * User-defined functions provide important capabilities but come with
+ some syntactic inelegancies. In a function call, there cannot be
+ any space between the function name and the opening left
+ parenthesis of the argument list. Also, there is no provision for
+ local variables, so the convention is to add extra parameters, and
+ to separate them visually from the real parameters by extra
+ whitespace.
+
+ * User-defined functions may call other user-defined (and built-in)
+ functions and may call themselves recursively. Function parameters
+ "hide" any global variables of the same names.
+
+ * Scalar values are passed to user-defined functions by value. Array
+ parameters are passed by reference; any changes made by the
+ function to array parameters are thus visible after the function
+ has returned.
+
+ * Use the `return' statement to return from a user-defined function.
+ An optional expression becomes the function's return value. Only
+ scalar values may be returned by a function.
+
+ * If a variable that has never been used is passed to a user-defined
+ function, how that function treats the variable can set its nature:
+ either scalar or array.
+
+ * `gawk' provides indirect function calls using a special syntax.
+ By setting a variable to the name of a user-defined function, you
+ can determine at runtime what function will be called at that
+ point in the program. This is equivalent to function pointers in C
+ and C++.
+
+
+
File: gawk.info, Node: Library Functions, Next: Sample Programs, Prev: Functions, Up: Top
10 A Library of `awk' Functions
@@ -13614,6 +14232,8 @@ for different implementations of `awk' is pretty straightforward.
* Passwd Functions:: Functions for getting user information.
* Group Functions:: Functions for getting group information.
* Walking Arrays:: A function to walk arrays of arrays.
+* Library Functions Summary:: Summary of library functions.
+* Library exercises:: Exercises.
---------- Footnotes ----------
@@ -13740,7 +14360,7 @@ versions of `awk':
# mystrtonum --- convert string to number
- function mystrtonum(str, ret, chars, n, i, k, c)
+ function mystrtonum(str, ret, n, i, k, c)
{
if (str ~ /^0[0-7]*$/) {
# octal
@@ -13753,7 +14373,7 @@ versions of `awk':
ret = ret * 8 + k
}
- } else if (str ~ /^0[xX][[:xdigit:]]+/) {
+ } else if (str ~ /^0[xX][[:xdigit:]]+$/) {
# hexadecimal
str = substr(str, 3) # lop off leading 0x
n = length(str)
@@ -13761,10 +14381,7 @@ versions of `awk':
for (i = 1; i <= n; i++) {
c = substr(str, i, 1)
c = tolower(c)
- if ((k = index("0123456789", c)) > 0)
- k-- # adjust for 1-basing in awk
- else if ((k = index("abcdef", c)) > 0)
- k += 9
+ k = index("123456789abcdef", c)
ret = ret * 16 + k
}
@@ -14164,7 +14781,7 @@ current time formatted in the same way as the `date' utility:
now = systime()
# return date(1)-style output
- ret = strftime("%a %b %e %H:%M:%S %Z %Y", now)
+ ret = strftime(PROCINFO["strftime"], now)
# clear out target array
delete time
@@ -14424,8 +15041,8 @@ File: gawk.info, Node: File Checking, Next: Empty Files, Prev: Rewind Functio
Normally, if you give `awk' a data file that isn't readable, it stops
with a fatal error. There are times when you might want to just ignore
-such files and keep going. You can do this by prepending the following
-program to your `awk' program:
+such files and keep going.(1) You can do this by prepending the
+following program to your `awk' program:
# readable.awk --- library file to skip over unreadable files
@@ -14445,10 +15062,16 @@ program to your `awk' program:
element from `ARGV' with `delete' skips the file (since it's no longer
in the list). See also *note ARGC and ARGV::.
+ ---------- Footnotes ----------
+
+ (1) The `BEGINFILE' special pattern (*note BEGINFILE/ENDFILE::)
+provides an alternative mechanism for dealing with files that can't be
+opened. However, the code here provides a portable solution.
+

File: gawk.info, Node: Empty Files, Next: Ignoring Assigns, Prev: File Checking, Up: Data File Management
-10.3.4 Checking For Zero-length Files
+10.3.4 Checking for Zero-length Files
-------------------------------------
All known `awk' implementations silently skip over zero-length files.
@@ -14496,12 +15119,6 @@ normal case.
end of the command-line arguments. Note that the test in the condition
of the `for' loop uses the `<=' operator, not `<'.
- As an exercise, you might consider whether this same problem can be
-solved without relying on `gawk''s `ARGIND' variable.
-
- As a second exercise, revise this code to handle the case where an
-intervening value in `ARGV' is a variable assignment.
-

File: gawk.info, Node: Ignoring Assigns, Prev: Empty Files, Up: Data File Management
@@ -14793,7 +15410,7 @@ is in `ARGV[0]':
# test program
if (_getopt_test) {
while ((_go_c = getopt(ARGC, ARGV, "ab:cd")) != -1)
- printf("c = <%c>, optarg = <%s>\n",
+ printf("c = <%c>, Optarg = <%s>\n",
_go_c, Optarg)
printf("non-option arguments:\n")
for (; Optind < ARGC; Optind++)
@@ -14806,17 +15423,17 @@ is in `ARGV[0]':
result of two sample runs of the test program:
$ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x
- -| c = <a>, optarg = <>
- -| c = <c>, optarg = <>
- -| c = <b>, optarg = <ARG>
+ -| c = <a>, Optarg = <>
+ -| c = <c>, Optarg = <>
+ -| c = <b>, Optarg = <ARG>
-| non-option arguments:
-| ARGV[3] = <bax>
-| ARGV[4] = <-x>
$ awk -f getopt.awk -v _getopt_test=1 -- -a -x -- xyz abc
- -| c = <a>, optarg = <>
+ -| c = <a>, Optarg = <>
error--> x -- invalid option
- -| c = <?>, optarg = <>
+ -| c = <?>, Optarg = <>
-| non-option arguments:
-| ARGV[4] = <xyz>
-| ARGV[5] = <abc>
@@ -14875,7 +15492,7 @@ that "cats" the password database:
/*
* pwcat.c
*
- * Generate a printable version of the password database
+ * Generate a printable version of the password database.
*/
#include <stdio.h>
#include <pwd.h>
@@ -15100,7 +15717,7 @@ group database, is as follows:
/*
* grcat.c
*
- * Generate a printable version of the group database
+ * Generate a printable version of the group database.
*/
#include <stdio.h>
#include <grp.h>
@@ -15136,9 +15753,10 @@ Group Password
used; it is usually empty or set to `*'.
Group ID Number
- The group's numeric group ID number; this number must be unique
- within the file. (On some systems it's a C `long', and not an
- `int'. Thus we cast it to `long' for all cases.)
+ The group's numeric group ID number; the association of name to
+ number must be unique within the file. (On some systems it's a C
+ `long', and not an `int'. Thus we cast it to `long' for all
+ cases.)
Group Member List
A comma-separated list of user names. These users are members of
@@ -15242,15 +15860,12 @@ the database for the same group. This is common when a group has a
large number of members. A pair of such entries might look like the
following:
- tvpeople:*:101:johnny,jay,arsenio
+ tvpeople:*:101:johny,jay,arsenio
tvpeople:*:101:david,conan,tom,joan
For this reason, `_gr_init()' looks to see if a group name or group
ID number is already seen. If it is, then the user names are simply
-concatenated onto the previous list of users. (There is actually a
-subtle problem with the code just presented. Suppose that the first
-time there were no names. This code adds the names with a leading
-comma. It also doesn't check that there is a `$4'.)
+concatenated onto the previous list of users.(1)
Finally, `_gr_init()' closes the pipeline to `grcat', restores `FS'
(and `FIELDWIDTHS' or `FPAT' if necessary), `RS', and `$0', initializes
@@ -15315,8 +15930,14 @@ very simple, relying on `awk''s associative arrays to do work.
The `id' program in *note Id Program::, uses these functions.
+ ---------- Footnotes ----------
+
+ (1) There is actually a subtle problem with the code just presented.
+Suppose that the first time there were no names. This code adds the
+names with a leading comma. It also doesn't check that there is a `$4'.
+

-File: gawk.info, Node: Walking Arrays, Prev: Group Functions, Up: Library Functions
+File: gawk.info, Node: Walking Arrays, Next: Library Functions Summary, Prev: Group Functions, Up: Library Functions
10.7 Traversing Arrays of Arrays
================================
@@ -15366,17 +15987,73 @@ value. Here is a main program to demonstrate:
-| a[2][2] = 22
-| a[3] = 3
- Walking an array and processing each element is a general-purpose
-operation. You might want to consider generalizing the `walk_array()'
-function by adding an additional parameter named `process'.
+
+File: gawk.info, Node: Library Functions Summary, Next: Library exercises, Prev: Walking Arrays, Up: Library Functions
+
+10.8 Summary
+============
+
+ * Reading programs is an excellent way to learn Good Programming.
+ The functions provided in this major node and the next are intended
+ to serve that purpose.
+
+ * When writing general-purpose library functions, put some thought
+ into how to name any global variables so that they won't conflict
+ with variables from a user's program.
+
+ * The functions presented here fit into the following categories:
+
+ General problems
+ Number to string conversion, assertions, rounding, random
+ number generation, converting characters to numbers, joining
+ strings, getting easily usable time-of-day information, and
+ reading a whole file in one shot.
+
+ Managing data files
+ Noting data file boundaries, rereading the current file,
+ checking for readable files, checking for zero-length files,
+ and treating assignments as file names.
+
+ Processing command-line options
+ An `awk' version of the standard C `getopt()' function.
- Then, inside the loop, instead of simply printing the array element's
-index and value, use the indirect function call syntax (*note Indirect
-Calls::) on `process', passing it the index and the value.
+ Reading the user and group databases
+ Two sets of routines that parallel the C library versions.
+
+ Traversing arrays of arrays
+ A simple function to traverse an array of arrays to any depth.
+
+
+
+File: gawk.info, Node: Library exercises, Prev: Library Functions Summary, Up: Library Functions
+
+10.9 Exercises
+==============
+
+ 1. In *note Empty Files::, we presented the `zerofile.awk' program,
+ which made use of `gawk''s `ARGIND' variable. Can this problem be
+ solved without relying on `ARGIND'? If so, how?
+
+ 2. As a related challenge, revise that code to handle the case where
+ an intervening value in `ARGV' is a variable assignment.
+
+ 3. *note Walking Arrays::, presented a function that walked a
+ multidimensional array to print it out. However, walking an array
+ and processing each element is a general-purpose operation.
+ Generalize the `walk_array()' function by adding an additional
+ parameter named `process'.
+
+ Then, inside the loop, instead of printing the array element's
+ index and value, use the indirect function call syntax (*note
+ Indirect Calls::) on `process', passing it the index and the value.
+
+ When calling `walk_array()', you would pass the name of a
+ user-defined function that expects to receive an index and a value,
+ and then processes the element.
+
+ Test your new version by printing the array; you should end up with
+ output identical to that of the original version.
- When calling `walk_array()', you would pass the name of a
-user-defined function that expects to receive an index and a value, and
-then processes the element.

File: gawk.info, Node: Sample Programs, Next: Advanced Features, Prev: Library Functions, Up: Top
@@ -15397,6 +16074,8 @@ Library Functions::.
* Running Examples:: How to run these examples.
* Clones:: Clones of common utilities.
* Miscellaneous Programs:: Some interesting `awk' programs.
+* Programs Summary:: Summary of programs.
+* Programs Exercises:: Exercises.

File: gawk.info, Node: Running Examples, Next: Clones, Up: Sample Programs
@@ -15542,7 +16221,7 @@ by characters, the output field separator is set to the null string:
OFS = ""
} else if (c == "d") {
if (length(Optarg) > 1) {
- printf("Using first character of %s" \
+ printf("cut: using first character of %s" \
" for delimiter\n", Optarg) > "/dev/stderr"
Optarg = substr(Optarg, 1, 1)
}
@@ -15551,7 +16230,7 @@ by characters, the output field separator is set to the null string:
if (FS == " ") # defeat awk semantics
FS = "[ ]"
} else if (c == "s")
- suppress++
+ suppress = 1
else
usage()
}
@@ -15609,7 +16288,7 @@ splitting:
if (index(f[i], "-") != 0) { # a range
m = split(f[i], g, "-")
if (m != 2 || g[1] >= g[2]) {
- printf("bad field list: %s\n",
+ printf("cut: bad field list: %s\n",
f[i]) > "/dev/stderr"
exit 1
}
@@ -15647,7 +16326,7 @@ filler fields:
if (index(f[i], "-") != 0) { # range
m = split(f[i], g, "-")
if (m != 2 || g[1] >= g[2]) {
- printf("bad character list: %s\n",
+ printf("cut: bad character list: %s\n",
f[i]) > "/dev/stderr"
exit 1
}
@@ -15720,7 +16399,7 @@ The `egrep' utility searches files for patterns. It uses regular
expressions that are almost identical to those available in `awk'
(*note Regexp::). You invoke it as follows:
- egrep [ OPTIONS ] 'PATTERN' FILES ...
+ `egrep' [OPTIONS] `'PATTERN'' FILES ...
The PATTERN is a regular expression. In typical usage, the regular
expression is quoted to prevent the shell from expanding any of the
@@ -15864,6 +16543,11 @@ know the total number of lines that matched the pattern:
total += fcount
}
+ The `BEGINFILE' and `ENDFILE' special patterns (*note
+BEGINFILE/ENDFILE::) could be used, but then the program would be
+`gawk'-specific. Additionally, this example was written before `gawk'
+acquired `BEGINFILE' and `ENDFILE'.
+
The following rule does most of the work of matching lines. The
variable `matches' is true if the line matched the pattern. If the user
wants lines that did not match, the sense of `matches' is inverted
@@ -15911,9 +16595,7 @@ there are no matches, the exit status is one; otherwise it is zero:
END \
{
- if (total == 0)
- exit 1
- exit 0
+ exit (total == 0)
}
The `usage()' function prints a usage message in case of invalid
@@ -15955,7 +16637,7 @@ different from the real ones. If possible, `id' also supplies the
corresponding user and group names. The output might look like this:
$ id
- -| uid=500(arnold) gid=500(arnold) groups=6(disk),7(lp),19(floppy)
+ -| uid=1000(arnold) gid=1000(arnold) groups=1000(arnold),4(adm),7(lp),27(sudo)
This information is part of what is provided by `gawk''s `PROCINFO'
array (*note Built-in Variables::). However, the `id' utility provides
@@ -15988,34 +16670,26 @@ and the group numbers:
printf("uid=%d", uid)
pw = getpwuid(uid)
- if (pw != "") {
- split(pw, a, ":")
- printf("(%s)", a[1])
- }
+ if (pw != "")
+ pr_first_field(pw)
if (euid != uid) {
printf(" euid=%d", euid)
pw = getpwuid(euid)
- if (pw != "") {
- split(pw, a, ":")
- printf("(%s)", a[1])
- }
+ if (pw != "")
+ pr_first_field(pw)
}
printf(" gid=%d", gid)
pw = getgrgid(gid)
- if (pw != "") {
- split(pw, a, ":")
- printf("(%s)", a[1])
- }
+ if (pw != "")
+ pr_first_field(pw)
if (egid != gid) {
printf(" egid=%d", egid)
pw = getgrgid(egid)
- if (pw != "") {
- split(pw, a, ":")
- printf("(%s)", a[1])
- }
+ if (pw != "")
+ pr_first_field(pw)
}
for (i = 1; ("group" i) in PROCINFO; i++) {
@@ -16024,10 +16698,8 @@ and the group numbers:
group = PROCINFO["group" i]
printf("%d", group)
pw = getgrgid(group)
- if (pw != "") {
- split(pw, a, ":")
- printf("(%s)", a[1])
- }
+ if (pw != "")
+ pr_first_field(pw)
if (("group" (i+1)) in PROCINFO)
printf(",")
}
@@ -16035,6 +16707,12 @@ and the group numbers:
print ""
}
+ function pr_first_field(str, a)
+ {
+ split(str, a, ":")
+ printf("(%s)", a[1])
+ }
+
The test in the `for' loop is worth noting. Any supplementary
groups in the `PROCINFO' array have the indices `"group1"' through
`"groupN"' for some N, i.e., the total number of supplementary groups.
@@ -16049,6 +16727,10 @@ the last group in the array and the loop exits.
then the condition is false the first time it's tested, and the loop
body never executes.
+ The `pr_first_field()' function simply isolates out some code that
+is used repeatedly, making the whole program slightly shorter and
+cleaner.
+

File: gawk.info, Node: Split Program, Next: Tee Program, Prev: Id Program, Up: Clones
@@ -16058,7 +16740,7 @@ File: gawk.info, Node: Split Program, Next: Tee Program, Prev: Id Program, U
The `split' program splits large text files into smaller pieces. Usage
is as follows:(1)
- split [-COUNT] file [ PREFIX ]
+ `split' [`-COUNT'] [FILE] [PREFIX]
By default, the output files are named `xaa', `xab', and so on. Each
file has 1000 lines in it, with the likely exception of the last file.
@@ -16082,7 +16764,7 @@ output file names:
# split.awk --- do split in awk
#
# Requires ord() and chr() library functions
- # usage: split [-num] [file] [outname]
+ # usage: split [-count] [file] [outname]
BEGIN {
outfile = "x" # default
@@ -16091,7 +16773,7 @@ output file names:
usage()
i = 1
- if (ARGV[i] ~ /^-[[:digit:]]+$/) {
+ if (i in ARGV && ARGV[i] ~ /^-[[:digit:]]+$/) {
count = -ARGV[i]
ARGV[i] = ""
i++
@@ -16167,7 +16849,7 @@ The `tee' program is known as a "pipe fitting." `tee' copies its
standard input to its standard output and also duplicates it to the
files named on the command line. Its usage is as follows:
- tee [-a] file ...
+ `tee' [`-a'] FILE ...
The `-a' option tells `tee' to append to the named files, instead of
truncating them and starting over.
@@ -16256,7 +16938,7 @@ and by default removes duplicate lines. In other words, it only prints
unique lines--hence the name. `uniq' has a number of options. The
usage is as follows:
- uniq [-udc [-N]] [+N] [ INPUT FILE [ OUTPUT FILE ]]
+ `uniq' [`-udc' [`-N']] [`+N'] [INPUTFILE [OUTPUTFILE]]
The options for `uniq' are:
@@ -16279,11 +16961,11 @@ usage is as follows:
Skip N characters before comparing lines. Any fields specified
with `-N' are skipped first.
-`INPUT FILE'
+`INPUTFILE'
Data is read from the input file named on the command line,
instead of from the standard input.
-`OUTPUT FILE'
+`OUTPUTFILE'
The generated output is sent to the named output file, instead of
to the standard output.
@@ -16473,7 +17155,7 @@ File: gawk.info, Node: Wc Program, Prev: Uniq Program, Up: Clones
The `wc' (word count) utility counts lines, words, and characters in
one or more input files. Its usage is as follows:
- wc [-lwc] [ FILES ... ]
+ `wc' [`-lwc'] [FILES ...]
If no files are specified on the command line, `wc' reads its
standard input. If there are multiple files, it also prints total
@@ -16553,7 +17235,7 @@ lines, words, and characters to zero, and saves the current file name in
}
The `endfile()' function adds the current file's numbers to the
-running totals of lines, words, and characters.(1) It then prints out
+running totals of lines, words, and characters. It then prints out
those numbers for the file that was just read. It relies on
`beginfile()' to reset the numbers for the following data file:
@@ -16572,7 +17254,7 @@ those numbers for the file that was just read. It relies on
}
There is one rule that is executed for each line. It adds the length
-of the record, plus one, to `chars'.(2) Adding one plus the record
+of the record, plus one, to `chars'.(1) Adding one plus the record
length is needed because the newline character separating records (the
value of `RS') is not part of the record itself, and thus not included
in its length. Next, `lines' is incremented for each line read, and
@@ -16602,15 +17284,11 @@ in its length. Next, `lines' is incremented for each line read, and
---------- Footnotes ----------
- (1) `wc' can't just use the value of `FNR' in `endfile()'. If you
-examine the code in *note Filetrans Function::, you will see that `FNR'
-has already been reset by the time `endfile()' is called.
-
- (2) Since `gawk' understands multibyte locales, this code counts
+ (1) Since `gawk' understands multibyte locales, this code counts
characters, not bytes.

-File: gawk.info, Node: Miscellaneous Programs, Prev: Clones, Up: Sample Programs
+File: gawk.info, Node: Miscellaneous Programs, Next: Programs Summary, Prev: Clones, Up: Sample Programs
11.3 A Grab Bag of `awk' Programs
=================================
@@ -16791,7 +17469,7 @@ alarm:
# how long to sleep for
naptime = target - current
if (naptime <= 0) {
- print "time is in the past!" > "/dev/stderr"
+ print "alarm: time is in the past!" > "/dev/stderr"
exit 1
}
@@ -16839,11 +17517,11 @@ there are more characters in the "from" list than in the "to" list, the
last character of the "to" list is used for the remaining characters in
the "from" list.
- Some time ago, a user proposed that a transliteration function should
-be added to `gawk'. The following program was written to prove that
-character transliteration could be done with a user-level function.
-This program is not as complete as the system `tr' utility but it does
-most of the job.
+ Once upon a time, a user proposed that a transliteration function
+should be added to `gawk'. The following program was written to prove
+that character transliteration could be done with a user-level
+function. This program is not as complete as the system `tr' utility
+but it does most of the job.
The `translate' program demonstrates one of the few weaknesses of
standard `awk': dealing with individual characters is very painful,
@@ -16924,8 +17602,8 @@ record:
While it is possible to do character transliteration in a user-level
function, it is not necessarily efficient, and we (the `gawk' authors)
started to consider adding a built-in function. However, shortly after
-writing this program, we learned that the System V Release 4 `awk' had
-added the `toupper()' and `tolower()' functions (*note String
+writing this program, we learned that Brian Kernighan had added the
+`toupper()' and `tolower()' functions to his `awk' (*note String
Functions::). These functions handle the vast majority of the cases
where character transliteration is necessary, and so we chose to simply
add those functions to `gawk' as well and then leave well enough alone.
@@ -16937,10 +17615,10 @@ program.
---------- Footnotes ----------
- (1) On some older systems, including Solaris, `tr' may require that
-the lists be written as range expressions enclosed in square brackets
-(`[a-z]') and quoted, to prevent the shell from attempting a file name
-expansion. This is not a feature.
+ (1) On some older systems, including Solaris, the system version of
+`tr' may require that the lists be written as range expressions
+enclosed in square brackets (`[a-z]') and quoted, to prevent the shell
+from attempting a file name expansion. This is not a feature.
(2) This program was written before `gawk' acquired the ability to
split each character in a string into separate array elements.
@@ -17060,7 +17738,7 @@ File: gawk.info, Node: Word Sorting, Next: History Sorting, Prev: Labels Prog
When working with large amounts of text, it can be interesting to know
how often different words appear. For example, an author may overuse
-certain words, in which case she might wish to find synonyms to
+certain words, in which case he or she might wish to find synonyms to
substitute for words that appear too often. This node develops a
program for counting words and presenting the frequency information in
a useful format.
@@ -17123,6 +17801,10 @@ script. Here is the new version of the program:
printf "%s\t%d\n", word, freq[word]
}
+ The regexp `/[^[:alnum:]_[:blank:]]/' might have been written
+`/[[:punct:]]/', but then underscores would also be removed, and we
+want to keep them.
+
Assuming we have saved this program in a file named `wordfreq.awk',
and that the data is in `file1', the following pipeline:
@@ -17200,8 +17882,7 @@ information. For example, using the following `print' statement in the
print data[lines[i]], lines[i]
- This works because `data[$0]' is incremented each time a line is
-seen.
+This works because `data[$0]' is incremented each time a line is seen.

File: gawk.info, Node: Extract Program, Next: Simple Sed, Prev: History Sorting, Up: Miscellaneous Programs
@@ -17290,7 +17971,7 @@ with a zero exit status, signifying OK:
/^@c(omment)?[ \t]+system/ \
{
if (NF < 3) {
- e = (FILENAME ":" FNR)
+ e = ("extract: " FILENAME ":" FNR)
e = (e ": badly formed `system' line")
print e > "/dev/stderr"
next
@@ -17299,7 +17980,7 @@ with a zero exit status, signifying OK:
$2 = ""
stat = system($0)
if (stat != 0) {
- e = (FILENAME ":" FNR)
+ e = ("extract: " FILENAME ":" FNR)
e = (e ": warning: system returned " stat)
print e > "/dev/stderr"
}
@@ -17329,16 +18010,17 @@ function (*note String Functions::). The `@' symbol is used as the
separator character. Each element of `a' that is empty indicates two
successive `@' symbols in the original line. For each two empty
elements (`@@' in the original file), we have to add a single `@'
-symbol back in.(1)
+symbol back in.
When the processing of the array is finished, `join()' is called
-with the value of `SUBSEP', to rejoin the pieces back into a single
-line. That line is then printed to the output file:
+with the value of `SUBSEP' (*note Multidimensional::), to rejoin the
+pieces back into a single line. That line is then printed to the
+output file:
/^@c(omment)?[ \t]+file/ \
{
if (NF != 3) {
- e = (FILENAME ":" FNR ": badly formed `file' line")
+ e = ("extract: " FILENAME ":" FNR ": badly formed `file' line")
print e > "/dev/stderr"
next
}
@@ -17389,7 +18071,7 @@ closing the open file:
function unexpected_eof()
{
- printf("%s:%d: unexpected EOF or error\n",
+ printf("extract: %s:%d: unexpected EOF or error\n",
FILENAME, FNR) > "/dev/stderr"
exit 1
}
@@ -17399,11 +18081,6 @@ closing the open file:
close(curfile)
}
- ---------- Footnotes ----------
-
- (1) This program was written before `gawk' had the `gensub()'
-function. Consider how you might use it to simplify the code.
-

File: gawk.info, Node: Simple Sed, Next: Igawk Program, Prev: Extract Program, Up: Miscellaneous Programs
@@ -17580,10 +18257,10 @@ are several cases of interest:
programming trick. Don't worry about it if you are not familiar
with `sh'.)
-`-v, -F'
+`-v', `-F'
These are saved and passed on to `gawk'.
-`-f, --file, --file=, -Wfile='
+`-f', `--file', `--file=', `-Wfile='
The file name is appended to the shell variable `program' with an
`@include' statement. The `expr' utility is used to remove the
leading option part of the argument (e.g., `--file='). (Typical
@@ -17592,10 +18269,10 @@ are several cases of interest:
sequences in their arguments, possibly mangling the program text.
Using `expr' avoids this problem.)
-`--source, --source=, -Wsource='
+`--source', `--source=', `-Wsource='
The source text is appended to `program'.
-`--version, -Wversion'
+`--version', `-Wversion'
`igawk' prints its version number, runs `gawk --version' to get
the `gawk' version information, and then exits.
@@ -17741,12 +18418,12 @@ which represents the current directory:
pathlist[i] = "."
}
- The stack is initialized with `ARGV[1]', which will be `/dev/stdin'.
-The main loop comes next. Input lines are read in succession. Lines
-that do not start with `@include' are printed verbatim. If the line
-does start with `@include', the file name is in `$2'. `pathto()' is
-called to generate the full path. If it cannot, then the program
-prints an error message and continues.
+ The stack is initialized with `ARGV[1]', which will be
+`"/dev/stdin"'. The main loop comes next. Input lines are read in
+succession. Lines that do not start with `@include' are printed
+verbatim. If the line does start with `@include', the file name is in
+`$2'. `pathto()' is called to generate the full path. If it cannot,
+then the program prints an error message and continues.
The next thing to check is if the file is included already. The
`processed' array is indexed by the full file name of each included
@@ -17769,7 +18446,7 @@ zero, the program is done:
}
fpath = pathto($2)
if (fpath == "") {
- printf("igawk:%s:%d: cannot find %s\n",
+ printf("igawk: %s:%d: cannot find %s\n",
input[stackptr], FNR, $2) > "/dev/stderr"
continue
}
@@ -17823,7 +18500,7 @@ supplied.
The `eval' command is a shell construct that reruns the shell's
parsing process. This keeps things properly quoted.
- This version of `igawk' represents my fifth version of this program.
+ This version of `igawk' represents the fifth version of this program.
There are four key simplifications that make the program work better:
* Using `@include' even for the files named with `-f' makes building
@@ -17853,26 +18530,6 @@ manipulation using the shell than it is in `awk'.
Finally, `igawk' shows that it is not always necessary to add new
features to a program; they can often be layered on top.
- As an additional example of this, consider the idea of having two
-files in a directory in the search path:
-
-`default.awk'
- This file contains a set of default library functions, such as
- `getopt()' and `assert()'.
-
-`site.awk'
- This file contains library functions that are specific to a site or
- installation; i.e., locally developed functions. Having a
- separate file allows `default.awk' to change with new `gawk'
- releases, without requiring the system administrator to update it
- each time by adding the local functions.
-
- One user suggested that `gawk' be modified to automatically read
-these files upon startup. Instead, it would be very simple to modify
-`igawk' to do this. Since `igawk' can process nested `@include'
-directives, `default.awk' could simply contain `@include' statements
-for the desired library functions.
-
---------- Footnotes ----------
(1) Fully explaining the `sh' language is beyond the scope of this
@@ -17997,7 +18654,129 @@ supplies the following copyright terms:
X*(X-x)-o*o,(x+X)*o*o+o,x*(X-x)-O-O,x-O+(O+o+X+x)*(o+O),X*X-X*(x-O)-x+O,
O+X*(o*(o+O)+O),+x+O+X*o,x*(x-o),(o+X+x)*o*o-(x-O-O),O+(X-x)*(X+O),x-O}'
- We leave it to you to determine what the program does.
+ We leave it to you to determine what the program does. (If you are
+truly desperate to understand it, see Chris Johansen's explanation,
+which is embedded in the Texinfo source file for this Info file.)
+
+
+File: gawk.info, Node: Programs Summary, Next: Programs Exercises, Prev: Miscellaneous Programs, Up: Sample Programs
+
+11.4 Summary
+============
+
+ * The functions provided in this major node and the previous one
+ continue on the theme that reading programs is an excellent way to
+ learn Good Programming.
+
+ * Using `#!' to make `awk' programs directly runnable makes them
+ easier to use. Otherwise, invoke the program using `awk -f ...'.
+
+ * Reimplementing standard POSIX programs in `awk' is a pleasant
+ exercise; `awk''s expressive power lets you write such programs in
+ relatively few lines of code, yet they are functionally complete
+ and usable.
+
+ * One of standard `awk''s weaknesses is working with individual
+ characters. The ability to use `split()' with the empty string as
+ the separator can considerably simplify such tasks.
+
+ * The library functions from *note Library Functions::, proved their
+ usefulness for a number of real (if small) programs.
+
+ * Besides reinventing POSIX wheels, other programs solved a
+ selection of interesting problems, such as finding duplicates
+ words in text, printing mailing labels, and finding anagrams.
+
+
+
+File: gawk.info, Node: Programs Exercises, Prev: Programs Summary, Up: Sample Programs
+
+11.5 Exercises
+==============
+
+ 1. Rewrite `cut.awk' (*note Cut Program::) using `split()' with `""'
+ as the seperator.
+
+ 2. In *note Egrep Program::, we mentioned that `egrep -i' could be
+ simulated in versions of `awk' without `IGNORECASE' by using
+ `tolower()' on the line and the pattern. In a footnote there, we
+ also mentioned that this solution has a bug: the translated line is
+ output, and not the original one. Fix this problem.
+
+ 3. The POSIX version of `id' takes options that control which
+ information is printed. Modify the `awk' version (*note Id
+ Program::) to accept the same arguments and perform in the same
+ way.
+
+ 4. The `split.awk' program (*note Split Program::) uses the `chr()'
+ and `ord()' functions to move through the letters of the alphabet.
+ Modify the program to instead use only the `awk' built-in
+ functions, such as `index()' and `substr()'.
+
+ 5. The `split.awk' program (*note Split Program::) assumes that
+ letters are contiguous in the character set, which isn't true for
+ EBCDIC systems. Fix this problem.
+
+ 6. Why can't the `wc.awk' program (*note Wc Program::) just use the
+ value of `FNR' in `endfile()'? Hint: Examine the code in *note
+ Filetrans Function::.
+
+ 7. Manipulation of individual characters in the `translate' program
+ (*note Translate Program::) is painful using standard `awk'
+ functions. Given that `gawk' can split strings into individual
+ characters using `""' as the separator, how might you use this
+ feature to simplify the program?
+
+ 8. The `extract.awk' program (*note Extract Program::) was written
+ before `gawk' had the `gensub()' function. Use it to simplify the
+ code.
+
+ 9. Compare the performance of the `awksed.awk' program (*note Simple
+ Sed::) with the more straightforward:
+
+ BEGIN {
+ pat = ARGV[1]
+ repl = ARGV[2]
+ ARGV[1] = ARGV[2] = ""
+ }
+
+ { gsub(pat, repl); print }
+
+ 10. What are the advantages and disadvantages of `awksed.awk' versus
+ the real `sed' utility?
+
+ 11. In *note Igawk Program::, we mentioned that not trying to save the
+ line read with `getline' in the `pathto()' function when testing
+ for the file's accessibility for use with the main program
+ simplifies things considerably. What problem does this engender
+ though?
+
+ 12. As an additional example of the idea that it is not always
+ necessary to add new features to a program, consider the idea of
+ having two files in a directory in the search path:
+
+ `default.awk'
+ This file contains a set of default library functions, such
+ as `getopt()' and `assert()'.
+
+ `site.awk'
+ This file contains library functions that are specific to a
+ site or installation; i.e., locally developed functions.
+ Having a separate file allows `default.awk' to change with
+ new `gawk' releases, without requiring the system
+ administrator to update it each time by adding the local
+ functions.
+
+ One user suggested that `gawk' be modified to automatically read
+ these files upon startup. Instead, it would be very simple to
+ modify `igawk' to do this. Since `igawk' can process nested
+ `@include' directives, `default.awk' could simply contain
+ `@include' statements for the desired library functions. Make
+ this change.
+
+ 13. Modify `anagram.awk' (*note Anagram Program::), to avoid the use
+ of the external `sort' utility.
+

File: gawk.info, Node: Advanced Features, Next: Internationalization, Prev: Sample Programs, Up: Top
@@ -18043,6 +18822,7 @@ own:
* Two-way I/O:: Two-way communications with another process.
* TCP/IP Networking:: Using `gawk' for network programming.
* Profiling:: Profiling your `awk' programs.
+* Advanced Features Summary:: Summary of advanced features.

File: gawk.info, Node: Nondecimal Data, Next: Array Sorting, Up: Advanced Features
@@ -18066,8 +18846,8 @@ your data as numeric:
The `print' statement treats its expressions as strings. Although the
fields can act as numbers when necessary, they are still strings, so
-`print' does not try to treat them numerically. You may need to add
-zero to a field to force it to be treated as a number. For example:
+`print' does not try to treat them numerically. You need to add zero
+to a field to force it to be treated as a number. For example:
$ echo 0123 123 0x123 | gawk --non-decimal-data '
> { print $1, $2, $3
@@ -18082,7 +18862,7 @@ request it.
CAUTION: _Use of this option is not recommended._ It can break old
programs very badly. Instead, use the `strtonum()' function to
- convert your data (*note Nondecimal-numbers::). This makes your
+ convert your data (*note String Functions::). This makes your
programs easier to write and easier to read, and leads to less
surprising results.
@@ -18121,7 +18901,7 @@ you do this.
*note Controlling Scanning::, describes how you can assign special,
pre-defined values to `PROCINFO["sorted_in"]' in order to control the
-order in which `gawk' will traverse an array during a `for' loop.
+order in which `gawk' traverses an array during a `for' loop.
In addition, the value of `PROCINFO["sorted_in"]' can be a function
name. This lets you traverse an array based on any custom criterion.
@@ -18394,9 +19174,9 @@ become the values of the result array:
So far, so good. Now it starts to get interesting. Both `asort()'
and `asorti()' accept a third string argument to control comparison of
-array elements. In *note String Functions::, we ignored this third
-argument; however, the time has now come to describe how this argument
-affects these two functions.
+array elements. When we introduced `asort()' and `asorti()' in *note
+String Functions::, we ignored this third argument; however, now is the
+time to describe how this argument affects these two functions.
Basically, the third argument specifies how the array is to be
sorted. There are two possibilities. As with `PROCINFO["sorted_in"]',
@@ -18544,7 +19324,8 @@ the `gawk' program. Once all of the data has been read, `gawk'
terminates the coprocess and exits.
As a side note, the assignment `LC_ALL=C' in the `sort' command
-ensures traditional Unix (ASCII) sorting from `sort'.
+ensures traditional Unix (ASCII) sorting from `sort'. This is not
+strictly necessary here, but it's good to know how to do this.
You may also use pseudo-ttys (ptys) for two-way communication
instead of pipes, if your system supports them. This is done on a
@@ -18556,10 +19337,10 @@ per-command basis, by setting a special element in the `PROCINFO' array
print ... |& command # start two-way pipe
...
-Using ptys avoids the buffer deadlock issues described earlier, at some
-loss in performance. If your system does not have ptys, or if all the
-system's ptys are in use, `gawk' automatically falls back to using
-regular pipes.
+Using ptys usually avoids the buffer deadlock issues described earlier,
+at some loss in performance. If your system does not have ptys, or if
+all the system's ptys are in use, `gawk' automatically falls back to
+using regular pipes.
---------- Footnotes ----------
@@ -18644,7 +19425,7 @@ much more complete introduction and discussion, as well as extensive
examples.

-File: gawk.info, Node: Profiling, Prev: TCP/IP Networking, Up: Advanced Features
+File: gawk.info, Node: Profiling, Next: Advanced Features Summary, Prev: TCP/IP Networking, Up: Advanced Features
12.5 Profiling Your `awk' Programs
==================================
@@ -18840,7 +19621,7 @@ As usual, the profiled version of the program is written to
`awkprof.out', or to a different file if one specified with the
`--profile' option.
- Along with the regular profile, as shown earlier, the profile
+ Along with the regular profile, as shown earlier, the profile file
includes a trace of any active functions:
# Function Call Stack:
@@ -18869,8 +19650,50 @@ by the `Ctrl-<\>' key.
called this way, `gawk' "pretty prints" the program into `awkprof.out',
without any execution counts.
- NOTE: The `--pretty-print' option still runs your program. This
- will change in the next major release.
+ NOTE: Once upon a time, the `--pretty-print' option would also run
+ your program. This is is no longer the case.
+
+
+File: gawk.info, Node: Advanced Features Summary, Prev: Profiling, Up: Advanced Features
+
+12.6 Summary
+============
+
+ * The `--non-decimal-data' option causes `gawk' to treat octal- and
+ hexadecimal-looking input data as octal and hexadecimal. This
+ option should be used with caution or not at all; use of
+ `strtonum()' is preferable.
+
+ * You can take over complete control of sorting in `for (INDX in
+ ARRAY)' array traversal by setting `PROCINFO["sorted_in"]' to the
+ name of a user-defined function that does the comparison of array
+ elements based on index and value.
+
+ * Similarly, you can supply the name of a user-defined comparison
+ function as the third argument to either `asort()' or `asorti()'
+ to control how those functions sort arrays. Or you may provide one
+ of the predefined control strings that work for
+ `PROCINFO["sorted_in"]'.
+
+ * You can use the `|&' operator to create a two-way pipe to a
+ co-process. You read from the co-process with `getline' and write
+ to it with `print' or `printf'. Use `close()' to close off the
+ co-process completely, or optionally, close off one side of the
+ two-way communications.
+
+ * By using special "file names" with the `|&' operator, you can open
+ a TCP/IP (or UDP/IP) connection to remote hosts in the Internet.
+ `gawk' supports both IPv4 an IPv6.
+
+ * You can generate statement count profiles of your program. This
+ can help you determine which parts of your program may be taking
+ the most time and let you tune them more easily. Sending the
+ `USR1' signal while profiling causes `gawk' to dump the profile
+ and keep going, including a function call stack.
+
+ * You can also just "pretty print" the program. This currently also
+ runs the program, but that will change in the next major release.
+

File: gawk.info, Node: Internationalization, Next: Debugger, Prev: Advanced Features, Up: Top
@@ -18902,6 +19725,7 @@ requirement.
* Translator i18n:: Features for the translator.
* I18N Example:: A simple i18n example.
* Gawk I18N:: `gawk' is also internationalized.
+* I18N Summary:: Summary of I18N stuff.

File: gawk.info, Node: I18N and L10N, Next: Explaining gettext, Up: Internationalization
@@ -18924,6 +19748,7 @@ File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev: I18N
13.2 GNU `gettext'
==================
+`gawk' uses GNU `gettext' to provide its internationalization features.
The facilities in GNU `gettext' focus on messages; strings printed by a
program, either directly or via formatting with `printf' or
`sprintf()'.(1)
@@ -19070,7 +19895,7 @@ internationalization:
for translation at runtime. String constants without a leading
underscore are not translated.
-`dcgettext(STRING [, DOMAIN [, CATEGORY]])'
+``dcgettext(STRING' [`,' DOMAIN [`,' CATEGORY]]`)''
Return the translation of STRING in text domain DOMAIN for locale
category CATEGORY. The default value for DOMAIN is the current
value of `TEXTDOMAIN'. The default value for CATEGORY is
@@ -19087,7 +19912,7 @@ internationalization:
be simple and to allow for reasonable `awk'-style default
arguments.
-`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])'
+``dcngettext(STRING1, STRING2, NUMBER' [`,' DOMAIN [`,' CATEGORY]]`)''
Return the plural form used for NUMBER of the translation of
STRING1 and STRING2 in text domain DOMAIN for locale category
CATEGORY. STRING1 is the English singular variant of a message,
@@ -19098,7 +19923,7 @@ internationalization:
The same remarks about argument order as for the `dcgettext()'
function apply.
-`bindtextdomain(DIRECTORY [, DOMAIN])'
+``bindtextdomain(DIRECTORY' [`,' DOMAIN ]`)''
Change the directory in which `gettext' looks for `.gmo' files, in
case they will not or cannot be placed in the standard locations
(e.g., during testing). Return the directory in which DOMAIN is
@@ -19403,19 +20228,20 @@ Following are the translations:
msgstr "Like, the scoop is"
The next step is to make the directory to hold the binary message
-object file and then to create the `guide.gmo' file. The directory
+object file and then to create the `guide.mo' file. We pretend that
+our file is to be used in the `en_US.UTF-8' locale. The directory
layout shown here is standard for GNU `gettext' on GNU/Linux systems.
Other versions of `gettext' may use a different layout:
- $ mkdir en_US en_US/LC_MESSAGES
+ $ mkdir en_US.UTF-8 en_US.UTF-8/LC_MESSAGES
The `msgfmt' utility does the conversion from human-readable `.po'
-file to machine-readable `.gmo' file. By default, `msgfmt' creates a
+file to machine-readable `.mo' file. By default, `msgfmt' creates a
file named `messages'. This file must be renamed and placed in the
proper directory so that `gawk' can find it:
$ msgfmt guide-mellow.po
- $ mv messages en_US/LC_MESSAGES/guide.gmo
+ $ mv messages en_US.UTF-8/LC_MESSAGES/guide.mo
Finally, we run the program to test it:
@@ -19438,7 +20264,7 @@ and `bindtextdomain()' (*note I18N Portability::) are in a file named
(1) Perhaps it would be better if it were called "Hippy." Ah, well.

-File: gawk.info, Node: Gawk I18N, Prev: I18N Example, Up: Internationalization
+File: gawk.info, Node: Gawk I18N, Next: I18N Summary, Prev: I18N Example, Up: Internationalization
13.6 `gawk' Can Speak Your Language
===================================
@@ -19446,13 +20272,46 @@ File: gawk.info, Node: Gawk I18N, Prev: I18N Example, Up: Internationalizatio
`gawk' itself has been internationalized using the GNU `gettext'
package. (GNU `gettext' is described in complete detail in *note (GNU
`gettext' utilities)Top:: gettext, GNU gettext tools.) As of this
-writing, the latest version of GNU `gettext' is version 0.18.2.1
-(ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.2.1.tar.gz).
+writing, the latest version of GNU `gettext' is version 0.19.1
+(ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.1.tar.gz).
If a translation of `gawk''s messages exists, then `gawk' produces
usage messages, warnings, and fatal errors in the local language.

+File: gawk.info, Node: I18N Summary, Prev: Gawk I18N, Up: Internationalization
+
+13.7 Summary
+============
+
+ * Internationalization means writing a program such that it can use
+ multiple languages without requiring source-code changes.
+ Localization means providing the data necessary for an
+ internationalized program to work in a particular language.
+
+ * `gawk' uses GNU `gettext' to let you internationalize and localize
+ `awk' programs. A program's text domain identifies the program
+ for grouping all messages and other data together.
+
+ * You mark a program's strings for translation by preceding them with
+ an underscore. Once that is done, the strings are extracted into a
+ `.pot' file. This file is copied for each language into a `.po'
+ file, and the `.po' files are compiled into `.gmo' files for use
+ at runtime.
+
+ * You can use position specifications with `sprintf()' and `printf'
+ to rearrange the placement of argument values in formatted strings
+ and output. This is useful for the translations of format control
+ strings.
+
+ * The internationalization features have been designed so that they
+ can be easily worked around in a standard `awk'.
+
+ * `gawk' itself has been internationalized and ships with a number
+ of translations for its messages.
+
+
+
File: gawk.info, Node: Debugger, Next: Arbitrary Precision Arithmetic, Prev: Internationalization, Up: Top
14 Debugging `awk' Programs
@@ -19475,12 +20334,13 @@ program is easy.
* List of Debugger Commands:: Main debugger commands.
* Readline Support:: Readline support.
* Limitations:: Limitations and future plans.
+* Debugging Summary:: Debugging summary.

File: gawk.info, Node: Debugging, Next: Sample Debugging Session, Up: Debugger
-14.1 Introduction to `gawk' Debugger
-====================================
+14.1 Introduction to The `gawk' Debugger
+========================================
This minor node introduces debugging in general and begins the
discussion of debugging in `gawk'.
@@ -19766,11 +20626,7 @@ typing `n' (for "next"):
decides whether to give the lines the special "field skipping" treatment
indicated by the `-f' command-line option. (Notice that we skipped
from where we were before at line 64 to here, since the condition in
-line 64
-
- if (fcount == 0 && charcount == 0)
-
-was false.)
+line 64 `if (fcount == 0 && charcount == 0)' was false.)
Continuing to step, we now get to the splitting of the current and
last records:
@@ -19876,7 +20732,7 @@ following descriptions, commands which may be abbreviated show the
abbreviation on a second description line. A debugger command name may
also be truncated if that partial name is unambiguous. The debugger has
the built-in capability to automatically repeat the previous command
-when just hitting <Enter>. This works for the commands `list', `next',
+just by hitting <Enter>. This works for the commands `list', `next',
`nexti', `step', `stepi' and `continue' executed without any argument.
* Menu:
@@ -20136,7 +20992,7 @@ AWK STATEMENTS
`set' VAR`='VALUE
Assign a constant (number or string) value to an `awk' variable or
field. String values must be enclosed between double quotes
- (`"..."').
+ (`"'...`"').
You can also set special `awk' variables, such as `FS', `NF',
`NR', etc.
@@ -20190,11 +21046,12 @@ are:
`frame' [N]
`f' [N]
- Select and print (frame number, function and argument names,
- source file, and the source line) stack frame N. Frame 0 is the
- currently executing, or "innermost", frame (function call), frame
- 1 is the frame that called the innermost one. The highest numbered
- frame is the one for the main program.
+ Select and print stack frame N. Frame 0 is the currently
+ executing, or "innermost", frame (function call), frame 1 is the
+ frame that called the innermost one. The highest numbered frame is
+ the one for the main program. The printed information consists of
+ the frame number, function and argument names, source file, and
+ the source line.
`up' [COUNT]
Move COUNT (default 1) frames up the stack toward the outermost
@@ -20279,16 +21136,16 @@ from a file. The commands are:
`prompt'
The debugger prompt. The default is `gawk> '.
- `save_history [on | off]'
+ `save_history' [`on' | `off']
Save command history to file `./.gawk_history'. The default
is `on'.
- `save_options [on | off]'
+ `save_options' [`on' | `off']
Save current options to file `./.gawkrc' upon exit. The
default is `on'. Options are read back in to the next
session upon startup.
- `trace [on | off]'
+ `trace' [`on' | `off']
Turn instruction tracing on or off. The default is `off'.
`save' FILENAME
@@ -20327,7 +21184,7 @@ categories, as follows:
Program::) demonstrates:
gawk> dump
- -| # BEGIN
+ -| # BEGIN
-|
-| [ 1:0xfcd340] Op_rule : [in_rule = BEGIN] [source_file = brini.awk]
-| [ 1:0xfcc240] Op_push_i : "~" [MALLOC|STRING|STRCUR]
@@ -20417,7 +21274,7 @@ categories, as follows:
accidentally type `q' or `quit', to make sure you really want to
quit.
-`trace' `on' | `off'
+`trace' [`on' | `off']
Turn on or off a continuous printing of instructions which are
about to be executed, along with printing the `awk' line which they
implement. The default is `off'.
@@ -20433,9 +21290,10 @@ File: gawk.info, Node: Readline Support, Next: Limitations, Prev: List of Deb
14.4 Readline Support
=====================
-If `gawk' is compiled with the `readline' library, you can take
-advantage of that library's command completion and history expansion
-features. The following types of completion are available:
+If `gawk' is compiled with the `readline' library
+(http://cnswww.cns.cwru.edu/php/chet/readline/readline.html), you can
+take advantage of that library's command completion and history
+expansion features. The following types of completion are available:
Command completion
Command names.
@@ -20455,7 +21313,7 @@ Variable name completion

-File: gawk.info, Node: Limitations, Prev: Readline Support, Up: Debugger
+File: gawk.info, Node: Limitations, Next: Debugging Summary, Prev: Readline Support, Up: Debugger
14.5 Limitations and Future Plans
=================================
@@ -20474,15 +21332,14 @@ some limitations. A few which are worth being aware of are:
you will realize that much of the internal manipulation of data in
`gawk', as in many interpreters, is done on a stack. `Op_push',
`Op_pop', etc., are the "bread and butter" of most `gawk' code.
- Unfortunately, as of now, the `gawk' debugger does not allow you
- to examine the stack's contents.
- That is, the intermediate results of expression evaluation are on
- the stack, but cannot be printed. Rather, only variables which
- are defined in the program can be printed. Of course, a
- workaround for this is to use more explicit variables at the
- debugging stage and then change back to obscure, perhaps more
- optimal code later.
+ Unfortunately, as of now, the `gawk' debugger does not allow you
+ to examine the stack's contents. That is, the intermediate
+ results of expression evaluation are on the stack, but cannot be
+ printed. Rather, only variables which are defined in the program
+ can be printed. Of course, a workaround for this is to use more
+ explicit variables at the debugging stage and then change back to
+ obscure, perhaps more optimal code later.
* There is no way to look "inside" the process of compiling regular
expressions to see if you got it right. As an `awk' programmer,
@@ -20503,362 +21360,319 @@ features may be added, and of course feel free to try to add them
yourself!

-File: gawk.info, Node: Arbitrary Precision Arithmetic, Next: Dynamic Extensions, Prev: Debugger, Up: Top
+File: gawk.info, Node: Debugging Summary, Prev: Limitations, Up: Debugger
-15 Arithmetic and Arbitrary Precision Arithmetic with `gawk'
-************************************************************
+14.6 Summary
+============
- There's a credibility gap: We don't know how much of the
- computer's answers to believe. Novice computer users solve this
- problem by implicitly trusting in the computer as an infallible
- authority; they tend to believe that all digits of a printed
- answer are significant. Disillusioned computer users have just the
- opposite approach; they are constantly afraid that their answers
- are almost meaningless.(1) -- Donald Knuth
+ * Programs rarely work correctly the first time. Finding bugs is
+ "debugging" and a program that helps you find bugs is a
+ "debugger". `gawk' has a built-in debugger that works very
+ similarly to the GNU Debugger, GDB.
- This major node discusses issues that you may encounter when
-performing arithmetic. It begins by discussing some of the general
-attributes of computer arithmetic, along with how this can influence
-what you see when running `awk' programs. This discussion applies to
-all versions of `awk'.
+ * Debuggers let you step through your program one statement at a
+ time, examine and change variable and array values, and do a
+ number of other things that let understand what your program is
+ actually doing (as opposed to what it is supposed to do).
- The major node then moves on to describe "arbitrary precision
-arithmetic", a feature which is specific to `gawk'.
-
-* Menu:
+ * Like most debuggers, the `gawk' debugger works in terms of stack
+ frames, and lets you set both breakpoints (stop at a point in the
+ code) and watchpoints (stop when a data value changes).
-* General Arithmetic:: An introduction to computer arithmetic.
-* Floating-point Programming:: Effective Floating-point Programming.
-* Gawk and MPFR:: How `gawk' provides
- arbitrary-precision arithmetic.
-* Arbitrary Precision Floats:: Arbitrary Precision Floating-point Arithmetic
- with `gawk'.
-* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic with
- `gawk'.
+ * The debugger command set is fairly complete, providing control over
+ breakpoints, execution, viewing and changing data, working with
+ the stack, getting information, and other tasks.
- ---------- Footnotes ----------
+ * If the `readline' library is available when `gawk' is compiled, it
+ is used by the debugger to provide command-line history and
+ editing.
- (1) Donald E. Knuth. `The Art of Computer Programming'. Volume 2,
-`Seminumerical Algorithms', third edition, 1998, ISBN 0-201-89683-4, p.
-229.

-File: gawk.info, Node: General Arithmetic, Next: Floating-point Programming, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Arbitrary Precision Arithmetic, Next: Dynamic Extensions, Prev: Debugger, Up: Top
-15.1 A General Description of Computer Arithmetic
-=================================================
+15 Arithmetic and Arbitrary Precision Arithmetic with `gawk'
+************************************************************
-Within computers, there are two kinds of numeric values: "integers" and
-"floating-point". In school, integer values were referred to as
-"whole" numbers--that is, numbers without any fractional part, such as
-1, 42, or -17. The advantage to integer numbers is that they represent
-values exactly. The disadvantage is that their range is limited. On
-most systems, this range is -2,147,483,648 to 2,147,483,647. However,
-many systems now support a range from -9,223,372,036,854,775,808 to
-9,223,372,036,854,775,807.
-
- Integer values come in two flavors: "signed" and "unsigned". Signed
-values may be negative or positive, with the range of values just
-described. Unsigned values are always positive. On most systems, the
-range is from 0 to 4,294,967,295. However, many systems now support a
-range from 0 to 18,446,744,073,709,551,615.
-
- Floating-point numbers represent what are called "real" numbers;
-i.e., those that do have a fractional part, such as 3.1415927. The
-advantage to floating-point numbers is that they can represent a much
-larger range of values. The disadvantage is that there are numbers
-that they cannot represent exactly. `awk' uses "double precision"
-floating-point numbers, which can hold more digits than "single
-precision" floating-point numbers.
-
- There a several important issues to be aware of, described next.
+This major node introduces some basic concepts relating to how
+computers do arithmetic and briefly lists the features in `gawk' for
+performing arbitrary precision floating point computations. It then
+proceeds to describe floating-point arithmetic, which is what `awk'
+uses for all its computations, including a discussion of arbitrary
+precision floating point arithmetic, which is a feature available only
+in `gawk'. It continues on to present arbitrary precision integers, and
+concludes with a description of some points where `gawk' and the POSIX
+standard are not quite in agreement.
* Menu:
-* Floating Point Issues:: Stuff to know about floating-point numbers.
-* Integer Programming:: Effective integer programming.
+* Computer Arithmetic:: A quick intro to computer math.
+* Math Definitions:: Defining terms used.
+* MPFR features:: The MPFR features in `gawk'.
+* FP Math Caution:: Things to know.
+* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic with
+ `gawk'.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+* Floating point summary:: Summary of floating point discussion.

-File: gawk.info, Node: Floating Point Issues, Next: Integer Programming, Up: General Arithmetic
+File: gawk.info, Node: Computer Arithmetic, Next: Math Definitions, Up: Arbitrary Precision Arithmetic
-15.1.1 Floating-Point Number Caveats
-------------------------------------
-
-This minor node describes some of the issues involved in using
-floating-point numbers.
+15.1 A General Description of Computer Arithmetic
+=================================================
- There is a very nice paper on floating-point arithmetic
-(http://www.validlab.com/goldberg/paper.pdf) by David Goldberg, "What
-Every Computer Scientist Should Know About Floating-point Arithmetic,"
-`ACM Computing Surveys' *23*, 1 (1991-03), 5-48. This is worth reading
-if you are interested in the details, but it does require a background
-in computer science.
+Until now, we have worked with data as either numbers or strings.
+Ultimately, however, computers represent everything in terms of "binary
+digits", or "bits". A decimal digit can take on any of 10 values: zero
+through nine. A binary digit can take on any of two values, zero or
+one. Using binary, computers (and computer software) can represent and
+manipulate numerical and character data. In general, the more bits you
+can use to represent a particular thing, the greater the range of
+possible values it can take on.
+
+ Modern computers support at least two, and often more, ways to do
+arithmetic. Each kind of arithmetic uses a different representation
+(organization of the bits) for the numbers. The kinds of arithmetic
+that interest us are:
+
+Decimal arithmetic
+ This is the kind of arithmetic you learned in elementary school,
+ using paper and pencil (and/or a calculator). In theory, numbers
+ can have an arbitrary number of digits on either side (or both
+ sides) of the decimal point, and the results of a computation are
+ always exact.
+
+ Some modern system can do decimal arithmetic in hardware, but
+ usually you need a special software library to provide access to
+ these instructions. There are also libraries that do decimal
+ arithmetic entirely in software.
+
+ Despite the fact that some users expect `gawk' to be performing
+ decimal arithmetic,(1) it does not do so.
+
+Integer arithmetic
+ In school, integer values were referred to as "whole" numbers--that
+ is, numbers without any fractional part, such as 1, 42, or -17.
+ The advantage to integer numbers is that they represent values
+ exactly. The disadvantage is that their range is limited.
+
+ In computers, integer values come in two flavors: "signed" and
+ "unsigned". Signed values may be negative or positive, whereas
+ unsigned values are always positive (that is, greater than or equal
+ to zero).
+
+ In computer systems, integer arithmetic is exact, but the possible
+ range of values is limited. Integer arithmetic is generally
+ faster than floating point arithmetic.
+
+Floating point arithmetic
+ Floating-point numbers represent what were called in school "real"
+ numbers; i.e., those that have a fractional part, such as
+ 3.1415927. The advantage to floating-point numbers is that they
+ can represent a much larger range of values than can integers.
+ The disadvantage is that there are numbers that they cannot
+ represent exactly.
+
+ Modern systems support floating point arithmetic in hardware, with
+ a limited range of values. There are software libraries that allow
+ the use of arbitrary precision floating point calculations.
+
+ POSIX `awk' uses "double precision" floating-point numbers, which
+ can hold more digits than "single precision" floating-point
+ numbers. `gawk' has facilities for performing arbitrary precision
+ floating point arithmetic, which we describe in more detail
+ shortly.
+
+ Computers work with integer and floating point values of different
+ranges. Integer values are usually either 32 or 64 bits in size. Single
+precision floating point values occupy 32 bits, whereas double precision
+floating point values occupy 64 bits. Floating point values are always
+signed. The possible ranges of values are shown in the following table.
+
+Numeric representation Miniumum value Maximum value
+---------------------------------------------------------------------------
+32-bit signed integer -2,147,483,648 2,147,483,647
+32-bit unsigned integer 0 4,294,967,295
+64-bit signed integer -9,223,372,036,854,775,8089,223,372,036,854,775,807
+64-bit unsigned integer 0 18,446,744,073,709,551,615
+Single precision `1.175494e-38' `3.402823e+38'
+floating point
+(approximate)
+Double precision `2.225074e-308' `1.797693e+308'
+floating point
+(approximate)
-* Menu:
+ ---------- Footnotes ----------
-* String Conversion Precision:: The String Value Can Lie.
-* Unexpected Results:: Floating Point Numbers Are Not Abstract
- Numbers.
-* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+ (1) We don't know why they expect this, but they do.

-File: gawk.info, Node: String Conversion Precision, Next: Unexpected Results, Up: Floating Point Issues
+File: gawk.info, Node: Math Definitions, Next: MPFR features, Prev: Computer Arithmetic, Up: Arbitrary Precision Arithmetic
-15.1.1.1 The String Value Can Lie
-.................................
+15.2 Other Stuff To Know
+========================
-Internally, `awk' keeps both the numeric value (double precision
-floating-point) and the string value for a variable. Separately, `awk'
-keeps track of what type the variable has (*note Typing and
-Comparison::), which plays a role in how variables are used in
-comparisons.
+The rest of this major node uses a number of terms. Here are some
+informal definitions that should help you work your way through the
+material here.
- It is important to note that the string value for a number may not
-reflect the full value (all the digits) that the numeric value actually
-contains. The following program, `values.awk', illustrates this:
+"Accuracy"
+ A floating-point calculation's accuracy is how close it comes to
+ the real (paper and pencil) value.
- {
- sum = $1 + $2
- # see it for what it is
- printf("sum = %.12g\n", sum)
- # use CONVFMT
- a = "<" sum ">"
- print "a =", a
- # use OFMT
- print "sum =", sum
- }
+"Error"
+ The difference between what the result of a computation "should be"
+ and what it actually is. It is best to minimize error as much as
+ possible.
-This program shows the full value of the sum of `$1' and `$2' using
-`printf', and then prints the string values obtained from both
-automatic conversion (via `CONVFMT') and from printing (via `OFMT').
+"Exponent"
+ The order of magnitude of a value; some number of bits in a
+ floating-point value store the exponent.
- Here is what happens when the program is run:
+"Inf"
+ A special value representing infinity. Operations involving another
+ number and infinity produce infinity.
- $ echo 3.654321 1.2345678 | awk -f values.awk
- -| sum = 4.8888888
- -| a = <4.88889>
- -| sum = 4.88889
+"NaN"
+ "Not A Number." A special value indicating a result that can't
+ happen in real math, but that can happen in floating-point
+ computations.
- This makes it clear that the full numeric value is different from
-what the default string representations show.
+"Normalized"
+ How the significand (see later in this list) is usually stored. The
+ value is adjusted so that the first bit is one, and then that
+ leading one is assumed instead of physically stored. This
+ provides one extra bit of precision.
- `CONVFMT''s default value is `"%.6g"', which yields a value with at
-most six significant digits. For some applications, you might want to
-change it to specify more precision. On most modern machines, most of
-the time, 17 digits is enough to capture a floating-point number's
-value exactly.(1)
+"Precision"
+ The number of bits used to represent a floating-point number. The
+ more bits, the more digits you can represent. Binary and decimal
+ precisions are related approximately, according to the formula:
- ---------- Footnotes ----------
+ PREC = 3.322 * DPS
- (1) Pathological cases can require up to 752 digits (!), but we
-doubt that you need to worry about this.
-
-
-File: gawk.info, Node: Unexpected Results, Next: POSIX Floating Point Problems, Prev: String Conversion Precision, Up: Floating Point Issues
+ Here, PREC denotes the binary precision (measured in bits) and DPS
+ (short for decimal places) is the decimal digits.
-15.1.1.2 Floating Point Numbers Are Not Abstract Numbers
-........................................................
+"Rounding mode"
+ How numbers are rounded up or down when necessary. More details
+ are provided later.
-Unlike numbers in the abstract sense (such as what you studied in high
-school or college arithmetic), numbers stored in computers are limited
-in certain ways. They cannot represent an infinite number of digits,
-nor can they always represent things exactly. In particular,
-floating-point numbers cannot always represent values exactly. Here is
-an example:
+"Significand"
+ A floating point value consists the significand multiplied by 10
+ to the power of the exponent. For example, in `1.2345e67', the
+ significand is `1.2345'.
- $ awk '{ printf("%010d\n", $1 * 100) }'
- 515.79
- -| 0000051579
- 515.80
- -| 0000051579
- 515.81
- -| 0000051580
- 515.82
- -| 0000051582
- Ctrl-d
+"Stability"
+ From the Wikipedia article on numerical stability
+ (http://en.wikipedia.org/wiki/Numerical_stability): "Calculations
+ that can be proven not to magnify approximation errors are called
+ "numerically stable"."
-This shows that some values can be represented exactly, whereas others
-are only approximated. This is not a "bug" in `awk', but simply an
-artifact of how computers represent numbers.
+ See the Wikipedia article on accuracy and precision
+(http://en.wikipedia.org/wiki/Accuracy_and_precision) for more
+information on some of those terms.
- NOTE: It cannot be emphasized enough that the behavior just
- described is fundamental to modern computers. You will see this
- kind of thing happen in _any_ programming language using hardware
- floating-point numbers. It is _not_ a bug in `gawk', nor is it
- something that can be "just fixed."
+ On modern systems, floating-point hardware uses the representation
+and operations defined by the IEEE 754 standard. Three of the standard
+IEEE 754 types are 32-bit single precision, 64-bit double precision and
+128-bit quadruple precision. The standard also specifies extended
+precision formats to allow greater precisions and larger exponent
+ranges. (`awk' uses only the 64-bit double precision format.)
- Another peculiarity of floating-point numbers on modern systems is
-that they often have more than one representation for the number zero!
-In particular, it is possible to represent "minus zero" as well as
-regular, or "positive" zero.
+ *note table-ieee-formats:: lists the precision and exponent field
+values for the basic IEEE 754 binary formats:
- This example shows that negative and positive zero are distinct
-values when stored internally, but that they are in fact equal to each
-other, as well as to "regular" zero:
+Name Total bits Precision emin emax
+---------------------------------------------------------------------------
+Single 32 24 -126 +127
+Double 64 53 -1022 +1023
+Quadruple 128 113 -16382 +16383
- $ gawk 'BEGIN { mz = -0 ; pz = 0
- > printf "-0 = %g, +0 = %g, (-0 == +0) -> %d\n", mz, pz, mz == pz
- > printf "mz == 0 -> %d, pz == 0 -> %d\n", mz == 0, pz == 0
- > }'
- -| -0 = -0, +0 = 0, (-0 == +0) -> 1
- -| mz == 0 -> 1, pz == 0 -> 1
+Table 15.1: Basic IEEE Format Context Values
- It helps to keep this in mind should you process numeric data that
-contains negative zero values; the fact that the zero is negative is
-noted and can affect comparisons.
+ NOTE: The precision numbers include the implied leading one that
+ gives them one extra bit of significand.

-File: gawk.info, Node: POSIX Floating Point Problems, Prev: Unexpected Results, Up: Floating Point Issues
-
-15.1.1.3 Standards Versus Existing Practice
-...........................................
+File: gawk.info, Node: MPFR features, Next: FP Math Caution, Prev: Math Definitions, Up: Arbitrary Precision Arithmetic
-Historically, `awk' has converted any non-numeric looking string to the
-numeric value zero, when required. Furthermore, the original
-definition of the language and the original POSIX standards specified
-that `awk' only understands decimal numbers (base 10), and not octal
-(base 8) or hexadecimal numbers (base 16).
-
- Changes in the language of the 2001 and 2004 POSIX standards can be
-interpreted to imply that `awk' should support additional features.
-These features are:
+15.3 Arbitrary Precison Arithmetic Features In `gawk'
+=====================================================
- * Interpretation of floating point data values specified in
- hexadecimal notation (`0xDEADBEEF'). (Note: data values, _not_
- source code constants.)
+By default, `gawk' uses the double precision floating point values
+supplied by the hardware of the system it runs on. However, if it was
+compiled to do, `gawk' uses the GNU MPFR (http://www.mpfr.org) and GNU
+MP (http://gmplib.org) (GMP) libraries for arbitrary precision
+arithmetic on numbers. You can see if MPFR support is available like
+so:
- * Support for the special IEEE 754 floating point values "Not A
- Number" (NaN), positive Infinity ("inf") and negative Infinity
- ("-inf"). In particular, the format for these values is as
- specified by the ISO 1999 C standard, which ignores case and can
- allow machine-dependent additional characters after the `nan' and
- allow either `inf' or `infinity'.
+ $ gawk --version
+ -| GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.0-p3, GNU MP 5.0.2)
+ -| Copyright (C) 1989, 1991-2014 Free Software Foundation.
+ ...
- The first problem is that both of these are clear changes to
-historical practice:
+(You may see different version numbers than what's shown here. That's
+OK; what's important is to see that GNU MPFR and GNU MP are listed in
+the output.)
- * The `gawk' maintainer feels that supporting hexadecimal floating
- point values, in particular, is ugly, and was never intended by the
- original designers to be part of the language.
+ Additionally, there are a few elements available in the `PROCINFO'
+array to provide information about the MPFR and GMP libraries (*note
+Auto-set::).
- * Allowing completely alphabetic strings to have valid numeric
- values is also a very severe departure from historical practice.
+ The MPFR library provides precise control over precisions and
+rounding modes, and gives correctly rounded, reproducible,
+platform-independent results. With either of the command-line options
+`--bignum' or `-M', all floating-point arithmetic operators and numeric
+functions can yield results to any desired precision level supported by
+MPFR.
- The second problem is that the `gawk' maintainer feels that this
-interpretation of the standard, which requires a certain amount of
-"language lawyering" to arrive at in the first place, was not even
-intended by the standard developers. In other words, "we see how you
-got where you are, but we don't think that that's where you want to be."
+ Two built-in variables, `PREC' and `ROUNDMODE', provide control over
+the working precision and the rounding mode. The precision and the
+rounding mode are set globally for every operation to follow. *Note
+Auto-set::, for more information.
- Recognizing the above issues, but attempting to provide compatibility
-with the earlier versions of the standard, the 2008 POSIX standard
-added explicit wording to allow, but not require, that `awk' support
-hexadecimal floating point values and special values for "Not A Number"
-and infinity.
+
+File: gawk.info, Node: FP Math Caution, Next: Arbitrary Precision Integers, Prev: MPFR features, Up: Arbitrary Precision Arithmetic
- Although the `gawk' maintainer continues to feel that providing
-those features is inadvisable, nevertheless, on systems that support
-IEEE floating point, it seems reasonable to provide _some_ way to
-support NaN and Infinity values. The solution implemented in `gawk' is
-as follows:
+15.4 Floating Point Arithmetic: Caveat Emptor!
+==============================================
- * With the `--posix' command-line option, `gawk' becomes "hands
- off." String values are passed directly to the system library's
- `strtod()' function, and if it successfully returns a numeric
- value, that is what's used.(1) By definition, the results are not
- portable across different systems. They are also a little
- surprising:
+ Math class is tough! -- Late 1980's Barbie
- $ echo nanny | gawk --posix '{ print $1 + 0 }'
- -| nan
- $ echo 0xDeadBeef | gawk --posix '{ print $1 + 0 }'
- -| 3735928559
+ This minor node provides a high level overview of the issues
+involved when doing lots of floating-point arithmetic.(1) The
+discussion applies to both hardware and arbitrary-precision
+floating-point arithmetic.
- * Without `--posix', `gawk' interprets the four strings `+inf',
- `-inf', `+nan', and `-nan' specially, producing the corresponding
- special numeric values. The leading sign acts a signal to `gawk'
- (and the user) that the value is really numeric. Hexadecimal
- floating point is not supported (unless you also use
- `--non-decimal-data', which is _not_ recommended). For example:
+ CAUTION: The material here is purposely general. If you need to do
+ serious computer arithmetic, you should do some research first,
+ and not rely just on what we tell you.
- $ echo nanny | gawk '{ print $1 + 0 }'
- -| 0
- $ echo +nan | gawk '{ print $1 + 0 }'
- -| nan
- $ echo 0xDeadBeef | gawk '{ print $1 + 0 }'
- -| 0
+* Menu:
- `gawk' does ignore case in the four special values. Thus `+nan'
- and `+NaN' are the same.
+* Inexactness of computations:: Floating point math is not exact.
+* Getting Accuracy:: Getting more accuracy takes some work.
+* Try To Round:: Add digits and round.
+* Setting precision:: How to set the precision.
+* Setting the rounding mode:: How to set the rounding mode.
---------- Footnotes ----------
- (1) You asked for it, you got it.
+ (1) There is a very nice paper on floating-point arithmetic
+(http://www.validlab.com/goldberg/paper.pdf) by David Goldberg, "What
+Every Computer Scientist Should Know About Floating-point Arithmetic,"
+`ACM Computing Surveys' *23*, 1 (1991-03), 5-48. This is worth reading
+if you are interested in the details, but it does require a background
+in computer science.

-File: gawk.info, Node: Integer Programming, Prev: Floating Point Issues, Up: General Arithmetic
+File: gawk.info, Node: Inexactness of computations, Next: Getting Accuracy, Up: FP Math Caution
-15.1.2 Mixing Integers And Floating-point
------------------------------------------
-
-As has been mentioned already, `awk' uses hardware double precision
-with 64-bit IEEE binary floating-point representation for numbers on
-most systems. A large integer like 9,007,199,254,740,997 has a binary
-representation that, although finite, is more than 53 bits long; it
-must also be rounded to 53 bits. The biggest integer that can be
-stored in a C `double' is usually the same as the largest possible
-value of a `double'. If your system `double' is an IEEE 64-bit
-`double', this largest possible value is an integer and can be
-represented precisely. What more should one know about integers?
-
- If you want to know what is the largest integer, such that it and
-all smaller integers can be stored in 64-bit doubles without losing
-precision, then the answer is 2^53. The next representable number is
-the even number 2^53 + 2, meaning it is unlikely that you will be able
-to make `gawk' print 2^53 + 1 in integer format. The range of integers
-exactly representable by a 64-bit double is [-2^53, 2^53]. If you ever
-see an integer outside this range in `awk' using 64-bit doubles, you
-have reason to be very suspicious about the accuracy of the output.
-Here is a simple program with erroneous output:
-
- $ gawk 'BEGIN { i = 2^53 - 1; for (j = 0; j < 4; j++) print i + j }'
- -| 9007199254740991
- -| 9007199254740992
- -| 9007199254740992
- -| 9007199254740994
-
- The lesson is to not assume that any large integer printed by `awk'
-represents an exact result from your computation, especially if it wraps
-around on your screen.
-
-
-File: gawk.info, Node: Floating-point Programming, Next: Gawk and MPFR, Prev: General Arithmetic, Up: Arbitrary Precision Arithmetic
-
-15.2 Understanding Floating-point Programming
-=============================================
+15.4.1 Floating Point Arithmetic Is Not Exact
+---------------------------------------------
-Numerical programming is an extensive area; if you need to develop
-sophisticated numerical algorithms then `gawk' may not be the ideal
-tool, and this documentation may not be sufficient. It might require
-digesting a book or two(1) to really internalize how to compute with
-ideal accuracy and precision, and the result often depends on the
-particular application.
-
- NOTE: A floating-point calculation's "accuracy" is how close it
- comes to the real value. This is as opposed to the "precision",
- which usually refers to the number of bits used to represent the
- number (see the Wikipedia article
- (http://en.wikipedia.org/wiki/Accuracy_and_precision) for more
- information).
-
- There are two options for doing floating-point calculations:
-hardware floating-point (as used by standard `awk' and the default for
-`gawk'), and "arbitrary-precision" floating-point, which is software
-based. From this point forward, this major node aims to provide enough
-information to understand both, and then will focus on `gawk''s
-facilities for the latter.(2)
-
- Binary floating-point representations and arithmetic are inexact.
+Binary floating-point representations and arithmetic are inexact.
Simple values like 0.1 cannot be precisely represented using binary
floating-point numbers, and the limited precision of floating-point
numbers means that slight changes in the order of operations or the
@@ -20867,9 +21681,21 @@ matters worse, with arbitrary precision floating-point, you can set the
precision before starting a computation, but then you cannot be sure of
the number of significant decimal places in the final result.
- Sometimes, before you start to write any code, you should think more
-about what you really want and what's really happening. Consider the
-two numbers in the following example:
+* Menu:
+
+* Inexact representation:: Numbers are not exactly represented.
+* Comparing FP Values:: How to compare floating point values.
+* Errors accumulate:: Errors get bigger as they go.
+
+
+File: gawk.info, Node: Inexact representation, Next: Comparing FP Values, Up: Inexactness of computations
+
+15.4.1.1 Many Numbers Cannot Be Represented Exactly
+...................................................
+
+So, before you start to write any code, you should think about what you
+really want and what's really happening. Consider the two numbers in
+the following example:
x = 0.875 # 1/2 + 1/4 + 1/8
y = 0.425
@@ -20892,20 +21718,44 @@ you can always specify how much precision you would like in your output.
Usually this is a format string like `"%.15g"', which when used in the
previous example, produces an output identical to the input.
- Because the underlying representation can be a little bit off from
-the exact value, comparing floating-point values to see if they are
-equal is generally not a good idea. Here is an example where it does
-not work like you expect:
+
+File: gawk.info, Node: Comparing FP Values, Next: Errors accumulate, Prev: Inexact representation, Up: Inexactness of computations
+
+15.4.1.2 Be Careful Comparing Values
+....................................
+
+Because the underlying representation can be a little bit off from the
+exact value, comparing floating-point values to see if they are exactly
+equal is generally a bad idea. Here is an example where it does not
+work like you would expect:
$ gawk 'BEGIN { print (0.1 + 12.2 == 12.3) }'
-| 0
- The loss of accuracy during a single computation with floating-point
+ The general wisdom when comparing floating-point values is to see if
+they are within some small range of each other (called a "delta", or
+"tolerance"). You have to decide how small a delta is important to
+you. Code to do this looks something like this:
+
+ delta = 0.00001 # for example
+ difference = abs(a) - abs(b) # subtract the two values
+ if (difference < delta)
+ # all ok
+ else
+ # not ok
+
+
+File: gawk.info, Node: Errors accumulate, Prev: Comparing FP Values, Up: Inexactness of computations
+
+15.4.1.3 Errors Accumulate
+..........................
+
+The loss of accuracy during a single computation with floating-point
numbers usually isn't enough to worry about. However, if you compute a
value which is the result of a sequence of floating point operations,
the error can accumulate and greatly affect the computation itself.
-Here is an attempt to compute the value of the constant pi using one of
-its many series representations:
+Here is an attempt to compute the value of pi using one of its many
+series representations:
BEGIN {
x = 1.0 / sqrt(3.0)
@@ -20917,9 +21767,9 @@ its many series representations:
}
}
- When run, the early errors propagating through later computations
-cause the loop to terminate prematurely after an attempt to divide by
-zero.
+ When run, the early errors propagate through later computations,
+causing the loop to terminate prematurely after attempting to divide by
+zero:
$ gawk -f pi.awk
-| 3.215390309173475
@@ -20942,166 +21792,176 @@ representations yield an unexpected result:
> }'
-| 4
- Can computation using arbitrary precision help with the previous
-examples? If you are impatient to know, see *note Exact Arithmetic::.
+
+File: gawk.info, Node: Getting Accuracy, Next: Try To Round, Prev: Inexactness of computations, Up: FP Math Caution
- Instead of arbitrary precision floating-point arithmetic, often all
-you need is an adjustment of your logic or a different order for the
-operations in your calculation. The stability and the accuracy of the
-computation of the constant pi in the earlier example can be enhanced
-by using the following simple algebraic transformation:
+15.4.2 Getting The Accuracy You Need
+------------------------------------
- (sqrt(x * x + 1) - 1) / x = x / (sqrt(x * x + 1) + 1)
+Can arbitrary precision arithmetic give exact results? There are no
+easy answers. The standard rules of algebra often do not apply when
+using floating-point arithmetic. Among other things, the distributive
+and associative laws do not hold completely, and order of operation may
+be important for your computation. Rounding error, cumulative precision
+loss and underflow are often troublesome.
-After making this, change the program does converge to pi in under 30
-iterations:
+ When `gawk' tests the expressions `0.1 + 12.2' and `12.3' for
+equality using the machine double precision arithmetic, it decides that
+they are not equal! (*Note Comparing FP Values::.) You can get the
+result you want by increasing the precision; 56 bits in this case does
+the job:
- $ gawk -f pi2.awk
- -| 3.215390309173473
- -| 3.159659942097501
- -| 3.146086215131436
- -| 3.142714599645370
- -| 3.141873049979825
- ...
- -| 3.141592653589797
- -| 3.141592653589797
+ $ gawk -M -v PREC=56 'BEGIN { print (0.1 + 12.2 == 12.3) }'
+ -| 1
- There is no need to be unduly suspicious about the results from
-floating-point arithmetic. The lesson to remember is that
-floating-point arithmetic is always more complex than arithmetic using
-pencil and paper. In order to take advantage of the power of computer
-floating-point, you need to know its limitations and work within them.
-For most casual use of floating-point arithmetic, you will often get
-the expected result in the end if you simply round the display of your
-final results to the correct number of significant decimal digits.
+ If adding more bits is good, perhaps adding even more bits of
+precision is better? Here is what happens if we use an even larger
+value of `PREC':
- As general advice, avoid presenting numerical data in a manner that
-implies better precision than is actually the case.
+ $ gawk -M -v PREC=201 'BEGIN { print (0.1 + 12.2 == 12.3) }'
+ -| 0
-* Menu:
+ This is not a bug in `gawk' or in the MPFR library. It is easy to
+forget that the finite number of bits used to store the value is often
+just an approximation after proper rounding. The test for equality
+succeeds if and only if _all_ bits in the two operands are exactly the
+same. Since this is not necessarily true after floating-point
+computations with a particular precision and effective rounding rule, a
+straight test for equality may not work. Instead, compare the two
+numbers to see if they are within the desirable delta of each other.
-* Floating-point Representation:: Binary floating-point representation.
-* Floating-point Context:: Floating-point context.
-* Rounding Mode:: Floating-point rounding mode.
+ In applications where 15 or fewer decimal places suffice, hardware
+double precision arithmetic can be adequate, and is usually much faster.
+But you need to keep in mind that every floating-point operation can
+suffer a new rounding error with catastrophic consequences as
+illustrated by our earlier attempt to compute the value of pi. Extra
+precision can greatly enhance the stability and the accuracy of your
+computation in such cases.
- ---------- Footnotes ----------
+ Repeated addition is not necessarily equivalent to multiplication in
+floating-point arithmetic. In the example in *note Errors accumulate:::
- (1) One recommended title is `Numerical Computing with IEEE Floating
-Point Arithmetic', Michael L. Overton, Society for Industrial and
-Applied Mathematics, 2004. ISBN: 0-89871-482-6, ISBN-13:
-978-0-89871-482-1. See `http://www.cs.nyu.edu/cs/faculty/overton/book'.
+ $ gawk 'BEGIN {
+ > for (d = 1.1; d <= 1.5; d += 0.1) # loop five times (?)
+ > i++
+ > print i
+ > }'
+ -| 4
- (2) If you are interested in other tools that perform arbitrary
-precision arithmetic, you may want to investigate the POSIX `bc' tool.
-See the POSIX specification for it
-(http://pubs.opengroup.org/onlinepubs/009695399/utilities/bc.html), for
-more information.
+you may or may not succeed in getting the correct result by choosing an
+arbitrarily large value for `PREC'. Reformulation of the problem at
+hand is often the correct approach in such situations.

-File: gawk.info, Node: Floating-point Representation, Next: Floating-point Context, Up: Floating-point Programming
-
-15.2.1 Binary Floating-point Representation
--------------------------------------------
-
-Although floating-point representations vary from machine to machine,
-the most commonly encountered representation is that defined by the
-IEEE 754 Standard. An IEEE-754 format value has three components:
+File: gawk.info, Node: Try To Round, Next: Setting precision, Prev: Getting Accuracy, Up: FP Math Caution
- * A sign bit telling whether the number is positive or negative.
+15.4.3 Try A Few Extra Bits of Precision and Rounding
+-----------------------------------------------------
- * An "exponent", E, giving its order of magnitude.
+Instead of arbitrary precision floating-point arithmetic, often all you
+need is an adjustment of your logic or a different order for the
+operations in your calculation. The stability and the accuracy of the
+computation of pi in the earlier example can be enhanced by using the
+following simple algebraic transformation:
- * A "significand", S, specifying the actual digits of the number.
+ (sqrt(x * x + 1) - 1) / x == x / (sqrt(x * x + 1) + 1)
- The value of the number is then S * 2^E. The first bit of a
-non-zero binary significand is always one, so the significand in an
-IEEE-754 format only includes the fractional part, leaving the leading
-one implicit. The significand is stored in "normalized" format, which
-means that the first bit is always a one.
+After making this, change the program converges to pi in under 30
+iterations:
- Three of the standard IEEE-754 types are 32-bit single precision,
-64-bit double precision and 128-bit quadruple precision. The standard
-also specifies extended precision formats to allow greater precisions
-and larger exponent ranges.
+ $ gawk -f pi2.awk
+ -| 3.215390309173473
+ -| 3.159659942097501
+ -| 3.146086215131436
+ -| 3.142714599645370
+ -| 3.141873049979825
+ ...
+ -| 3.141592653589797
+ -| 3.141592653589797

-File: gawk.info, Node: Floating-point Context, Next: Rounding Mode, Prev: Floating-point Representation, Up: Floating-point Programming
+File: gawk.info, Node: Setting precision, Next: Setting the rounding mode, Prev: Try To Round, Up: FP Math Caution
-15.2.2 Floating-point Context
------------------------------
-
-A floating-point "context" defines the environment for arithmetic
-operations. It governs precision, sets rules for rounding, and limits
-the range for exponents. The context has the following primary
-components:
-
-"Precision"
- Precision of the floating-point format in bits.
-
-"emax"
- Maximum exponent allowed for the format.
-
-"emin"
- Minimum exponent allowed for the format.
-
-"Underflow behavior"
- The format may or may not support gradual underflow.
-
-"Rounding"
- The rounding mode of the context.
+15.4.4 Setting The Precision
+----------------------------
- *note table-ieee-formats:: lists the precision and exponent field
-values for the basic IEEE-754 binary formats:
+`gawk' uses a global working precision; it does not keep track of the
+precision or accuracy of individual numbers. Performing an arithmetic
+operation or calling a built-in function rounds the result to the
+current working precision. The default working precision is 53 bits,
+which you can modify using the built-in variable `PREC'. You can also
+set the value to one of the predefined case-insensitive strings shown
+in *note table-predefined-precision-strings::, to emulate an IEEE 754
+binary format.
-Name Total bits Precision emin emax
----------------------------------------------------------------------------
-Single 32 24 -126 +127
-Double 64 53 -1022 +1023
-Quadruple 128 113 -16382 +16383
+`PREC' IEEE 754 Binary Format
+---------------------------------------------------
+`"half"' 16-bit half-precision.
+`"single"' Basic 32-bit single precision.
+`"double"' Basic 64-bit double precision.
+`"quad"' Basic 128-bit quadruple precision.
+`"oct"' 256-bit octuple precision.
-Table 15.1: Basic IEEE Format Context Values
+Table 15.2: Predefined Precision Strings For `PREC'
- NOTE: The precision numbers include the implied leading one that
- gives them one extra bit of significand.
+ The following example illustrates the effects of changing precision
+on arithmetic operations:
- A floating-point context can also determine which signals are treated
-as exceptions, and can set rules for arithmetic with special values.
-Please consult the IEEE-754 standard or other resources for details.
+ $ gawk -M -v PREC=100 'BEGIN { x = 1.0e-400; print x + 0
+ > PREC = "double"; print x + 0 }'
+ -| 1e-400
+ -| 0
- `gawk' ordinarily uses the hardware double precision representation
-for numbers. On most systems, this is IEEE-754 floating-point format,
-corresponding to 64-bit binary with 53 bits of precision.
+ CAUTION: Be wary of floating-point constants! When reading a
+ floating-point constant from program source code, `gawk' uses the
+ default precision (that of a C `double'), unless overridden by an
+ assignment to the special variable `PREC' on the command line, to
+ store it internally as a MPFR number. Changing the precision
+ using `PREC' in the program text does _not_ change the precision
+ of a constant.
+
+ If you need to represent a floating-point constant at a higher
+ precision than the default and cannot use a command line
+ assignment to `PREC', you should either specify the constant as a
+ string, or as a rational number, whenever possible. The following
+ example illustrates the differences among various ways to print a
+ floating-point constant:
- NOTE: In case an underflow occurs, the standard allows, but does
- not require, the result from an arithmetic operation to be a
- number smaller than the smallest nonzero normalized number. Such
- numbers do not have as many significant digits as normal numbers,
- and are called "denormals" or "subnormals". The alternative,
- simply returning a zero, is called "flush to zero". The basic
- IEEE-754 binary formats support subnormal numbers.
+ $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 0.1) }'
+ -| 0.1000000000000000055511151
+ $ gawk -M -v PREC=113 'BEGIN { printf("%0.25f\n", 0.1) }'
+ -| 0.1000000000000000000000000
+ $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", "0.1") }'
+ -| 0.1000000000000000000000000
+ $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 1/10) }'
+ -| 0.1000000000000000000000000

-File: gawk.info, Node: Rounding Mode, Prev: Floating-point Context, Up: Floating-point Programming
+File: gawk.info, Node: Setting the rounding mode, Prev: Setting precision, Up: FP Math Caution
-15.2.3 Floating-point Rounding Mode
------------------------------------
+15.4.5 Setting The Rounding Mode
+--------------------------------
-The "rounding mode" specifies the behavior for the results of numerical
-operations when discarding extra precision. Each rounding mode indicates
-how the least significant returned digit of a rounded result is to be
-calculated. *note table-rounding-modes:: lists the IEEE-754 defined
-rounding modes:
+The `ROUNDMODE' variable provides program level control over the
+rounding mode. The correspondence between `ROUNDMODE' and the IEEE
+rounding modes is shown in *note table-gawk-rounding-modes::.
-Rounding Mode IEEE Name
---------------------------------------------------------------------------
-Round to nearest, ties to even `roundTiesToEven'
-Round toward plus Infinity `roundTowardPositive'
-Round toward negative Infinity `roundTowardNegative'
-Round toward zero `roundTowardZero'
-Round to nearest, ties away `roundTiesToAway'
-from zero
+Rounding Mode IEEE Name `ROUNDMODE'
+---------------------------------------------------------------------------
+Round to nearest, ties to even `roundTiesToEven' `"N"' or `"n"'
+Round toward plus Infinity `roundTowardPositive' `"U"' or `"u"'
+Round toward negative Infinity `roundTowardNegative' `"D"' or `"d"'
+Round toward zero `roundTowardZero' `"Z"' or `"z"'
+Round to nearest, ties away `roundTiesToAway' `"A"' or `"a"'
+from zero
+
+Table 15.3: `gawk' Rounding Modes
-Table 15.2: IEEE 754 Rounding Modes
+ `ROUNDMODE' has the default value `"N"', which selects the IEEE 754
+rounding mode `roundTiesToEven'. In *note Table 15.3:
+table-gawk-rounding-modes, the value `"A"' selects `roundTiesToAway'.
+This is only available if your version of the MPFR library supports it;
+otherwise setting `ROUNDMODE' to `"A"' has no effect.
The default mode `roundTiesToEven' is the most preferred, but the
least intuitive. This method does the obvious thing for most values, by
@@ -21136,20 +21996,19 @@ produces the following output when run on the author's system:(1)
3.5 => 4
4.5 => 4
- The theory behind the rounding mode `roundTiesToEven' is that it
-more or less evenly distributes upward and downward rounds of exact
-halves, which might cause any round-off error to cancel itself out.
-This is the default rounding mode used in IEEE-754 computing functions
-and operators.
+ The theory behind `roundTiesToEven' is that it more or less evenly
+distributes upward and downward rounds of exact halves, which might
+cause any accumulating round-off error to cancel itself out. This is the
+default rounding mode for IEEE 754 computing functions and operators.
The other rounding modes are rarely used. Round toward positive
infinity (`roundTowardPositive') and round toward negative infinity
-(`roundTowardNegative') are often used to implement interval arithmetic,
-where you adjust the rounding mode to calculate upper and lower bounds
-for the range of output. The `roundTowardZero' mode can be used for
-converting floating-point numbers to integers. The rounding mode
-`roundTiesToAway' rounds the result to the nearest number and selects
-the number with the larger magnitude if a tie occurs.
+(`roundTowardNegative') are often used to implement interval
+arithmetic, where you adjust the rounding mode to calculate upper and
+lower bounds for the range of output. The `roundTowardZero' mode can be
+used for converting floating-point numbers to integers. The rounding
+mode `roundTiesToAway' rounds the result to the nearest number and
+selects the number with the larger magnitude if a tie occurs.
Some numerical analysts will tell you that your choice of rounding
style has tremendous impact on the final outcome, and advise you to
@@ -21158,418 +22017,255 @@ round-off error problems by setting the precision initially to some
value sufficiently larger than the final desired precision, so that the
accumulation of round-off error does not influence the outcome. If you
suspect that results from your computation are sensitive to
-accumulation of round-off error, one way to be sure is to look for a
-significant difference in output when you change the rounding mode.
+accumulation of round-off error, look for a significant difference in
+output when you change the rounding mode to be sure.
---------- Footnotes ----------
(1) It is possible for the output to be completely different if the
-C library in your system does not use the IEEE-754 even-rounding rule
+C library in your system does not use the IEEE 754 even-rounding rule
to round halfway cases for `printf'.

-File: gawk.info, Node: Gawk and MPFR, Next: Arbitrary Precision Floats, Prev: Floating-point Programming, Up: Arbitrary Precision Arithmetic
-
-15.3 `gawk' + MPFR = Powerful Arithmetic
-========================================
-
-The rest of this major node describes how to use the arbitrary precision
-(also known as "multiple precision" or "infinite precision") numeric
-capabilities in `gawk' to produce maximally accurate results when you
-need it.
-
- But first you should check if your version of `gawk' supports
-arbitrary precision arithmetic. The easiest way to find out is to look
-at the output of the following command:
+File: gawk.info, Node: Arbitrary Precision Integers, Next: POSIX Floating Point Problems, Prev: FP Math Caution, Up: Arbitrary Precision Arithmetic
- $ ./gawk --version
- -| GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.0-p3, GNU MP 5.0.2)
- -| Copyright (C) 1989, 1991-2014 Free Software Foundation.
- ...
+15.5 Arbitrary Precision Integer Arithmetic with `gawk'
+=======================================================
-(You may see different version numbers than what's shown here. That's
-OK; what's important is to see that GNU MPFR and GNU MP are listed in
-the output.)
+When given one of the options `--bignum' or `-M', `gawk' performs all
+integer arithmetic using GMP arbitrary precision integers. Any number
+that looks like an integer in a source or data file is stored as an
+arbitrary precision integer. The size of the integer is limited only
+by the available memory. For example, the following computes 5^4^3^2,
+the result of which is beyond the limits of ordinary hardware
+double-precision floating point values:
- `gawk' uses the GNU MPFR (http://www.mpfr.org) and GNU MP
-(http://gmplib.org) (GMP) libraries for arbitrary precision arithmetic
-on numbers. So if you do not see the names of these libraries in the
-output, then your version of `gawk' does not support arbitrary
-precision arithmetic.
+ $ gawk -M 'BEGIN {
+ > x = 5^4^3^2
+ > print "# of digits =", length(x)
+ > print substr(x, 1, 20), "...", substr(x, length(x) - 19, 20)
+ > }'
+ -| # of digits = 183231
+ -| 62060698786608744707 ... 92256259918212890625
- Additionally, there are a few elements available in the `PROCINFO'
-array to provide information about the MPFR and GMP libraries. *Note
-Auto-set::, for more information.
+ If instead you were to compute the same value using arbitrary
+precision floating-point values, the precision needed for correct
+output (using the formula `prec = 3.322 * dps'), would be 3.322 x
+183231, or 608693.
-
-File: gawk.info, Node: Arbitrary Precision Floats, Next: Arbitrary Precision Integers, Prev: Gawk and MPFR, Up: Arbitrary Precision Arithmetic
+ The result from an arithmetic operation with an integer and a
+floating-point value is a floating-point value with a precision equal
+to the working precision. The following program calculates the eighth
+term in Sylvester's sequence(1) using a recurrence:
-15.4 Arbitrary Precision Floating-point Arithmetic with `gawk'
-==============================================================
+ $ gawk -M 'BEGIN {
+ > s = 2.0
+ > for (i = 1; i <= 7; i++)
+ > s = s * (s - 1) + 1
+ > print s
+ > }'
+ -| 113423713055421845118910464
-`gawk' uses the GNU MPFR library for arbitrary precision floating-point
-arithmetic. The MPFR library provides precise control over precisions
-and rounding modes, and gives correctly rounded, reproducible,
-platform-independent results. With one of the command-line options
-`--bignum' or `-M', all floating-point arithmetic operators and numeric
-functions can yield results to any desired precision level supported by
-MPFR. Two built-in variables, `PREC' and `ROUNDMODE', provide control
-over the working precision and the rounding mode (*note Setting
-Precision::, and *note Setting Rounding Mode::). The precision and the
-rounding mode are set globally for every operation to follow.
-
- The default working precision for arbitrary precision floating-point
-values is 53 bits, and the default value for `ROUNDMODE' is `"N"',
-which selects the IEEE-754 `roundTiesToEven' rounding mode (*note
-Rounding Mode::).(1) `gawk' uses the default exponent range in MPFR
-(EMAX = 2^30 - 1, EMIN = -EMAX) for all floating-point contexts. There
-is no explicit mechanism to adjust the exponent range. MPFR does not
-implement subnormal numbers by default, and this behavior cannot be
-changed in `gawk'.
-
- NOTE: When emulating an IEEE-754 format (*note Setting
- Precision::), `gawk' internally adjusts the exponent range to the
- value defined for the format and also performs computations needed
- for gradual underflow (subnormal numbers).
-
- NOTE: MPFR numbers are variable-size entities, consuming only as
- much space as needed to store the significant digits. Since the
- performance using MPFR numbers pales in comparison to doing
- arithmetic using the underlying machine types, you should consider
- using only as much precision as needed by your program.
+ The output differs from the actual number,
+113,423,713,055,421,844,361,000,443, because the default precision of
+53 bits is not enough to represent the floating-point results exactly.
+You can either increase the precision (100 bits is enough in this
+case), or replace the floating-point constant `2.0' with an integer, to
+perform all computations using integer arithmetic to get the correct
+output.
-* Menu:
+ Sometimes `gawk' must implicitly convert an arbitrary precision
+integer into an arbitrary precision floating-point value. This is
+primarily because the MPFR library does not always provide the relevant
+interface to process arbitrary precision integers or mixed-mode numbers
+as needed by an operation or function. In such a case, the precision is
+set to the minimum value necessary for exact conversion, and the working
+precision is not used for this purpose. If this is not what you need or
+want, you can employ a subterfuge, and convert the integer to floating
+point first, like this:
-* Setting Precision:: Setting the working precision.
-* Setting Rounding Mode:: Setting the rounding mode.
-* Floating-point Constants:: Representing floating-point constants.
-* Changing Precision:: Changing the precision of a number.
-* Exact Arithmetic:: Exact arithmetic with floating-point numbers.
+ gawk -M 'BEGIN { n = 13; print (n + 0.0) % 2.0 }'
- ---------- Footnotes ----------
+ You can avoid this issue altogether by specifying the number as a
+floating-point value to begin with:
- (1) The default precision is 53 bits, since according to the MPFR
-documentation, the library should be able to exactly reproduce all
-computations with double-precision machine floating-point numbers
-(`double' type in C), except the default exponent range is much wider
-and subnormal numbers are not implemented.
+ gawk -M 'BEGIN { n = 13.0; print n % 2.0 }'
-
-File: gawk.info, Node: Setting Precision, Next: Setting Rounding Mode, Up: Arbitrary Precision Floats
+ Note that for the particular example above, it is likely best to
+just use the following:
-15.4.1 Setting the Working Precision
-------------------------------------
+ gawk -M 'BEGIN { n = 13; print n % 2 }'
-`gawk' uses a global working precision; it does not keep track of the
-precision or accuracy of individual numbers. Performing an arithmetic
-operation or calling a built-in function rounds the result to the
-current working precision. The default working precision is 53 bits,
-which can be modified using the built-in variable `PREC'. You can also
-set the value to one of the pre-defined case-insensitive strings shown
-in *note table-predefined-precision-strings::, to emulate an IEEE-754
-binary format.
+ When dividing two arbitrary precision integers with either `/' or
+`%', the result is typically an arbitrary precision floating point
+value (unless the denominator evenly divides into the numerator). In
+order to do integer division or remainder with arbitrary precision
+integers, use the built-in `div()' function (*note Numeric Functions::).
-`PREC' IEEE-754 Binary Format
----------------------------------------------------
-`"half"' 16-bit half-precision.
-`"single"' Basic 32-bit single precision.
-`"double"' Basic 64-bit double precision.
-`"quad"' Basic 128-bit quadruple precision.
-`"oct"' 256-bit octuple precision.
+ You can simulate the `div()' function in standard `awk' using this
+user-defined function:
-Table 15.3: Predefined precision strings for `PREC'
+ # div --- do integer division
- The following example illustrates the effects of changing precision
-on arithmetic operations:
+ function div(numerator, denominator, result, i)
+ {
+ split("", result)
- $ gawk -M -v PREC=100 'BEGIN { x = 1.0e-400; print x + 0
- > PREC = "double"; print x + 0 }'
- -| 1e-400
- -| 0
+ numerator = int(numerator)
+ denominator = int(denominator)
+ result["quotient"] = int(numerator / denominator)
+ result["remainder"] = int(numerator % denominator)
- Binary and decimal precisions are related approximately, according
-to the formula:
-
- PREC = 3.322 * DPS
-
-Here, PREC denotes the binary precision (measured in bits) and DPS
-(short for decimal places) is the decimal digits. We can easily
-calculate how many decimal digits the 53-bit significand of an IEEE
-double is equivalent to: 53 / 3.322 which is equal to about 15.95. But
-what does 15.95 digits actually mean? It depends whether you are
-concerned about how many digits you can rely on, or how many digits you
-need.
-
- It is important to know how many bits it takes to uniquely identify
-a double-precision value (the C type `double'). If you want to convert
-from `double' to decimal and back to `double' (e.g., saving a `double'
-representing an intermediate result to a file, and later reading it
-back to restart the computation), then a few more decimal digits are
-required. 17 digits is generally enough for a `double'.
-
- It can also be important to know what decimal numbers can be uniquely
-represented with a `double'. If you want to convert from decimal to
-`double' and back again, 15 digits is the most that you can get. Stated
-differently, you should not present the numbers from your
-floating-point computations with more than 15 significant digits in
-them.
+ return 0.0
+ }
- Conversely, it takes a precision of 332 bits to hold an approximation
-of the constant pi that is accurate to 100 decimal places.
+ ---------- Footnotes ----------
- You should always add some extra bits in order to avoid the
-confusing round-off issues that occur because numbers are stored
-internally in binary.
+ (1) Weisstein, Eric W. `Sylvester's Sequence'. From MathWorld--A
+Wolfram Web Resource
+(`http://mathworld.wolfram.com/SylvestersSequence.html').

-File: gawk.info, Node: Setting Rounding Mode, Next: Floating-point Constants, Prev: Setting Precision, Up: Arbitrary Precision Floats
-
-15.4.2 Setting the Rounding Mode
---------------------------------
+File: gawk.info, Node: POSIX Floating Point Problems, Next: Floating point summary, Prev: Arbitrary Precision Integers, Up: Arbitrary Precision Arithmetic
-The `ROUNDMODE' variable provides program level control over the
-rounding mode. The correspondence between `ROUNDMODE' and the IEEE
-rounding modes is shown in *note table-gawk-rounding-modes::.
+15.6 Standards Versus Existing Practice
+=======================================
-Rounding Mode IEEE Name `ROUNDMODE'
----------------------------------------------------------------------------
-Round to nearest, ties to even `roundTiesToEven' `"N"' or `"n"'
-Round toward plus Infinity `roundTowardPositive' `"U"' or `"u"'
-Round toward negative Infinity `roundTowardNegative' `"D"' or `"d"'
-Round toward zero `roundTowardZero' `"Z"' or `"z"'
-Round to nearest, ties away `roundTiesToAway' `"A"' or `"a"'
-from zero
+Historically, `awk' has converted any non-numeric looking string to the
+numeric value zero, when required. Furthermore, the original
+definition of the language and the original POSIX standards specified
+that `awk' only understands decimal numbers (base 10), and not octal
+(base 8) or hexadecimal numbers (base 16).
-Table 15.4: `gawk' Rounding Modes
+ Changes in the language of the 2001 and 2004 POSIX standards can be
+interpreted to imply that `awk' should support additional features.
+These features are:
- `ROUNDMODE' has the default value `"N"', which selects the IEEE-754
-rounding mode `roundTiesToEven'. In *note Table 15.4:
-table-gawk-rounding-modes, `"A"' is listed to select the IEEE-754 mode
-`roundTiesToAway'. This is only available if your version of the MPFR
-library supports it; otherwise setting `ROUNDMODE' to this value has no
-effect. *Note Rounding Mode::, for the meanings of the various rounding
-modes.
+ * Interpretation of floating point data values specified in
+ hexadecimal notation (e.g., `0xDEADBEEF'). (Note: data values,
+ _not_ source code constants.)
- Here is an example of how to change the default rounding behavior of
-`printf''s output:
+ * Support for the special IEEE 754 floating point values "Not A
+ Number" (NaN), positive Infinity ("inf") and negative Infinity
+ ("-inf"). In particular, the format for these values is as
+ specified by the ISO 1999 C standard, which ignores case and can
+ allow implementation-dependent additional characters after the
+ `nan' and allow either `inf' or `infinity'.
- $ gawk -M -v ROUNDMODE="Z" 'BEGIN { printf("%.2f\n", 1.378) }'
- -| 1.37
+ The first problem is that both of these are clear changes to
+historical practice:
-
-File: gawk.info, Node: Floating-point Constants, Next: Changing Precision, Prev: Setting Rounding Mode, Up: Arbitrary Precision Floats
+ * The `gawk' maintainer feels that supporting hexadecimal floating
+ point values, in particular, is ugly, and was never intended by the
+ original designers to be part of the language.
-15.4.3 Representing Floating-point Constants
---------------------------------------------
+ * Allowing completely alphabetic strings to have valid numeric
+ values is also a very severe departure from historical practice.
-Be wary of floating-point constants! When reading a floating-point
-constant from program source code, `gawk' uses the default precision,
-unless overridden by an assignment to the special variable `PREC' on
-the command line, to store it internally as a MPFR number. Changing
-the precision using `PREC' in the program text does _not_ change the
-precision of a constant. If you need to represent a floating-point
-constant at a higher precision than the default and cannot use a
-command line assignment to `PREC', you should either specify the
-constant as a string, or as a rational number, whenever possible. The
-following example illustrates the differences among various ways to
-print a floating-point constant:
+ The second problem is that the `gawk' maintainer feels that this
+interpretation of the standard, which requires a certain amount of
+"language lawyering" to arrive at in the first place, was not even
+intended by the standard developers. In other words, "we see how you
+got where you are, but we don't think that that's where you want to be."
- $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 0.1) }'
- -| 0.1000000000000000055511151
- $ gawk -M -v PREC=113 'BEGIN { printf("%0.25f\n", 0.1) }'
- -| 0.1000000000000000000000000
- $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", "0.1") }'
- -| 0.1000000000000000000000000
- $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 1/10) }'
- -| 0.1000000000000000000000000
+ Recognizing the above issues, but attempting to provide compatibility
+with the earlier versions of the standard, the 2008 POSIX standard
+added explicit wording to allow, but not require, that `awk' support
+hexadecimal floating point values and special values for "Not A Number"
+and infinity.
- In the first case, the number is stored with the default precision
-of 53 bits.
+ Although the `gawk' maintainer continues to feel that providing
+those features is inadvisable, nevertheless, on systems that support
+IEEE floating point, it seems reasonable to provide _some_ way to
+support NaN and Infinity values. The solution implemented in `gawk' is
+as follows:
-
-File: gawk.info, Node: Changing Precision, Next: Exact Arithmetic, Prev: Floating-point Constants, Up: Arbitrary Precision Floats
+ * With the `--posix' command-line option, `gawk' becomes "hands
+ off." String values are passed directly to the system library's
+ `strtod()' function, and if it successfully returns a numeric
+ value, that is what's used.(1) By definition, the results are not
+ portable across different systems. They are also a little
+ surprising:
-15.4.4 Changing the Precision of a Number
------------------------------------------
+ $ echo nanny | gawk --posix '{ print $1 + 0 }'
+ -| nan
+ $ echo 0xDeadBeef | gawk --posix '{ print $1 + 0 }'
+ -| 3735928559
- The point is that in any variable-precision package, a decision is
- made on how to treat numbers given as data, or arising in
- intermediate results, which are represented in floating-point
- format to a precision lower than working precision. Do we promote
- them to full membership of the high-precision club, or do we treat
- them and all their associates as second-class citizens? Sometimes
- the first course is proper, sometimes the second, and it takes
- careful analysis to tell which.(1) -- Dirk Laurie
-
- `gawk' does not implicitly modify the precision of any previously
-computed results when the working precision is changed with an
-assignment to `PREC'. The precision of a number is always the one that
-was used at the time of its creation, and there is no way for the user
-to explicitly change it afterwards. However, since the result of a
-floating-point arithmetic operation is always an arbitrary precision
-floating-point value--with a precision set by the value of `PREC'--one
-of the following workarounds effectively accomplishes the desired
-behavior:
-
- x = x + 0.0
+ * Without `--posix', `gawk' interprets the four strings `+inf',
+ `-inf', `+nan', and `-nan' specially, producing the corresponding
+ special numeric values. The leading sign acts a signal to `gawk'
+ (and the user) that the value is really numeric. Hexadecimal
+ floating point is not supported (unless you also use
+ `--non-decimal-data', which is _not_ recommended). For example:
-or:
+ $ echo nanny | gawk '{ print $1 + 0 }'
+ -| 0
+ $ echo +nan | gawk '{ print $1 + 0 }'
+ -| nan
+ $ echo 0xDeadBeef | gawk '{ print $1 + 0 }'
+ -| 0
- x += 0.0
+ `gawk' ignores case in the four special values. Thus `+nan' and
+ `+NaN' are the same.
---------- Footnotes ----------
- (1) Dirk Laurie. `Variable-precision Arithmetic Considered Perilous
--- A Detective Story'. Electronic Transactions on Numerical Analysis.
-Volume 28, pp. 168-173, 2008.
+ (1) You asked for it, you got it.

-File: gawk.info, Node: Exact Arithmetic, Prev: Changing Precision, Up: Arbitrary Precision Floats
+File: gawk.info, Node: Floating point summary, Prev: POSIX Floating Point Problems, Up: Arbitrary Precision Arithmetic
-15.4.5 Exact Arithmetic with Floating-point Numbers
----------------------------------------------------
+15.7 Summary
+============
- CAUTION: Never depend on the exactness of floating-point
- arithmetic, even for apparently simple expressions!
+ * Most computer arithmetic is done using either integers or
+ floating-point values. The default for `awk' is to use
+ double-precision floating-point values.
- Can arbitrary precision arithmetic give exact results? There are no
-easy answers. The standard rules of algebra often do not apply when
-using floating-point arithmetic. Among other things, the distributive
-and associative laws do not hold completely, and order of operation may
-be important for your computation. Rounding error, cumulative precision
-loss and underflow are often troublesome.
-
- When `gawk' tests the expressions `0.1 + 12.2' and `12.3' for
-equality using the machine double precision arithmetic, it decides that
-they are not equal! (*Note Floating-point Programming::.) You can get
-the result you want by increasing the precision; 56 bits in this case
-will get the job done:
-
- $ gawk -M -v PREC=56 'BEGIN { print (0.1 + 12.2 == 12.3) }'
- -| 1
-
- If adding more bits is good, perhaps adding even more bits of
-precision is better? Here is what happens if we use an even larger
-value of `PREC':
-
- $ gawk -M -v PREC=201 'BEGIN { print (0.1 + 12.2 == 12.3) }'
- -| 0
-
- This is not a bug in `gawk' or in the MPFR library. It is easy to
-forget that the finite number of bits used to store the value is often
-just an approximation after proper rounding. The test for equality
-succeeds if and only if _all_ bits in the two operands are exactly the
-same. Since this is not necessarily true after floating-point
-computations with a particular precision and effective rounding rule, a
-straight test for equality may not work.
+ * In the 1980's, Barbie mistakenly said "Math class is tough!"
+ While math isn't tough, floating-point arithmetic isn't the same
+ as pencil and paper math, and care must be taken:
- So, don't assume that floating-point values can be compared for
-equality. You should also exercise caution when using other forms of
-comparisons. The standard way to compare between floating-point
-numbers is to determine how much error (or "tolerance") you will allow
-in a comparison and check to see if one value is within this error
-range of the other.
+ - Not all numbers can be represented exactly.
- In applications where 15 or fewer decimal places suffice, hardware
-double precision arithmetic can be adequate, and is usually much faster.
-But you do need to keep in mind that every floating-point operation can
-suffer a new rounding error with catastrophic consequences as
-illustrated by our earlier attempt to compute the value of the constant
-pi (*note Floating-point Programming::). Extra precision can greatly
-enhance the stability and the accuracy of your computation in such
-cases.
+ - Comparing values should use a delta, instead of being done
+ directly with `==' and `!='.
- Repeated addition is not necessarily equivalent to multiplication in
-floating-point arithmetic. In the example in *note Floating-point
-Programming:::
+ - Errors accumulate.
- $ gawk 'BEGIN {
- > for (d = 1.1; d <= 1.5; d += 0.1) # loop five times (?)
- > i++
- > print i
- > }'
- -| 4
+ - Operations are not always truly associative or distributive.
-you may or may not succeed in getting the correct result by choosing an
-arbitrarily large value for `PREC'. Reformulation of the problem at
-hand is often the correct approach in such situations.
+ * Increasing the accuracy can help, but it is not a panacea.
-
-File: gawk.info, Node: Arbitrary Precision Integers, Prev: Arbitrary Precision Floats, Up: Arbitrary Precision Arithmetic
+ * Often, increasing the accuracy and then rounding to the desired
+ number of digits produces reasonable results.
-15.5 Arbitrary Precision Integer Arithmetic with `gawk'
-=======================================================
+ * Use either `-M' or `--bignum' to enable MPFR arithmetic. Use
+ `PREC' to set the precision in bits, and `ROUNDMODE' to set the
+ IEEE 754 rounding mode.
-If one of the options `--bignum' or `-M' is specified, `gawk' performs
-all integer arithmetic using GMP arbitrary precision integers. Any
-number that looks like an integer in a program source or data file is
-stored as an arbitrary precision integer. The size of the integer is
-limited only by your computer's memory. The current floating-point
-context has no effect on operations involving integers. For example,
-the following computes 5^4^3^2, the result of which is beyond the
-limits of ordinary `gawk' numbers:
+ * With `-M' or `--bignum', `gawk' performs arbitrary precision
+ integer arithmetic using the GMP library. This is faster and more
+ space efficient than using MPFR for the same calculations.
- $ gawk -M 'BEGIN {
- > x = 5^4^3^2
- > print "# of digits =", length(x)
- > print substr(x, 1, 20), "...", substr(x, length(x) - 19, 20)
- > }'
- -| # of digits = 183231
- -| 62060698786608744707 ... 92256259918212890625
+ * There are several "dark corners" with respect to floating-point
+ numbers where `gawk' disagrees with the POSIX standard. It pays
+ to be aware of them.
- If you were to compute the same value using arbitrary precision
-floating-point values instead, the precision needed for correct output
-(using the formula `prec = 3.322 * dps'), would be 3.322 x 183231, or
-608693.
+ * Overall, there is no need to be unduly suspicious about the
+ results from floating-point arithmetic. The lesson to remember is
+ that floating-point arithmetic is always more complex than
+ arithmetic using pencil and paper. In order to take advantage of
+ the power of computer floating-point, you need to know its
+ limitations and work within them. For most casual use of
+ floating-point arithmetic, you will often get the expected result
+ if you simply round the display of your final results to the
+ correct number of significant decimal digits.
- The result from an arithmetic operation with an integer and a
-floating-point value is a floating-point value with a precision equal
-to the working precision. The following program calculates the eighth
-term in Sylvester's sequence(1) using a recurrence:
+ * As general advice, avoid presenting numerical data in a manner that
+ implies better precision than is actually the case.
- $ gawk -M 'BEGIN {
- > s = 2.0
- > for (i = 1; i <= 7; i++)
- > s = s * (s - 1) + 1
- > print s
- > }'
- -| 113423713055421845118910464
-
- The output differs from the actual number,
-113,423,713,055,421,844,361,000,443, because the default precision of
-53 bits is not enough to represent the floating-point results exactly.
-You can either increase the precision (100 bits is enough in this
-case), or replace the floating-point constant `2.0' with an integer, to
-perform all computations using integer arithmetic to get the correct
-output.
-
- It will sometimes be necessary for `gawk' to implicitly convert an
-arbitrary precision integer into an arbitrary precision floating-point
-value. This is primarily because the MPFR library does not always
-provide the relevant interface to process arbitrary precision integers
-or mixed-mode numbers as needed by an operation or function. In such a
-case, the precision is set to the minimum value necessary for exact
-conversion, and the working precision is not used for this purpose. If
-this is not what you need or want, you can employ a subterfuge like
-this:
-
- gawk -M 'BEGIN { n = 13; print (n + 0.0) % 2.0 }'
-
- You can avoid this issue altogether by specifying the number as a
-floating-point value to begin with:
-
- gawk -M 'BEGIN { n = 13.0; print n % 2.0 }'
-
- Note that for the particular example above, it is likely best to
-just use the following:
-
- gawk -M 'BEGIN { n = 13; print n % 2 }'
-
- ---------- Footnotes ----------
-
- (1) Weisstein, Eric W. `Sylvester's Sequence'. From MathWorld--A
-Wolfram Web Resource.
-`http://mathworld.wolfram.com/SylvestersSequence.html'

File: gawk.info, Node: Dynamic Extensions, Next: Language History, Prev: Arbitrary Precision Arithmetic, Up: Top
@@ -21602,6 +22298,8 @@ sample extensions are automatically built and installed when `gawk' is.
* Extension Samples:: The sample extensions that ship with
`gawk'.
* gawkextlib:: The `gawkextlib' project.
+* Extension summary:: Extension summary.
+* Extension Exercises:: Exercises.

File: gawk.info, Node: Extension Intro, Next: Plugin License, Up: Dynamic Extensions
@@ -21655,7 +22353,8 @@ File: gawk.info, Node: Extension Mechanism Outline, Next: Extension API Descri
Communication between `gawk' and an extension is two-way. First, when
an extension is loaded, it is passed a pointer to a `struct' whose
-fields are function pointers. This is shown in *note load-extension::.
+fields are function pointers. This is shown in *note
+figure-load-extension::.
API
Struct
@@ -21687,7 +22386,7 @@ Figure 16.1: Loading The Extension
function pointers, at runtime, without needing (link-time) access to
`gawk''s symbols. One of these function pointers is to a function for
"registering" new built-in functions. This is shown in *note
-load-new-function::.
+figure-load-new-function::.
register_ext_func({ "chdir", do_chdir, 1 });
@@ -21707,7 +22406,7 @@ Figure 16.2: Loading The New Function
with `gawk' by passing function pointers to the functions that provide
the new feature (`do_chdir()', for example). `gawk' associates the
function pointer with a name and can then call it, using a defined
-calling convention. This is shown in *note call-new-function::.
+calling convention. This is shown in *note figure-call-new-function::.
BEGIN {
chdir("/path") (*fnptr)(1);
@@ -21728,9 +22427,9 @@ Figure 16.3: Calling The New Function
the API `struct' to do its work, such as updating variables or arrays,
printing messages, setting `ERRNO', and so on.
- Convenience macros in the `gawkapi.h' header file make calling
-through the function pointers look like regular function calls so that
-extension code is quite readable and understandable.
+ Convenience macros make calling through the function pointers look
+like regular function calls so that extension code is quite readable
+and understandable.
Although all of this sounds somewhat complicated, the result is that
extension code is quite straightforward to write and to read. You can
@@ -21757,7 +22456,10 @@ File: gawk.info, Node: Extension API Description, Next: Finding Extensions, P
16.4 API Description
====================
-This (rather large) minor node describes the API in detail.
+C or C++ code for an extension must include the header file
+`gawkapi.h', which declares the functions and defines the data types
+used to communicate with `gawk'. This (rather large) minor node
+describes the API in detail.
* Menu:
@@ -21789,7 +22491,7 @@ through function pointers passed into your extension.
API function pointers are provided for the following kinds of
operations:
- * Registrations functions. You may register:
+ * Registration functions. You may register:
- extension functions,
- exit callbacks,
@@ -21841,6 +22543,7 @@ operations:
C Entity Header File
-------------------------------------------
`EOF' `<stdio.h>'
+ Values for `errno' `<errno.h>'
`FILE' `<stdio.h>'
`NULL' `<stddef.h>'
`memcpy()' `<string.h>'
@@ -21855,9 +22558,6 @@ operations:
a portability hodge-podge as can be seen in some parts of the
`gawk' source code.
- To pass reasonable integer values for `ERRNO', you will also need
- to include `<errno.h>'.
-
* The `gawkapi.h' file may be included more than once without ill
effect. Doing so, however, is poor coding practice.
@@ -21877,7 +22577,7 @@ operations:
* The API defines several simple `struct's that map values as seen
from `awk'. A value can be a `double', a string, or an array (as
in multidimensional arrays, or when creating a new array). String
- values maintain both pointer and length since embedded `NUL'
+ values maintain both pointer and length since embedded NUL
characters are allowed.
NOTE: By intent, strings are maintained using the current
@@ -22001,7 +22701,7 @@ that use them.
indicates what is in the `union'.
Representing numbers is easy--the API uses a C `double'. Strings
-require more work. Since `gawk' allows embedded `NUL' bytes in string
+require more work. Since `gawk' allows embedded NUL bytes in string
values, a string must be represented as a pair containing a
data-pointer and length. This is the `awk_string_t' type.
@@ -22081,7 +22781,7 @@ Requested: Scalar Scalar Scalar false false
Value false false false false
Cookie
-Table 16.1: Value Types Returned
+Table 16.1: API Value Types Returned

File: gawk.info, Node: Memory Allocation Functions, Next: Constructor Functions, Prev: Requesting Values, Up: Extension API Description
@@ -22124,6 +22824,7 @@ not return a value.
`#define emalloc(pointer, type, size, message) ...'
The arguments to this macro are as follows:
+
`pointer'
The pointer variable to point at the allocated storage.
@@ -22275,7 +22976,7 @@ File: gawk.info, Node: Exit Callback Functions, Next: Extension Version String
..............................................
An "exit callback" function is a function that `gawk' calls before it
-exits. Such functions are useful if you have general "clean up" tasks
+exits. Such functions are useful if you have general "cleanup" tasks
that should be performed in your extension (such as closing data base
connections or other resource deallocations). You can register such a
function with `gawk' using the following function.
@@ -22283,6 +22984,7 @@ function with `gawk' using the following function.
`void awk_atexit(void (*funcp)(void *data, int exit_status),'
` void *arg0);'
The parameters are:
+
`funcp'
A pointer to the function to be called before `gawk' exits.
The `data' parameter will be the original value of `arg0'.
@@ -22371,7 +23073,8 @@ used for `RT', if any.
A pointer to your `XXX_take_control_of()' function.
`awk_const struct input_parser *awk_const next;'
- This pointer is used by `gawk'. The extension cannot modify it.
+ This is for use by `gawk'; therefore it is marked `awk_const' so
+ that the extension cannot modify it.
The steps are as follows:
@@ -22410,8 +23113,8 @@ as follows:
Otherwise, it will.
`struct stat sbuf;'
- If file descriptor is valid, then `gawk' will have filled in this
- structure via a call to the `fstat()' system call.
+ If the file descriptor is valid, then `gawk' will have filled in
+ this structure via a call to the `fstat()' system call.
The `XXX_can_take_file()' function should examine these fields and
decide if the input parser should be used for the file. The decision
@@ -22574,8 +23277,8 @@ an extension to take over the output to a file opened with the `>' or
false otherwise.
`awk_const struct output_wrapper *awk_const next;'
- This is for use by `gawk'; therefore they are marked `awk_const'
- so that the extension cannot modify them.
+ This is for use by `gawk'; therefore it is marked `awk_const' so
+ that the extension cannot modify it.
The `awk_output_buf_t' structure looks like this:
@@ -22632,9 +23335,9 @@ in the `awk_output_buf_t'. The data members are as follows:
the `name' and `mode' fields, and any additional state (such as `awk'
variable values) that is appropriate.
- When `gawk' calls `XXX_take_control_of()', it should fill in the
-other fields, as appropriate, except for `fp', which it should just use
-normally.
+ When `gawk' calls `XXX_take_control_of()', that function should fill
+in the other fields, as appropriate, except for `fp', which it should
+just use normally.
You register your output wrapper with the following function:
@@ -22671,7 +23374,7 @@ structures as described earlier.
`awk_bool_t (*can_take_two_way)(const char *name);'
This function returns true if it wants to take over two-way I/O
- for this filename. It should not change any state (variable
+ for this file name. It should not change any state (variable
values, etc.) within `gawk'.
`awk_bool_t (*take_control_of)(const char *name,'
@@ -22682,8 +23385,8 @@ structures as described earlier.
respectively. These structures were described earlier.
`awk_const struct two_way_processor *awk_const next;'
- This is for use by `gawk'; therefore they are marked `awk_const'
- so that the extension cannot modify them.
+ This is for use by `gawk'; therefore it is marked `awk_const' so
+ that the extension cannot modify it.
As with the input parser and output processor, you provide "yes I
can take this" and "take over for this" functions,
@@ -22852,7 +23555,7 @@ was discussed earlier, in *note General Data Types::.
`awk_bool_t sym_update_scalar(awk_scalar_t cookie, awk_value_t *value);'
Update the value associated with a scalar cookie. Return false if
- the new value is not one of `AWK_STRING' or `AWK_NUMBER'. Here
+ the new value is not of type `AWK_STRING' or `AWK_NUMBER'. Here
too, the built-in variables may not be updated.
It is not obvious at first glance how to work with scalar cookies or
@@ -22967,9 +23670,10 @@ follows:
`awk_bool_t create_value(awk_value_t *value, awk_value_cookie_t *result);'
Create a cached string or numeric value from `value' for efficient
- later assignment. Only `AWK_NUMBER' and `AWK_STRING' values are
- allowed. Any other type is rejected. While `AWK_UNDEFINED' could
- be allowed, doing so would result in inferior performance.
+ later assignment. Only values of type `AWK_NUMBER' and
+ `AWK_STRING' are allowed. Any other type is rejected. While
+ `AWK_UNDEFINED' could be allowed, doing so would result in
+ inferior performance.
`awk_bool_t release_value(awk_value_cookie_t vc);'
Release the memory associated with a value cookie obtained from
@@ -23023,13 +23727,13 @@ if `awk' code assigns a new value to `VAR1', are all the others be
changed too?"
That's a great question. The answer is that no, it's not a problem.
-Internally, `gawk' uses reference-counted strings. This means that many
-variables can share the same string value, and `gawk' keeps track of
-the usage. When a variable's value changes, `gawk' simply decrements
-the reference count on the old value and updates the variable to use
-the new value.
+Internally, `gawk' uses "reference-counted strings". This means that
+many variables can share the same string value, and `gawk' keeps track
+of the usage. When a variable's value changes, `gawk' simply
+decrements the reference count on the old value and updates the
+variable to use the new value.
- Finally, as part of your clean up action (*note Exit Callback
+ Finally, as part of your cleanup action (*note Exit Callback
Functions::) you should release any cached values that you created,
using `release_value()'.
@@ -23170,7 +23874,7 @@ The following functions relate to individual array elements.
` const awk_value_t *const value);'
In the array represented by `a_cookie', create or modify the
element whose index is given by `index'. The `ARGV' and `ENVIRON'
- arrays may not be changed.
+ arrays may not be changed, although the `PROCINFO' array can be.
`awk_bool_t set_array_element_by_elem(awk_array_t a_cookie,'
` awk_element_t element);'
@@ -23408,8 +24112,8 @@ code:
Thus, the correct way to build an array is to work "top down."
Create the array, and immediately install it in `gawk''s symbol
table using `sym_update()', or install it as an element in a
- previously existing array using `set_element()'. We show example
- code shortly.
+ previously existing array using `set_array_element()'. We show
+ example code shortly.
2. Due to gawk internals, after using `sym_update()' to install an
array into `gawk', you have to retrieve the array cookie from the
@@ -23599,13 +24303,15 @@ The API provides access to several variables that describe whether the
corresponding command-line options were enabled when `gawk' was
invoked. The variables are:
+`do_debug'
+ This variable is true if `gawk' was invoked with `--debug' option.
+
`do_lint'
This variable is true if `gawk' was invoked with `--lint' option
(*note Options::).
-`do_traditional'
- This variable is true if `gawk' was invoked with `--traditional'
- option.
+`do_mpfr'
+ This variable is true if `gawk' was invoked with `--bignum' option.
`do_profile'
This variable is true if `gawk' was invoked with `--profile'
@@ -23615,11 +24321,9 @@ invoked. The variables are:
This variable is true if `gawk' was invoked with `--sandbox'
option.
-`do_debug'
- This variable is true if `gawk' was invoked with `--debug' option.
-
-`do_mpfr'
- This variable is true if `gawk' was invoked with `--bignum' option.
+`do_traditional'
+ This variable is true if `gawk' was invoked with `--traditional'
+ option.
The value of `do_lint' can change if `awk' code modifies the `LINT'
built-in variable (*note Built-in Variables::). The others should not
@@ -24219,7 +24923,9 @@ for loading each function into `gawk':
static awk_ext_func_t func_table[] = {
{ "chdir", do_chdir, 1 },
{ "stat", do_stat, 2 },
+ #ifndef __MINGW32__
{ "fts", do_fts, 3 },
+ #endif
};
Each extension must have a routine named `dl_load()' to load
@@ -24230,8 +24936,7 @@ everything that needs to be loaded. It is simplest to use the
dl_load_func(func_table, filefuncs, "")
- And that's it! As an exercise, consider adding functions to
-implement system calls such as `chown()', `chmod()', and `umask()'.
+ And that's it!
---------- Footnotes ----------
@@ -24284,8 +24989,8 @@ create a GNU/Linux shared library:
}
The `AWKLIBPATH' environment variable tells `gawk' where to find
-shared libraries (*note Finding Extensions::). We set it to the
-current directory and run the program:
+extensions (*note Finding Extensions::). We set it to the current
+directory and run the program:
$ AWKLIBPATH=$PWD gawk -f testff.awk
-| /tmp
@@ -24315,7 +25020,7 @@ current directory and run the program:
---------- Footnotes ----------
(1) In practice, you would probably want to use the GNU
-Autotools--Automake, Autoconf, Libtool, and Gettext--to configure and
+Autotools--Automake, Autoconf, Libtool, and `gettext'--to configure and
build your libraries. Instructions for doing so are beyond the scope of
this Info file. *Note gawkextlib::, for WWW links to the tools.
@@ -24358,7 +25063,7 @@ File: gawk.info, Node: Extension Sample File Functions, Next: Extension Sample
The `filefuncs' extension provides three different functions, as
follows: The usage is:
-`@load "filefuncs"'
+@load "filefuncs"
This is how you load the extension.
`result = chdir("/some/directory")'
@@ -24367,7 +25072,7 @@ follows: The usage is:
success or less than zero upon error. In the latter case it
updates `ERRNO'.
-`result = stat("/some/path", statdata [, follow])'
+`result = stat("/some/path", statdata' [`, follow']`)'
The `stat()' function provides a hook into the `stat()' system
call. It returns zero upon success or less than zero upon error.
In the latter case it updates `ERRNO'.
@@ -24379,52 +25084,36 @@ follows: The usage is:
successful, `stat()' fills the `statdata' array with information
retrieved from the filesystem, as follows:
- `statdata["name"]' The name of the file.
- `statdata["dev"]' Corresponds to the `st_dev' field in
- the `struct stat'.
- `statdata["ino"]' Corresponds to the `st_ino' field in
- the `struct stat'.
- `statdata["mode"]' Corresponds to the `st_mode' field in
- the `struct stat'.
- `statdata["nlink"]' Corresponds to the `st_nlink' field in
- the `struct stat'.
- `statdata["uid"]' Corresponds to the `st_uid' field in
- the `struct stat'.
- `statdata["gid"]' Corresponds to the `st_gid' field in
- the `struct stat'.
- `statdata["size"]' Corresponds to the `st_size' field in
- the `struct stat'.
- `statdata["atime"]' Corresponds to the `st_atime' field in
- the `struct stat'.
- `statdata["mtime"]' Corresponds to the `st_mtime' field in
- the `struct stat'.
- `statdata["ctime"]' Corresponds to the `st_ctime' field in
- the `struct stat'.
- `statdata["rdev"]' Corresponds to the `st_rdev' field in
- the `struct stat'. This element is
- only present for device files.
- `statdata["major"]' Corresponds to the `st_major' field in
- the `struct stat'. This element is
- only present for device files.
- `statdata["minor"]' Corresponds to the `st_minor' field in
- the `struct stat'. This element is
- only present for device files.
- `statdata["blksize"]' Corresponds to the `st_blksize' field
- in the `struct stat', if this field is
- present on your system. (It is present
- on all modern systems that we know of.)
- `statdata["pmode"]' A human-readable version of the mode
- value, such as printed by `ls'. For
- example, `"-rwxr-xr-x"'.
- `statdata["linkval"]' If the named file is a symbolic link,
- this element will exist and its value
- is the value of the symbolic link
- (where the symbolic link points to).
- `statdata["type"]' The type of the file as a string. One
- of `"file"', `"blockdev"', `"chardev"',
- `"directory"', `"socket"', `"fifo"',
- `"symlink"', `"door"', or `"unknown"'.
- Not all systems support all file types.
+ Subscript Field in `struct stat' File type
+ ------------------------------------------------------------
+ `"name"' The file name All
+ `"dev"' `st_dev' All
+ `"ino"' `st_ino' All
+ `"mode"' `st_mode' All
+ `"nlink"' `st_nlink' All
+ `"uid"' `st_uid' All
+ `"gid"' `st_gid' All
+ `"size"' `st_size' All
+ `"atime"' `st_atime' All
+ `"mtime"' `st_mtime' All
+ `"ctime"' `st_ctime' All
+ `"rdev"' `st_rdev' Device files
+ `"major"' `st_major' Device files
+ `"minor"' `st_minor' Device files
+ `"blksize"'`st_blksize' All
+ `"pmode"' A human-readable version of the All
+ mode value, such as printed by
+ `ls'. For example,
+ `"-rwxr-xr-x"'
+ `"linkval"'The value of the symbolic link Symbolic
+ links
+ `"type"' The type of the file as a string. All
+ One of `"file"', `"blockdev"',
+ `"chardev"', `"directory"',
+ `"socket"', `"fifo"', `"symlink"',
+ `"door"', or `"unknown"'. Not
+ all systems support all file
+ types.
`flags = or(FTS_PHYSICAL, ...)'
`result = fts(pathlist, flags, filedata)'
@@ -24442,7 +25131,7 @@ requested hierarchies.
The arguments are as follows:
`pathlist'
- An array of filenames. The element values are used; the index
+ An array of file names. The element values are used; the index
values are ignored.
`flags'
@@ -24558,10 +25247,10 @@ constant (`FNM_NOMATCH'), and an array of flag values named `FNM'.
The arguments to `fnmatch()' are:
`pattern'
- The filename wildcard to match.
+ The file name wildcard to match.
`string'
- The filename string.
+ The file name string.
`flag'
Either zero, or the bitwise OR of one or more of the flags in the
@@ -24569,18 +25258,14 @@ constant (`FNM_NOMATCH'), and an array of flag values named `FNM'.
The flags are follows:
-`FNM["CASEFOLD"]' Corresponds to the `FNM_CASEFOLD' flag as defined in
- `fnmatch()'.
-`FNM["FILE_NAME"]' Corresponds to the `FNM_FILE_NAME' flag as defined
- in `fnmatch()'.
-`FNM["LEADING_DIR"]' Corresponds to the `FNM_LEADING_DIR' flag as defined
- in `fnmatch()'.
-`FNM["NOESCAPE"]' Corresponds to the `FNM_NOESCAPE' flag as defined in
- `fnmatch()'.
-`FNM["PATHNAME"]' Corresponds to the `FNM_PATHNAME' flag as defined in
- `fnmatch()'.
-`FNM["PERIOD"]' Corresponds to the `FNM_PERIOD' flag as defined in
- `fnmatch()'.
+Array element Corresponding flag defined by `fnmatch()'
+--------------------------------------------------------------------------
+`FNM["CASEFOLD"]' `FNM_CASEFOLD'
+`FNM["FILE_NAME"]' `FNM_FILE_NAME'
+`FNM["LEADING_DIR"]'`FNM_LEADING_DIR'
+`FNM["NOESCAPE"]' `FNM_NOESCAPE'
+`FNM["PATHNAME"]' `FNM_PATHNAME'
+`FNM["PERIOD"]' `FNM_PERIOD'
Here is an example:
@@ -24657,8 +25342,8 @@ standard output to a temporary file configured to have the same owner
and permissions as the original. After the file has been processed,
the extension restores standard output to its original destination. If
`INPLACE_SUFFIX' is not an empty string, the original file is linked to
-a backup filename created by appending that suffix. Finally, the
-temporary file is renamed to the original filename.
+a backup file name created by appending that suffix. Finally, the
+temporary file is renamed to the original file name.
If any error occurs, the extension issues a fatal error to terminate
processing immediately without damaging the original file.
@@ -24672,9 +25357,6 @@ processing immediately without damaging the original file.
$ gawk -i inplace -v INPLACE_SUFFIX=.bak '{ gsub(/foo/, "bar") }
> { print }' file1 file2 file3
- We leave it as an exercise to write a wrapper script that presents an
-interface similar to `sed -i'.
-

File: gawk.info, Node: Extension Sample Ord, Next: Extension Sample Readdir, Prev: Extension Sample Inplace, Up: Extension Samples
@@ -24718,10 +25400,11 @@ on the command line (or with `getline'), they are read, with each entry
returned as a record.
The record consists of three fields. The first two are the inode
-number and the filename, separated by a forward slash character. On
+number and the file name, separated by a forward slash character. On
systems where the directory entry contains the file type, the record
has a third field (also separated by a slash) which is a single letter
-indicating the type of the file:
+indicating the type of the file. The letters are file types are shown
+in *note table-readdir-file-types::.
Letter File Type
--------------------------------------------------------------------------
@@ -24734,6 +25417,8 @@ Letter File Type
`s' Socket
`u' Anything else (unknown)
+Table 16.2: File Types Returned By `readdir()'
+
On systems without the file type information, the third field is
always `u'.
@@ -24765,10 +25450,10 @@ unwary. Here is an example:
BEGIN {
REVOUT = 1
- print "hello, world" > "/dev/stdout"
+ print "don't panic" > "/dev/stdout"
}
- The output from this program is: `dlrow ,olleh'.
+ The output from this program is: `cinap t'nod'.

File: gawk.info, Node: Extension Sample Rev2way, Next: Extension Sample Read write array, Prev: Extension Sample Revout, Up: Extension Samples
@@ -24786,12 +25471,14 @@ example shows how to use it:
BEGIN {
cmd = "/magic/mirror"
- print "hello, world" |& cmd
+ print "don't panic" |& cmd
cmd |& getline result
print result
close(cmd)
}
+ The output from this program is: `cinap t'nod'.
+

File: gawk.info, Node: Extension Sample Read write array, Next: Extension Sample Readfile, Prev: Extension Sample Rev2way, Up: Extension Samples
@@ -24803,8 +25490,8 @@ The `rwarray' extension adds two functions, named `writea()' and
`ret = writea(file, array)'
This function takes a string argument, which is the name of the
- file to which dump the array, and the array itself as the second
- argument. `writea()' understands multidimensional arrays. It
+ file to which to dump the array, and the array itself as the
+ second argument. `writea()' understands arrays of arrays. It
returns one on success, or zero upon failure.
`ret = reada(file, array)'
@@ -24887,9 +25574,8 @@ File: gawk.info, Node: Extension Sample Time, Prev: Extension Sample API Tests
16.7.12 Extension Time Functions
--------------------------------
-These functions can be used either by invoking `gawk' with a
-command-line argument of `-l time' or by inserting `@load "time"' in
-your script.
+The `time' extension adds two functions, named `gettimeofday()' and
+`sleep()', as follows:
`@load "time"'
This is how you load the extension.
@@ -24901,7 +25587,7 @@ your script.
have sub-second precision, but the actual precision may vary based
on the platform. If the standard C `gettimeofday()' system call
is available on this platform, then it simply returns the value.
- Otherwise, if on Windows, it tries to use
+ Otherwise, if on MS-Windows, it tries to use
`GetSystemTimeAsFileTime()'.
`result = sleep(SECONDS)'
@@ -24914,7 +25600,7 @@ your script.
delay.

-File: gawk.info, Node: gawkextlib, Prev: Extension Samples, Up: Dynamic Extensions
+File: gawk.info, Node: gawkextlib, Next: Extension summary, Prev: Extension Samples, Up: Dynamic Extensions
16.8 The `gawkextlib' Project
=============================
@@ -24943,7 +25629,7 @@ Time::) was originally from this project but has been moved in to the
main `gawk' distribution.
You can check out the code for the `gawkextlib' project using the
-GIT (http://git-scm.com) distributed source code control system. The
+Git (http://git-scm.com) distributed source code control system. The
command is as follows:
git clone git://git.code.sf.net/p/gawkextlib/code gawkextlib-code
@@ -24954,7 +25640,7 @@ parser library installed in order to build and use the XML extension.
In addition, you must have the GNU Autotools installed (Autoconf
(http://www.gnu.org/software/autoconf), Automake
(http://www.gnu.org/software/automake), Libtool
-(http://www.gnu.org/software/libtool), and Gettext
+(http://www.gnu.org/software/libtool), and GNU `gettext'
(http://www.gnu.org/software/gettext)).
The simple recipe for building and testing `gawkextlib' is as
@@ -24984,6 +25670,115 @@ users, please consider doing so through the `gawkextlib' project. See
the project's web site for more information.

+File: gawk.info, Node: Extension summary, Next: Extension Exercises, Prev: gawkextlib, Up: Dynamic Extensions
+
+16.9 Summary
+============
+
+ * You can write extensions (sometimes called plug-ins) for `gawk' in
+ C or C++ using the Application Programming Interface (API) defined
+ by the `gawk' developers.
+
+ * Extensions must have a license compatible with the GNU General
+ Public License (GPL), and they must assert that fact by declaring
+ a variable named `plugin_is_GPL_compatible'.
+
+ * Communication between `gawk' and an extension is two-way. `gawk'
+ passes a `struct' to the extension which contains various data
+ fields and function pointers. The extension can then call into
+ `gawk' via the supplied function pointers to accomplish certain
+ tasks.
+
+ * One of these tasks is to "register" the name and implementation of
+ a new `awk'-level function with `gawk'. The implementation takes
+ the form of a C function pointer with a defined signature. By
+ convention, implementation functions are named `do_XXXX()' for
+ some `awk'-level function `XXXX()'.
+
+ * The API is defined in a header file named `gawkpi.h'. You must
+ include a number of standard header files _before_ including it in
+ your source file.
+
+ * API function pointers are provided for the following kinds of
+ operations:
+
+ * Registration functions. You may register extension functions,
+ exit callbacks, a version string, input parsers, output
+ wrappers, and two-way processors.
+
+ * Printing fatal, warning, and "lint" warning messages.
+
+ * Updating `ERRNO', or unsetting it.
+
+ * Accessing parameters, including converting an undefined
+ parameter into an array.
+
+ * Symbol table access: retrieving a global variable, creating
+ one, or changing one.
+
+ * Allocating, reallocating, and releasing memory.
+
+ * Creating and releasing cached values; this provides an
+ efficient way to use values for multiple variables and can be
+ a big performance win.
+
+ * Manipulating arrays: retrieving, adding, deleting, and
+ modifying elements; getting the count of elements in an array;
+ creating a new array; clearing an array; and flattening an
+ array for easy C style looping over all its indices and
+ elements
+
+ * The API defines a number of standard data types for representing
+ `awk' values, array elements, and arrays.
+
+ * The API provide convenience functions for constructing values. It
+ also provides memory management functions to ensure compatibility
+ between memory allocated by `gawk' and memory allocated by an
+ extension.
+
+ * _All_ memory passed from `gawk' to an extension must be treated as
+ read-only by the extension.
+
+ * _All_ memory passed from an extension to `gawk' must come from the
+ API's memory allocation functions. `gawk' takes responsibility for
+ the memory and will release it when appropriate.
+
+ * The API provides information about the running version of `gawk' so
+ that an extension can make sure it is compatible with the `gawk'
+ that loaded it.
+
+ * It is easiest to start a new extension by copying the boilerplate
+ code described in this major node. Macros in the `gawkapi.h' make
+ this easier to do.
+
+ * The `gawk' distribution includes a number of small but useful
+ sample extensions. The `gawkextlib' project includes several more,
+ larger, extensions. If you wish to write an extension and
+ contribute it to the community of `gawk' users, the `gawkextlib'
+ project should be the place to do so.
+
+
+
+File: gawk.info, Node: Extension Exercises, Prev: Extension summary, Up: Dynamic Extensions
+
+16.10 Exercises
+===============
+
+ 1. Add functions to implement system calls such as `chown()',
+ `chmod()', and `umask()' to the file operations extension
+ presented in *note Internal File Ops::.
+
+ 2. (Hard.) How would you provide namespaces in `gawk', so that the
+ names of functions in different extensions don't conflict with
+ each other? If you come up with a really good scheme, contact the
+ `gawk' maintainer to tell him about it.
+
+ 3. Write a wrapper script that provides an interface similar to `sed
+ -i' for the "inplace" extension presented in *note Extension
+ Sample Inplace::.
+
+
+
File: gawk.info, Node: Language History, Next: Installation, Prev: Dynamic Extensions, Up: Top
Appendix A The Evolution of the `awk' Language
@@ -24994,7 +25789,7 @@ the POSIX specification. Many long-time `awk' users learned `awk'
programming with the original `awk' implementation in Version 7 Unix.
(This implementation was the basis for `awk' in Berkeley Unix, through
4.3-Reno. Subsequent versions of Berkeley Unix, and some systems
-derived from 4.4BSD-Lite, use various versions of `gawk' for their
+derived from 4.4BSD-Lite, used various versions of `gawk' for their
`awk'.) This major node briefly describes the evolution of the `awk'
language, with cross-references to other parts of the Info file where
you can find more information.
@@ -25014,6 +25809,7 @@ you can find more information.
* Common Extensions:: Common Extensions Summary.
* Ranges and Locales:: How locales used to affect regexp ranges.
* Contributors:: The major contributors to `gawk'.
+* History summary:: History summary.

File: gawk.info, Node: V7/SVR3.1, Next: SVR4, Up: Language History
@@ -25436,8 +26232,8 @@ in POSIX `awk', in the order they were added to `gawk'.
* The `next file' statement became `nextfile' (*note Nextfile
Statement::).
- * The `fflush()' function from the Bell Laboratories research
- version of `awk' (*note I/O Functions::).
+ * The `fflush()' function from Brian Kernighan's `awk' (then at Bell
+ Laboratories; *note I/O Functions::).
* New command line options:
@@ -25445,8 +26241,9 @@ in POSIX `awk', in the order they were added to `gawk'.
available in the original Version 7 Unix version of `awk'
(*note V7/SVR3.1::).
- - The `-m' option from the Bell Laboratories research version
- of `awk' This was later removed.
+ - The `-m' option from Brian Kernighan's `awk'. (He was still
+ at Bell Laboratories at the time.) This was later removed
+ from both his `awk' and from `gawk'.
- The `--re-interval' option to provide interval expressions in
regexps (*note Regexp Operators::).
@@ -25457,7 +26254,7 @@ in POSIX `awk', in the order they were added to `gawk'.
* The use of GNU Autoconf to control the configuration process
(*note Quick Installation::).
- * Amiga support.
+ * Amiga support. This has since been removed.
Version 3.1 of `gawk' introduced the following features:
@@ -25516,7 +26313,7 @@ in POSIX `awk', in the order they were added to `gawk'.
* The support for `next file' as two words was removed completely
(*note Nextfile Statement::).
- * Additional commnd line options (*note Options::):
+ * Additional command-line options (*note Options::):
- The `--dump-variables' option to print a list of all global
variables.
@@ -25548,7 +26345,8 @@ in POSIX `awk', in the order they were added to `gawk'.
* Tandem support. This was later removed.
- * The Atari port became officially unsupported.
+ * The Atari port became officially unsupported and was later removed
+ entirely.
* The source code changed to use ISO C standard-style function
definitions.
@@ -25610,8 +26408,8 @@ in POSIX `awk', in the order they were added to `gawk'.
output redirections (*note I/O Functions::).
* The `isarray()' function which distinguishes if an item is an array
- or not, to make it possible to traverse multidimensional arrays
- (*note Type Functions::).
+ or not, to make it possible to traverse arrays of arrays (*note
+ Type Functions::).
* The `patsplit()' function which gives the same capability as
`FPAT', for splitting (*note String Functions::).
@@ -25717,8 +26515,8 @@ in POSIX `awk', in the order they were added to `gawk'.
- The `-R' option was removed.
- * Support for high precision arithmetic with MPFR. (*note Gawk and
- MPFR::).
+ * Support for high precision arithmetic with MPFR. (*note Arbitrary
+ Precision Arithmetic::).
* The `and()', `or()' and `xor()' functions changed to allow any
number of arguments, with a minimum of two (*note Bitwise
@@ -25866,7 +26664,7 @@ and its rationale
(http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html#tag_21_09_03_05).

-File: gawk.info, Node: Contributors, Prev: Ranges and Locales, Up: Language History
+File: gawk.info, Node: Contributors, Next: History summary, Prev: Ranges and Locales, Up: Language History
A.9 Major Contributors to `gawk'
================================
@@ -25912,8 +26710,8 @@ Info file, in approximate chronological order:
* Michal Jaegermann provided the port to Atari systems and its
documentation. (This port is no longer supported.) He continues
- to provide portability checking with DEC Alpha systems, and has
- done a lot of work to make sure `gawk' works on non-32-bit systems.
+ to provide portability checking, and has done a lot of work to
+ make sure `gawk' works on non-32-bit systems.
* Fred Fish provided the port to Amiga systems and its documentation.
(With Fred's sad passing, this is no longer supported.)
@@ -25978,8 +26776,7 @@ Info file, in approximate chronological order:
- The modifications to convert `gawk' into a byte-code
interpreter, including the debugger.
- - The addition of true multidimensional arrays. *note Arrays
- of Arrays::.
+ - The addition of true arrays of arrays.
- The additional modifications for support of arbitrary
precision arithmetic.
@@ -25994,6 +26791,9 @@ Info file, in approximate chronological order:
- The improved array sorting features were driven by John
together with Pat Rankin.
+ * Panos Papadopoulos contributed the original text for *note Include
+ Files::.
+
* Efraim Yawitz contributed the original text for *note Debugger::.
* The development of the extension API first released with `gawk'
@@ -26009,6 +26809,38 @@ Info file, in approximate chronological order:
1994.

+File: gawk.info, Node: History summary, Prev: Contributors, Up: Language History
+
+A.10 Summary
+============
+
+ * The `awk' language has evolved over time. The first release was
+ with V7 Unix circa 1978. In 1987 for System V Release 3.1, major
+ additions, including user-defined functions, were made to the
+ language. Additional changes were made for System V Release 4, in
+ 1989. Since then, further minor changes happen under the auspices
+ of the POSIX standard.
+
+ * Brian Kernighan's `awk' provides a small number of extensions that
+ are implemented in common with other versions of `awk'.
+
+ * `gawk' provides a large number of extensions over POSIX `awk'.
+ They can be disabled with either the `--traditional' or `--posix'
+ options.
+
+ * The interaction of POSIX locales and regexp matching in `gawk' has
+ been confusing over the years. Today, `gawk' implements Rational
+ Range Interpretation, where ranges of the form `[a-z]' match
+ _only_ the characters numerically between `a' through `z' in the
+ machine's native character set. Usually this is ASCII but it can
+ be EBCDIC on IBM S/390 systems.
+
+ * Many people have contributed to `gawk' development over the years.
+ We hope that the list provided in this major node is complete and
+ gives the appropriate credit where credit is due.
+
+
+
File: gawk.info, Node: Installation, Next: Notes, Prev: Language History, Up: Top
Appendix B Installing `gawk'
@@ -26029,6 +26861,7 @@ people who did the respective ports.
* Bugs:: Reporting Problems and Bugs.
* Other Versions:: Other freely available `awk'
implementations.
+* Installation summary:: Summary of installation.

File: gawk.info, Node: Gawk Distribution, Next: Unix Installation, Up: Installation
@@ -26051,7 +26884,7 @@ File: gawk.info, Node: Getting, Next: Extracting, Up: Gawk Distribution
B.1.1 Getting the `gawk' Distribution
-------------------------------------
-There are three ways to get GNU software:
+There are two ways to get GNU software:
* Copy it from someone else who already has it.
@@ -26083,7 +26916,6 @@ the GNU Zip program, `gzip'.
use `gzip' to expand the file and then use `tar' to extract it. You
can use the following pipeline to produce the `gawk' distribution:
- # Under System V, add 'o' to the tar options
gzip -d -c gawk-4.1.1.tar.gz | tar -xvpf -
On a system with GNU `tar', you can let `tar' do the decompression
@@ -26211,7 +27043,9 @@ Various `.c', `.y', and `.h' files
`doc/igawk.1'
The `troff' source for a manual page describing the `igawk'
- program presented in *note Igawk Program::.
+ program presented in *note Igawk Program::. (Since `gawk' can do
+ its own `@include' processing, neither `igawk' nor `igawk.1' are
+ installed.)
`doc/Makefile.in'
The input file used during the configuration process to generate
@@ -26219,8 +27053,8 @@ Various `.c', `.y', and `.h' files
`Makefile.am'
`*/Makefile.am'
- Files used by the GNU `automake' software for generating the
- `Makefile.in' files used by `autoconf' and `configure'.
+ Files used by the GNU Automake software for generating the
+ `Makefile.in' files used by Autoconf and `configure'.
`Makefile.in'
`aclocal.m4'
@@ -26253,11 +27087,10 @@ Various `.c', `.y', and `.h' files
contains a `Makefile.in' file, which `configure' uses to generate
a `Makefile'. `Makefile.am' is used by GNU Automake to create
`Makefile.in'. The library functions from *note Library
- Functions::, and the `igawk' program from *note Igawk Program::,
- are included as ready-to-use files in the `gawk' distribution.
- They are installed as part of the installation process. The rest
- of the programs in this Info file are available in appropriate
- subdirectories of `awklib/eg'.
+ Functions::, are included as ready-to-use files in the `gawk'
+ distribution. They are installed as part of the installation
+ process. The rest of the programs in this Info file are available
+ in appropriate subdirectories of `awklib/eg'.
`extension/*'
The source code, manual pages, and infrastructure files for the
@@ -26272,8 +27105,8 @@ Various `.c', `.y', and `.h' files
PC Installation::, for details).
`vms/*'
- Files needed for building `gawk' under VMS (*note VMS
- Installation::, for details).
+ Files needed for building `gawk' under Vax/VMS and OpenVMS (*note
+ VMS Installation::, for details).
`test/*'
A test suite for `gawk'. You can use `make check' from the
@@ -26311,8 +27144,8 @@ environment for MS-Windows.
`gawk-4.1.1'. Like most GNU software, `gawk' is configured
automatically for your system by running the `configure' program. This
program is a Bourne shell script that is generated automatically using
-GNU `autoconf'. (The `autoconf' software is described fully starting
-with *note (Autoconf)Top:: autoconf,Autoconf--Generating Automatic
+GNU Autoconf. (The Autoconf software is described fully starting with
+*note (Autoconf)Top:: autoconf,Autoconf--Generating Automatic
Configuration Scripts.)
To configure `gawk', simply run `configure':
@@ -26390,8 +27223,8 @@ command line when compiling `gawk' from scratch, including:
improvement.
`--with-whiny-user-strftime'
- Force use of the included version of the `strftime()' function for
- deficient systems.
+ Force use of the included version of the C `strftime()' function
+ for deficient systems.
Use the command `./configure --help' to see the full list of options
that `configure' supplies.
@@ -26435,9 +27268,9 @@ any constants that `configure' defined and should not have. `custom.h'
is automatically included by `config.h'.
It is also possible that the `configure' program generated by
-`autoconf' will not work on your system in some other fashion. If you
-do have a problem, the file `configure.ac' is the input for `autoconf'.
-You may be able to change this file and generate a new version of
+Autoconf will not work on your system in some other fashion. If you do
+have a problem, the file `configure.ac' is the input for Autoconf. You
+may be able to change this file and generate a new version of
`configure' that works on your system (*note Bugs::, for information on
how to report problems in configuring `gawk'). The same mechanism may
be used to send in updates to `configure.ac' and/or `custom.h'.
@@ -26466,14 +27299,14 @@ B.3.1 Installation on PC Operating Systems
This minor node covers installation and usage of `gawk' on x86 machines
running MS-DOS, any version of MS-Windows, or OS/2. In this minor
node, the term "Windows32" refers to any of Microsoft
-Windows-95/98/ME/NT/2000/XP/Vista/7.
+Windows-95/98/ME/NT/2000/XP/Vista/7/8.
- The limitations of MS-DOS (and MS-DOS shells under Windows32 or
-OS/2) has meant that various "DOS extenders" are often used with
-programs such as `gawk'. The varying capabilities of Microsoft Windows
-3.1 and Windows32 can add to the confusion. For an overview of the
-considerations, please refer to `README_d/README.pc' in the
-distribution.
+ The limitations of MS-DOS (and MS-DOS shells under the other
+operating systems) has meant that various "DOS extenders" are often
+used with programs such as `gawk'. The varying capabilities of
+Microsoft Windows 3.1 and Windows32 can add to the confusion. For an
+overview of the considerations, please refer to `README_d/README.pc' in
+the distribution.
* Menu:
@@ -26532,13 +27365,13 @@ B.3.1.2 Compiling `gawk' for PC Operating Systems
.................................................
`gawk' can be compiled for MS-DOS, Windows32, and OS/2 using the GNU
-development tools from DJ Delorie (DJGPP: MS-DOS only) or Eberhard
-Mattes (EMX: MS-DOS, Windows32 and OS/2). The file
-`README_d/README.pc' in the `gawk' distribution contains additional
-notes, and `pc/Makefile' contains important information on compilation
-options.
+development tools from DJ Delorie (DJGPP: MS-DOS only), MinGW
+(Windows32) or Eberhard Mattes (EMX: MS-DOS, Windows32 and OS/2). The
+file `README_d/README.pc' in the `gawk' distribution contains
+additional notes, and `pc/Makefile' contains important information on
+compilation options.
- To build `gawk' for MS-DOS and Windows32, copy the files in the `pc'
+To build `gawk' for MS-DOS and Windows32, copy the files in the `pc'
directory (_except_ for `ChangeLog') to the directory with the rest of
the `gawk' sources, then invoke `make' with the appropriate target name
as an argument to build `gawk'. The `Makefile' copied from the `pc'
@@ -26598,7 +27431,12 @@ other set of (self-consistent) environment variables and compiler flags.
NOTE: Ancient OS/2 ports of GNU `make' are not able to handle the
Makefiles of this package. If you encounter any problems with
`make', try GNU Make 3.79.1 or later versions. You should find
- the latest version on `ftp://hobbes.nmsu.edu/pub/os2/'.
+ the latest version on `ftp://hobbes.nmsu.edu/pub/os2/'.(1)
+
+ ---------- Footnotes ----------
+
+ (1) As of May, 2014, this site is still there, but the author could
+not find a package for GNU Make.

File: gawk.info, Node: PC Testing, Next: PC Using, Prev: PC Compiling, Up: PC Installation
@@ -26639,11 +27477,11 @@ Networking::). EMX (OS/2 only) supports at least the `|&' operator.
files as described in *note AWKPATH Variable::. However, semicolons
(rather than colons) separate elements in the `AWKPATH' variable. If
`AWKPATH' is not set or is empty, then the default search path for
-MS-Windows and MS-DOS versions is `".;c:/lib/awk;c:/gnu/lib/awk"'.
+MS-Windows and MS-DOS versions is `.;c:/lib/awk;c:/gnu/lib/awk'.
The search path for OS/2 (32 bit, EMX) is determined by the prefix
directory (most likely `/usr' or `c:/usr') that has been specified as
-an option of the `configure' script like it is the case for the Unix
+an option of the `configure' script as is the case for the Unix
versions. If `c:/usr' is the prefix directory then the default search
path contains `.' and `c:/usr/share/awk'. Additionally, to support
binary distributions of `gawk' for OS/2 systems whose drive `c:' might
@@ -26651,7 +27489,7 @@ not support long file names or might not exist at all, there is a
special environment variable. If `UNIXROOT' specifies a drive then
this specific drive is also searched for program files. E.g., if
`UNIXROOT' is set to `e:' the complete default search path is
-`".;c:/usr/share/awk;e:/usr/share/awk"'.
+`.;c:/usr/share/awk;e:/usr/share/awk'.
An `sh'-like shell (as opposed to `command.com' under MS-DOS or
`cmd.exe' under MS-Windows or OS/2) may be useful for `awk' programming.
@@ -26659,10 +27497,9 @@ The DJGPP collection of tools includes an MS-DOS port of Bash, and
several shells are available for OS/2, including `ksh'.
Under MS-Windows, OS/2 and MS-DOS, `gawk' (and many other text
-programs) silently translate end-of-line `"\r\n"' to `"\n"' on input
-and `"\n"' to `"\r\n"' on output. A special `BINMODE' variable
-(c.e.) allows control over these translations and is interpreted as
-follows:
+programs) silently translate end-of-line `\r\n' to `\n' on input and
+`\n' to `\r\n' on output. A special `BINMODE' variable (c.e.) allows
+control over these translations and is interpreted as follows:
* If `BINMODE' is `"r"', or one, then binary mode is set on read
(i.e., no translations on reads).
@@ -26688,11 +27525,11 @@ and cannot be changed mid-stream.
Versions::). `mawk' and `gawk' handle `BINMODE' similarly; however,
`mawk' adds a `-W BINMODE=N' option and an environment variable that
can set `BINMODE', `RS', and `ORS'. The files `binmode[1-3].awk'
-(under `gnu/lib/awk' in some of the prepared distributions) have been
-chosen to match `mawk''s `-W BINMODE=N' option. These can be changed
-or discarded; in particular, the setting of `RS' giving the fewest
-"surprises" is open to debate. `mawk' uses `RS = "\r\n"' if binary
-mode is set on read, which is appropriate for files with the
+(under `gnu/lib/awk' in some of the prepared binary distributions) have
+been chosen to match `mawk''s `-W BINMODE=N' option. These can be
+changed or discarded; in particular, the setting of `RS' giving the
+fewest "surprises" is open to debate. `mawk' uses `RS = "\r\n"' if
+binary mode is set on read, which is appropriate for files with the
MS-DOS-style end-of-line.
To illustrate, the following examples set binary mode on writes for
@@ -26757,8 +27594,8 @@ translation of `"\r\n"', since it won't. Caveat Emptor!

File: gawk.info, Node: VMS Installation, Prev: PC Installation, Up: Non-Unix Installation
-B.3.2 How to Compile and Install `gawk' on VMS
-----------------------------------------------
+B.3.2 How to Compile and Install `gawk' on Vax/VMS and OpenVMS
+--------------------------------------------------------------
This node describes how to compile and install `gawk' under VMS. The
older designation "VMS" is used throughout to refer to OpenVMS.
@@ -26795,10 +27632,10 @@ or:
$ MMK/DESCRIPTION=[.vms]descrip.mms gawk
`MMK' is an open source, free, near-clone of `MMS' and can better
-handle `ODS-5' volumes with upper- and lowercase filenames. `MMK' is
+handle ODS-5 volumes with upper- and lowercase file names. `MMK' is
available from `https://github.com/endlesssoftware/mmk'.
- With `ODS-5' volumes and extended parsing enabled, the case of the
+ With ODS-5 volumes and extended parsing enabled, the case of the
target parameter may need to be exact.
`gawk' has been tested under VAX/VMS 7.3 and Alpha/VMS 7.3-1 using
@@ -26806,8 +27643,8 @@ Compaq C V6.4, and Alpha/VMS 7.3, Alpha/VMS 7.3-2, and IA64/VMS 8.3.
The most recent builds used HP C V7.3 on Alpha VMS 8.3 and both Alpha
and IA64 VMS 8.4 used HP C 7.3.(1)
- The `[.vms]gawk_build_steps.txt' provides information on how to build
-`gawk' into a PCSI kit that is compatible with the GNV product.
+ *Note VMS GNV::, for information on building `gawk' as a PCSI kit
+that is compatible with the GNV product.
---------- Footnotes ----------
@@ -26916,7 +27753,7 @@ has no device or directory path information in it, `gawk' looks in the
current directory first, then in the directory specified by the
translation of `AWK_LIBRARY' if the file is not found. If, after
searching in both directories, the file still is not found, `gawk'
-appends the suffix `.awk' to the filename and retries the file search.
+appends the suffix `.awk' to the file name and retries the file search.
If `AWK_LIBRARY' has no definition, a default value of `SYS$LIBRARY:'
is used for it.
@@ -27049,11 +27886,12 @@ get this information with the command `gawk --version'.
Once you have a precise problem, send email to <bug-gawk@gnu.org>.
- Using this address automatically sends a copy of your mail to me.
-If necessary, I can be reached directly at <arnold@skeeve.com>. The
-bug reporting address is preferred since the email list is archived at
-the GNU Project. _All email should be in English, since that is my
-native language._
+ The `gawk' maintainers subscribe to this address and thus they will
+receive your bug report. If necessary, the primary maintainer can be
+reached directly at <arnold@skeeve.com>. The bug reporting address is
+preferred since the email list is archived at the GNU Project. _All
+email should be in English. This is the only language understood in
+common by all the maintainers._
CAUTION: Do _not_ try to report bugs in `gawk' by posting to the
Usenet/Internet newsgroup `comp.lang.awk'. While the `gawk'
@@ -27088,7 +27926,7 @@ considered authoritative if it conflicts with this Info file.
The people maintaining the non-Unix ports of `gawk' are as follows:
MS-DOS with DJGPP Scott Deifik, <scottd.mail@sbcglobal.net>.
-MS-Windows with MINGW Eli Zaretskii, <eliz@gnu.org>.
+MS-Windows with MinGW Eli Zaretskii, <eliz@gnu.org>.
OS/2 Andreas Buening, <andreas.buening@nexgo.de>.
VMS Pat Rankin, <r.pat.rankin@gmail.com>, and John
Malmberg, <wb8tyw@qsl.net>.
@@ -27098,7 +27936,7 @@ z/OS (OS/390) Dave Pitts, <dpitts@cozx.com>.
your report to the <bug-gawk@gnu.org> email list as well.

-File: gawk.info, Node: Other Versions, Prev: Bugs, Up: Installation
+File: gawk.info, Node: Other Versions, Next: Installation summary, Prev: Bugs, Up: Installation
B.5 Other Freely Available `awk' Implementations
================================================
@@ -27194,12 +28032,13 @@ Busybox Awk
(http://busybox.net).
The OpenSolaris POSIX `awk'
- The version of `awk' in `/usr/xpg4/bin' on Solaris is more-or-less
- POSIX-compliant. It is based on the `awk' from Mortice Kern
- Systems for PCs. This author was able to make it compile and work
- under GNU/Linux with 1-2 hours of work. Making it more generally
- portable (using GNU Autoconf and/or Automake) would take more
- work, and this has not been done, at least to our knowledge.
+ The versions of `awk' in `/usr/xpg4/bin' and `/usr/xpg6/bin' on
+ Solaris are more-or-less POSIX-compliant. They are based on the
+ `awk' from Mortice Kern Systems for PCs. This author was able to
+ make this code compile and work under GNU/Linux with 1-2 hours of
+ work. Making it more generally portable (using GNU Autoconf
+ and/or Automake) would take more work, and this has not been done,
+ at least to our knowledge.
The source code used to be available from the OpenSolaris web site.
However, that project was ended and the web site shut down.
@@ -27237,6 +28076,9 @@ QSE Awk
`http://www.quiktrim.org/QTawk.html' for more information,
including the manual and a download link.
+ The project may also be frozen; no new code changes have been made
+ since approximately 2008.
+
Other Versions
See also the Wikipedia article
(http://en.wikipedia.org/wiki/Awk_language#Versions_and_implementations),
@@ -27244,6 +28086,34 @@ Other Versions

+File: gawk.info, Node: Installation summary, Prev: Other Versions, Up: Installation
+
+B.6 Summary
+===========
+
+ * The `gawk' distribution is available from GNU project's main
+ distribution site, `ftp.gnu.org'. The canonical build recipe is:
+
+ wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.1.tar.gz
+ tar -xvpzf gawk-4.1.1.tar.gz
+ cd gawk-4.1.1
+ ./configure && make && make check
+
+ * `gawk' may be built on non-POSIX systems as well. The currently
+ supported systems are MS-Windows using DJGPP, MSYS, MinGW and
+ Cygwin, OS/2 using EMX, and both Vax/VMS and OpenVMS.
+ Instructions for each system are included in this major node.
+
+ * Bug reports should be sent via email to <bug-gawk@gnu.org>. Bug
+ reports should be in English, and should include the version of
+ `gawk', how it was compiled, and a short program and data file
+ which demonstrate the problem.
+
+ * There are a number of other freely available `awk'
+ implementations. Many are POSIX compliant; others are less so.
+
+
+
File: gawk.info, Node: Notes, Next: Basic Concepts, Prev: Installation, Up: Top
Appendix C Implementation Notes
@@ -27262,6 +28132,7 @@ and maintainers of `gawk'. Everything in it applies specifically to
* Implementation Limitations:: Some limitations of the implementation.
* Extension Design:: Design notes about the extension API.
* Old Extension Mechanism:: Some compatibility for old extensions.
+* Notes summary:: Summary of implementation notes.

File: gawk.info, Node: Compatibility Mode, Next: Additions, Up: Notes
@@ -27279,7 +28150,7 @@ one more option available on the command line:
`-Y'
`--parsedebug'
- Prints out the parse stack information as the program is being
+ Print out the parse stack information as the program is being
parsed.
This option is intended only for serious `gawk' developers and not
@@ -27307,8 +28178,8 @@ as well as any considerations you should bear in mind.
`gawk'.
* New Ports:: Porting `gawk' to a new operating
system.
-* Derived Files:: Why derived files are kept in the
- `git' repository.
+* Derived Files:: Why derived files are kept in the Git
+ repository.

File: gawk.info, Node: Accessing The Source, Next: Adding Code, Up: Additions
@@ -27329,9 +28200,9 @@ doesn't have it. Once you have done so, use the command:
git clone git://git.savannah.gnu.org/gawk.git
-This will clone the `gawk' repository. If you are behind a firewall
-that will not allow you to use the Git native protocol, you can still
-access the repository using:
+This clones the `gawk' repository. If you are behind a firewall that
+does not allow you to use the Git native protocol, you can still access
+the repository using:
git clone http://git.savannah.gnu.org/r/gawk.git
@@ -27353,7 +28224,7 @@ C.2.2 Adding New Features
You are free to add any new features you like to `gawk'. However, if
you want your changes to be incorporated into the `gawk' distribution,
there are several steps that you need to take in order to make it
-possible to include your changes:
+possible to include them:
1. Before building the new feature into `gawk' itself, consider
writing it as an extension module (*note Dynamic Extensions::).
@@ -27370,9 +28241,10 @@ possible to include your changes:
3. Get the latest version. It is much easier for me to integrate
changes if they are relative to the most recent distributed
- version of `gawk'. If your version of `gawk' is very old, I may
- not be able to integrate them at all. (*Note Getting::, for
- information on getting the latest version of `gawk'.)
+ version of `gawk', or better yet, relative to the latest code in
+ the Git repository. If your version of `gawk' is very old, I may
+ not be able to integrate your changes at all. (*Note Getting::,
+ for information on getting the latest version of `gawk'.)
4. See *note (Version)Top:: standards, GNU Coding Standards. This
document describes how GNU software should be written. If you
@@ -27469,7 +28341,8 @@ possible to include your changes:
8. Include an entry for the `ChangeLog' file with your submission.
This helps further minimize the amount of work I have to do,
- making it easier for me to accept patches.
+ making it easier for me to accept patches. It is simplest if you
+ just make this part of your diff.
Although this sounds like a lot of work, please remember that while
you may write the new code, I have to maintain it and support it. If it
@@ -27510,18 +28383,24 @@ steps:
people. Thus, you should not change them unless it is for a very
good reason; i.e., changes are not out of the question, but
changes to these files are scrutinized extra carefully. The files
- are `dfa.c', `dfa.h', `getopt1.c', `getopt.c', `getopt.h',
- `install-sh', `mkinstalldirs', `regcomp.c', `regex.c',
- `regexec.c', `regexex.c', `regex.h', `regex_internal.c', and
- `regex_internal.h'.
-
- 5. Be willing to continue to maintain the port. Non-Unix operating
+ are `dfa.c', `dfa.h', `getopt.c', `getopt.h', `getopt1.c',
+ `getopt_int.h', `gettext.h', `regcomp.c', `regex.c', `regex.h',
+ `regex_internal.c', `regex_internal.h', and `regexec.c'.
+
+ 5. A number of other files are provided by the GNU Autotools
+ (Autoconf, Automake, and GNU `gettext'). You should not change
+ them either, unless it is for a very good reason. The files are
+ `ABOUT-NLS', `config.guess', `config.rpath', `config.sub',
+ `depcomp', `INSTALL', `install-sh', `missing', `mkinstalldirs',
+ `xalloc.h', and `ylwrap'.
+
+ 6. Be willing to continue to maintain the port. Non-Unix operating
systems are supported by volunteers who maintain the code needed
to compile and run `gawk' on their systems. If noone volunteers to
maintain a port, it becomes unsupported and it may be necessary to
remove it from the distribution.
- 6. Supply an appropriate `gawkmisc.???' file. Each port has its own
+ 7. Supply an appropriate `gawkmisc.???' file. Each port has its own
`gawkmisc.???' that implements certain operating system specific
functions. This is cleaner than a plethora of `#ifdef's scattered
throughout the code. The `gawkmisc.c' in the main source
@@ -27537,7 +28416,7 @@ steps:
(Currently, this is only an issue for the PC operating system
ports.)
- 7. Supply a `Makefile' as well as any other C source and header files
+ 8. Supply a `Makefile' as well as any other C source and header files
that are necessary for your operating system. All your code
should be in a separate subdirectory, with a name that is the same
as, or reminiscent of, either your operating system or the
@@ -27547,7 +28426,7 @@ steps:
avoid using names for your files that duplicate the names of files
in the main source directory.
- 8. Update the documentation. Please write a section (or sections)
+ 9. Update the documentation. Please write a section (or sections)
for this Info file describing the installation and compilation
steps needed to compile and/or install `gawk' for your system.
@@ -27561,13 +28440,13 @@ style and brace layout that suits your taste.

File: gawk.info, Node: Derived Files, Prev: New Ports, Up: Additions
-C.2.4 Why Generated Files Are Kept In `git'
--------------------------------------------
+C.2.4 Why Generated Files Are Kept In Git
+-----------------------------------------
-If you look at the `gawk' source in the `git' repository, you will
-notice that it includes files that are automatically generated by GNU
-infrastructure tools, such as `Makefile.in' from `automake' and even
-`configure' from `autoconf'.
+If you look at the `gawk' source in the Git repository, you will notice
+that it includes files that are automatically generated by GNU
+infrastructure tools, such as `Makefile.in' from Automake and even
+`configure' from Autoconf.
This is different from many Free Software projects that do not store
the derived files, because that keeps the repository less cluttered,
@@ -27594,10 +28473,10 @@ build?)
If the repository has all the generated files, then it's easy to
just check them out and build. (Or _easier_, depending upon how far
-back we go. `:-)')
+back we go.)
And that brings us to the second (and stronger) reason why all the
-files really need to be in `git'. It boils down to who do you cater
+files really need to be in Git. It boils down to who do you cater
to--the `gawk' developer(s), or the user who just wants to check out a
version and try it out?
@@ -27623,7 +28502,7 @@ idea how to create it, and that was not the only problem.)
He felt _extremely_ frustrated. With respect to that branch, the
maintainer is no different than Jane User who wants to try to build
-`gawk-4.0-stable' or `master' from the repository.
+`gawk-4.1-stable' or `master' from the repository.
Thus, the maintainer thinks that it's not just important, but
critical, that for any given branch, the above incantation _just works_.
@@ -27638,32 +28517,26 @@ critical, that for any given branch, the above incantation _just works_.
B. He is really good at `git diff x y > /tmp/diff1 ; gvim
/tmp/diff1' to remove the diffs that aren't of interest in
- order to review code. `:-)'
+ order to review code.
2. It would certainly help if everyone used the same versions of the
GNU tools as he does, which in general are the latest released
- versions of `automake', `autoconf', `bison', and `gettext'.
-
- A. Installing from source is quite easy. It's how the maintainer
- worked for years under Fedora. He had `/usr/local/bin' at
- the front of his `PATH' and just did:
+ versions of Automake, Autoconf, `bison', and GNU `gettext'.
- wget http://ftp.gnu.org/gnu/PACKAGE/PACKAGE-X.Y.Z.tar.gz
- tar -xpzvf PACKAGE-X.Y.Z.tar.gz
- cd PACKAGE-X.Y.Z
- ./configure && make && make check
- make install # as root
-
- B. These days the maintainer uses Ubuntu 12.04 which is medium
- current, but he is already doing the above for `autoconf',
- `automake' and `bison'.
+ Installing from source is quite easy. It's how the maintainer
+ worked for years (and still works). He had `/usr/local/bin' at
+ the front of his `PATH' and just did:
+ wget http://ftp.gnu.org/gnu/PACKAGE/PACKAGE-X.Y.Z.tar.gz
+ tar -xpzvf PACKAGE-X.Y.Z.tar.gz
+ cd PACKAGE-X.Y.Z
+ ./configure && make && make check
+ make install # as root
Most of the above was originally written by the maintainer to other
`gawk' developers. It raised the objection from one of the developers
-"... that anybody pulling down the source from `git' is not an end
-user."
+"... that anybody pulling down the source from Git is not an end user."
However, this is not true. There are "power `awk' users" who can
build `gawk' (using the magic incantation shown previously) but who
@@ -27672,12 +28545,12 @@ all the time.
It was then suggested that there be a `cron' job to create nightly
tarballs of "the source." Here, the problem is that there are source
-trees, corresponding to the various branches! So, nightly tar balls
+trees, corresponding to the various branches! So, nightly tarballs
aren't the answer, especially as the repository can go for weeks
without significant change being introduced.
- Fortunately, the `git' server can meet this need. For any given
-branch named BRANCHNAME, use:
+ Fortunately, the Git server can meet this need. For any given branch
+named BRANCHNAME, use:
wget http://git.savannah.gnu.org/cgit/gawk.git/snapshot/gawk-BRANCHNAME.tar.gz
@@ -27688,9 +28561,9 @@ to retrieve a snapshot of the given branch.
(1) We tried. It was painful.
(2) There is one GNU program that is (in our opinion) severely
-difficult to bootstrap from the `git' repository. For example, on the
-author's old (but still working) PowerPC macintosh with Mac OS X 10.5,
-it was necessary to bootstrap a ton of software, starting with `git'
+difficult to bootstrap from the Git repository. For example, on the
+author's old (but still working) PowerPC Macintosh with Mac OS X 10.5,
+it was necessary to bootstrap a ton of software, starting with Git
itself, in order to try to work with the latest code. It's not
pleasant, and especially on older systems, it's a big waste of time.
@@ -27698,8 +28571,8 @@ pleasant, and especially on older systems, it's a big waste of time.
maintainers had dropped `.gz' and `.bz2' files and only distribute
`.tar.xz' files. It was necessary to bootstrap `xz' first!
- (3) A branch created by one of the other developers that did not
-include the generated files.
+ (3) A branch (since removed) created by one of the other developers
+that did not include the generated files.

File: gawk.info, Node: Future Extensions, Next: Implementation Limitations, Prev: Additions, Up: Notes
@@ -27712,11 +28585,11 @@ C.3 Probable Future Extensions
Hey! -- Larry Wall
- The `TODO' file in the `gawk' Git repository lists possible future
-enhancements. Some of these relate to the source code, and others to
-possible new features. Please see that file for the list. *Note
-Additions::, if you are interested in tackling any of the projects
-listed there.
+ The `TODO' file in the `master' branch of the `gawk' Git repository
+lists possible future enhancements. Some of these relate to the source
+code, and others to possible new features. Please see that file for
+the list. *Note Additions::, if you are interested in tackling any of
+the projects listed there.

File: gawk.info, Node: Implementation Limitations, Next: Extension Design, Prev: Future Extensions, Up: Notes
@@ -27732,7 +28605,7 @@ Item Limit
--------------------------------------------------------------------------
Characters in a character 2^(number of bits per byte)
class
-Length of input record `MAX_INT '
+Length of input record `MAX_INT'
Length of output record Unlimited
Length of source line Unlimited
Number of fields in a record `MAX_LONG'
@@ -27745,9 +28618,9 @@ Number of pipe redirections min(number of processes per user, number
of open files)
Numeric values Double-precision floating point (if not
using MPFR)
-Size of a field `MAX_INT '
-Size of a literal string `MAX_INT '
-Size of a printf string `MAX_INT '
+Size of a field `MAX_INT'
+Size of a literal string `MAX_INT'
+Size of a printf string `MAX_INT'

File: gawk.info, Node: Extension Design, Next: Old Extension Mechanism, Prev: Implementation Limitations, Up: Notes
@@ -27793,9 +28666,9 @@ The old extension mechanism had several problems:
* Being able to call into `gawk' from an extension required linker
facilities that are common on Unix-derived systems but that did
- not work on Windows systems; users wanting extensions on Windows
- had to statically link them into `gawk', even though Windows
- supports dynamic loading of shared objects.
+ not work on MS-Windows systems; users wanting extensions on
+ MS-Windows had to statically link them into `gawk', even though
+ MS-Windows supports dynamic loading of shared objects.
* The API would change occasionally as `gawk' changed; no
compatibility between versions was ever offered or planned for.
@@ -27843,8 +28716,8 @@ Some goals for the new API were:
flattening") in order to loop over all the element in an easy
fashion for C code.
- - The ability to create arrays (including `gawk''s true
- multidimensional arrays).
+ - The ability to create arrays (including `gawk''s true arrays
+ of arrays).
Some additional important goals were:
@@ -27858,7 +28731,7 @@ Some goals for the new API were:
* The API mechanism should not require access to `gawk''s symbols(1)
by the compile-time or dynamic linker, in order to enable creation
- of extensions that also work on Windows.
+ of extensions that also work on MS-Windows.
During development, it became clear that there were other features
that should be available to extensions, which were also subsequently
@@ -27871,7 +28744,7 @@ provided:
hook into input processing, output processing, and two-way I/O.
* An extension should be able to provide a "call back" function to
- perform clean up actions when `gawk' exits.
+ perform cleanup actions when `gawk' exits.
* An extension should be able to provide a version string so that
`gawk''s `--version' option can provide information about
@@ -27896,7 +28769,7 @@ Mechanism Outline::, for the details.
(1) The "symbols" are the variables and functions defined inside
`gawk'. Access to these symbols by code external to `gawk' loaded
-dynamically at runtime is problematic on Windows.
+dynamically at runtime is problematic on MS-Windows.

File: gawk.info, Node: Extension Other Design Decisions, Next: Extension Future Growth, Prev: Extension New Mechanism Goals, Up: Extension Design
@@ -27964,7 +28837,7 @@ The API can later be expanded, in two ways:
respect to any of the above.

-File: gawk.info, Node: Old Extension Mechanism, Prev: Extension Design, Up: Notes
+File: gawk.info, Node: Old Extension Mechanism, Next: Notes summary, Prev: Extension Design, Up: Notes
C.6 Compatibility For Old Extensions
====================================
@@ -28002,6 +28875,37 @@ old extensions that you may have to use the new API described in *note
Dynamic Extensions::.

+File: gawk.info, Node: Notes summary, Prev: Old Extension Mechanism, Up: Notes
+
+C.7 Summary
+===========
+
+ * `gawk''s extensions can be disabled with either the
+ `--traditional' option or with the `--posix' option. The
+ `--parsedebug' option is available if `gawk' is compiled with
+ `-DDEBUG'.
+
+ * The source code for `gawk' is maintained in a publicly accessable
+ Git repository. Anyone may check it out and view the source.
+
+ * Contributions to `gawk' are welcome. Following the steps outlined
+ in this major node will make it easier to integrate your
+ contributions into the code base. This applies both to new
+ feature contributions and to ports to additional operating systems.
+
+ * `gawk' has some limits--generally those that are imposed by the
+ machine architecture.
+
+ * The extension API design was intended to solve a number of problems
+ with the previous extension mechanism, enable features needed by
+ the `xgawk' project, and provide binary compatibility going
+ forward.
+
+ * The previous extension mechanism is still supported in version 4.1
+ of `gawk', but it _will_ be removed in the next major release.
+
+
+
File: gawk.info, Node: Basic Concepts, Next: Glossary, Prev: Notes, Up: Top
Appendix D Basic Programming Concepts
@@ -28138,7 +29042,7 @@ characters that comprise them. Individual variables, as well as
numeric and string variables, are referred to as "scalar" values.
Groups of values, such as arrays, are not scalars.
- *note General Arithmetic::, provided a basic introduction to numeric
+ *note Computer Arithmetic::, provided a basic introduction to numeric
types (integer and floating-point) and how they are used in a computer.
Please review that information, including a number of caveats that were
presented.
@@ -28152,15 +29056,14 @@ like this: `""'.
Humans are used to working in decimal; i.e., base 10. In base 10,
numbers go from 0 to 9, and then "roll over" into the next column.
-(Remember grade school? 42 is 4 times 10 plus 2.)
+(Remember grade school? 42 = 4 x 10 + 2.)
There are other number bases though. Computers commonly use base 2
or "binary", base 8 or "octal", and base 16 or "hexadecimal". In
binary, each column represents two times the value in the column to its
right. Each column may contain either a 0 or a 1. Thus, binary 1010
-represents 1 times 8, plus 0 times 4, plus 1 times 2, plus 0 times 1,
-or decimal 10. Octal and hexadecimal are discussed more in *note
-Nondecimal-numbers::.
+represents (1 x 8) + (0 x 4) + (1 x 2) + (0 x 1), or decimal 10. Octal
+and hexadecimal are discussed more in *note Nondecimal-numbers::.
At the very lowest level, computers store values as groups of binary
digits, or "bits". Modern computers group bits into groups of eight,
@@ -28192,8 +29095,7 @@ Glossary
Action
A series of `awk' statements attached to a rule. If the rule's
pattern matches an input record, `awk' executes the rule's action.
- Actions are always enclosed in curly braces. (*Note Action
- Overview::.)
+ Actions are always enclosed in braces. (*Note Action Overview::.)
Amazing `awk' Assembler
Henry Spencer at the University of Toronto wrote a retargetable
@@ -28279,9 +29181,9 @@ Boolean Expression
Bourne Shell
The standard shell (`/bin/sh') on Unix and Unix-like systems,
- originally written by Steven R. Bourne. Many shells (Bash, `ksh',
- `pdksh', `zsh') are generally upwardly compatible with the Bourne
- shell.
+ originally written by Steven R. Bourne at Bell Laboratories. Many
+ shells (Bash, `ksh', `pdksh', `zsh') are generally upwardly
+ compatible with the Bourne shell.
Built-in Function
The `awk' language provides built-in functions that perform various
@@ -28302,7 +29204,8 @@ Built-in Variable
Variables::.)
Braces
- See "Curly Braces."
+ The characters `{' and `}'. Braces are used in `awk' for
+ delimiting actions, compound statements, and function bodies.
C
The system programming language that most GNU software is written
@@ -28323,8 +29226,8 @@ Character Set
ASCII (American Standard Code for Information Interchange). Many
European countries use an extension of ASCII known as ISO-8859-1
(ISO Latin-1). The Unicode character set (http://www.unicode.org)
- is becoming increasingly popular and standard, and is particularly
- widely used on GNU/Linux systems.
+ is increasingly popular and standard, and is particularly widely
+ used on GNU/Linux systems.
CHEM
A preprocessor for `pic' that reads descriptions of molecules and
@@ -28334,7 +29237,7 @@ CHEM
Cookie
A peculiar goodie, token, saying or remembrance produced by or
- presented to a program. (With thanks to Doug McIlroy.)
+ presented to a program. (With thanks to Professor Doug McIlroy.)
Coprocess
A subordinate program with which two-way communications is
@@ -28369,8 +29272,7 @@ Comparison Expression
process. (*Note Typing and Comparison::.)
Curly Braces
- The characters `{' and `}'. Curly braces are used in `awk' for
- delimiting actions, compound statements, and function bodies.
+ See "Braces."
Dark Corner
An area in the language where specifications often were (or still
@@ -28410,8 +29312,8 @@ Dynamic Regular Expression
(*Note Computed Regexps::.)
Environment
- A collection of strings, of the form NAME`='`val', that each
- program has available to it. Users generally place values into the
+ A collection of strings, of the form `NAME=VAL', that each program
+ has available to it. Users generally place values into the
environment in order to provide information to various programs.
Typical examples are the environment variables `HOME' and `PATH'.
@@ -28461,11 +29363,11 @@ Floating-Point Number
See also "Double Precision" and "Single Precision."
Format
- Format strings are used to control the appearance of output in the
- `strftime()' and `sprintf()' functions, and are used in the
- `printf' statement as well. Also, data conversions from numbers
- to strings are controlled by the format strings contained in the
- built-in variables `CONVFMT' and `OFMT'. (*Note Control Letters::.)
+ Format strings control the appearance of output in the
+ `strftime()' and `sprintf()' functions, and in the `printf'
+ statement as well. Also, data conversions from numbers to strings
+ are controlled by the format strings contained in the built-in
+ variables `CONVFMT' and `OFMT'. (*Note Control Letters::.)
Free Documentation License
This document describes the terms under which this Info file is
@@ -28520,8 +29422,8 @@ Hexadecimal
Base 16 notation, where the digits are `0'-`9' and `A'-`F', with
`A' representing 10, `B' representing 11, and so on, up to `F' for
15. Hexadecimal numbers are written in C using a leading `0x', to
- indicate their base. Thus, `0x12' is 18 (1 times 16 plus 2).
- *Note Nondecimal-numbers::.
+ indicate their base. Thus, `0x12' is 18 ((1 x 16) + 2). *Note
+ Nondecimal-numbers::.
I/O
Abbreviation for "Input/Output," the act of moving data into and/or
@@ -28578,7 +29480,7 @@ Keyword
`gawk''s keywords are: `BEGIN', `BEGINFILE', `END', `ENDFILE',
`break', `case', `continue', `default' `delete', `do...while',
`else', `exit', `for...in', `for', `function', `func', `if',
- `nextfile', `next', `switch', and `while'.
+ `next', `nextfile', `switch', and `while'.
Lesser General Public License
This document describes the terms under which binary library
@@ -28634,11 +29536,7 @@ Number
Octal
Base-eight notation, where the digits are `0'-`7'. Octal numbers
are written in C using a leading `0', to indicate their base.
- Thus, `013' is 11 (one times 8 plus 3). *Note
- Nondecimal-numbers::.
-
-P1003.1
- See "POSIX."
+ Thus, `013' is 11 ((1 x 8) + 3). *Note Nondecimal-numbers::.
Pattern
Patterns tell `awk' which input records are interesting to which
@@ -28679,9 +29577,9 @@ Range (of input lines)
specify single lines. (*Note Pattern Overview::.)
Recursion
- When a function calls itself, either directly or indirectly. As
- long as this is not clear, refer to the entry for "recursion." If
- this is clear, stop, and proceed to the next entry.
+ When a function calls itself, either directly or indirectly. If
+ this is clear, stop, and proceed to the next entry. Otherwise,
+ refer to the entry for "recursion."
Redirection
Redirection means performing input from something other than the
@@ -28762,8 +29660,8 @@ Single Precision
parts. Single precision numbers keep track of fewer digits than
do double precision numbers, but operations on them are sometimes
less expensive in terms of CPU time. This is the type used by
- some very old versions of `awk' to store numeric values. It is
- the C type `float'.
+ some ancient versions of `awk' to store numeric values. It is the
+ C type `float'.
Space
The character generated by hitting the space bar on the keyboard.
@@ -28797,7 +29695,7 @@ Text Domain
Timestamp
A value in the "seconds since the epoch" format used by Unix and
POSIX systems. Used for the `gawk' functions `mktime()',
- `strftime()', and `systime()'. See also "Epoch" and "UTC."
+ `strftime()', and `systime()'. See also "Epoch," "GMT," and "UTC."
Unix
A computer operating system originally developed in the early
@@ -30036,7 +30934,7 @@ Index
* Menu:
* ! (exclamation point), ! operator: Boolean Ops. (line 67)
-* ! (exclamation point), ! operator <1>: Egrep Program. (line 170)
+* ! (exclamation point), ! operator <1>: Egrep Program. (line 175)
* ! (exclamation point), ! operator <2>: Ranges. (line 48)
* ! (exclamation point), ! operator: Precedence. (line 52)
* ! (exclamation point), != operator <1>: Precedence. (line 65)
@@ -30052,8 +30950,8 @@ Index
* ! (exclamation point), !~ operator <6>: Case-sensitivity. (line 26)
* ! (exclamation point), !~ operator: Regexp Usage. (line 19)
* " (double quote) in shell commands: Read Terminal. (line 25)
-* " (double quote), in regexp constants: Computed Regexps. (line 28)
-* " (double quote), in shell commands: Quoting. (line 37)
+* " (double quote), in regexp constants: Computed Regexps. (line 29)
+* " (double quote), in shell commands: Quoting. (line 54)
* # (number sign), #! (executable scripts): Executable Scripts.
(line 6)
* # (number sign), commenting: Comments. (line 6)
@@ -30064,46 +30962,46 @@ Index
* $ (dollar sign), regexp operator: Regexp Operators. (line 35)
* % (percent sign), % operator: Precedence. (line 55)
* % (percent sign), %= operator <1>: Precedence. (line 95)
-* % (percent sign), %= operator: Assignment Ops. (line 129)
+* % (percent sign), %= operator: Assignment Ops. (line 130)
* & (ampersand), && operator <1>: Precedence. (line 86)
* & (ampersand), && operator: Boolean Ops. (line 57)
* & (ampersand), gsub()/gensub()/sub() functions and: Gory Details.
(line 6)
* ' (single quote): One-shot. (line 15)
* ' (single quote) in gawk command lines: Long. (line 33)
-* ' (single quote), in shell commands: Quoting. (line 31)
+* ' (single quote), in shell commands: Quoting. (line 48)
* ' (single quote), vs. apostrophe: Comments. (line 27)
-* ' (single quote), with double quotes: Quoting. (line 53)
+* ' (single quote), with double quotes: Quoting. (line 70)
* () (parentheses), in a profile: Profiling. (line 146)
-* () (parentheses), regexp operator: Regexp Operators. (line 79)
+* () (parentheses), regexp operator: Regexp Operators. (line 80)
* * (asterisk), * operator, as multiplication operator: Precedence.
(line 55)
* * (asterisk), * operator, as regexp operator: Regexp Operators.
- (line 87)
+ (line 88)
* * (asterisk), * operator, null strings, matching: Gory Details.
(line 164)
* * (asterisk), ** operator <1>: Precedence. (line 49)
* * (asterisk), ** operator: Arithmetic Ops. (line 81)
* * (asterisk), **= operator <1>: Precedence. (line 95)
-* * (asterisk), **= operator: Assignment Ops. (line 129)
+* * (asterisk), **= operator: Assignment Ops. (line 130)
* * (asterisk), *= operator <1>: Precedence. (line 95)
-* * (asterisk), *= operator: Assignment Ops. (line 129)
+* * (asterisk), *= operator: Assignment Ops. (line 130)
* + (plus sign), + operator: Precedence. (line 52)
* + (plus sign), ++ operator <1>: Precedence. (line 46)
* + (plus sign), ++ operator: Increment Ops. (line 11)
* + (plus sign), += operator <1>: Precedence. (line 95)
* + (plus sign), += operator: Assignment Ops. (line 82)
-* + (plus sign), regexp operator: Regexp Operators. (line 102)
+* + (plus sign), regexp operator: Regexp Operators. (line 103)
* , (comma), in range patterns: Ranges. (line 6)
* - (hyphen), - operator: Precedence. (line 52)
* - (hyphen), -- operator <1>: Precedence. (line 46)
* - (hyphen), -- operator: Increment Ops. (line 48)
* - (hyphen), -= operator <1>: Precedence. (line 95)
-* - (hyphen), -= operator: Assignment Ops. (line 129)
+* - (hyphen), -= operator: Assignment Ops. (line 130)
* - (hyphen), filenames beginning with: Options. (line 59)
* - (hyphen), in bracket expressions: Bracket Expressions. (line 17)
* --assign option: Options. (line 32)
-* --bignum option: Options. (line 201)
+* --bignum option: Options. (line 205)
* --characters-as-bytes option: Options. (line 68)
* --copyright option: Options. (line 88)
* --debug option: Options. (line 108)
@@ -30123,32 +31021,32 @@ Index
* --gen-pot option: Options. (line 147)
* --help option: Options. (line 154)
* --include option: Options. (line 159)
-* --lint option <1>: Options. (line 182)
+* --lint option <1>: Options. (line 185)
* --lint option: Command Line. (line 20)
-* --lint-old option: Options. (line 288)
+* --lint-old option: Options. (line 293)
* --load option: Options. (line 173)
* --non-decimal-data option <1>: Nondecimal Data. (line 6)
-* --non-decimal-data option: Options. (line 207)
+* --non-decimal-data option: Options. (line 211)
* --non-decimal-data option, strtonum() function and: Nondecimal Data.
(line 36)
-* --optimize option: Options. (line 228)
-* --posix option: Options. (line 247)
-* --posix option, --traditional option and: Options. (line 266)
-* --pretty-print option: Options. (line 220)
+* --optimize option: Options. (line 235)
+* --posix option: Options. (line 252)
+* --posix option, --traditional option and: Options. (line 271)
+* --pretty-print option: Options. (line 224)
* --profile option <1>: Profiling. (line 12)
-* --profile option: Options. (line 235)
-* --re-interval option: Options. (line 272)
-* --sandbox option: Options. (line 279)
+* --profile option: Options. (line 240)
+* --re-interval option: Options. (line 277)
+* --sandbox option: Options. (line 284)
* --sandbox option, disabling system() function: I/O Functions.
- (line 94)
+ (line 97)
* --sandbox option, input redirection with getline: Getline. (line 19)
* --sandbox option, output redirection with print, printf: Redirection.
(line 6)
* --source option: Options. (line 117)
* --traditional option: Options. (line 81)
-* --traditional option, --posix option and: Options. (line 266)
-* --use-lc-numeric option: Options. (line 215)
-* --version option: Options. (line 293)
+* --traditional option, --posix option and: Options. (line 271)
+* --use-lc-numeric option: Options. (line 219)
+* --version option: Options. (line 298)
* --with-whiny-user-strftime configuration option: Additional Configuration Options.
(line 35)
* -b option: Options. (line 68)
@@ -30161,45 +31059,45 @@ Index
* -f option: Options. (line 25)
* -F option: Options. (line 21)
* -f option: Long. (line 12)
-* -F option, -Ft sets FS to TAB: Options. (line 301)
+* -F option, -Ft sets FS to TAB: Options. (line 306)
* -F option, command line: Command Line Field Separator.
(line 6)
-* -f option, multiple uses: Options. (line 306)
+* -f option, multiple uses: Options. (line 311)
* -g option: Options. (line 147)
* -h option: Options. (line 154)
* -i option: Options. (line 159)
-* -L option: Options. (line 288)
+* -L option: Options. (line 293)
* -l option: Options. (line 173)
-* -M option: Options. (line 201)
-* -N option: Options. (line 215)
-* -n option: Options. (line 207)
-* -O option: Options. (line 228)
-* -o option: Options. (line 220)
-* -P option: Options. (line 247)
-* -p option: Options. (line 235)
-* -r option: Options. (line 272)
-* -S option: Options. (line 279)
+* -M option: Options. (line 205)
+* -N option: Options. (line 219)
+* -n option: Options. (line 211)
+* -O option: Options. (line 235)
+* -o option: Options. (line 224)
+* -P option: Options. (line 252)
+* -p option: Options. (line 240)
+* -r option: Options. (line 277)
+* -S option: Options. (line 284)
* -v option: Assignment Options. (line 12)
-* -V option: Options. (line 293)
+* -V option: Options. (line 298)
* -v option: Options. (line 32)
* -W option: Options. (line 46)
-* . (period), regexp operator: Regexp Operators. (line 43)
-* .gmo files: Explaining gettext. (line 41)
-* .gmo files, converting from .po: I18N Example. (line 62)
+* . (period), regexp operator: Regexp Operators. (line 44)
+* .gmo files: Explaining gettext. (line 42)
* .gmo files, specifying directory of <1>: Programmer i18n. (line 47)
-* .gmo files, specifying directory of: Explaining gettext. (line 53)
+* .gmo files, specifying directory of: Explaining gettext. (line 54)
+* .mo files, converting from .po: I18N Example. (line 63)
* .po files <1>: Translator i18n. (line 6)
-* .po files: Explaining gettext. (line 36)
-* .po files, converting to .gmo: I18N Example. (line 62)
-* .pot files: Explaining gettext. (line 30)
+* .po files: Explaining gettext. (line 37)
+* .po files, converting to .mo: I18N Example. (line 63)
+* .pot files: Explaining gettext. (line 31)
* / (forward slash) to enclose regular expressions: Regexp. (line 10)
* / (forward slash), / operator: Precedence. (line 55)
* / (forward slash), /= operator <1>: Precedence. (line 95)
-* / (forward slash), /= operator: Assignment Ops. (line 129)
+* / (forward slash), /= operator: Assignment Ops. (line 130)
* / (forward slash), /= operator, vs. /=.../ regexp constant: Assignment Ops.
- (line 147)
+ (line 148)
* / (forward slash), patterns and: Expression Patterns. (line 24)
-* /= operator vs. /=.../ regexp constant: Assignment Ops. (line 147)
+* /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148)
* /dev/... special files: Special FD. (line 46)
* /dev/fd/N special files (gawk): Special FD. (line 46)
* /inet/... special files (gawk): TCP/IP Networking. (line 6)
@@ -30235,8 +31133,8 @@ Index
* ? (question mark), ?: operator: Precedence. (line 92)
* ? (question mark), regexp operator <1>: GNU Regexp Operators.
(line 59)
-* ? (question mark), regexp operator: Regexp Operators. (line 111)
-* [] (square brackets), regexp operator: Regexp Operators. (line 55)
+* ? (question mark), regexp operator: Regexp Operators. (line 112)
+* [] (square brackets), regexp operator: Regexp Operators. (line 56)
* \ (backslash): Comments. (line 50)
* \ (backslash) in shell commands: Read Terminal. (line 25)
* \ (backslash), \" escape sequence: Escape Sequences. (line 76)
@@ -30272,7 +31170,7 @@ Index
(line 38)
* \ (backslash), as field separator: Command Line Field Separator.
(line 27)
-* \ (backslash), continuing lines and <1>: Egrep Program. (line 220)
+* \ (backslash), continuing lines and <1>: Egrep Program. (line 223)
* \ (backslash), continuing lines and: Statements/Lines. (line 19)
* \ (backslash), continuing lines and, comments and: Statements/Lines.
(line 76)
@@ -30284,23 +31182,23 @@ Index
* \ (backslash), in escape sequences: Escape Sequences. (line 6)
* \ (backslash), in escape sequences, POSIX and: Escape Sequences.
(line 112)
-* \ (backslash), in regexp constants: Computed Regexps. (line 28)
-* \ (backslash), in shell commands: Quoting. (line 31)
+* \ (backslash), in regexp constants: Computed Regexps. (line 29)
+* \ (backslash), in shell commands: Quoting. (line 48)
* \ (backslash), regexp operator: Regexp Operators. (line 18)
* ^ (caret), ^ operator: Precedence. (line 49)
* ^ (caret), ^= operator <1>: Precedence. (line 95)
-* ^ (caret), ^= operator: Assignment Ops. (line 129)
+* ^ (caret), ^= operator: Assignment Ops. (line 130)
* ^ (caret), in bracket expressions: Bracket Expressions. (line 17)
* ^ (caret), in FS: Regexp Field Splitting.
(line 59)
* ^ (caret), regexp operator <1>: GNU Regexp Operators.
(line 59)
* ^ (caret), regexp operator: Regexp Operators. (line 22)
-* _ (underscore), C macro: Explaining gettext. (line 70)
+* _ (underscore), C macro: Explaining gettext. (line 71)
* _ (underscore), in names of private variables: Library Names.
(line 29)
* _ (underscore), translatable string: Programmer i18n. (line 69)
-* _gr_init() user-defined function: Group Functions. (line 82)
+* _gr_init() user-defined function: Group Functions. (line 83)
* _ord_init() user-defined function: Ordinal Functions. (line 16)
* _pw_init() user-defined function: Passwd Functions. (line 105)
* accessing fields: Fields. (line 6)
@@ -30312,7 +31210,7 @@ Index
* actions, control statements in: Statements. (line 6)
* actions, default: Very Simple. (line 34)
* actions, empty: Very Simple. (line 39)
-* Ada programming language: Glossary. (line 20)
+* Ada programming language: Glossary. (line 19)
* adding, features to gawk: Adding Code. (line 6)
* adding, fields: Changing Fields. (line 53)
* advanced features, fixed-width data: Constant Size. (line 10)
@@ -30331,10 +31229,10 @@ Index
* allocating memory for extensions: Memory Allocation Functions.
(line 6)
* Alpha (DEC): Manual History. (line 28)
-* amazing awk assembler (aaa): Glossary. (line 12)
-* amazingly workable formatter (awf): Glossary. (line 25)
+* amazing awk assembler (aaa): Glossary. (line 11)
+* amazingly workable formatter (awf): Glossary. (line 24)
* ambiguity, syntactic: /= operator vs. /=.../ regexp constant: Assignment Ops.
- (line 147)
+ (line 148)
* ampersand (&), && operator <1>: Precedence. (line 86)
* ampersand (&), && operator: Boolean Ops. (line 57)
* ampersand (&), gsub()/gensub()/sub() functions and: Gory Details.
@@ -30344,7 +31242,7 @@ Index
* and: Bitwise Functions. (line 39)
* AND bitwise operation: Bitwise Functions. (line 6)
* and Boolean-logic operator: Boolean Ops. (line 6)
-* ANSI: Glossary. (line 35)
+* ANSI: Glossary. (line 34)
* API informational variables: Extension API Informational Variables.
(line 6)
* API version: Extension Versioning.
@@ -30355,18 +31253,18 @@ Index
(line 6)
* archeologists: Bugs. (line 6)
* arctangent: Numeric Functions. (line 11)
-* ARGC/ARGV variables: Auto-set. (line 11)
+* ARGC/ARGV variables: Auto-set. (line 15)
* ARGC/ARGV variables, command-line arguments: Other Arguments.
(line 12)
* ARGC/ARGV variables, how to use: ARGC and ARGV. (line 6)
* ARGC/ARGV variables, portability and: Executable Scripts. (line 42)
-* ARGIND variable: Auto-set. (line 40)
+* ARGIND variable: Auto-set. (line 44)
* ARGIND variable, command-line arguments: Other Arguments. (line 12)
* arguments, command-line <1>: ARGC and ARGV. (line 6)
-* arguments, command-line <2>: Auto-set. (line 11)
+* arguments, command-line <2>: Auto-set. (line 15)
* arguments, command-line: Other Arguments. (line 6)
* arguments, command-line, invoking awk: Command Line. (line 6)
-* arguments, in function calls: Function Calls. (line 16)
+* arguments, in function calls: Function Calls. (line 18)
* arguments, processing: Getopt Function. (line 6)
* ARGV array, indexing into: Other Arguments. (line 12)
* arithmetic operators: Arithmetic Ops. (line 6)
@@ -30374,15 +31272,15 @@ Index
* array members: Reference to Elements.
(line 6)
* array scanning order, controlling: Controlling Scanning.
- (line 12)
-* array, number of elements: String Functions. (line 194)
+ (line 14)
+* array, number of elements: String Functions. (line 197)
* arrays: Arrays. (line 6)
* arrays of arrays: Arrays of Arrays. (line 6)
* arrays, an example of using: Array Example. (line 6)
-* arrays, and IGNORECASE variable: Array Intro. (line 91)
+* arrays, and IGNORECASE variable: Array Intro. (line 92)
* arrays, as parameters to functions: Pass By Value/Reference.
(line 47)
-* arrays, associative: Array Intro. (line 49)
+* arrays, associative: Array Intro. (line 50)
* arrays, associative, library functions and: Library Names. (line 57)
* arrays, deleting entire contents: Delete. (line 39)
* arrays, elements that don't exist: Reference to Elements.
@@ -30391,9 +31289,9 @@ Index
* arrays, elements, deleting: Delete. (line 6)
* arrays, elements, order of access by in operator: Scanning an Array.
(line 48)
-* arrays, elements, retrieving number of: String Functions. (line 32)
+* arrays, elements, retrieving number of: String Functions. (line 42)
* arrays, for statement and: Scanning an Array. (line 20)
-* arrays, indexing: Array Intro. (line 49)
+* arrays, indexing: Array Intro. (line 50)
* arrays, merging into strings: Join Function. (line 6)
* arrays, multidimensional: Multidimensional. (line 10)
* arrays, multidimensional, scanning: Multiscanning. (line 11)
@@ -30407,7 +31305,7 @@ Index
(line 6)
* arrays, sorting, and IGNORECASE variable: Array Sorting Functions.
(line 83)
-* arrays, sparse: Array Intro. (line 70)
+* arrays, sparse: Array Intro. (line 71)
* arrays, subscripts, uninitialized variables as: Uninitialized Subscripts.
(line 6)
* arrays, unassigned elements: Reference to Elements.
@@ -30418,12 +31316,12 @@ Index
* ASCII: Ordinal Functions. (line 45)
* asort <1>: Array Sorting Functions.
(line 6)
-* asort: String Functions. (line 32)
+* asort: String Functions. (line 42)
* asort() function (gawk), arrays, sorting: Array Sorting Functions.
(line 6)
* asorti <1>: Array Sorting Functions.
(line 6)
-* asorti: String Functions. (line 32)
+* asorti: String Functions. (line 42)
* asorti() function (gawk), arrays, sorting: Array Sorting Functions.
(line 6)
* assert() function (C library): Assert Function. (line 6)
@@ -30435,25 +31333,25 @@ Index
* assignment operators, evaluation order: Assignment Ops. (line 111)
* assignment operators, lvalues/rvalues: Assignment Ops. (line 32)
* assignments as filenames: Ignoring Assigns. (line 6)
-* associative arrays: Array Intro. (line 49)
+* associative arrays: Array Intro. (line 50)
* asterisk (*), * operator, as multiplication operator: Precedence.
(line 55)
* asterisk (*), * operator, as regexp operator: Regexp Operators.
- (line 87)
+ (line 88)
* asterisk (*), * operator, null strings, matching: Gory Details.
(line 164)
* asterisk (*), ** operator <1>: Precedence. (line 49)
* asterisk (*), ** operator: Arithmetic Ops. (line 81)
* asterisk (*), **= operator <1>: Precedence. (line 95)
-* asterisk (*), **= operator: Assignment Ops. (line 129)
+* asterisk (*), **= operator: Assignment Ops. (line 130)
* asterisk (*), *= operator <1>: Precedence. (line 95)
-* asterisk (*), *= operator: Assignment Ops. (line 129)
+* asterisk (*), *= operator: Assignment Ops. (line 130)
* atan2: Numeric Functions. (line 11)
* automatic displays, in debugger: Debugger Info. (line 24)
-* awf (amazingly workable formatter) program: Glossary. (line 25)
+* awf (amazingly workable formatter) program: Glossary. (line 24)
* awk debugging, enabling: Options. (line 108)
-* awk language, POSIX version: Assignment Ops. (line 136)
-* awk profiling, enabling: Options. (line 235)
+* awk language, POSIX version: Assignment Ops. (line 137)
+* awk profiling, enabling: Options. (line 240)
* awk programs <1>: Two Rules. (line 6)
* awk programs <2>: Executable Scripts. (line 6)
* awk programs: Getting Started. (line 12)
@@ -30484,7 +31382,7 @@ Index
* awk, implementations, limits: Getline Notes. (line 14)
* awk, invoking: Command Line. (line 6)
* awk, new vs. old: Names. (line 6)
-* awk, new vs. old, OFMT variable: Conversion. (line 55)
+* awk, new vs. old, OFMT variable: Strings And Numbers. (line 57)
* awk, POSIX and: Preface. (line 23)
* awk, POSIX and, See Also POSIX awk: Preface. (line 23)
* awk, regexp constants and: Comparison Operators.
@@ -30546,7 +31444,7 @@ Index
(line 38)
* backslash (\), as field separator: Command Line Field Separator.
(line 27)
-* backslash (\), continuing lines and <1>: Egrep Program. (line 220)
+* backslash (\), continuing lines and <1>: Egrep Program. (line 223)
* backslash (\), continuing lines and: Statements/Lines. (line 19)
* backslash (\), continuing lines and, comments and: Statements/Lines.
(line 76)
@@ -30558,8 +31456,8 @@ Index
* backslash (\), in escape sequences: Escape Sequences. (line 6)
* backslash (\), in escape sequences, POSIX and: Escape Sequences.
(line 112)
-* backslash (\), in regexp constants: Computed Regexps. (line 28)
-* backslash (\), in shell commands: Quoting. (line 31)
+* backslash (\), in regexp constants: Computed Regexps. (line 29)
+* backslash (\), in shell commands: Quoting. (line 48)
* backslash (\), regexp operator: Regexp Operators. (line 18)
* backtrace debugger command: Execution Stack. (line 13)
* Beebe, Nelson H.F. <1>: Other Versions. (line 78)
@@ -30593,14 +31491,14 @@ Index
* Benzinger, Michael: Contributors. (line 97)
* Berry, Karl <1>: Ranges and Locales. (line 74)
* Berry, Karl: Acknowledgments. (line 33)
-* binary input/output: User-modified. (line 10)
+* binary input/output: User-modified. (line 15)
* bindtextdomain <1>: Programmer i18n. (line 47)
* bindtextdomain: I18N Functions. (line 12)
-* bindtextdomain() function (C library): Explaining gettext. (line 49)
+* bindtextdomain() function (C library): Explaining gettext. (line 50)
* bindtextdomain() function (gawk), portability and: I18N Portability.
(line 33)
* BINMODE variable <1>: PC Using. (line 33)
-* BINMODE variable: User-modified. (line 10)
+* BINMODE variable: User-modified. (line 15)
* bit-manipulation functions: Bitwise Functions. (line 6)
* bits2str() user-defined function: Bitwise Functions. (line 70)
* bitwise AND: Bitwise Functions. (line 39)
@@ -30620,17 +31518,17 @@ Index
* braces ({}), actions and: Action Overview. (line 19)
* braces ({}), statements, grouping: Statements. (line 10)
* bracket expressions <1>: Bracket Expressions. (line 6)
-* bracket expressions: Regexp Operators. (line 55)
+* bracket expressions: Regexp Operators. (line 56)
* bracket expressions, character classes: Bracket Expressions.
(line 30)
* bracket expressions, collating elements: Bracket Expressions.
- (line 69)
+ (line 77)
* bracket expressions, collating symbols: Bracket Expressions.
- (line 76)
-* bracket expressions, complemented: Regexp Operators. (line 63)
+ (line 84)
+* bracket expressions, complemented: Regexp Operators. (line 64)
* bracket expressions, equivalence classes: Bracket Expressions.
- (line 82)
-* bracket expressions, non-ASCII: Bracket Expressions. (line 69)
+ (line 90)
+* bracket expressions, non-ASCII: Bracket Expressions. (line 77)
* bracket expressions, range expressions: Bracket Expressions.
(line 6)
* break debugger command: Breakpoint Control. (line 11)
@@ -30648,7 +31546,7 @@ Index
* Brennan, Michael <3>: Simple Sed. (line 25)
* Brennan, Michael <4>: Delete. (line 56)
* Brennan, Michael: Foreword. (line 83)
-* Brian Kernighan's awk <1>: I/O Functions. (line 40)
+* Brian Kernighan's awk <1>: I/O Functions. (line 43)
* Brian Kernighan's awk <2>: Gory Details. (line 15)
* Brian Kernighan's awk <3>: String Functions. (line 490)
* Brian Kernighan's awk <4>: Delete. (line 48)
@@ -30668,17 +31566,18 @@ Index
* Brian Kernighan's awk, extensions: BTL. (line 6)
* Brian Kernighan's awk, source code: Other Versions. (line 13)
* Brini, Davide: Signature Program. (line 6)
+* Brink, Jeroen: DOS Quoting. (line 10)
* Broder, Alan J.: Contributors. (line 88)
* Brown, Martin: Contributors. (line 82)
-* BSD-based operating systems: Glossary. (line 616)
+* BSD-based operating systems: Glossary. (line 611)
* bt debugger command (alias for backtrace): Execution Stack. (line 13)
-* Buening, Andreas <1>: Bugs. (line 70)
+* Buening, Andreas <1>: Bugs. (line 71)
* Buening, Andreas <2>: Contributors. (line 92)
* Buening, Andreas: Acknowledgments. (line 60)
* buffering, input/output <1>: Two-way I/O. (line 70)
-* buffering, input/output: I/O Functions. (line 137)
-* buffering, interactive vs. noninteractive: I/O Functions. (line 106)
-* buffers, flushing: I/O Functions. (line 29)
+* buffering, input/output: I/O Functions. (line 140)
+* buffering, interactive vs. noninteractive: I/O Functions. (line 109)
+* buffers, flushing: I/O Functions. (line 32)
* buffers, operators for: GNU Regexp Operators.
(line 48)
* bug reports, email address, bug-gawk@gnu.org: Bugs. (line 30)
@@ -30697,30 +31596,29 @@ Index
* call stack, display in debugger: Execution Stack. (line 13)
* caret (^), ^ operator: Precedence. (line 49)
* caret (^), ^= operator <1>: Precedence. (line 95)
-* caret (^), ^= operator: Assignment Ops. (line 129)
+* caret (^), ^= operator: Assignment Ops. (line 130)
* caret (^), in bracket expressions: Bracket Expressions. (line 17)
* caret (^), regexp operator <1>: GNU Regexp Operators.
(line 59)
* caret (^), regexp operator: Regexp Operators. (line 22)
* case keyword: Switch Statement. (line 6)
-* case sensitivity, and regexps: User-modified. (line 82)
-* case sensitivity, and string comparisons: User-modified. (line 82)
-* case sensitivity, array indices and: Array Intro. (line 91)
+* case sensitivity, and regexps: User-modified. (line 76)
+* case sensitivity, and string comparisons: User-modified. (line 76)
+* case sensitivity, array indices and: Array Intro. (line 92)
* case sensitivity, converting case: String Functions. (line 520)
* case sensitivity, example programs: Library Functions. (line 53)
* case sensitivity, gawk: Case-sensitivity. (line 26)
* case sensitivity, regexps and: Case-sensitivity. (line 6)
* CGI, awk scripts for: Options. (line 125)
-* changing precision of a number: Changing Precision. (line 6)
* character classes, See bracket expressions: Regexp Operators.
- (line 55)
+ (line 56)
* character lists in regular expression: Bracket Expressions. (line 6)
-* character lists, See bracket expressions: Regexp Operators. (line 55)
+* character lists, See bracket expressions: Regexp Operators. (line 56)
* character sets (machine character encodings) <1>: Glossary. (line 133)
* character sets (machine character encodings): Ordinal Functions.
(line 45)
* character sets, See Also bracket expressions: Regexp Operators.
- (line 55)
+ (line 56)
* characters, counting: Wc Program. (line 6)
* characters, transliterating: Translate Program. (line 6)
* characters, values of as numbers: Ordinal Functions. (line 6)
@@ -30743,21 +31641,21 @@ Index
* close() function, portability: Close Files And Pipes.
(line 81)
* close() function, return value: Close Files And Pipes.
- (line 130)
+ (line 131)
* close() function, two-way pipes and: Two-way I/O. (line 77)
* Close, Diane <1>: Contributors. (line 20)
* Close, Diane: Manual History. (line 41)
* Collado, Manuel: Acknowledgments. (line 60)
-* collating elements: Bracket Expressions. (line 69)
-* collating symbols: Bracket Expressions. (line 76)
-* Colombo, Antonio <1>: Contributors. (line 135)
+* collating elements: Bracket Expressions. (line 77)
+* collating symbols: Bracket Expressions. (line 84)
+* Colombo, Antonio <1>: Contributors. (line 137)
* Colombo, Antonio: Acknowledgments. (line 60)
* columns, aligning: Print Examples. (line 70)
* columns, cutting: Cut Program. (line 6)
* comma (,), in range patterns: Ranges. (line 6)
* command completion, in debugger: Readline Support. (line 6)
* command line, arguments <1>: ARGC and ARGV. (line 6)
-* command line, arguments <2>: Auto-set. (line 11)
+* command line, arguments <2>: Auto-set. (line 15)
* command line, arguments: Other Arguments. (line 6)
* command line, directories on: Command line directories.
(line 6)
@@ -30778,20 +31676,20 @@ Index
* commenting: Comments. (line 6)
* commenting, backslash continuation and: Statements/Lines. (line 76)
* common extensions, ** operator: Arithmetic Ops. (line 30)
-* common extensions, **= operator: Assignment Ops. (line 136)
+* common extensions, **= operator: Assignment Ops. (line 137)
* common extensions, /dev/stderr special file: Special FD. (line 46)
* common extensions, /dev/stdin special file: Special FD. (line 46)
* common extensions, /dev/stdout special file: Special FD. (line 46)
* common extensions, \x escape sequence: Escape Sequences. (line 61)
* common extensions, BINMODE variable: PC Using. (line 33)
* common extensions, delete to delete entire arrays: Delete. (line 39)
-* common extensions, func keyword: Definition Syntax. (line 83)
+* common extensions, func keyword: Definition Syntax. (line 89)
* common extensions, length() applied to an array: String Functions.
- (line 194)
-* common extensions, RS as a regexp: Records. (line 135)
+ (line 197)
+* common extensions, RS as a regexp: gawk split records. (line 6)
* common extensions, single character fields: Single Character Fields.
(line 6)
-* comp.lang.awk newsgroup: Bugs. (line 38)
+* comp.lang.awk newsgroup: Bugs. (line 39)
* comparison expressions: Typing and Comparison.
(line 9)
* comparison expressions, as patterns: Expression Patterns. (line 14)
@@ -30827,31 +31725,27 @@ Index
* configuration options, gawk: Additional Configuration Options.
(line 6)
* constant regexps: Regexp Usage. (line 57)
-* constants, floating-point: Floating-point Constants.
- (line 6)
* constants, nondecimal: Nondecimal Data. (line 6)
* constants, numeric: Scalar Constants. (line 6)
* constants, types of: Constants. (line 6)
-* context, floating-point: Floating-point Context.
- (line 6)
* continue program, in debugger: Debugger Execution Control.
(line 33)
* continue statement: Continue Statement. (line 6)
* control statements: Statements. (line 6)
* controlling array scanning order: Controlling Scanning.
- (line 12)
+ (line 14)
* convert string to lower case: String Functions. (line 521)
-* convert string to number: String Functions. (line 385)
+* convert string to number: String Functions. (line 388)
* convert string to upper case: String Functions. (line 527)
* converting integer array subscripts: Numeric Array Subscripts.
(line 31)
* converting, dates to timestamps: Time Functions. (line 76)
* converting, numbers to strings <1>: Bitwise Functions. (line 109)
-* converting, numbers to strings: Conversion. (line 6)
+* converting, numbers to strings: Strings And Numbers. (line 6)
* converting, strings to numbers <1>: Bitwise Functions. (line 109)
-* converting, strings to numbers: Conversion. (line 6)
-* CONVFMT variable <1>: User-modified. (line 28)
-* CONVFMT variable: Conversion. (line 29)
+* converting, strings to numbers: Strings And Numbers. (line 6)
+* CONVFMT variable <1>: User-modified. (line 30)
+* CONVFMT variable: Strings And Numbers. (line 29)
* CONVFMT variable, and array subscripts: Numeric Array Subscripts.
(line 6)
* cookie: Glossary. (line 149)
@@ -30864,10 +31758,10 @@ Index
* cosine: Numeric Functions. (line 15)
* counting: Wc Program. (line 6)
* csh utility: Statements/Lines. (line 44)
-* csh utility, POSIXLY_CORRECT environment variable: Options. (line 348)
+* csh utility, POSIXLY_CORRECT environment variable: Options. (line 353)
* csh utility, |& operator, comparison with: Two-way I/O. (line 44)
* ctime() user-defined function: Function Example. (line 73)
-* currency symbols, localization: Explaining gettext. (line 103)
+* currency symbols, localization: Explaining gettext. (line 104)
* current system time: Time Functions. (line 66)
* custom.h file: Configuration Philosophy.
(line 30)
@@ -30878,58 +31772,59 @@ Index
* cut.awk program: Cut Program. (line 45)
* d debugger command (alias for delete): Breakpoint Control. (line 64)
* d.c., See dark corner: Conventions. (line 38)
-* dark corner <1>: Glossary. (line 189)
+* dark corner <1>: Glossary. (line 188)
* dark corner: Conventions. (line 38)
* dark corner, "0" is actually true: Truth Values. (line 24)
* dark corner, /= operator vs. /=.../ regexp constant: Assignment Ops.
- (line 147)
+ (line 148)
* dark corner, ^, in FS: Regexp Field Splitting.
(line 59)
* dark corner, array subscripts: Uninitialized Subscripts.
(line 43)
* dark corner, break statement: Break Statement. (line 51)
* dark corner, close() function: Close Files And Pipes.
- (line 130)
+ (line 131)
* dark corner, command-line arguments: Assignment Options. (line 43)
* dark corner, continue statement: Continue Statement. (line 43)
-* dark corner, CONVFMT variable: Conversion. (line 40)
+* dark corner, CONVFMT variable: Strings And Numbers. (line 40)
* dark corner, escape sequences: Other Arguments. (line 31)
* dark corner, escape sequences, for metacharacters: Escape Sequences.
(line 134)
* dark corner, exit statement: Exit Statement. (line 30)
* dark corner, field separators: Field Splitting Summary.
(line 46)
-* dark corner, FILENAME variable <1>: Auto-set. (line 102)
+* dark corner, FILENAME variable <1>: Auto-set. (line 98)
* dark corner, FILENAME variable: Getline Notes. (line 19)
-* dark corner, FNR/NR variables: Auto-set. (line 323)
+* dark corner, FNR/NR variables: Auto-set. (line 309)
* dark corner, format-control characters: Control Letters. (line 18)
* dark corner, FS as null string: Single Character Fields.
(line 20)
-* dark corner, input files: Records. (line 118)
+* dark corner, input files: awk split records. (line 110)
* dark corner, invoking awk: Command Line. (line 16)
-* dark corner, length() function: String Functions. (line 180)
-* dark corner, locale's decimal point character: Conversion. (line 77)
+* dark corner, length() function: String Functions. (line 183)
+* dark corner, locale's decimal point character: Locale influences conversions.
+ (line 17)
* dark corner, multiline records: Multiple Line. (line 35)
* dark corner, NF variable, decrementing: Changing Fields. (line 107)
* dark corner, OFMT variable: OFMT. (line 27)
* dark corner, regexp constants: Using Constant Regexps.
(line 6)
* dark corner, regexp constants, /= operator and: Assignment Ops.
- (line 147)
+ (line 148)
* dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps.
(line 43)
* dark corner, split() function: String Functions. (line 359)
-* dark corner, strings, storing: Records. (line 210)
-* dark corner, value of ARGV[0]: Auto-set. (line 35)
+* dark corner, strings, storing: gawk split records. (line 83)
+* dark corner, value of ARGV[0]: Auto-set. (line 39)
* data, fixed-width: Constant Size. (line 10)
* data-driven languages: Basic High Level. (line 85)
* database, group, reading: Group Functions. (line 6)
* database, users, reading: Passwd Functions. (line 6)
* date utility, GNU: Time Functions. (line 17)
-* date utility, POSIX: Time Functions. (line 263)
+* date utility, POSIX: Time Functions. (line 254)
* dates, converting to timestamps: Time Functions. (line 76)
* dates, information related to, localization: Explaining gettext.
- (line 115)
+ (line 116)
* Davies, Stephen <1>: Contributors. (line 74)
* Davies, Stephen: Acknowledgments. (line 60)
* dcgettext <1>: Programmer i18n. (line 19)
@@ -31032,7 +31927,7 @@ Index
(line 83)
* debugger commands, unwatch: Viewing And Changing Data.
(line 84)
-* debugger commands, up: Execution Stack. (line 33)
+* debugger commands, up: Execution Stack. (line 34)
* debugger commands, w (watch): Viewing And Changing Data.
(line 67)
* debugger commands, watch: Viewing And Changing Data.
@@ -31046,10 +31941,10 @@ Index
* debugger, read commands from a file: Debugger Info. (line 96)
* debugging awk programs: Debugger. (line 6)
* debugging gawk, bug reports: Bugs. (line 9)
-* decimal point character, locale specific: Options. (line 263)
+* decimal point character, locale specific: Options. (line 268)
* decrement operators: Increment Ops. (line 35)
* default keyword: Switch Statement. (line 6)
-* Deifik, Scott <1>: Bugs. (line 70)
+* Deifik, Scott <1>: Bugs. (line 71)
* Deifik, Scott <2>: Contributors. (line 53)
* Deifik, Scott: Acknowledgments. (line 60)
* delete ARRAY: Delete. (line 39)
@@ -31063,10 +31958,10 @@ Index
* deleting entire arrays: Delete. (line 39)
* Demaille, Akim: Acknowledgments. (line 60)
* describe call stack frame, in debugger: Debugger Info. (line 27)
-* differences between gawk and awk: String Functions. (line 194)
+* differences between gawk and awk: String Functions. (line 197)
* differences in awk and gawk, ARGC/ARGV variables: ARGC and ARGV.
(line 88)
-* differences in awk and gawk, ARGIND variable: Auto-set. (line 40)
+* differences in awk and gawk, ARGIND variable: Auto-set. (line 44)
* differences in awk and gawk, array elements, deleting: Delete.
(line 39)
* differences in awk and gawk, AWKLIBPATH environment variable: AWKLIBPATH Variable.
@@ -31080,7 +31975,7 @@ Index
* differences in awk and gawk, BINMODE variable <1>: PC Using.
(line 33)
* differences in awk and gawk, BINMODE variable: User-modified.
- (line 23)
+ (line 15)
* differences in awk and gawk, close() function: Close Files And Pipes.
(line 81)
* differences in awk and gawk, command line directories: Command line directories.
@@ -31088,14 +31983,14 @@ Index
* differences in awk and gawk, ERRNO variable: Auto-set. (line 82)
* differences in awk and gawk, error messages: Special FD. (line 16)
* differences in awk and gawk, FIELDWIDTHS variable: User-modified.
- (line 35)
-* differences in awk and gawk, FPAT variable: User-modified. (line 45)
-* differences in awk and gawk, FUNCTAB variable: Auto-set. (line 128)
+ (line 37)
+* differences in awk and gawk, FPAT variable: User-modified. (line 43)
+* differences in awk and gawk, FUNCTAB variable: Auto-set. (line 123)
* differences in awk and gawk, function arguments (gawk): Calling Built-in.
(line 16)
* differences in awk and gawk, getline command: Getline. (line 19)
* differences in awk and gawk, IGNORECASE variable: User-modified.
- (line 82)
+ (line 76)
* differences in awk and gawk, implementation limitations <1>: Redirection.
(line 135)
* differences in awk and gawk, implementation limitations: Getline Notes.
@@ -31108,34 +32003,38 @@ Index
(line 6)
* differences in awk and gawk, line continuations: Conditional Exp.
(line 34)
-* differences in awk and gawk, LINT variable: User-modified. (line 98)
+* differences in awk and gawk, LINT variable: User-modified. (line 88)
* differences in awk and gawk, match() function: String Functions.
- (line 257)
+ (line 260)
* differences in awk and gawk, print/printf statements: Format Modifiers.
(line 13)
-* differences in awk and gawk, PROCINFO array: Auto-set. (line 142)
-* differences in awk and gawk, record separators: Records. (line 132)
+* differences in awk and gawk, PROCINFO array: Auto-set. (line 136)
+* differences in awk and gawk, read timeouts: Read Timeout. (line 6)
+* differences in awk and gawk, record separators: awk split records.
+ (line 124)
* differences in awk and gawk, regexp constants: Using Constant Regexps.
(line 43)
* differences in awk and gawk, regular expressions: Case-sensitivity.
(line 26)
-* differences in awk and gawk, RS/RT variables: Records. (line 187)
-* differences in awk and gawk, RT variable: Auto-set. (line 275)
+* differences in awk and gawk, RS/RT variables: gawk split records.
+ (line 58)
+* differences in awk and gawk, RT variable: Auto-set. (line 265)
* differences in awk and gawk, single-character fields: Single Character Fields.
(line 6)
* differences in awk and gawk, split() function: String Functions.
(line 347)
* differences in awk and gawk, strings: Scalar Constants. (line 20)
-* differences in awk and gawk, strings, storing: Records. (line 206)
-* differences in awk and gawk, SYMTAB variable: Auto-set. (line 283)
+* differences in awk and gawk, strings, storing: gawk split records.
+ (line 77)
+* differences in awk and gawk, SYMTAB variable: Auto-set. (line 269)
* differences in awk and gawk, TEXTDOMAIN variable: User-modified.
- (line 162)
+ (line 152)
* differences in awk and gawk, trunc-mod operation: Arithmetic Ops.
(line 66)
* directories, command line: Command line directories.
(line 6)
-* directories, searching: Igawk Program. (line 368)
-* directories, searching for shared libraries: AWKLIBPATH Variable.
+* directories, searching: Programs Exercises. (line 63)
+* directories, searching for loadable extensions: AWKLIBPATH Variable.
(line 6)
* directories, searching for source files: AWKPATH Variable. (line 6)
* disable breakpoint: Breakpoint Control. (line 69)
@@ -31143,6 +32042,7 @@ Index
* display debugger command: Viewing And Changing Data.
(line 8)
* display debugger options: Debugger Info. (line 57)
+* div: Numeric Functions. (line 18)
* division: Arithmetic Ops. (line 44)
* do-while statement: Do Statement. (line 6)
* do-while statement, use of regexps in: Regexp Usage. (line 19)
@@ -31154,10 +32054,9 @@ Index
* dollar sign ($), incrementing fields and arrays: Increment Ops.
(line 30)
* dollar sign ($), regexp operator: Regexp Operators. (line 35)
-* double precision floating-point: General Arithmetic. (line 21)
* double quote (") in shell commands: Read Terminal. (line 25)
-* double quote ("), in regexp constants: Computed Regexps. (line 28)
-* double quote ("), in shell commands: Quoting. (line 37)
+* double quote ("), in regexp constants: Computed Regexps. (line 29)
+* double quote ("), in shell commands: Quoting. (line 54)
* down debugger command: Execution Stack. (line 21)
* Drepper, Ulrich: Acknowledgments. (line 52)
* dump all variables of a program: Options. (line 93)
@@ -31168,8 +32067,8 @@ Index
* dynamically loaded extensions: Dynamic Extensions. (line 6)
* e debugger command (alias for enable): Breakpoint Control. (line 73)
* EBCDIC: Ordinal Functions. (line 45)
-* effective group ID of gawk user: Auto-set. (line 147)
-* effective user ID of gawk user: Auto-set. (line 151)
+* effective group ID of gawk user: Auto-set. (line 141)
+* effective user ID of gawk user: Auto-set. (line 145)
* egrep utility <1>: Egrep Program. (line 6)
* egrep utility: Bracket Expressions. (line 24)
* egrep.awk program: Egrep Program. (line 54)
@@ -31185,7 +32084,7 @@ Index
* empty array elements: Reference to Elements.
(line 18)
* empty pattern: Empty. (line 6)
-* empty strings: Records. (line 122)
+* empty strings: awk split records. (line 114)
* empty strings, See null strings: Regexp Field Splitting.
(line 43)
* enable breakpoint: Breakpoint Control. (line 73)
@@ -31197,7 +32096,7 @@ Index
* END pattern, and profiling: Profiling. (line 62)
* END pattern, assert() user-defined function and: Assert Function.
(line 75)
-* END pattern, backslash continuation and: Egrep Program. (line 220)
+* END pattern, backslash continuation and: Egrep Program. (line 223)
* END pattern, Boolean patterns and: Expression Patterns. (line 70)
* END pattern, exit statement and: Exit Statement. (line 12)
* END pattern, next/nextfile statements and <1>: Next Statement.
@@ -31209,15 +32108,16 @@ Index
* ENDFILE pattern: BEGINFILE/ENDFILE. (line 6)
* ENDFILE pattern, Boolean patterns and: Expression Patterns. (line 70)
* endfile() user-defined function: Filetrans Function. (line 62)
-* endgrent() function (C library): Group Functions. (line 215)
-* endgrent() user-defined function: Group Functions. (line 218)
+* endgrent() function (C library): Group Functions. (line 213)
+* endgrent() user-defined function: Group Functions. (line 216)
* endpwent() function (C library): Passwd Functions. (line 210)
* endpwent() user-defined function: Passwd Functions. (line 213)
+* English, Steve: Advanced Features. (line 6)
* ENVIRON array: Auto-set. (line 60)
* environment variables used by gawk: Environment Variables.
(line 6)
* environment variables, in ENVIRON array: Auto-set. (line 60)
-* epoch, definition of: Glossary. (line 235)
+* epoch, definition of: Glossary. (line 234)
* equals sign (=), = operator: Assignment Ops. (line 6)
* equals sign (=), == operator <1>: Precedence. (line 65)
* equals sign (=), == operator: Comparison Operators.
@@ -31227,7 +32127,7 @@ Index
* ERRNO variable: Auto-set. (line 82)
* ERRNO variable, with BEGINFILE pattern: BEGINFILE/ENDFILE. (line 26)
* ERRNO variable, with close() function: Close Files And Pipes.
- (line 138)
+ (line 139)
* ERRNO variable, with getline command: Getline. (line 19)
* error handling: Special FD. (line 16)
* error handling, ERRNO variable and: Auto-set. (line 82)
@@ -31243,7 +32143,7 @@ Index
* evaluation order, concatenation: Concatenation. (line 41)
* evaluation order, functions: Calling Built-in. (line 30)
* examining fields: Fields. (line 6)
-* exclamation point (!), ! operator <1>: Egrep Program. (line 170)
+* exclamation point (!), ! operator <1>: Egrep Program. (line 175)
* exclamation point (!), ! operator <2>: Precedence. (line 52)
* exclamation point (!), ! operator: Boolean Ops. (line 67)
* exclamation point (!), != operator <1>: Precedence. (line 65)
@@ -31263,10 +32163,10 @@ Index
* exit status, of VMS: VMS Running. (line 29)
* exit the debugger: Miscellaneous Debugger Commands.
(line 99)
-* exp: Numeric Functions. (line 18)
+* exp: Numeric Functions. (line 32)
* expand utility: Very Simple. (line 69)
* Expat XML parser library: gawkextlib. (line 35)
-* exponent: Numeric Functions. (line 18)
+* exponent: Numeric Functions. (line 32)
* expressions: Expressions. (line 6)
* expressions, as patterns: Expression Patterns. (line 6)
* expressions, assignment: Assignment Ops. (line 6)
@@ -31284,7 +32184,7 @@ Index
(line 6)
* extension API version: Extension Versioning.
(line 6)
-* extension API, version number: Auto-set. (line 238)
+* extension API, version number: Auto-set. (line 232)
* extension example: Extension Example. (line 6)
* extension registration: Registration Functions.
(line 6)
@@ -31295,18 +32195,18 @@ Index
* extensions, Brian Kernighan's awk <1>: Common Extensions. (line 6)
* extensions, Brian Kernighan's awk: BTL. (line 6)
* extensions, common, ** operator: Arithmetic Ops. (line 30)
-* extensions, common, **= operator: Assignment Ops. (line 136)
+* extensions, common, **= operator: Assignment Ops. (line 137)
* extensions, common, /dev/stderr special file: Special FD. (line 46)
* extensions, common, /dev/stdin special file: Special FD. (line 46)
* extensions, common, /dev/stdout special file: Special FD. (line 46)
* extensions, common, \x escape sequence: Escape Sequences. (line 61)
* extensions, common, BINMODE variable: PC Using. (line 33)
* extensions, common, delete to delete entire arrays: Delete. (line 39)
-* extensions, common, fflush() function: I/O Functions. (line 40)
-* extensions, common, func keyword: Definition Syntax. (line 83)
+* extensions, common, fflush() function: I/O Functions. (line 43)
+* extensions, common, func keyword: Definition Syntax. (line 89)
* extensions, common, length() applied to an array: String Functions.
- (line 194)
-* extensions, common, RS as a regexp: Records. (line 135)
+ (line 197)
+* extensions, common, RS as a regexp: gawk split records. (line 6)
* extensions, common, single character fields: Single Character Fields.
(line 6)
* extensions, in gawk, not in POSIX awk: POSIX/GNU. (line 6)
@@ -31320,12 +32220,11 @@ Index
* FDL (Free Documentation License): GNU Free Documentation License.
(line 7)
* features, adding to gawk: Adding Code. (line 6)
-* features, advanced, See advanced features: Obsolete. (line 6)
* features, deprecated: Obsolete. (line 6)
* features, undocumented: Undocumented. (line 6)
* Fenlason, Jay <1>: Contributors. (line 18)
* Fenlason, Jay: History. (line 30)
-* fflush: I/O Functions. (line 25)
+* fflush: I/O Functions. (line 28)
* field numbers: Nonconstant Fields. (line 6)
* field operator $: Fields. (line 19)
* field operators, dollar sign as: Fields. (line 19)
@@ -31334,11 +32233,11 @@ Index
(line 6)
* field separator, POSIX and: Field Splitting Summary.
(line 40)
-* field separators <1>: User-modified. (line 56)
+* field separators <1>: User-modified. (line 50)
* field separators: Field Separators. (line 15)
* field separators, choice of: Field Separators. (line 51)
-* field separators, FIELDWIDTHS variable and: User-modified. (line 35)
-* field separators, FPAT variable and: User-modified. (line 45)
+* field separators, FIELDWIDTHS variable and: User-modified. (line 37)
+* field separators, FPAT variable and: User-modified. (line 43)
* field separators, POSIX and: Fields. (line 6)
* field separators, regular expressions as <1>: Regexp Field Splitting.
(line 6)
@@ -31358,24 +32257,24 @@ Index
* fields, separating: Field Separators. (line 15)
* fields, single-character: Single Character Fields.
(line 6)
-* FIELDWIDTHS variable <1>: User-modified. (line 35)
+* FIELDWIDTHS variable <1>: User-modified. (line 37)
* FIELDWIDTHS variable: Constant Size. (line 23)
* file descriptors: Special FD. (line 6)
-* file names, distinguishing: Auto-set. (line 52)
+* file names, distinguishing: Auto-set. (line 56)
* file names, in compatibility mode: Special Caveats. (line 9)
* file names, standard streams in gawk: Special FD. (line 46)
-* FILENAME variable <1>: Auto-set. (line 102)
+* FILENAME variable <1>: Auto-set. (line 98)
* FILENAME variable: Reading Files. (line 6)
* FILENAME variable, getline, setting with: Getline Notes. (line 19)
* filenames, assignments as: Ignoring Assigns. (line 6)
-* files, .gmo: Explaining gettext. (line 41)
-* files, .gmo, converting from .po: I18N Example. (line 62)
+* files, .gmo: Explaining gettext. (line 42)
* files, .gmo, specifying directory of <1>: Programmer i18n. (line 47)
-* files, .gmo, specifying directory of: Explaining gettext. (line 53)
+* files, .gmo, specifying directory of: Explaining gettext. (line 54)
+* files, .mo, converting from .po: I18N Example. (line 63)
* files, .po <1>: Translator i18n. (line 6)
-* files, .po: Explaining gettext. (line 36)
-* files, .po, converting to .gmo: I18N Example. (line 62)
-* files, .pot: Explaining gettext. (line 30)
+* files, .po: Explaining gettext. (line 37)
+* files, .po, converting to .mo: I18N Example. (line 63)
+* files, .pot: Explaining gettext. (line 31)
* files, /dev/... special files: Special FD. (line 46)
* files, /inet/... (gawk): TCP/IP Networking. (line 6)
* files, /inet4/... (gawk): TCP/IP Networking. (line 6)
@@ -31392,33 +32291,33 @@ Index
* files, managing: Data File Management.
(line 6)
* files, managing, data file boundaries: Filetrans Function. (line 6)
-* files, message object: Explaining gettext. (line 41)
+* files, message object: Explaining gettext. (line 42)
* files, message object, converting from portable object files: I18N Example.
- (line 62)
+ (line 63)
* files, message object, specifying directory of <1>: Programmer i18n.
(line 47)
* files, message object, specifying directory of: Explaining gettext.
- (line 53)
+ (line 54)
* files, multiple passes over: Other Arguments. (line 49)
* files, multiple, duplicating output into: Tee Program. (line 6)
* files, output, See output files: Close Files And Pipes.
(line 6)
* files, password: Passwd Functions. (line 16)
* files, portable object <1>: Translator i18n. (line 6)
-* files, portable object: Explaining gettext. (line 36)
-* files, portable object template: Explaining gettext. (line 30)
+* files, portable object: Explaining gettext. (line 37)
+* files, portable object template: Explaining gettext. (line 31)
* files, portable object, converting to message object files: I18N Example.
- (line 62)
+ (line 63)
* files, portable object, generating: Options. (line 147)
-* files, processing, ARGIND variable and: Auto-set. (line 47)
+* files, processing, ARGIND variable and: Auto-set. (line 51)
* files, reading: Rewind Function. (line 6)
* files, reading, multiline records: Multiple Line. (line 6)
* files, searching for regular expressions: Egrep Program. (line 6)
* files, skipping: File Checking. (line 6)
-* files, source, search path for: Igawk Program. (line 368)
+* files, source, search path for: Programs Exercises. (line 63)
* files, splitting: Split Program. (line 6)
* files, Texinfo, extracting programs from: Extract Program. (line 6)
-* find substring in string: String Functions. (line 151)
+* find substring in string: String Functions. (line 155)
* finding extensions: Finding Extensions. (line 6)
* finish debugger command: Debugger Execution Control.
(line 39)
@@ -31426,17 +32325,15 @@ Index
* fixed-width data: Constant Size. (line 10)
* flag variables <1>: Tee Program. (line 20)
* flag variables: Boolean Ops. (line 67)
-* floating-point, numbers <1>: Unexpected Results. (line 6)
-* floating-point, numbers: General Arithmetic. (line 6)
* floating-point, numbers, arbitrary precision: Arbitrary Precision Arithmetic.
(line 6)
* floating-point, VAX/VMS: VMS Running. (line 51)
-* flush buffered output: I/O Functions. (line 25)
+* flush buffered output: I/O Functions. (line 28)
* fnmatch() extension function: Extension Sample Fnmatch.
(line 12)
-* FNR variable <1>: Auto-set. (line 112)
+* FNR variable <1>: Auto-set. (line 107)
* FNR variable: Records. (line 6)
-* FNR variable, changing: Auto-set. (line 323)
+* FNR variable, changing: Auto-set. (line 309)
* for statement: For Statement. (line 6)
* for statement, looping over arrays: Scanning an Array. (line 20)
* fork() extension function: Extension Sample Fork.
@@ -31450,30 +32347,30 @@ Index
* format time string: Time Functions. (line 48)
* formats, numeric output: OFMT. (line 6)
* formatting output: Printf. (line 6)
-* formatting strings: String Functions. (line 378)
+* formatting strings: String Functions. (line 381)
* forward slash (/) to enclose regular expressions: Regexp. (line 10)
* forward slash (/), / operator: Precedence. (line 55)
* forward slash (/), /= operator <1>: Precedence. (line 95)
-* forward slash (/), /= operator: Assignment Ops. (line 129)
+* forward slash (/), /= operator: Assignment Ops. (line 130)
* forward slash (/), /= operator, vs. /=.../ regexp constant: Assignment Ops.
- (line 147)
+ (line 148)
* forward slash (/), patterns and: Expression Patterns. (line 24)
-* FPAT variable <1>: User-modified. (line 45)
+* FPAT variable <1>: User-modified. (line 43)
* FPAT variable: Splitting By Content.
(line 27)
* frame debugger command: Execution Stack. (line 25)
* Free Documentation License (FDL): GNU Free Documentation License.
(line 7)
-* Free Software Foundation (FSF) <1>: Glossary. (line 297)
+* Free Software Foundation (FSF) <1>: Glossary. (line 296)
* Free Software Foundation (FSF) <2>: Getting. (line 10)
* Free Software Foundation (FSF): Manual History. (line 6)
-* FreeBSD: Glossary. (line 616)
-* FS variable <1>: User-modified. (line 56)
+* FreeBSD: Glossary. (line 611)
+* FS variable <1>: User-modified. (line 50)
* FS variable: Field Separators. (line 15)
* FS variable, --field-separator option and: Options. (line 21)
* FS variable, as null string: Single Character Fields.
(line 20)
-* FS variable, as TAB character: Options. (line 259)
+* FS variable, as TAB character: Options. (line 264)
* FS variable, changing value of: Field Separators. (line 35)
* FS variable, running awk programs and: Cut Program. (line 68)
* FS variable, setting from command line: Command Line Field Separator.
@@ -31481,12 +32378,12 @@ Index
* FS, containing ^: Regexp Field Splitting.
(line 59)
* FS, in multiline records: Multiple Line. (line 41)
-* FSF (Free Software Foundation) <1>: Glossary. (line 297)
+* FSF (Free Software Foundation) <1>: Glossary. (line 296)
* FSF (Free Software Foundation) <2>: Getting. (line 10)
* FSF (Free Software Foundation): Manual History. (line 6)
* fts() extension function: Extension Sample File Functions.
- (line 77)
-* FUNCTAB array: Auto-set. (line 128)
+ (line 61)
+* FUNCTAB array: Auto-set. (line 123)
* function calls: Function Calls. (line 6)
* function calls, indirect: Indirect Calls. (line 6)
* function definition example: Function Example. (line 6)
@@ -31521,7 +32418,7 @@ Index
(line 6)
* functions, names of <1>: Definition Syntax. (line 20)
* functions, names of: Arrays. (line 18)
-* functions, recursive: Definition Syntax. (line 73)
+* functions, recursive: Definition Syntax. (line 79)
* functions, string-translation: I18N Functions. (line 6)
* functions, undefined: Pass By Value/Reference.
(line 71)
@@ -31533,18 +32430,18 @@ Index
(line 47)
* functions, user-defined, next/nextfile statements and: Next Statement.
(line 45)
-* G-d: Acknowledgments. (line 78)
+* G-d: Acknowledgments. (line 82)
* Garfinkle, Scott: Contributors. (line 34)
* gawk program, dynamic profiling: Profiling. (line 179)
-* gawk version: Auto-set. (line 213)
+* gawk version: Auto-set. (line 207)
* gawk, ARGIND variable in: Other Arguments. (line 12)
* gawk, awk and <1>: This Manual. (line 14)
* gawk, awk and: Preface. (line 23)
* gawk, bitwise operations in: Bitwise Functions. (line 39)
* gawk, break statement in: Break Statement. (line 51)
* gawk, built-in variables and: Built-in Variables. (line 14)
-* gawk, character classes and: Bracket Expressions. (line 90)
-* gawk, coding style in: Adding Code. (line 38)
+* gawk, character classes and: Bracket Expressions. (line 98)
+* gawk, coding style in: Adding Code. (line 39)
* gawk, command-line options, and regular expressions: GNU Regexp Operators.
(line 70)
* gawk, comparison operators and: Comparison Operators.
@@ -31560,28 +32457,28 @@ Index
* gawk, ERRNO variable in <2>: Auto-set. (line 82)
* gawk, ERRNO variable in <3>: BEGINFILE/ENDFILE. (line 26)
* gawk, ERRNO variable in <4>: Close Files And Pipes.
- (line 138)
+ (line 139)
* gawk, ERRNO variable in: Getline. (line 19)
* gawk, escape sequences: Escape Sequences. (line 124)
-* gawk, extensions, disabling: Options. (line 247)
+* gawk, extensions, disabling: Options. (line 252)
* gawk, features, adding: Adding Code. (line 6)
* gawk, features, advanced: Advanced Features. (line 6)
-* gawk, field separators and: User-modified. (line 77)
-* gawk, FIELDWIDTHS variable in <1>: User-modified. (line 35)
+* gawk, field separators and: User-modified. (line 71)
+* gawk, FIELDWIDTHS variable in <1>: User-modified. (line 37)
* gawk, FIELDWIDTHS variable in: Constant Size. (line 23)
* gawk, file names in: Special Files. (line 6)
* gawk, format-control characters: Control Letters. (line 18)
-* gawk, FPAT variable in <1>: User-modified. (line 45)
+* gawk, FPAT variable in <1>: User-modified. (line 43)
* gawk, FPAT variable in: Splitting By Content.
(line 27)
-* gawk, FUNCTAB array in: Auto-set. (line 128)
+* gawk, FUNCTAB array in: Auto-set. (line 123)
* gawk, function arguments and: Calling Built-in. (line 16)
* gawk, hexadecimal numbers and: Nondecimal-numbers. (line 42)
* gawk, IGNORECASE variable in <1>: Array Sorting Functions.
(line 83)
-* gawk, IGNORECASE variable in <2>: String Functions. (line 48)
-* gawk, IGNORECASE variable in <3>: Array Intro. (line 91)
-* gawk, IGNORECASE variable in <4>: User-modified. (line 82)
+* gawk, IGNORECASE variable in <2>: String Functions. (line 58)
+* gawk, IGNORECASE variable in <3>: Array Intro. (line 92)
+* gawk, IGNORECASE variable in <4>: User-modified. (line 76)
* gawk, IGNORECASE variable in: Case-sensitivity. (line 26)
* gawk, implementation issues: Notes. (line 6)
* gawk, implementation issues, debugging: Compatibility Mode. (line 6)
@@ -31594,61 +32491,61 @@ Index
(line 13)
* gawk, interpreter, adding code to: Using Internal File Ops.
(line 6)
-* gawk, interval expressions and: Regexp Operators. (line 139)
+* gawk, interval expressions and: Regexp Operators. (line 140)
* gawk, line continuation in: Conditional Exp. (line 34)
-* gawk, LINT variable in: User-modified. (line 98)
+* gawk, LINT variable in: User-modified. (line 88)
* gawk, list of contributors to: Contributors. (line 6)
* gawk, MS-DOS version of: PC Using. (line 10)
* gawk, MS-Windows version of: PC Using. (line 10)
* gawk, newlines in: Statements/Lines. (line 12)
* gawk, octal numbers and: Nondecimal-numbers. (line 42)
-* gawk, OS/2 version of: PC Using. (line 10)
-* gawk, PROCINFO array in <1>: Two-way I/O. (line 116)
+* gawk, OS/2 version of: PC Using. (line 16)
+* gawk, PROCINFO array in <1>: Two-way I/O. (line 117)
* gawk, PROCINFO array in <2>: Time Functions. (line 47)
-* gawk, PROCINFO array in: Auto-set. (line 142)
+* gawk, PROCINFO array in: Auto-set. (line 136)
* gawk, regexp constants and: Using Constant Regexps.
(line 28)
* gawk, regular expressions, case sensitivity: Case-sensitivity.
(line 26)
* gawk, regular expressions, operators: GNU Regexp Operators.
(line 6)
-* gawk, regular expressions, precedence: Regexp Operators. (line 161)
-* gawk, RT variable in <1>: Auto-set. (line 275)
+* gawk, regular expressions, precedence: Regexp Operators. (line 162)
+* gawk, RT variable in <1>: Auto-set. (line 265)
* gawk, RT variable in <2>: Multiple Line. (line 129)
-* gawk, RT variable in: Records. (line 132)
+* gawk, RT variable in: awk split records. (line 124)
* gawk, See Also awk: Preface. (line 36)
* gawk, source code, obtaining: Getting. (line 6)
* gawk, splitting fields and: Constant Size. (line 88)
* gawk, string-translation functions: I18N Functions. (line 6)
-* gawk, SYMTAB array in: Auto-set. (line 283)
-* gawk, TEXTDOMAIN variable in: User-modified. (line 162)
+* gawk, SYMTAB array in: Auto-set. (line 269)
+* gawk, TEXTDOMAIN variable in: User-modified. (line 152)
* gawk, timestamps: Time Functions. (line 6)
* gawk, uses for: Preface. (line 36)
-* gawk, versions of, information about, printing: Options. (line 293)
+* gawk, versions of, information about, printing: Options. (line 298)
* gawk, VMS version of: VMS Installation. (line 6)
* gawk, word-boundary operator: GNU Regexp Operators.
(line 63)
* gawkextlib: gawkextlib. (line 6)
* gawkextlib project: gawkextlib. (line 6)
-* General Public License (GPL): Glossary. (line 306)
+* General Public License (GPL): Glossary. (line 305)
* General Public License, See GPL: Manual History. (line 11)
* generate time values: Time Functions. (line 25)
-* gensub <1>: String Functions. (line 82)
+* gensub <1>: String Functions. (line 89)
* gensub: Using Constant Regexps.
(line 43)
* gensub() function (gawk), escape processing: Gory Details. (line 6)
* getaddrinfo() function (C library): TCP/IP Networking. (line 38)
* getgrent() function (C library): Group Functions. (line 6)
* getgrent() user-defined function: Group Functions. (line 6)
-* getgrgid() function (C library): Group Functions. (line 186)
-* getgrgid() user-defined function: Group Functions. (line 189)
-* getgrnam() function (C library): Group Functions. (line 175)
-* getgrnam() user-defined function: Group Functions. (line 180)
-* getgruser() function (C library): Group Functions. (line 195)
-* getgruser() function, user-defined: Group Functions. (line 198)
+* getgrgid() function (C library): Group Functions. (line 184)
+* getgrgid() user-defined function: Group Functions. (line 187)
+* getgrnam() function (C library): Group Functions. (line 173)
+* getgrnam() user-defined function: Group Functions. (line 178)
+* getgruser() function (C library): Group Functions. (line 193)
+* getgruser() function, user-defined: Group Functions. (line 196)
* getline command: Reading Files. (line 20)
* getline command, _gr_init() user-defined function: Group Functions.
- (line 82)
+ (line 83)
* getline command, _pw_init() function: Passwd Functions. (line 154)
* getline command, coprocesses, using from <1>: Close Files And Pipes.
(line 6)
@@ -31674,42 +32571,41 @@ Index
* getpwuid() function (C library): Passwd Functions. (line 188)
* getpwuid() user-defined function: Passwd Functions. (line 192)
* gettext library: Explaining gettext. (line 6)
-* gettext library, locale categories: Explaining gettext. (line 80)
-* gettext() function (C library): Explaining gettext. (line 62)
+* gettext library, locale categories: Explaining gettext. (line 81)
+* gettext() function (C library): Explaining gettext. (line 63)
* gettimeofday() extension function: Extension Sample Time.
- (line 13)
-* git utility <1>: Adding Code. (line 111)
+ (line 12)
+* git utility <1>: Adding Code. (line 112)
* git utility <2>: Accessing The Source.
(line 10)
* git utility <3>: Other Versions. (line 29)
* git utility: gawkextlib. (line 29)
-* git, use of for gawk source code: Derived Files. (line 6)
-* GMP: Gawk and MPFR. (line 6)
+* Git, use of for gawk source code: Derived Files. (line 6)
* GNITS mailing list: Acknowledgments. (line 52)
-* GNU awk, See gawk: Preface. (line 49)
+* GNU awk, See gawk: Preface. (line 53)
* GNU Free Documentation License: GNU Free Documentation License.
(line 7)
-* GNU General Public License: Glossary. (line 306)
-* GNU Lesser General Public License: Glossary. (line 397)
+* GNU General Public License: Glossary. (line 305)
+* GNU Lesser General Public License: Glossary. (line 396)
* GNU long options <1>: Options. (line 6)
* GNU long options: Command Line. (line 13)
* GNU long options, printing list of: Options. (line 154)
-* GNU Project <1>: Glossary. (line 315)
+* GNU Project <1>: Glossary. (line 314)
* GNU Project: Manual History. (line 11)
-* GNU/Linux <1>: Glossary. (line 616)
+* GNU/Linux <1>: Glossary. (line 611)
* GNU/Linux <2>: I18N Example. (line 55)
* GNU/Linux: Manual History. (line 28)
* Gordon, Assaf: Contributors. (line 105)
-* GPL (General Public License) <1>: Glossary. (line 306)
+* GPL (General Public License) <1>: Glossary. (line 305)
* GPL (General Public License): Manual History. (line 11)
* GPL (General Public License), printing: Options. (line 88)
* grcat program: Group Functions. (line 16)
* Grigera, Juan: Contributors. (line 57)
* group database, reading: Group Functions. (line 6)
* group file: Group Functions. (line 6)
-* group ID of gawk user: Auto-set. (line 186)
+* group ID of gawk user: Auto-set. (line 180)
* groups, information about: Group Functions. (line 6)
-* gsub <1>: String Functions. (line 135)
+* gsub <1>: String Functions. (line 139)
* gsub: Using Constant Regexps.
(line 43)
* gsub() function, arguments of: String Functions. (line 460)
@@ -31725,7 +32621,7 @@ Index
* help debugger command: Miscellaneous Debugger Commands.
(line 66)
* hexadecimal numbers: Nondecimal-numbers. (line 6)
-* hexadecimal values, enabling interpretation of: Options. (line 207)
+* hexadecimal values, enabling interpretation of: Options. (line 211)
* history expansion, in debugger: Readline Support. (line 6)
* histsort.awk program: History Sorting. (line 25)
* Hughes, Phil: Acknowledgments. (line 43)
@@ -31734,30 +32630,28 @@ Index
* hyphen (-), -- operator <1>: Precedence. (line 46)
* hyphen (-), -- operator: Increment Ops. (line 48)
* hyphen (-), -= operator <1>: Precedence. (line 95)
-* hyphen (-), -= operator: Assignment Ops. (line 129)
+* hyphen (-), -= operator: Assignment Ops. (line 130)
* hyphen (-), filenames beginning with: Options. (line 59)
* hyphen (-), in bracket expressions: Bracket Expressions. (line 17)
* i debugger command (alias for info): Debugger Info. (line 13)
* id utility: Id Program. (line 6)
* id.awk program: Id Program. (line 30)
-* IEEE-754 format: Floating-point Representation.
- (line 6)
* if statement: If Statement. (line 6)
* if statement, actions, changing: Ranges. (line 25)
* if statement, use of regexps in: Regexp Usage. (line 19)
* igawk.sh program: Igawk Program. (line 124)
* ignore breakpoint: Breakpoint Control. (line 87)
* ignore debugger command: Breakpoint Control. (line 87)
-* IGNORECASE variable: User-modified. (line 82)
-* IGNORECASE variable, and array indices: Array Intro. (line 91)
+* IGNORECASE variable: User-modified. (line 76)
+* IGNORECASE variable, and array indices: Array Intro. (line 92)
* IGNORECASE variable, and array sorting functions: Array Sorting Functions.
(line 83)
* IGNORECASE variable, in example programs: Library Functions.
(line 53)
* IGNORECASE variable, with ~ and !~ operators: Case-sensitivity.
(line 26)
-* Illumos: Other Versions. (line 104)
-* Illumos, POSIX-compliant awk: Other Versions. (line 104)
+* Illumos: Other Versions. (line 105)
+* Illumos, POSIX-compliant awk: Other Versions. (line 105)
* implementation issues, gawk: Notes. (line 6)
* implementation issues, gawk, debugging: Compatibility Mode. (line 6)
* implementation issues, gawk, limits <1>: Redirection. (line 135)
@@ -31773,8 +32667,8 @@ Index
(line 37)
* in operator, use in loops: Scanning an Array. (line 17)
* increment operators: Increment Ops. (line 6)
-* index: String Functions. (line 151)
-* indexing arrays: Array Intro. (line 49)
+* index: String Functions. (line 155)
+* indexing arrays: Array Intro. (line 50)
* indirect function calls: Indirect Calls. (line 6)
* infinite precision: Arbitrary Precision Arithmetic.
(line 6)
@@ -31791,7 +32685,7 @@ Index
* input files, running awk without: Read Terminal. (line 6)
* input files, variable assignments and: Other Arguments. (line 19)
* input pipeline: Getline/Pipe. (line 9)
-* input record, length of: String Functions. (line 171)
+* input record, length of: String Functions. (line 174)
* input redirection: Getline/File. (line 6)
* input, data, nondecimal: Nondecimal Data. (line 6)
* input, explicit: Getline. (line 6)
@@ -31801,90 +32695,88 @@ Index
* input, standard <1>: Special FD. (line 6)
* input, standard: Read Terminal. (line 6)
* input/output functions: I/O Functions. (line 6)
-* input/output, binary: User-modified. (line 10)
+* input/output, binary: User-modified. (line 15)
* input/output, from BEGIN and END: I/O And BEGIN/END. (line 6)
* input/output, two-way: Two-way I/O. (line 44)
* insomnia, cure for: Alarm Program. (line 6)
* installation, VMS: VMS Installation. (line 6)
* installing gawk: Installation. (line 6)
* instruction tracing, in debugger: Debugger Info. (line 89)
-* int: Numeric Functions. (line 23)
+* int: Numeric Functions. (line 37)
* INT signal (MS-Windows): Profiling. (line 214)
* integer array indices: Numeric Array Subscripts.
(line 31)
-* integers: General Arithmetic. (line 6)
* integers, arbitrary precision: Arbitrary Precision Integers.
(line 6)
-* integers, unsigned: General Arithmetic. (line 15)
-* interacting with other programs: I/O Functions. (line 72)
+* integers, unsigned: Computer Arithmetic. (line 41)
+* interacting with other programs: I/O Functions. (line 75)
* internationalization <1>: I18N and L10N. (line 6)
* internationalization: I18N Functions. (line 6)
* internationalization, localization <1>: Internationalization.
(line 13)
-* internationalization, localization: User-modified. (line 162)
+* internationalization, localization: User-modified. (line 152)
* internationalization, localization, character classes: Bracket Expressions.
- (line 90)
+ (line 98)
* internationalization, localization, gawk and: Internationalization.
(line 13)
* internationalization, localization, locale categories: Explaining gettext.
- (line 80)
+ (line 81)
* internationalization, localization, marked strings: Programmer i18n.
(line 14)
* internationalization, localization, portability and: I18N Portability.
(line 6)
* internationalizing a program: Explaining gettext. (line 6)
-* interpreted programs <1>: Glossary. (line 357)
+* interpreted programs <1>: Glossary. (line 356)
* interpreted programs: Basic High Level. (line 15)
-* interval expressions, regexp operator: Regexp Operators. (line 116)
+* interval expressions, regexp operator: Regexp Operators. (line 117)
* inventory-shipped file: Sample Data Files. (line 32)
-* invoke shell command: I/O Functions. (line 72)
+* invoke shell command: I/O Functions. (line 75)
* isarray: Type Functions. (line 11)
-* ISO: Glossary. (line 368)
+* ISO: Glossary. (line 367)
* ISO 8859-1: Glossary. (line 133)
* ISO Latin-1: Glossary. (line 133)
* Jacobs, Andrew: Passwd Functions. (line 90)
* Jaegermann, Michal <1>: Contributors. (line 45)
* Jaegermann, Michal: Acknowledgments. (line 60)
-* Java implementation of awk: Other Versions. (line 112)
-* Java programming language: Glossary. (line 380)
-* jawk: Other Versions. (line 112)
+* Java implementation of awk: Other Versions. (line 113)
+* Java programming language: Glossary. (line 379)
+* jawk: Other Versions. (line 113)
* Jedi knights: Undocumented. (line 6)
+* Johansen, Chris: Signature Program. (line 25)
* join() user-defined function: Join Function. (line 18)
* Kahrs, Ju"rgen <1>: Contributors. (line 70)
* Kahrs, Ju"rgen: Acknowledgments. (line 60)
* Kasal, Stepan: Acknowledgments. (line 60)
* Kenobi, Obi-Wan: Undocumented. (line 6)
* Kernighan, Brian <1>: Glossary. (line 143)
-* Kernighan, Brian <2>: Basic Data Typing. (line 55)
+* Kernighan, Brian <2>: Basic Data Typing. (line 54)
* Kernighan, Brian <3>: Other Versions. (line 13)
* Kernighan, Brian <4>: Contributors. (line 11)
* Kernighan, Brian <5>: BTL. (line 6)
* Kernighan, Brian <6>: Library Functions. (line 12)
* Kernighan, Brian <7>: Concatenation. (line 6)
* Kernighan, Brian <8>: Getline/Pipe. (line 6)
-* Kernighan, Brian <9>: Acknowledgments. (line 72)
+* Kernighan, Brian <9>: Acknowledgments. (line 76)
* Kernighan, Brian <10>: Conventions. (line 34)
* Kernighan, Brian: History. (line 17)
* kill command, dynamic profiling: Profiling. (line 188)
* Knights, jedi: Undocumented. (line 6)
-* Knuth, Donald: Arbitrary Precision Arithmetic.
- (line 6)
* Kwok, Conrad: Contributors. (line 34)
* l debugger command (alias for list): Miscellaneous Debugger Commands.
(line 72)
* labels.awk program: Labels Program. (line 51)
+* Langston, Peter: Advanced Features. (line 6)
* languages, data-driven: Basic High Level. (line 85)
-* Laurie, Dirk: Changing Precision. (line 6)
-* LC_ALL locale category: Explaining gettext. (line 120)
-* LC_COLLATE locale category: Explaining gettext. (line 93)
-* LC_CTYPE locale category: Explaining gettext. (line 97)
-* LC_MESSAGES locale category: Explaining gettext. (line 87)
+* LC_ALL locale category: Explaining gettext. (line 121)
+* LC_COLLATE locale category: Explaining gettext. (line 94)
+* LC_CTYPE locale category: Explaining gettext. (line 98)
+* LC_MESSAGES locale category: Explaining gettext. (line 88)
* LC_MESSAGES locale category, bindtextdomain() function (gawk): Programmer i18n.
(line 88)
-* LC_MONETARY locale category: Explaining gettext. (line 103)
-* LC_NUMERIC locale category: Explaining gettext. (line 107)
-* LC_RESPONSE locale category: Explaining gettext. (line 111)
-* LC_TIME locale category: Explaining gettext. (line 115)
+* LC_MONETARY locale category: Explaining gettext. (line 104)
+* LC_NUMERIC locale category: Explaining gettext. (line 108)
+* LC_RESPONSE locale category: Explaining gettext. (line 112)
+* LC_TIME locale category: Explaining gettext. (line 116)
* left angle bracket (<), < operator <1>: Precedence. (line 65)
* left angle bracket (<), < operator: Comparison Operators.
(line 11)
@@ -31895,12 +32787,12 @@ Index
* left shift: Bitwise Functions. (line 46)
* left shift, bitwise: Bitwise Functions. (line 32)
* leftmost longest match: Multiple Line. (line 26)
-* length: String Functions. (line 164)
-* length of input record: String Functions. (line 171)
-* length of string: String Functions. (line 164)
-* Lesser General Public License (LGPL): Glossary. (line 397)
-* LGPL (Lesser General Public License): Glossary. (line 397)
-* libmawk: Other Versions. (line 120)
+* length: String Functions. (line 167)
+* length of input record: String Functions. (line 174)
+* length of string: String Functions. (line 167)
+* Lesser General Public License (LGPL): Glossary. (line 396)
+* LGPL (Lesser General Public License): Glossary. (line 396)
+* libmawk: Other Versions. (line 121)
* libraries of awk functions: Library Functions. (line 6)
* libraries of awk functions, assertions: Assert Function. (line 6)
* libraries of awk functions, associative arrays and: Library Names.
@@ -31933,35 +32825,35 @@ Index
* lines, duplicate, removing: History Sorting. (line 6)
* lines, matching ranges of: Ranges. (line 6)
* lines, skipping between markers: Ranges. (line 43)
-* lint checking: User-modified. (line 98)
+* lint checking: User-modified. (line 88)
* lint checking, array elements: Delete. (line 34)
* lint checking, array subscripts: Uninitialized Subscripts.
(line 43)
* lint checking, empty programs: Command Line. (line 16)
-* lint checking, issuing warnings: Options. (line 182)
+* lint checking, issuing warnings: Options. (line 185)
* lint checking, POSIXLY_CORRECT environment variable: Options.
- (line 332)
+ (line 338)
* lint checking, undefined functions: Pass By Value/Reference.
(line 88)
-* LINT variable: User-modified. (line 98)
-* Linux <1>: Glossary. (line 616)
+* LINT variable: User-modified. (line 88)
+* Linux <1>: Glossary. (line 611)
* Linux <2>: I18N Example. (line 55)
* Linux: Manual History. (line 28)
* list all global variables, in debugger: Debugger Info. (line 48)
* list debugger command: Miscellaneous Debugger Commands.
(line 72)
* list function definitions, in debugger: Debugger Info. (line 30)
-* loading, library: Options. (line 173)
+* loading, extensions: Options. (line 173)
* local variables, in a function: Variable Scope. (line 6)
-* locale categories: Explaining gettext. (line 80)
-* locale decimal point character: Options. (line 263)
+* locale categories: Explaining gettext. (line 81)
+* locale decimal point character: Options. (line 268)
* locale, definition of: Locales. (line 6)
* localization: I18N and L10N. (line 6)
* localization, See internationalization, localization: I18N and L10N.
(line 6)
-* log: Numeric Functions. (line 30)
+* log: Numeric Functions. (line 44)
* log files, timestamps in: Time Functions. (line 6)
-* logarithm: Numeric Functions. (line 30)
+* logarithm: Numeric Functions. (line 44)
* logical false/true: Truth Values. (line 6)
* logical operators, See Boolean expressions: Boolean Ops. (line 6)
* login information: Passwd Functions. (line 16)
@@ -31982,17 +32874,17 @@ Index
* mail-list file: Sample Data Files. (line 6)
* mailing labels, printing: Labels Program. (line 6)
* mailing list, GNITS: Acknowledgments. (line 52)
-* Malmberg, John <1>: Bugs. (line 70)
+* Malmberg, John <1>: Bugs. (line 71)
* Malmberg, John: Acknowledgments. (line 60)
* mark parity: Ordinal Functions. (line 45)
* marked string extraction (internationalization): String Extraction.
(line 6)
* marked strings, extracting: String Extraction. (line 6)
* Marx, Groucho: Increment Ops. (line 60)
-* match: String Functions. (line 204)
-* match regexp in string: String Functions. (line 204)
+* match: String Functions. (line 207)
+* match regexp in string: String Functions. (line 207)
* match() function, RSTART/RLENGTH variables: String Functions.
- (line 221)
+ (line 224)
* matching, expressions, See comparison expressions: Typing and Comparison.
(line 9)
* matching, leftmost longest: Multiple Line. (line 26)
@@ -32002,24 +32894,25 @@ Index
* mawk utility <3>: Concatenation. (line 36)
* mawk utility <4>: Getline/Pipe. (line 62)
* mawk utility: Escape Sequences. (line 124)
-* maximum precision supported by MPFR library: Auto-set. (line 227)
+* maximum precision supported by MPFR library: Auto-set. (line 221)
+* McIlroy, Doug: Glossary. (line 149)
* McPhee, Patrick: Contributors. (line 100)
-* message object files: Explaining gettext. (line 41)
+* message object files: Explaining gettext. (line 42)
* message object files, converting from portable object files: I18N Example.
- (line 62)
+ (line 63)
* message object files, specifying directory of <1>: Programmer i18n.
(line 47)
* message object files, specifying directory of: Explaining gettext.
- (line 53)
+ (line 54)
* messages from extensions: Printing Messages. (line 6)
* metacharacters in regular expressions: Regexp Operators. (line 6)
* metacharacters, escape sequences for: Escape Sequences. (line 130)
-* minimum precision supported by MPFR library: Auto-set. (line 230)
+* minimum precision supported by MPFR library: Auto-set. (line 224)
* mktime: Time Functions. (line 25)
* modifiers, in format specifiers: Format Modifiers. (line 6)
-* monetary information, localization: Explaining gettext. (line 103)
-* MPFR: Gawk and MPFR. (line 6)
-* msgfmt utility: I18N Example. (line 62)
+* monetary information, localization: Explaining gettext. (line 104)
+* Moore, Duncan: Getline Notes. (line 40)
+* msgfmt utility: I18N Example. (line 63)
* multiple precision: Arbitrary Precision Arithmetic.
(line 6)
* multiple-line records: Multiple Line. (line 6)
@@ -32032,26 +32925,25 @@ Index
* namespace issues <1>: Library Names. (line 6)
* namespace issues: Arrays. (line 18)
* namespace issues, functions: Definition Syntax. (line 20)
-* nawk utility: Names. (line 17)
-* negative zero: Unexpected Results. (line 34)
-* NetBSD: Glossary. (line 616)
+* nawk utility: Names. (line 10)
+* NetBSD: Glossary. (line 611)
* networks, programming: TCP/IP Networking. (line 6)
* networks, support for: Special Network. (line 6)
* newlines <1>: Boolean Ops. (line 67)
-* newlines <2>: Options. (line 253)
+* newlines <2>: Options. (line 258)
* newlines: Statements/Lines. (line 6)
* newlines, as field separators: Default Field Splitting.
(line 6)
-* newlines, as record separators: Records. (line 20)
-* newlines, in dynamic regexps: Computed Regexps. (line 58)
-* newlines, in regexp constants: Computed Regexps. (line 68)
+* newlines, as record separators: awk split records. (line 12)
+* newlines, in dynamic regexps: Computed Regexps. (line 59)
+* newlines, in regexp constants: Computed Regexps. (line 69)
* newlines, printing: Print Examples. (line 12)
* newlines, separating statements in actions <1>: Statements. (line 10)
* newlines, separating statements in actions: Action Overview.
(line 19)
* next debugger command: Debugger Execution Control.
(line 43)
-* next file statement: Feature History. (line 168)
+* next file statement: Feature History. (line 169)
* next statement <1>: Next Statement. (line 6)
* next statement: Boolean Ops. (line 85)
* next statement, BEGIN/END patterns and: I/O And BEGIN/END. (line 37)
@@ -32067,7 +32959,7 @@ Index
(line 47)
* nexti debugger command: Debugger Execution Control.
(line 49)
-* NF variable <1>: Auto-set. (line 117)
+* NF variable <1>: Auto-set. (line 112)
* NF variable: Fields. (line 33)
* NF variable, decrementing: Changing Fields. (line 107)
* ni debugger command (alias for nexti): Debugger Execution Control.
@@ -32076,22 +32968,23 @@ Index
* non-existent array elements: Reference to Elements.
(line 23)
* not Boolean-logic operator: Boolean Ops. (line 6)
-* NR variable <1>: Auto-set. (line 137)
+* NR variable <1>: Auto-set. (line 131)
* NR variable: Records. (line 6)
-* NR variable, changing: Auto-set. (line 323)
+* NR variable, changing: Auto-set. (line 309)
* null strings <1>: Basic Data Typing. (line 26)
* null strings <2>: Truth Values. (line 6)
* null strings <3>: Regexp Field Splitting.
(line 43)
-* null strings: Records. (line 122)
-* null strings in gawk arguments, quoting and: Quoting. (line 62)
+* null strings: awk split records. (line 114)
+* null strings in gawk arguments, quoting and: Quoting. (line 79)
* null strings, and deleting array elements: Delete. (line 27)
* null strings, as array subscripts: Uninitialized Subscripts.
(line 43)
-* null strings, converting numbers to strings: Conversion. (line 21)
+* null strings, converting numbers to strings: Strings And Numbers.
+ (line 21)
* null strings, matching: Gory Details. (line 164)
* number as string of bits: Bitwise Functions. (line 109)
-* number of array elements: String Functions. (line 194)
+* number of array elements: String Functions. (line 197)
* number sign (#), #! (executable scripts): Executable Scripts.
(line 6)
* number sign (#), commenting: Comments. (line 6)
@@ -32101,9 +32994,8 @@ Index
* numbers, Cliff random: Cliff Random Function.
(line 6)
* numbers, converting <1>: Bitwise Functions. (line 109)
-* numbers, converting: Conversion. (line 6)
-* numbers, converting, to strings: User-modified. (line 28)
-* numbers, floating-point: General Arithmetic. (line 6)
+* numbers, converting: Strings And Numbers. (line 6)
+* numbers, converting, to strings: User-modified. (line 30)
* numbers, hexadecimal: Nondecimal-numbers. (line 6)
* numbers, octal: Nondecimal-numbers. (line 6)
* numbers, rounding: Round Function. (line 6)
@@ -32112,18 +33004,18 @@ Index
* numeric, output format: OFMT. (line 6)
* numeric, strings: Variable Typing. (line 6)
* o debugger command (alias for option): Debugger Info. (line 57)
-* oawk utility: Names. (line 17)
+* oawk utility: Names. (line 10)
* obsolete features: Obsolete. (line 6)
* octal numbers: Nondecimal-numbers. (line 6)
-* octal values, enabling interpretation of: Options. (line 207)
-* OFMT variable <1>: User-modified. (line 115)
-* OFMT variable <2>: Conversion. (line 55)
+* octal values, enabling interpretation of: Options. (line 211)
+* OFMT variable <1>: User-modified. (line 105)
+* OFMT variable <2>: Strings And Numbers. (line 57)
* OFMT variable: OFMT. (line 15)
* OFMT variable, POSIX awk and: OFMT. (line 27)
-* OFS variable <1>: User-modified. (line 124)
+* OFS variable <1>: User-modified. (line 114)
* OFS variable <2>: Output Separators. (line 6)
* OFS variable: Changing Fields. (line 64)
-* OpenBSD: Glossary. (line 616)
+* OpenBSD: Glossary. (line 611)
* OpenSolaris: Other Versions. (line 96)
* operating systems, BSD-based: Manual History. (line 28)
* operating systems, PC, gawk on: PC Using. (line 6)
@@ -32173,14 +33065,14 @@ Index
(line 12)
* ord() user-defined function: Ordinal Functions. (line 16)
* order of evaluation, concatenation: Concatenation. (line 41)
-* ORS variable <1>: User-modified. (line 129)
+* ORS variable <1>: User-modified. (line 119)
* ORS variable: Output Separators. (line 20)
* output field separator, See OFS variable: Changing Fields. (line 64)
* output record separator, See ORS variable: Output Separators.
(line 20)
* output redirection: Redirection. (line 6)
* output wrapper: Output Wrappers. (line 6)
-* output, buffering: I/O Functions. (line 29)
+* output, buffering: I/O Functions. (line 32)
* output, duplicating into files: Tee Program. (line 6)
* output, files, closing: Close Files And Pipes.
(line 6)
@@ -32192,12 +33084,12 @@ Index
* output, standard: Special FD. (line 6)
* p debugger command (alias for print): Viewing And Changing Data.
(line 36)
-* P1003.1 POSIX standard: Glossary. (line 454)
-* parent process ID of gawk process: Auto-set. (line 195)
+* Papadopoulos, Panos: Contributors. (line 128)
+* parent process ID of gawk process: Auto-set. (line 189)
* parentheses (), in a profile: Profiling. (line 146)
-* parentheses (), regexp operator: Regexp Operators. (line 79)
+* parentheses (), regexp operator: Regexp Operators. (line 80)
* password file: Passwd Functions. (line 16)
-* patsplit: String Functions. (line 291)
+* patsplit: String Functions. (line 294)
* patterns: Patterns and Actions.
(line 6)
* patterns, comparison expressions as: Expression Patterns. (line 14)
@@ -32210,13 +33102,13 @@ Index
* patterns, types of: Pattern Overview. (line 15)
* pawk (profiling version of Brian Kernighan's awk): Other Versions.
(line 78)
-* pawk, awk-like facilities for Python: Other Versions. (line 124)
+* pawk, awk-like facilities for Python: Other Versions. (line 125)
* PC operating systems, gawk on: PC Using. (line 6)
* PC operating systems, gawk on, installing: PC Installation. (line 6)
* percent sign (%), % operator: Precedence. (line 55)
* percent sign (%), %= operator <1>: Precedence. (line 95)
-* percent sign (%), %= operator: Assignment Ops. (line 129)
-* period (.), regexp operator: Regexp Operators. (line 43)
+* percent sign (%), %= operator: Assignment Ops. (line 130)
+* period (.), regexp operator: Regexp Operators. (line 44)
* Perl: Future Extensions. (line 6)
* Peters, Arno: Contributors. (line 85)
* Peterson, Hal: Contributors. (line 39)
@@ -32224,7 +33116,7 @@ Index
(line 6)
* pipe, input: Getline/Pipe. (line 9)
* pipe, output: Redirection. (line 57)
-* Pitts, Dave <1>: Bugs. (line 70)
+* Pitts, Dave <1>: Bugs. (line 71)
* Pitts, Dave: Acknowledgments. (line 60)
* Plauger, P.J.: Library Functions. (line 12)
* plug-in: Extension Intro. (line 6)
@@ -32233,51 +33125,51 @@ Index
* plus sign (+), ++ operator: Increment Ops. (line 11)
* plus sign (+), += operator <1>: Precedence. (line 95)
* plus sign (+), += operator: Assignment Ops. (line 82)
-* plus sign (+), regexp operator: Regexp Operators. (line 102)
+* plus sign (+), regexp operator: Regexp Operators. (line 103)
* pointers to functions: Indirect Calls. (line 6)
* portability: Escape Sequences. (line 94)
* portability, #! (executable scripts): Executable Scripts. (line 33)
* portability, ** operator and: Arithmetic Ops. (line 81)
-* portability, **= operator and: Assignment Ops. (line 142)
+* portability, **= operator and: Assignment Ops. (line 143)
* portability, ARGV variable: Executable Scripts. (line 42)
* portability, backslash continuation and: Statements/Lines. (line 30)
* portability, backslash in escape sequences: Escape Sequences.
(line 112)
* portability, close() function and: Close Files And Pipes.
(line 81)
-* portability, data files as single record: Records. (line 194)
+* portability, data files as single record: gawk split records.
+ (line 65)
* portability, deleting array elements: Delete. (line 56)
* portability, example programs: Library Functions. (line 42)
-* portability, functions, defining: Definition Syntax. (line 99)
+* portability, functions, defining: Definition Syntax. (line 105)
* portability, gawk: New Ports. (line 6)
-* portability, gettext library and: Explaining gettext. (line 10)
+* portability, gettext library and: Explaining gettext. (line 11)
* portability, internationalization and: I18N Portability. (line 6)
-* portability, length() function: String Functions. (line 173)
-* portability, new awk vs. old awk: Conversion. (line 55)
+* portability, length() function: String Functions. (line 176)
+* portability, new awk vs. old awk: Strings And Numbers. (line 57)
* portability, next statement in user-defined functions: Pass By Value/Reference.
(line 91)
* portability, NF variable, decrementing: Changing Fields. (line 115)
* portability, operators: Increment Ops. (line 60)
* portability, operators, not in POSIX awk: Precedence. (line 98)
-* portability, POSIXLY_CORRECT environment variable: Options. (line 353)
+* portability, POSIXLY_CORRECT environment variable: Options. (line 358)
* portability, substr() function: String Functions. (line 510)
* portable object files <1>: Translator i18n. (line 6)
-* portable object files: Explaining gettext. (line 36)
+* portable object files: Explaining gettext. (line 37)
* portable object files, converting to message object files: I18N Example.
- (line 62)
+ (line 63)
* portable object files, generating: Options. (line 147)
-* portable object template files: Explaining gettext. (line 30)
+* portable object template files: Explaining gettext. (line 31)
* porting gawk: New Ports. (line 6)
* positional specifiers, printf statement <1>: Printf Ordering.
(line 6)
* positional specifiers, printf statement: Format Modifiers. (line 13)
* positional specifiers, printf statement, mixing with regular formats: Printf Ordering.
(line 57)
-* positive zero: Unexpected Results. (line 34)
-* POSIX awk <1>: Assignment Ops. (line 136)
+* POSIX awk <1>: Assignment Ops. (line 137)
* POSIX awk: This Manual. (line 14)
* POSIX awk, ** operator and: Precedence. (line 98)
-* POSIX awk, **= operator and: Assignment Ops. (line 142)
+* POSIX awk, **= operator and: Assignment Ops. (line 143)
* POSIX awk, < operator and: Getline/File. (line 26)
* POSIX awk, arithmetic operators and: Arithmetic Ops. (line 30)
* POSIX awk, backslashes in string constants: Escape Sequences.
@@ -32289,36 +33181,35 @@ Index
* POSIX awk, break statement and: Break Statement. (line 51)
* POSIX awk, changes in awk versions: POSIX. (line 6)
* POSIX awk, continue statement and: Continue Statement. (line 43)
-* POSIX awk, CONVFMT variable and: User-modified. (line 28)
-* POSIX awk, date utility and: Time Functions. (line 263)
+* POSIX awk, CONVFMT variable and: User-modified. (line 30)
+* POSIX awk, date utility and: Time Functions. (line 254)
* POSIX awk, field separators and <1>: Field Splitting Summary.
(line 40)
* POSIX awk, field separators and: Fields. (line 6)
-* POSIX awk, FS variable and: User-modified. (line 66)
-* POSIX awk, function keyword in: Definition Syntax. (line 83)
+* POSIX awk, FS variable and: User-modified. (line 60)
+* POSIX awk, function keyword in: Definition Syntax. (line 89)
* POSIX awk, functions and, gsub()/sub(): Gory Details. (line 54)
-* POSIX awk, functions and, length(): String Functions. (line 173)
+* POSIX awk, functions and, length(): String Functions. (line 176)
* POSIX awk, GNU long options and: Options. (line 15)
-* POSIX awk, interval expressions in: Regexp Operators. (line 135)
+* POSIX awk, interval expressions in: Regexp Operators. (line 136)
* POSIX awk, next/nextfile statements and: Next Statement. (line 45)
* POSIX awk, numeric strings and: Variable Typing. (line 6)
-* POSIX awk, OFMT variable and <1>: Conversion. (line 55)
+* POSIX awk, OFMT variable and <1>: Strings And Numbers. (line 57)
* POSIX awk, OFMT variable and: OFMT. (line 27)
-* POSIX awk, period (.), using: Regexp Operators. (line 50)
+* POSIX awk, period (.), using: Regexp Operators. (line 51)
* POSIX awk, printf format strings and: Format Modifiers. (line 159)
-* POSIX awk, regular expressions and: Regexp Operators. (line 161)
+* POSIX awk, regular expressions and: Regexp Operators. (line 162)
* POSIX awk, timestamps and: Time Functions. (line 6)
* POSIX awk, | I/O operator and: Getline/Pipe. (line 55)
-* POSIX mode: Options. (line 247)
+* POSIX mode: Options. (line 252)
* POSIX, awk and: Preface. (line 23)
* POSIX, gawk extensions not included in: POSIX/GNU. (line 6)
* POSIX, programs, implementing in awk: Clones. (line 6)
-* POSIXLY_CORRECT environment variable: Options. (line 332)
-* PREC variable <1>: Setting Precision. (line 6)
-* PREC variable: User-modified. (line 134)
+* POSIXLY_CORRECT environment variable: Options. (line 338)
+* PREC variable: User-modified. (line 124)
* precedence <1>: Precedence. (line 6)
* precedence: Increment Ops. (line 60)
-* precedence, regexp operators: Regexp Operators. (line 156)
+* precedence, regexp operators: Regexp Operators. (line 157)
* print debugger command: Viewing And Changing Data.
(line 36)
* print statement: Printing. (line 16)
@@ -32326,7 +33217,7 @@ Index
* print statement, commas, omitting: Print Examples. (line 31)
* print statement, I/O operators in: Precedence. (line 71)
* print statement, line continuations and: Print Examples. (line 76)
-* print statement, OFMT variable and: User-modified. (line 124)
+* print statement, OFMT variable and: User-modified. (line 114)
* print statement, See Also redirection, of output: Redirection.
(line 17)
* print statement, sprintf() function and: Round Function. (line 6)
@@ -32357,78 +33248,78 @@ Index
* printing, unduplicated lines of text: Uniq Program. (line 6)
* printing, user information: Id Program. (line 6)
* private variables: Library Names. (line 11)
-* process group idIDof gawk process: Auto-set. (line 189)
-* process ID of gawk process: Auto-set. (line 192)
+* process group idIDof gawk process: Auto-set. (line 183)
+* process ID of gawk process: Auto-set. (line 186)
* processes, two-way communications with: Two-way I/O. (line 23)
* processing data: Basic High Level. (line 6)
* PROCINFO array <1>: Passwd Functions. (line 6)
* PROCINFO array <2>: Time Functions. (line 47)
-* PROCINFO array: Auto-set. (line 142)
-* PROCINFO array, and communications via ptys: Two-way I/O. (line 116)
+* PROCINFO array: Auto-set. (line 136)
+* PROCINFO array, and communications via ptys: Two-way I/O. (line 117)
* PROCINFO array, and group membership: Group Functions. (line 6)
* PROCINFO array, and user and group ID numbers: Id Program. (line 15)
* PROCINFO array, testing the field splitting: Passwd Functions.
(line 161)
-* PROCINFO array, uses: Auto-set. (line 248)
+* PROCINFO array, uses: Auto-set. (line 242)
* PROCINFO, values of sorted_in: Controlling Scanning.
- (line 24)
+ (line 26)
* profiling awk programs: Profiling. (line 6)
* profiling awk programs, dynamically: Profiling. (line 179)
-* program identifiers: Auto-set. (line 160)
+* program identifiers: Auto-set. (line 154)
* program, definition of: Getting Started. (line 21)
* programmers, attractiveness of: Two-way I/O. (line 6)
* programming conventions, --non-decimal-data option: Nondecimal Data.
(line 36)
-* programming conventions, ARGC/ARGV variables: Auto-set. (line 31)
+* programming conventions, ARGC/ARGV variables: Auto-set. (line 35)
* programming conventions, exit statement: Exit Statement. (line 38)
* programming conventions, function parameters: Return Statement.
(line 45)
* programming conventions, functions, calling: Calling Built-in.
(line 10)
* programming conventions, functions, writing: Definition Syntax.
- (line 55)
+ (line 61)
* programming conventions, gawk extensions: Internal File Ops.
(line 45)
* programming conventions, private variable names: Library Names.
(line 23)
* programming language, recipe for: History. (line 6)
-* programming languages, Ada: Glossary. (line 20)
+* programming languages, Ada: Glossary. (line 19)
* programming languages, data-driven vs. procedural: Getting Started.
(line 12)
-* programming languages, Java: Glossary. (line 380)
+* programming languages, Java: Glossary. (line 379)
* programming, basic steps: Basic High Level. (line 20)
* programming, concepts: Basic Concepts. (line 6)
* pwcat program: Passwd Functions. (line 23)
* q debugger command (alias for quit): Miscellaneous Debugger Commands.
(line 99)
-* QSE Awk: Other Versions. (line 130)
+* QSE Awk: Other Versions. (line 131)
* Quanstrom, Erik: Alarm Program. (line 8)
* question mark (?), ?: operator: Precedence. (line 92)
* question mark (?), regexp operator <1>: GNU Regexp Operators.
(line 59)
-* question mark (?), regexp operator: Regexp Operators. (line 111)
-* QuikTrim Awk: Other Versions. (line 134)
+* question mark (?), regexp operator: Regexp Operators. (line 112)
+* QuikTrim Awk: Other Versions. (line 135)
* quit debugger command: Miscellaneous Debugger Commands.
(line 99)
* QUIT signal (MS-Windows): Profiling. (line 214)
* quoting in gawk command lines: Long. (line 26)
-* quoting in gawk command lines, tricks for: Quoting. (line 71)
+* quoting in gawk command lines, tricks for: Quoting. (line 88)
* quoting, for small awk programs: Comments. (line 27)
* r debugger command (alias for run): Debugger Execution Control.
(line 62)
* Rakitzis, Byron: History Sorting. (line 25)
* Ramey, Chet <1>: General Data Types. (line 6)
* Ramey, Chet: Acknowledgments. (line 60)
-* rand: Numeric Functions. (line 34)
+* rand: Numeric Functions. (line 48)
* random numbers, Cliff: Cliff Random Function.
(line 6)
* random numbers, rand()/srand() functions: Numeric Functions.
- (line 34)
-* random numbers, seed of: Numeric Functions. (line 64)
+ (line 48)
+* random numbers, seed of: Numeric Functions. (line 78)
* range expressions (regexps): Bracket Expressions. (line 6)
* range patterns: Ranges. (line 6)
* range patterns, line continuation and: Ranges. (line 65)
-* Rankin, Pat <1>: Bugs. (line 70)
+* Rankin, Pat <1>: Bugs. (line 71)
* Rankin, Pat <2>: Contributors. (line 37)
* Rankin, Pat <3>: Assignment Ops. (line 100)
* Rankin, Pat: Acknowledgments. (line 60)
@@ -32443,19 +33334,20 @@ Index
* readfile() user-defined function: Readfile Function. (line 30)
* reading input files: Reading Files. (line 6)
* recipe for a programming language: History. (line 6)
-* record separators <1>: User-modified. (line 143)
-* record separators: Records. (line 14)
-* record separators, changing: Records. (line 93)
-* record separators, regular expressions as: Records. (line 132)
+* record separators <1>: User-modified. (line 133)
+* record separators: awk split records. (line 6)
+* record separators, changing: awk split records. (line 85)
+* record separators, regular expressions as: awk split records.
+ (line 124)
* record separators, with multiline records: Multiple Line. (line 10)
* records <1>: Basic High Level. (line 73)
* records: Reading Files. (line 14)
* records, multiline: Multiple Line. (line 6)
* records, printing: Print. (line 22)
* records, splitting input into: Records. (line 6)
-* records, terminating: Records. (line 132)
-* records, treating files as: Records. (line 219)
-* recursive functions: Definition Syntax. (line 73)
+* records, terminating: awk split records. (line 124)
+* records, treating files as: gawk split records. (line 92)
+* recursive functions: Definition Syntax. (line 79)
* redirect gawk output, in debugger: Debugger Info. (line 72)
* redirection of input: Getline/File. (line 6)
* redirection of output: Redirection. (line 6)
@@ -32466,12 +33358,12 @@ Index
(line 102)
* regexp constants <2>: Regexp Constants. (line 6)
* regexp constants: Regexp Usage. (line 57)
-* regexp constants, /=.../, /= operator and: Assignment Ops. (line 147)
+* regexp constants, /=.../, /= operator and: Assignment Ops. (line 148)
* regexp constants, as patterns: Expression Patterns. (line 34)
* regexp constants, in gawk: Using Constant Regexps.
(line 28)
-* regexp constants, slashes vs. quotes: Computed Regexps. (line 28)
-* regexp constants, vs. string constants: Computed Regexps. (line 38)
+* regexp constants, slashes vs. quotes: Computed Regexps. (line 29)
+* regexp constants, vs. string constants: Computed Regexps. (line 39)
* register extension: Registration Functions.
(line 6)
* regular expressions: Regexp. (line 6)
@@ -32481,18 +33373,19 @@ Index
(line 6)
* regular expressions, as patterns <1>: Regexp Patterns. (line 6)
* regular expressions, as patterns: Regexp Usage. (line 6)
-* regular expressions, as record separators: Records. (line 132)
-* regular expressions, case sensitivity <1>: User-modified. (line 82)
+* regular expressions, as record separators: awk split records.
+ (line 124)
+* regular expressions, case sensitivity <1>: User-modified. (line 76)
* regular expressions, case sensitivity: Case-sensitivity. (line 6)
* regular expressions, computed: Computed Regexps. (line 6)
* regular expressions, constants, See regexp constants: Regexp Usage.
(line 57)
* regular expressions, dynamic: Computed Regexps. (line 6)
* regular expressions, dynamic, with embedded newlines: Computed Regexps.
- (line 58)
+ (line 59)
* regular expressions, gawk, command-line options: GNU Regexp Operators.
(line 70)
-* regular expressions, interval expressions and: Options. (line 272)
+* regular expressions, interval expressions and: Options. (line 277)
* regular expressions, leftmost longest match: Leftmost Longest.
(line 6)
* regular expressions, operators <1>: Regexp Operators. (line 6)
@@ -32504,7 +33397,7 @@ Index
* regular expressions, operators, gawk: GNU Regexp Operators.
(line 6)
* regular expressions, operators, precedence of: Regexp Operators.
- (line 156)
+ (line 157)
* regular expressions, searching for: Egrep Program. (line 6)
* relational operators, See comparison operators: Typing and Comparison.
(line 9)
@@ -32513,7 +33406,7 @@ Index
(line 54)
* return statement, user-defined functions: Return Statement. (line 6)
* return value, close() function: Close Files And Pipes.
- (line 130)
+ (line 131)
* rev() user-defined function: Function Example. (line 53)
* revoutput extension: Extension Sample Revout.
(line 11)
@@ -32531,12 +33424,12 @@ Index
* right angle bracket (>), >> operator (I/O): Redirection. (line 50)
* right shift: Bitwise Functions. (line 52)
* right shift, bitwise: Bitwise Functions. (line 32)
-* Ritchie, Dennis: Basic Data Typing. (line 55)
-* RLENGTH variable: Auto-set. (line 262)
-* RLENGTH variable, match() function and: String Functions. (line 221)
+* Ritchie, Dennis: Basic Data Typing. (line 54)
+* RLENGTH variable: Auto-set. (line 252)
+* RLENGTH variable, match() function and: String Functions. (line 224)
* Robbins, Arnold <1>: Future Extensions. (line 6)
* Robbins, Arnold <2>: Bugs. (line 32)
-* Robbins, Arnold <3>: Contributors. (line 139)
+* Robbins, Arnold <3>: Contributors. (line 141)
* Robbins, Arnold <4>: General Data Types. (line 6)
* Robbins, Arnold <5>: Alarm Program. (line 6)
* Robbins, Arnold <6>: Passwd Functions. (line 90)
@@ -32544,28 +33437,25 @@ Index
* Robbins, Arnold: Command Line Field Separator.
(line 73)
* Robbins, Bill: Getline/Pipe. (line 39)
-* Robbins, Harry: Acknowledgments. (line 78)
-* Robbins, Jean: Acknowledgments. (line 78)
+* Robbins, Harry: Acknowledgments. (line 82)
+* Robbins, Jean: Acknowledgments. (line 82)
* Robbins, Miriam <1>: Passwd Functions. (line 90)
* Robbins, Miriam <2>: Getline/Pipe. (line 39)
-* Robbins, Miriam: Acknowledgments. (line 78)
+* Robbins, Miriam: Acknowledgments. (line 82)
* Rommel, Kai Uwe: Contributors. (line 42)
-* round to nearest integer: Numeric Functions. (line 23)
+* round to nearest integer: Numeric Functions. (line 37)
* round() user-defined function: Round Function. (line 16)
-* rounding mode, floating-point: Rounding Mode. (line 6)
* rounding numbers: Round Function. (line 6)
-* ROUNDMODE variable <1>: Setting Rounding Mode.
- (line 6)
-* ROUNDMODE variable: User-modified. (line 138)
-* RS variable <1>: User-modified. (line 143)
-* RS variable: Records. (line 20)
+* ROUNDMODE variable: User-modified. (line 128)
+* RS variable <1>: User-modified. (line 133)
+* RS variable: awk split records. (line 12)
* RS variable, multiline records and: Multiple Line. (line 17)
* rshift: Bitwise Functions. (line 52)
-* RSTART variable: Auto-set. (line 268)
-* RSTART variable, match() function and: String Functions. (line 221)
-* RT variable <1>: Auto-set. (line 275)
+* RSTART variable: Auto-set. (line 258)
+* RSTART variable, match() function and: String Functions. (line 224)
+* RT variable <1>: Auto-set. (line 265)
* RT variable <2>: Multiple Line. (line 129)
-* RT variable: Records. (line 132)
+* RT variable: awk split records. (line 124)
* Rubin, Paul <1>: Contributors. (line 15)
* Rubin, Paul: History. (line 30)
* rule, definition of: Getting Started. (line 21)
@@ -32576,33 +33466,34 @@ Index
(line 68)
* sample debugging session: Sample Debugging Session.
(line 6)
-* sandbox mode: Options. (line 279)
+* sandbox mode: Options. (line 284)
* save debugger options: Debugger Info. (line 84)
* scalar or array: Type Functions. (line 11)
* scalar values: Basic Data Typing. (line 13)
* scanning arrays: Scanning an Array. (line 6)
* scanning multidimensional arrays: Multiscanning. (line 11)
-* Schorr, Andrew <1>: Contributors. (line 131)
+* Schorr, Andrew <1>: Contributors. (line 133)
+* Schorr, Andrew <2>: Auto-set. (line 292)
* Schorr, Andrew: Acknowledgments. (line 60)
* Schreiber, Bert: Acknowledgments. (line 38)
* Schreiber, Rita: Acknowledgments. (line 38)
-* search and replace in strings: String Functions. (line 82)
-* search in string: String Functions. (line 151)
+* search and replace in strings: String Functions. (line 89)
+* search in string: String Functions. (line 155)
* search paths <1>: VMS Running. (line 58)
* search paths <2>: PC Using. (line 10)
-* search paths: Igawk Program. (line 368)
-* search paths, for shared libraries: AWKLIBPATH Variable. (line 6)
+* search paths: Programs Exercises. (line 63)
+* search paths, for loadable extensions: AWKLIBPATH Variable. (line 6)
* search paths, for source files <1>: VMS Running. (line 58)
* search paths, for source files <2>: PC Using. (line 10)
-* search paths, for source files <3>: Igawk Program. (line 368)
+* search paths, for source files <3>: Programs Exercises. (line 63)
* search paths, for source files: AWKPATH Variable. (line 6)
* searching, files for regular expressions: Egrep Program. (line 6)
* searching, for words: Dupword Program. (line 6)
-* sed utility <1>: Glossary. (line 12)
+* sed utility <1>: Glossary. (line 11)
* sed utility <2>: Simple Sed. (line 6)
* sed utility: Field Splitting Summary.
(line 46)
-* seeding random number generator: Numeric Functions. (line 64)
+* seeding random number generator: Numeric Functions. (line 78)
* semicolon (;), AWKPATH variable and: PC Using. (line 10)
* semicolon (;), separating statements in actions <1>: Statements.
(line 10)
@@ -32610,25 +33501,23 @@ Index
(line 19)
* semicolon (;), separating statements in actions: Statements/Lines.
(line 91)
-* separators, field: User-modified. (line 56)
-* separators, field, FIELDWIDTHS variable and: User-modified. (line 35)
-* separators, field, FPAT variable and: User-modified. (line 45)
+* separators, field: User-modified. (line 50)
+* separators, field, FIELDWIDTHS variable and: User-modified. (line 37)
+* separators, field, FPAT variable and: User-modified. (line 43)
* separators, field, POSIX and: Fields. (line 6)
-* separators, for records <1>: User-modified. (line 143)
-* separators, for records: Records. (line 14)
-* separators, for records, regular expressions as: Records. (line 132)
+* separators, for records <1>: User-modified. (line 133)
+* separators, for records: awk split records. (line 6)
+* separators, for records, regular expressions as: awk split records.
+ (line 124)
* separators, for statements in actions: Action Overview. (line 19)
-* separators, subscript: User-modified. (line 156)
+* separators, subscript: User-modified. (line 146)
* set breakpoint: Breakpoint Control. (line 11)
* set debugger command: Viewing And Changing Data.
(line 59)
* set directory of message catalogs: I18N Functions. (line 12)
* set watchpoint: Viewing And Changing Data.
(line 67)
-* setting rounding mode: Setting Rounding Mode.
- (line 6)
-* setting working precision: Setting Precision. (line 6)
-* shadowing of variable values: Definition Syntax. (line 61)
+* shadowing of variable values: Definition Syntax. (line 67)
* shell quoting, double quote: Read Terminal. (line 25)
* shell quoting, rules for: Quoting. (line 6)
* shells, piping commands into: Redirection. (line 142)
@@ -32661,7 +33550,7 @@ Index
* side effects, conditional expressions: Conditional Exp. (line 22)
* side effects, decrement/increment operators: Increment Ops. (line 11)
* side effects, FILENAME variable: Getline Notes. (line 19)
-* side effects, function calls: Function Calls. (line 54)
+* side effects, function calls: Function Calls. (line 57)
* side effects, statements: Action Overview. (line 32)
* sidebar, A Constant's Base Does Not Affect Its Value: Nondecimal-numbers.
(line 64)
@@ -32669,30 +33558,32 @@ Index
(line 110)
* sidebar, Changing FS Does Not Affect the Fields: Field Splitting Summary.
(line 38)
-* sidebar, Changing NR and FNR: Auto-set. (line 321)
+* sidebar, Changing NR and FNR: Auto-set. (line 307)
* sidebar, Controlling Output Buffering with system(): I/O Functions.
- (line 135)
+ (line 138)
* sidebar, Escape Sequences for Metacharacters: Escape Sequences.
(line 128)
* sidebar, FS and IGNORECASE: Field Splitting Summary.
(line 64)
* sidebar, Interactive Versus Noninteractive Buffering: I/O Functions.
- (line 104)
+ (line 107)
* sidebar, Matching the Null String: Gory Details. (line 162)
* sidebar, Operator Evaluation Order: Increment Ops. (line 58)
* sidebar, Piping into sh: Redirection. (line 140)
* sidebar, Portability Issues with #!: Executable Scripts. (line 31)
+* sidebar, Pre-POSIX awk Used OFMT For String Conversion: Strings And Numbers.
+ (line 55)
* sidebar, Recipe For A Programming Language: History. (line 6)
-* sidebar, RS = "\0" Is Not Portable: Records. (line 192)
+* sidebar, RS = "\0" Is Not Portable: gawk split records. (line 63)
* sidebar, So Why Does gawk have BEGINFILE and ENDFILE?: Filetrans Function.
(line 83)
* sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops.
- (line 145)
+ (line 146)
* sidebar, Understanding $0: Changing Fields. (line 134)
* sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps.
- (line 56)
+ (line 57)
* sidebar, Using close()'s Return Value: Close Files And Pipes.
- (line 128)
+ (line 129)
* SIGHUP signal, for dynamic profiling: Profiling. (line 211)
* SIGINT signal (MS-Windows): Profiling. (line 214)
* signals, HUP/SIGHUP, for profiling: Profiling. (line 211)
@@ -32704,14 +33595,13 @@ Index
* SIGUSR1 signal, for dynamic profiling: Profiling. (line 188)
* silent debugger command: Debugger Execution Control.
(line 10)
-* sin: Numeric Functions. (line 75)
-* sine: Numeric Functions. (line 75)
-* single precision floating-point: General Arithmetic. (line 21)
+* sin: Numeric Functions. (line 89)
+* sine: Numeric Functions. (line 89)
* single quote ('): One-shot. (line 15)
* single quote (') in gawk command lines: Long. (line 33)
-* single quote ('), in shell commands: Quoting. (line 31)
+* single quote ('), in shell commands: Quoting. (line 48)
* single quote ('), vs. apostrophe: Comments. (line 27)
-* single quote ('), with double quotes: Quoting. (line 53)
+* single quote ('), with double quotes: Quoting. (line 70)
* single-character fields: Single Character Fields.
(line 6)
* single-step execution, in the debugger: Debugger Execution Control.
@@ -32719,49 +33609,49 @@ Index
* Skywalker, Luke: Undocumented. (line 6)
* sleep utility: Alarm Program. (line 111)
* sleep() extension function: Extension Sample Time.
- (line 23)
+ (line 22)
* Solaris, POSIX-compliant awk: Other Versions. (line 96)
-* sort array: String Functions. (line 32)
-* sort array indices: String Functions. (line 32)
+* sort array: String Functions. (line 42)
+* sort array indices: String Functions. (line 42)
* sort function, arrays, sorting: Array Sorting Functions.
(line 6)
* sort utility: Word Sorting. (line 50)
* sort utility, coprocesses and: Two-way I/O. (line 83)
* sorting characters in different languages: Explaining gettext.
- (line 93)
+ (line 94)
* source code, awka: Other Versions. (line 64)
* source code, Brian Kernighan's awk: Other Versions. (line 13)
* source code, Busybox Awk: Other Versions. (line 88)
* source code, gawk: Gawk Distribution. (line 6)
-* source code, Illumos awk: Other Versions. (line 104)
-* source code, jawk: Other Versions. (line 112)
-* source code, libmawk: Other Versions. (line 120)
+* source code, Illumos awk: Other Versions. (line 105)
+* source code, jawk: Other Versions. (line 113)
+* source code, libmawk: Other Versions. (line 121)
* source code, mawk: Other Versions. (line 44)
* source code, mixing: Options. (line 117)
* source code, pawk: Other Versions. (line 78)
-* source code, pawk (Python version): Other Versions. (line 124)
-* source code, QSE Awk: Other Versions. (line 130)
-* source code, QuikTrim Awk: Other Versions. (line 134)
+* source code, pawk (Python version): Other Versions. (line 125)
+* source code, QSE Awk: Other Versions. (line 131)
+* source code, QuikTrim Awk: Other Versions. (line 135)
* source code, Solaris awk: Other Versions. (line 96)
-* source files, search path for: Igawk Program. (line 368)
-* sparse arrays: Array Intro. (line 70)
-* Spencer, Henry: Glossary. (line 12)
+* source files, search path for: Programs Exercises. (line 63)
+* sparse arrays: Array Intro. (line 71)
+* Spencer, Henry: Glossary. (line 11)
* split: String Functions. (line 313)
-* split string into array: String Functions. (line 291)
+* split string into array: String Functions. (line 294)
* split utility: Split Program. (line 6)
* split() function, array elements, deleting: Delete. (line 61)
* split.awk program: Split Program. (line 30)
-* sprintf <1>: String Functions. (line 378)
+* sprintf <1>: String Functions. (line 381)
* sprintf: OFMT. (line 15)
-* sprintf() function, OFMT variable and: User-modified. (line 124)
+* sprintf() function, OFMT variable and: User-modified. (line 114)
* sprintf() function, print/printf statements and: Round Function.
(line 6)
-* sqrt: Numeric Functions. (line 78)
-* square brackets ([]), regexp operator: Regexp Operators. (line 55)
-* square root: Numeric Functions. (line 78)
-* srand: Numeric Functions. (line 82)
+* sqrt: Numeric Functions. (line 92)
+* square brackets ([]), regexp operator: Regexp Operators. (line 56)
+* square root: Numeric Functions. (line 92)
+* srand: Numeric Functions. (line 96)
* stack frame: Debugging Terms. (line 10)
-* Stallman, Richard <1>: Glossary. (line 297)
+* Stallman, Richard <1>: Glossary. (line 296)
* Stallman, Richard <2>: Contributors. (line 23)
* Stallman, Richard <3>: Acknowledgments. (line 18)
* Stallman, Richard: Manual History. (line 6)
@@ -32786,21 +33676,21 @@ Index
(line 46)
* strftime: Time Functions. (line 48)
* string constants: Scalar Constants. (line 15)
-* string constants, vs. regexp constants: Computed Regexps. (line 38)
+* string constants, vs. regexp constants: Computed Regexps. (line 39)
* string extraction (internationalization): String Extraction.
(line 6)
-* string length: String Functions. (line 164)
+* string length: String Functions. (line 167)
* string operators: Concatenation. (line 8)
-* string, regular expression match: String Functions. (line 204)
+* string, regular expression match: String Functions. (line 207)
* string-manipulation functions: String Functions. (line 6)
* string-matching operators: Regexp Usage. (line 19)
* string-translation functions: I18N Functions. (line 6)
* strings splitting, example: String Functions. (line 333)
* strings, converting <1>: Bitwise Functions. (line 109)
-* strings, converting: Conversion. (line 6)
+* strings, converting: Strings And Numbers. (line 6)
* strings, converting letter case: String Functions. (line 520)
-* strings, converting, numbers to: User-modified. (line 28)
-* strings, empty, See null strings: Records. (line 122)
+* strings, converting, numbers to: User-modified. (line 30)
+* strings, empty, See null strings: awk split records. (line 114)
* strings, extracting: String Extraction. (line 6)
* strings, for localization: Programmer i18n. (line 14)
* strings, length limitations: Scalar Constants. (line 20)
@@ -32808,7 +33698,7 @@ Index
* strings, null: Regexp Field Splitting.
(line 43)
* strings, numeric: Variable Typing. (line 6)
-* strtonum: String Functions. (line 385)
+* strtonum: String Functions. (line 388)
* strtonum() function (gawk), --non-decimal-data option and: Nondecimal Data.
(line 36)
* sub <1>: String Functions. (line 406)
@@ -32816,7 +33706,7 @@ Index
(line 43)
* sub() function, arguments of: String Functions. (line 460)
* sub() function, escape processing: Gory Details. (line 6)
-* subscript separators: User-modified. (line 156)
+* subscript separators: User-modified. (line 146)
* subscripts in arrays, multidimensional: Multidimensional. (line 10)
* subscripts in arrays, multidimensional, scanning: Multiscanning.
(line 11)
@@ -32824,19 +33714,19 @@ Index
(line 6)
* subscripts in arrays, uninitialized variables as: Uninitialized Subscripts.
(line 6)
-* SUBSEP variable: User-modified. (line 156)
+* SUBSEP variable: User-modified. (line 146)
* SUBSEP variable, and multidimensional arrays: Multidimensional.
(line 16)
-* substitute in string: String Functions. (line 82)
+* substitute in string: String Functions. (line 89)
* substr: String Functions. (line 479)
* substring: String Functions. (line 479)
* Sumner, Andrew: Other Versions. (line 64)
-* supplementary groups of gawk process: Auto-set. (line 243)
+* supplementary groups of gawk process: Auto-set. (line 237)
* switch statement: Switch Statement. (line 6)
-* SYMTAB array: Auto-set. (line 283)
+* SYMTAB array: Auto-set. (line 269)
* syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops.
- (line 147)
-* system: I/O Functions. (line 72)
+ (line 148)
+* system: I/O Functions. (line 75)
* systime: Time Functions. (line 66)
* t debugger command (alias for tbreak): Breakpoint Control. (line 90)
* tbreak debugger command: Breakpoint Control. (line 90)
@@ -32846,11 +33736,11 @@ Index
* tee utility: Tee Program. (line 6)
* tee.awk program: Tee Program. (line 26)
* temporary breakpoint: Breakpoint Control. (line 90)
-* terminating records: Records. (line 132)
+* terminating records: awk split records. (line 124)
* testbits.awk program: Bitwise Functions. (line 70)
* testext extension: Extension Sample API Tests.
(line 6)
-* Texinfo <1>: Adding Code. (line 99)
+* Texinfo <1>: Adding Code. (line 100)
* Texinfo <2>: Distribution contents.
(line 77)
* Texinfo <3>: Extract Program. (line 12)
@@ -32863,10 +33753,10 @@ Index
* text, printing: Print. (line 22)
* text, printing, unduplicated lines of: Uniq Program. (line 6)
* TEXTDOMAIN variable <1>: Programmer i18n. (line 9)
-* TEXTDOMAIN variable: User-modified. (line 162)
+* TEXTDOMAIN variable: User-modified. (line 152)
* TEXTDOMAIN variable, BEGIN pattern and: Programmer i18n. (line 60)
* TEXTDOMAIN variable, portability and: I18N Portability. (line 20)
-* textdomain() function (C library): Explaining gettext. (line 27)
+* textdomain() function (C library): Explaining gettext. (line 28)
* tilde (~), ~ operator <1>: Expression Patterns. (line 24)
* tilde (~), ~ operator <2>: Precedence. (line 80)
* tilde (~), ~ operator <3>: Comparison Operators.
@@ -32877,7 +33767,7 @@ Index
* tilde (~), ~ operator: Regexp Usage. (line 19)
* time functions: Time Functions. (line 6)
* time, alarm clock example program: Alarm Program. (line 11)
-* time, localization and: Explaining gettext. (line 115)
+* time, localization and: Explaining gettext. (line 116)
* time, managing: Getlocaltime Function.
(line 6)
* time, retrieving: Time Functions. (line 17)
@@ -32894,8 +33784,8 @@ Index
* traceback, display in debugger: Execution Stack. (line 13)
* translate string: I18N Functions. (line 22)
* translate.awk program: Translate Program. (line 55)
-* treating files, as single records: Records. (line 219)
-* troubleshooting, --non-decimal-data option: Options. (line 207)
+* treating files, as single records: gawk split records. (line 92)
+* troubleshooting, --non-decimal-data option: Options. (line 211)
* troubleshooting, == operator: Comparison Operators.
(line 37)
* troubleshooting, awk uses FS not IFS: Field Separators. (line 30)
@@ -32906,26 +33796,25 @@ Index
(line 23)
* troubleshooting, fatal errors, printf format strings: Format Modifiers.
(line 159)
-* troubleshooting, fflush() function: I/O Functions. (line 60)
-* troubleshooting, function call syntax: Function Calls. (line 28)
+* troubleshooting, fflush() function: I/O Functions. (line 63)
+* troubleshooting, function call syntax: Function Calls. (line 30)
* troubleshooting, gawk: Compatibility Mode. (line 6)
* troubleshooting, gawk, bug reports: Bugs. (line 9)
* troubleshooting, gawk, fatal errors, function arguments: Calling Built-in.
(line 16)
* troubleshooting, getline function: File Checking. (line 25)
* troubleshooting, gsub()/sub() functions: String Functions. (line 470)
-* troubleshooting, match() function: String Functions. (line 286)
-* troubleshooting, patsplit() function: String Functions. (line 309)
+* troubleshooting, match() function: String Functions. (line 289)
* troubleshooting, print statement, omitting commas: Print Examples.
(line 31)
* troubleshooting, printing: Redirection. (line 118)
* troubleshooting, quotes with file names: Special FD. (line 68)
* troubleshooting, readable data files: File Checking. (line 6)
* troubleshooting, regexp constants vs. string constants: Computed Regexps.
- (line 38)
+ (line 39)
* troubleshooting, string concatenation: Concatenation. (line 26)
* troubleshooting, substr() function: String Functions. (line 497)
-* troubleshooting, system() function: I/O Functions. (line 94)
+* troubleshooting, system() function: I/O Functions. (line 97)
* troubleshooting, typographical errors, global variables: Options.
(line 98)
* true, logical: Truth Values. (line 6)
@@ -32934,14 +33823,14 @@ Index
* Trueman, David: History. (line 30)
* trunc-mod operation: Arithmetic Ops. (line 66)
* truth values: Truth Values. (line 6)
-* type conversion: Conversion. (line 21)
+* type conversion: Strings And Numbers. (line 21)
* u debugger command (alias for until): Debugger Execution Control.
(line 83)
* unassigned array elements: Reference to Elements.
(line 18)
* undefined functions: Pass By Value/Reference.
(line 71)
-* underscore (_), C macro: Explaining gettext. (line 70)
+* underscore (_), C macro: Explaining gettext. (line 71)
* underscore (_), in names of private variables: Library Names.
(line 29)
* underscore (_), translatable string: Programmer i18n. (line 69)
@@ -32955,21 +33844,21 @@ Index
(line 6)
* uniq utility: Uniq Program. (line 6)
* uniq.awk program: Uniq Program. (line 65)
-* Unix: Glossary. (line 616)
+* Unix: Glossary. (line 611)
* Unix awk, backslashes in escape sequences: Escape Sequences.
(line 124)
* Unix awk, close() function and: Close Files And Pipes.
- (line 130)
+ (line 131)
* Unix awk, password files, field separators and: Command Line Field Separator.
(line 64)
* Unix, awk scripts and: Executable Scripts. (line 6)
* UNIXROOT variable, on OS/2 systems: PC Using. (line 16)
-* unsigned integers: General Arithmetic. (line 15)
+* unsigned integers: Computer Arithmetic. (line 41)
* until debugger command: Debugger Execution Control.
(line 83)
* unwatch debugger command: Viewing And Changing Data.
(line 84)
-* up debugger command: Execution Stack. (line 33)
+* up debugger command: Execution Stack. (line 34)
* user database, reading: Passwd Functions. (line 6)
* user-defined functions: User-defined. (line 6)
* user-defined, functions, counts, in a profile: Profiling. (line 137)
@@ -33005,18 +33894,18 @@ Index
* variables, names of: Arrays. (line 18)
* variables, private: Library Names. (line 11)
* variables, setting: Options. (line 32)
-* variables, shadowing: Definition Syntax. (line 61)
+* variables, shadowing: Definition Syntax. (line 67)
* variables, types of: Assignment Ops. (line 40)
* variables, types of, comparison expressions and: Typing and Comparison.
(line 9)
* variables, uninitialized, as array subscripts: Uninitialized Subscripts.
(line 6)
* variables, user-defined: Variables. (line 6)
-* version of gawk: Auto-set. (line 213)
-* version of gawk extension API: Auto-set. (line 238)
-* version of GNU MP library: Auto-set. (line 224)
-* version of GNU MPFR library: Auto-set. (line 220)
-* vertical bar (|): Regexp Operators. (line 69)
+* version of gawk: Auto-set. (line 207)
+* version of gawk extension API: Auto-set. (line 232)
+* version of GNU MP library: Auto-set. (line 218)
+* version of GNU MPFR library: Auto-set. (line 214)
+* vertical bar (|): Regexp Operators. (line 70)
* vertical bar (|), | operator (I/O) <1>: Precedence. (line 65)
* vertical bar (|), | operator (I/O): Getline/Pipe. (line 9)
* vertical bar (|), |& operator (I/O) <1>: Two-way I/O. (line 44)
@@ -33036,7 +33925,7 @@ Index
* Wall, Larry <1>: Future Extensions. (line 6)
* Wall, Larry: Array Intro. (line 6)
* Wallin, Anders: Contributors. (line 103)
-* warnings, issuing: Options. (line 182)
+* warnings, issuing: Options. (line 185)
* watch debugger command: Viewing And Changing Data.
(line 67)
* watchpoint: Debugging Terms. (line 42)
@@ -33049,7 +33938,7 @@ Index
* whitespace, as field separators: Default Field Splitting.
(line 6)
* whitespace, functions, calling: Calling Built-in. (line 10)
-* whitespace, newlines as: Options. (line 253)
+* whitespace, newlines as: Options. (line 258)
* Williams, Kent: Contributors. (line 34)
* Woehlke, Matthew: Contributors. (line 79)
* Woods, John: Contributors. (line 27)
@@ -33068,17 +33957,16 @@ Index
* xgettext utility: String Extraction. (line 13)
* xor: Bitwise Functions. (line 55)
* XOR bitwise operation: Bitwise Functions. (line 6)
-* Yawitz, Efraim: Contributors. (line 129)
-* Zaretskii, Eli <1>: Bugs. (line 70)
+* Yawitz, Efraim: Contributors. (line 131)
+* Zaretskii, Eli <1>: Bugs. (line 71)
* Zaretskii, Eli <2>: Contributors. (line 55)
* Zaretskii, Eli: Acknowledgments. (line 60)
-* zero, negative vs. positive: Unexpected Results. (line 34)
* zerofile.awk program: Empty Files. (line 21)
* Zoulas, Christos: Contributors. (line 66)
* {} (braces): Profiling. (line 142)
* {} (braces), actions and: Action Overview. (line 19)
* {} (braces), statements, grouping: Statements. (line 10)
-* | (vertical bar): Regexp Operators. (line 69)
+* | (vertical bar): Regexp Operators. (line 70)
* | (vertical bar), | operator (I/O) <1>: Precedence. (line 65)
* | (vertical bar), | operator (I/O) <2>: Redirection. (line 57)
* | (vertical bar), | operator (I/O): Getline/Pipe. (line 9)
@@ -33087,7 +33975,7 @@ Index
* | (vertical bar), |& operator (I/O) <3>: Redirection. (line 102)
* | (vertical bar), |& operator (I/O): Getline/Coprocess. (line 6)
* | (vertical bar), |& operator (I/O), pipes, closing: Close Files And Pipes.
- (line 118)
+ (line 119)
* | (vertical bar), || operator <1>: Precedence. (line 89)
* | (vertical bar), || operator: Boolean Ops. (line 57)
* ~ (tilde), ~ operator <1>: Expression Patterns. (line 24)
@@ -33102,530 +33990,553 @@ Index

Tag Table:
-Node: Top1292
-Node: Foreword40821
-Node: Preface45166
-Ref: Preface-Footnote-148219
-Ref: Preface-Footnote-248315
-Node: History48547
-Node: Names50921
-Ref: Names-Footnote-152398
-Node: This Manual52470
-Ref: This Manual-Footnote-158244
-Node: Conventions58344
-Node: Manual History60500
-Ref: Manual History-Footnote-163948
-Ref: Manual History-Footnote-263989
-Node: How To Contribute64063
-Node: Acknowledgments65207
-Node: Getting Started69401
-Node: Running gawk71780
-Node: One-shot72966
-Node: Read Terminal74191
-Ref: Read Terminal-Footnote-175841
-Ref: Read Terminal-Footnote-276117
-Node: Long76288
-Node: Executable Scripts77664
-Ref: Executable Scripts-Footnote-179497
-Ref: Executable Scripts-Footnote-279599
-Node: Comments80146
-Node: Quoting82613
-Node: DOS Quoting87236
-Node: Sample Data Files87911
-Node: Very Simple90426
-Node: Two Rules95077
-Node: More Complex96975
-Ref: More Complex-Footnote-199905
-Node: Statements/Lines99990
-Ref: Statements/Lines-Footnote-1104453
-Node: Other Features104718
-Node: When105646
-Node: Invoking Gawk107793
-Node: Command Line109256
-Node: Options110039
-Ref: Options-Footnote-1125417
-Node: Other Arguments125442
-Node: Naming Standard Input128100
-Node: Environment Variables129194
-Node: AWKPATH Variable129752
-Ref: AWKPATH Variable-Footnote-1132533
-Ref: AWKPATH Variable-Footnote-2132578
-Node: AWKLIBPATH Variable132838
-Node: Other Environment Variables133556
-Node: Exit Status136519
-Node: Include Files137194
-Node: Loading Shared Libraries140763
-Node: Obsolete142127
-Node: Undocumented142824
-Node: Regexp143066
-Node: Regexp Usage144455
-Node: Escape Sequences146480
-Node: Regexp Operators152149
-Ref: Regexp Operators-Footnote-1159529
-Ref: Regexp Operators-Footnote-2159676
-Node: Bracket Expressions159774
-Ref: table-char-classes161664
-Node: GNU Regexp Operators164187
-Node: Case-sensitivity167910
-Ref: Case-sensitivity-Footnote-1170878
-Ref: Case-sensitivity-Footnote-2171113
-Node: Leftmost Longest171221
-Node: Computed Regexps172422
-Node: Reading Files175759
-Node: Records177761
-Ref: Records-Footnote-1187284
-Node: Fields187321
-Ref: Fields-Footnote-1190277
-Node: Nonconstant Fields190363
-Node: Changing Fields192569
-Node: Field Separators198528
-Node: Default Field Splitting201230
-Node: Regexp Field Splitting202347
-Node: Single Character Fields205689
-Node: Command Line Field Separator206748
-Node: Full Line Fields210090
-Ref: Full Line Fields-Footnote-1210598
-Node: Field Splitting Summary210644
-Ref: Field Splitting Summary-Footnote-1213743
-Node: Constant Size213844
-Node: Splitting By Content218451
-Ref: Splitting By Content-Footnote-1222200
-Node: Multiple Line222240
-Ref: Multiple Line-Footnote-1228087
-Node: Getline228266
-Node: Plain Getline230482
-Node: Getline/Variable232577
-Node: Getline/File233724
-Node: Getline/Variable/File235065
-Ref: Getline/Variable/File-Footnote-1236664
-Node: Getline/Pipe236751
-Node: Getline/Variable/Pipe239450
-Node: Getline/Coprocess240557
-Node: Getline/Variable/Coprocess241809
-Node: Getline Notes242546
-Node: Getline Summary245333
-Ref: table-getline-variants245741
-Node: Read Timeout246653
-Ref: Read Timeout-Footnote-1250394
-Node: Command line directories250451
-Node: Printing251081
-Node: Print252712
-Node: Print Examples254049
-Node: Output Separators256833
-Node: OFMT258849
-Node: Printf260207
-Node: Basic Printf261113
-Node: Control Letters262652
-Node: Format Modifiers266464
-Node: Printf Examples272473
-Node: Redirection275185
-Node: Special Files282159
-Node: Special FD282692
-Ref: Special FD-Footnote-1286317
-Node: Special Network286391
-Node: Special Caveats287241
-Node: Close Files And Pipes288037
-Ref: Close Files And Pipes-Footnote-1295020
-Ref: Close Files And Pipes-Footnote-2295168
-Node: Expressions295318
-Node: Values296450
-Node: Constants297126
-Node: Scalar Constants297806
-Ref: Scalar Constants-Footnote-1298665
-Node: Nondecimal-numbers298847
-Node: Regexp Constants301847
-Node: Using Constant Regexps302322
-Node: Variables305377
-Node: Using Variables306032
-Node: Assignment Options307756
-Node: Conversion309631
-Ref: table-locale-affects315131
-Ref: Conversion-Footnote-1315755
-Node: All Operators315864
-Node: Arithmetic Ops316494
-Node: Concatenation318999
-Ref: Concatenation-Footnote-1321787
-Node: Assignment Ops321907
-Ref: table-assign-ops326895
-Node: Increment Ops328226
-Node: Truth Values and Conditions331660
-Node: Truth Values332743
-Node: Typing and Comparison333792
-Node: Variable Typing334585
-Ref: Variable Typing-Footnote-1338482
-Node: Comparison Operators338604
-Ref: table-relational-ops339014
-Node: POSIX String Comparison342562
-Ref: POSIX String Comparison-Footnote-1343518
-Node: Boolean Ops343656
-Ref: Boolean Ops-Footnote-1347726
-Node: Conditional Exp347817
-Node: Function Calls349549
-Node: Precedence353143
-Node: Locales356812
-Node: Patterns and Actions357901
-Node: Pattern Overview358955
-Node: Regexp Patterns360624
-Node: Expression Patterns361167
-Node: Ranges364948
-Node: BEGIN/END368052
-Node: Using BEGIN/END368814
-Ref: Using BEGIN/END-Footnote-1371550
-Node: I/O And BEGIN/END371656
-Node: BEGINFILE/ENDFILE373938
-Node: Empty376852
-Node: Using Shell Variables377169
-Node: Action Overview379454
-Node: Statements381811
-Node: If Statement383665
-Node: While Statement385164
-Node: Do Statement387208
-Node: For Statement388364
-Node: Switch Statement391516
-Node: Break Statement393670
-Node: Continue Statement395660
-Node: Next Statement397453
-Node: Nextfile Statement399843
-Node: Exit Statement402498
-Node: Built-in Variables404914
-Node: User-modified406009
-Ref: User-modified-Footnote-1414367
-Node: Auto-set414429
-Ref: Auto-set-Footnote-1427886
-Ref: Auto-set-Footnote-2428091
-Node: ARGC and ARGV428147
-Node: Arrays432001
-Node: Array Basics433506
-Node: Array Intro434332
-Node: Reference to Elements438649
-Node: Assigning Elements440919
-Node: Array Example441410
-Node: Scanning an Array443142
-Node: Controlling Scanning445456
-Ref: Controlling Scanning-Footnote-1450543
-Node: Delete450859
-Ref: Delete-Footnote-1453624
-Node: Numeric Array Subscripts453681
-Node: Uninitialized Subscripts455864
-Node: Multidimensional457491
-Node: Multiscanning460584
-Node: Arrays of Arrays462173
-Node: Functions466813
-Node: Built-in467632
-Node: Calling Built-in468710
-Node: Numeric Functions470698
-Ref: Numeric Functions-Footnote-1474530
-Ref: Numeric Functions-Footnote-2474887
-Ref: Numeric Functions-Footnote-3474935
-Node: String Functions475204
-Ref: String Functions-Footnote-1498162
-Ref: String Functions-Footnote-2498291
-Ref: String Functions-Footnote-3498539
-Node: Gory Details498626
-Ref: table-sub-escapes500305
-Ref: table-sub-posix-92501659
-Ref: table-sub-proposed503010
-Ref: table-posix-sub504364
-Ref: table-gensub-escapes505909
-Ref: Gory Details-Footnote-1507085
-Ref: Gory Details-Footnote-2507136
-Node: I/O Functions507287
-Ref: I/O Functions-Footnote-1514277
-Node: Time Functions514424
-Ref: Time Functions-Footnote-1525407
-Ref: Time Functions-Footnote-2525475
-Ref: Time Functions-Footnote-3525633
-Ref: Time Functions-Footnote-4525744
-Ref: Time Functions-Footnote-5525856
-Ref: Time Functions-Footnote-6526083
-Node: Bitwise Functions526349
-Ref: table-bitwise-ops526911
-Ref: Bitwise Functions-Footnote-1531132
-Node: Type Functions531316
-Node: I18N Functions532467
-Node: User-defined534094
-Node: Definition Syntax534898
-Ref: Definition Syntax-Footnote-1539812
-Node: Function Example539881
-Ref: Function Example-Footnote-1542530
-Node: Function Caveats542552
-Node: Calling A Function543070
-Node: Variable Scope544025
-Node: Pass By Value/Reference546988
-Node: Return Statement550496
-Node: Dynamic Typing553477
-Node: Indirect Calls554408
-Node: Library Functions564095
-Ref: Library Functions-Footnote-1567608
-Ref: Library Functions-Footnote-2567751
-Node: Library Names567922
-Ref: Library Names-Footnote-1571395
-Ref: Library Names-Footnote-2571615
-Node: General Functions571701
-Node: Strtonum Function572729
-Node: Assert Function575659
-Node: Round Function578985
-Node: Cliff Random Function580526
-Node: Ordinal Functions581542
-Ref: Ordinal Functions-Footnote-1584619
-Ref: Ordinal Functions-Footnote-2584871
-Node: Join Function585082
-Ref: Join Function-Footnote-1586853
-Node: Getlocaltime Function587053
-Node: Readfile Function590794
-Node: Data File Management592633
-Node: Filetrans Function593265
-Node: Rewind Function597334
-Node: File Checking598721
-Node: Empty Files599815
-Node: Ignoring Assigns602045
-Node: Getopt Function603599
-Ref: Getopt Function-Footnote-1614902
-Node: Passwd Functions615105
-Ref: Passwd Functions-Footnote-1624083
-Node: Group Functions624171
-Node: Walking Arrays632255
-Node: Sample Programs634391
-Node: Running Examples635065
-Node: Clones635793
-Node: Cut Program637017
-Node: Egrep Program646868
-Ref: Egrep Program-Footnote-1654641
-Node: Id Program654751
-Node: Split Program658400
-Ref: Split Program-Footnote-1661919
-Node: Tee Program662047
-Node: Uniq Program664850
-Node: Wc Program672279
-Ref: Wc Program-Footnote-1676545
-Ref: Wc Program-Footnote-2676745
-Node: Miscellaneous Programs676837
-Node: Dupword Program678025
-Node: Alarm Program680056
-Node: Translate Program684863
-Ref: Translate Program-Footnote-1689250
-Ref: Translate Program-Footnote-2689498
-Node: Labels Program689632
-Ref: Labels Program-Footnote-1693003
-Node: Word Sorting693087
-Node: History Sorting696971
-Node: Extract Program698810
-Ref: Extract Program-Footnote-1706313
-Node: Simple Sed706441
-Node: Igawk Program709503
-Ref: Igawk Program-Footnote-1724660
-Ref: Igawk Program-Footnote-2724861
-Node: Anagram Program724999
-Node: Signature Program728067
-Node: Advanced Features729167
-Node: Nondecimal Data731053
-Node: Array Sorting732636
-Node: Controlling Array Traversal733333
-Node: Array Sorting Functions741617
-Ref: Array Sorting Functions-Footnote-1745486
-Node: Two-way I/O745680
-Ref: Two-way I/O-Footnote-1751112
-Node: TCP/IP Networking751194
-Node: Profiling754038
-Node: Internationalization761541
-Node: I18N and L10N762966
-Node: Explaining gettext763652
-Ref: Explaining gettext-Footnote-1768720
-Ref: Explaining gettext-Footnote-2768904
-Node: Programmer i18n769069
-Node: Translator i18n773271
-Node: String Extraction774065
-Ref: String Extraction-Footnote-1775026
-Node: Printf Ordering775112
-Ref: Printf Ordering-Footnote-1777894
-Node: I18N Portability777958
-Ref: I18N Portability-Footnote-1780407
-Node: I18N Example780470
-Ref: I18N Example-Footnote-1783108
-Node: Gawk I18N783180
-Node: Debugger783801
-Node: Debugging784772
-Node: Debugging Concepts785205
-Node: Debugging Terms787061
-Node: Awk Debugging789658
-Node: Sample Debugging Session790550
-Node: Debugger Invocation791070
-Node: Finding The Bug792403
-Node: List of Debugger Commands798890
-Node: Breakpoint Control800224
-Node: Debugger Execution Control803888
-Node: Viewing And Changing Data807248
-Node: Execution Stack810604
-Node: Debugger Info812071
-Node: Miscellaneous Debugger Commands816053
-Node: Readline Support821229
-Node: Limitations822060
-Node: Arbitrary Precision Arithmetic824312
-Ref: Arbitrary Precision Arithmetic-Footnote-1825961
-Node: General Arithmetic826109
-Node: Floating Point Issues827829
-Node: String Conversion Precision828710
-Ref: String Conversion Precision-Footnote-1830415
-Node: Unexpected Results830524
-Node: POSIX Floating Point Problems832677
-Ref: POSIX Floating Point Problems-Footnote-1836502
-Node: Integer Programming836540
-Node: Floating-point Programming838279
-Ref: Floating-point Programming-Footnote-1844610
-Ref: Floating-point Programming-Footnote-2844880
-Node: Floating-point Representation845144
-Node: Floating-point Context846309
-Ref: table-ieee-formats847148
-Node: Rounding Mode848532
-Ref: table-rounding-modes849011
-Ref: Rounding Mode-Footnote-1852026
-Node: Gawk and MPFR852205
-Node: Arbitrary Precision Floats853616
-Ref: Arbitrary Precision Floats-Footnote-1856059
-Node: Setting Precision856375
-Ref: table-predefined-precision-strings857061
-Node: Setting Rounding Mode859206
-Ref: table-gawk-rounding-modes859610
-Node: Floating-point Constants860797
-Node: Changing Precision862226
-Ref: Changing Precision-Footnote-1863623
-Node: Exact Arithmetic863797
-Node: Arbitrary Precision Integers866935
-Ref: Arbitrary Precision Integers-Footnote-1869950
-Node: Dynamic Extensions870097
-Node: Extension Intro871555
-Node: Plugin License872820
-Node: Extension Mechanism Outline873505
-Ref: load-extension873922
-Ref: load-new-function875400
-Ref: call-new-function876395
-Node: Extension API Description878410
-Node: Extension API Functions Introduction879697
-Node: General Data Types884624
-Ref: General Data Types-Footnote-1890319
-Node: Requesting Values890618
-Ref: table-value-types-returned891355
-Node: Memory Allocation Functions892309
-Ref: Memory Allocation Functions-Footnote-1895055
-Node: Constructor Functions895151
-Node: Registration Functions896909
-Node: Extension Functions897594
-Node: Exit Callback Functions899896
-Node: Extension Version String901145
-Node: Input Parsers901795
-Node: Output Wrappers911552
-Node: Two-way processors916062
-Node: Printing Messages918270
-Ref: Printing Messages-Footnote-1919347
-Node: Updating `ERRNO'919499
-Node: Accessing Parameters920238
-Node: Symbol Table Access921468
-Node: Symbol table by name921982
-Node: Symbol table by cookie923958
-Ref: Symbol table by cookie-Footnote-1928090
-Node: Cached values928153
-Ref: Cached values-Footnote-1931643
-Node: Array Manipulation931734
-Ref: Array Manipulation-Footnote-1932832
-Node: Array Data Types932871
-Ref: Array Data Types-Footnote-1935574
-Node: Array Functions935666
-Node: Flattening Arrays939502
-Node: Creating Arrays946354
-Node: Extension API Variables951079
-Node: Extension Versioning951715
-Node: Extension API Informational Variables953616
-Node: Extension API Boilerplate954702
-Node: Finding Extensions958506
-Node: Extension Example959066
-Node: Internal File Description959796
-Node: Internal File Ops963887
-Ref: Internal File Ops-Footnote-1975396
-Node: Using Internal File Ops975536
-Ref: Using Internal File Ops-Footnote-1977889
-Node: Extension Samples978155
-Node: Extension Sample File Functions979679
-Node: Extension Sample Fnmatch988164
-Node: Extension Sample Fork989933
-Node: Extension Sample Inplace991146
-Node: Extension Sample Ord992924
-Node: Extension Sample Readdir993760
-Node: Extension Sample Revout995292
-Node: Extension Sample Rev2way995885
-Node: Extension Sample Read write array996575
-Node: Extension Sample Readfile998458
-Node: Extension Sample API Tests999558
-Node: Extension Sample Time1000083
-Node: gawkextlib1001447
-Node: Language History1004228
-Node: V7/SVR3.11005821
-Node: SVR41008141
-Node: POSIX1009583
-Node: BTL1010969
-Node: POSIX/GNU1011703
-Node: Feature History1017302
-Node: Common Extensions1030278
-Node: Ranges and Locales1031590
-Ref: Ranges and Locales-Footnote-11036207
-Ref: Ranges and Locales-Footnote-21036234
-Ref: Ranges and Locales-Footnote-31036468
-Node: Contributors1036689
-Node: Installation1042070
-Node: Gawk Distribution1042964
-Node: Getting1043448
-Node: Extracting1044274
-Node: Distribution contents1045966
-Node: Unix Installation1051671
-Node: Quick Installation1052288
-Node: Additional Configuration Options1054734
-Node: Configuration Philosophy1056470
-Node: Non-Unix Installation1058824
-Node: PC Installation1059282
-Node: PC Binary Installation1060581
-Node: PC Compiling1062429
-Node: PC Testing1065373
-Node: PC Using1066549
-Node: Cygwin1070717
-Node: MSYS1071526
-Node: VMS Installation1072040
-Node: VMS Compilation1072804
-Ref: VMS Compilation-Footnote-11074056
-Node: VMS Dynamic Extensions1074114
-Node: VMS Installation Details1075487
-Node: VMS Running1077738
-Node: VMS GNV1080572
-Node: VMS Old Gawk1081295
-Node: Bugs1081765
-Node: Other Versions1085683
-Node: Notes1091767
-Node: Compatibility Mode1092567
-Node: Additions1093350
-Node: Accessing The Source1094277
-Node: Adding Code1095717
-Node: New Ports1101762
-Node: Derived Files1105897
-Ref: Derived Files-Footnote-11111218
-Ref: Derived Files-Footnote-21111252
-Ref: Derived Files-Footnote-31111852
-Node: Future Extensions1111950
-Node: Implementation Limitations1112533
-Node: Extension Design1113785
-Node: Old Extension Problems1114939
-Ref: Old Extension Problems-Footnote-11116447
-Node: Extension New Mechanism Goals1116504
-Ref: Extension New Mechanism Goals-Footnote-11119869
-Node: Extension Other Design Decisions1120055
-Node: Extension Future Growth1122161
-Node: Old Extension Mechanism1122997
-Node: Basic Concepts1124737
-Node: Basic High Level1125418
-Ref: figure-general-flow1125690
-Ref: figure-process-flow1126289
-Ref: Basic High Level-Footnote-11129518
-Node: Basic Data Typing1129703
-Node: Glossary1133058
-Node: Copying1158289
-Node: GNU Free Documentation License1195845
-Node: Index1220981
+Node: Top1204
+Node: Foreword41858
+Node: Preface46203
+Ref: Preface-Footnote-149350
+Ref: Preface-Footnote-249457
+Node: History49689
+Node: Names52063
+Ref: Names-Footnote-153527
+Node: This Manual53600
+Ref: This Manual-Footnote-159379
+Node: Conventions59479
+Node: Manual History61635
+Ref: Manual History-Footnote-165074
+Ref: Manual History-Footnote-265115
+Node: How To Contribute65189
+Node: Acknowledgments66428
+Node: Getting Started70724
+Node: Running gawk73158
+Node: One-shot74348
+Node: Read Terminal75573
+Ref: Read Terminal-Footnote-177223
+Ref: Read Terminal-Footnote-277499
+Node: Long77670
+Node: Executable Scripts79046
+Ref: Executable Scripts-Footnote-180879
+Ref: Executable Scripts-Footnote-280981
+Node: Comments81528
+Node: Quoting84001
+Node: DOS Quoting89317
+Node: Sample Data Files89992
+Node: Very Simple92507
+Node: Two Rules97145
+Node: More Complex99039
+Ref: More Complex-Footnote-1101971
+Node: Statements/Lines102056
+Ref: Statements/Lines-Footnote-1106512
+Node: Other Features106777
+Node: When107705
+Node: Intro Summary109875
+Node: Invoking Gawk110641
+Node: Command Line112156
+Node: Options112947
+Ref: Options-Footnote-1128647
+Node: Other Arguments128672
+Node: Naming Standard Input131334
+Node: Environment Variables132428
+Node: AWKPATH Variable132986
+Ref: AWKPATH Variable-Footnote-1135858
+Ref: AWKPATH Variable-Footnote-2135903
+Node: AWKLIBPATH Variable136163
+Node: Other Environment Variables136922
+Node: Exit Status140372
+Node: Include Files141047
+Node: Loading Shared Libraries144625
+Node: Obsolete146009
+Node: Undocumented146706
+Node: Invoking Summary146973
+Node: Regexp148553
+Node: Regexp Usage150003
+Node: Escape Sequences152036
+Node: Regexp Operators157703
+Ref: Regexp Operators-Footnote-1165183
+Ref: Regexp Operators-Footnote-2165330
+Node: Bracket Expressions165428
+Ref: table-char-classes167318
+Node: GNU Regexp Operators170258
+Node: Case-sensitivity173981
+Ref: Case-sensitivity-Footnote-1176873
+Ref: Case-sensitivity-Footnote-2177108
+Node: Leftmost Longest177216
+Node: Computed Regexps178417
+Node: Regexp Summary181789
+Node: Reading Files183260
+Node: Records185352
+Node: awk split records186095
+Node: gawk split records190953
+Ref: gawk split records-Footnote-1195474
+Node: Fields195511
+Ref: Fields-Footnote-1198475
+Node: Nonconstant Fields198561
+Ref: Nonconstant Fields-Footnote-1200791
+Node: Changing Fields200993
+Node: Field Separators206947
+Node: Default Field Splitting209649
+Node: Regexp Field Splitting210766
+Node: Single Character Fields214107
+Node: Command Line Field Separator215166
+Node: Full Line Fields218508
+Ref: Full Line Fields-Footnote-1219016
+Node: Field Splitting Summary219062
+Ref: Field Splitting Summary-Footnote-1222161
+Node: Constant Size222262
+Node: Splitting By Content226869
+Ref: Splitting By Content-Footnote-1230619
+Node: Multiple Line230659
+Ref: Multiple Line-Footnote-1236515
+Node: Getline236694
+Node: Plain Getline238910
+Node: Getline/Variable241005
+Node: Getline/File242152
+Node: Getline/Variable/File243536
+Ref: Getline/Variable/File-Footnote-1245135
+Node: Getline/Pipe245222
+Node: Getline/Variable/Pipe247921
+Node: Getline/Coprocess249028
+Node: Getline/Variable/Coprocess250280
+Node: Getline Notes251017
+Node: Getline Summary253821
+Ref: table-getline-variants254229
+Node: Read Timeout255141
+Ref: Read Timeout-Footnote-1258968
+Node: Command line directories259026
+Node: Input Summary259930
+Node: Input Exercises263067
+Node: Printing263800
+Node: Print265522
+Node: Print Examples266863
+Node: Output Separators269642
+Node: OFMT271658
+Node: Printf273016
+Node: Basic Printf273922
+Node: Control Letters275461
+Node: Format Modifiers279313
+Node: Printf Examples285340
+Node: Redirection287804
+Node: Special Files294776
+Node: Special FD295307
+Ref: Special FD-Footnote-1298931
+Node: Special Network299005
+Node: Special Caveats299855
+Node: Close Files And Pipes300651
+Ref: Close Files And Pipes-Footnote-1307812
+Ref: Close Files And Pipes-Footnote-2307960
+Node: Output Summary308110
+Node: Output exercises309107
+Node: Expressions309787
+Node: Values310972
+Node: Constants311648
+Node: Scalar Constants312328
+Ref: Scalar Constants-Footnote-1313187
+Node: Nondecimal-numbers313437
+Node: Regexp Constants316437
+Node: Using Constant Regexps316912
+Node: Variables319982
+Node: Using Variables320637
+Node: Assignment Options322361
+Node: Conversion324236
+Node: Strings And Numbers324760
+Ref: Strings And Numbers-Footnote-1327822
+Node: Locale influences conversions327931
+Ref: table-locale-affects330648
+Node: All Operators331236
+Node: Arithmetic Ops331866
+Node: Concatenation334371
+Ref: Concatenation-Footnote-1337167
+Node: Assignment Ops337287
+Ref: table-assign-ops342270
+Node: Increment Ops343587
+Node: Truth Values and Conditions347025
+Node: Truth Values348108
+Node: Typing and Comparison349157
+Node: Variable Typing349950
+Ref: Variable Typing-Footnote-1353850
+Node: Comparison Operators353972
+Ref: table-relational-ops354382
+Node: POSIX String Comparison357932
+Ref: POSIX String Comparison-Footnote-1359016
+Node: Boolean Ops359154
+Ref: Boolean Ops-Footnote-1363224
+Node: Conditional Exp363315
+Node: Function Calls365042
+Node: Precedence368922
+Node: Locales372591
+Node: Expressions Summary374222
+Node: Patterns and Actions376763
+Node: Pattern Overview377879
+Node: Regexp Patterns379556
+Node: Expression Patterns380099
+Node: Ranges383880
+Node: BEGIN/END386986
+Node: Using BEGIN/END387748
+Ref: Using BEGIN/END-Footnote-1390484
+Node: I/O And BEGIN/END390590
+Node: BEGINFILE/ENDFILE392875
+Node: Empty395806
+Node: Using Shell Variables396123
+Node: Action Overview398406
+Node: Statements400733
+Node: If Statement402581
+Node: While Statement404079
+Node: Do Statement406123
+Node: For Statement407279
+Node: Switch Statement410431
+Node: Break Statement412534
+Node: Continue Statement414589
+Node: Next Statement416382
+Node: Nextfile Statement418772
+Node: Exit Statement421427
+Node: Built-in Variables423831
+Node: User-modified424958
+Ref: User-modified-Footnote-1432647
+Node: Auto-set432709
+Ref: Auto-set-Footnote-1445628
+Ref: Auto-set-Footnote-2445833
+Node: ARGC and ARGV445889
+Node: Pattern Action Summary449743
+Node: Arrays451966
+Node: Array Basics453515
+Node: Array Intro454341
+Ref: figure-array-elements456314
+Node: Reference to Elements458721
+Node: Assigning Elements460994
+Node: Array Example461485
+Node: Scanning an Array463217
+Node: Controlling Scanning466232
+Ref: Controlling Scanning-Footnote-1471405
+Node: Delete471721
+Ref: Delete-Footnote-1474486
+Node: Numeric Array Subscripts474543
+Node: Uninitialized Subscripts476726
+Node: Multidimensional478351
+Node: Multiscanning481444
+Node: Arrays of Arrays483033
+Node: Arrays Summary487696
+Node: Functions489801
+Node: Built-in490674
+Node: Calling Built-in491752
+Node: Numeric Functions493740
+Ref: Numeric Functions-Footnote-1498318
+Ref: Numeric Functions-Footnote-2498675
+Ref: Numeric Functions-Footnote-3498723
+Node: String Functions498992
+Ref: String Functions-Footnote-1522003
+Ref: String Functions-Footnote-2522132
+Ref: String Functions-Footnote-3522380
+Node: Gory Details522467
+Ref: table-sub-escapes524136
+Ref: table-sub-posix-92525490
+Ref: table-sub-proposed526841
+Ref: table-posix-sub528195
+Ref: table-gensub-escapes529740
+Ref: Gory Details-Footnote-1530916
+Ref: Gory Details-Footnote-2530967
+Node: I/O Functions531118
+Ref: I/O Functions-Footnote-1538241
+Node: Time Functions538388
+Ref: Time Functions-Footnote-1548852
+Ref: Time Functions-Footnote-2548920
+Ref: Time Functions-Footnote-3549078
+Ref: Time Functions-Footnote-4549189
+Ref: Time Functions-Footnote-5549301
+Ref: Time Functions-Footnote-6549528
+Node: Bitwise Functions549794
+Ref: table-bitwise-ops550356
+Ref: Bitwise Functions-Footnote-1554601
+Node: Type Functions554785
+Node: I18N Functions555927
+Node: User-defined557572
+Node: Definition Syntax558376
+Ref: Definition Syntax-Footnote-1563555
+Node: Function Example563624
+Ref: Function Example-Footnote-1566268
+Node: Function Caveats566290
+Node: Calling A Function566808
+Node: Variable Scope567763
+Node: Pass By Value/Reference570751
+Node: Return Statement574259
+Node: Dynamic Typing577243
+Node: Indirect Calls578172
+Node: Functions Summary587885
+Node: Library Functions590424
+Ref: Library Functions-Footnote-1594042
+Ref: Library Functions-Footnote-2594185
+Node: Library Names594356
+Ref: Library Names-Footnote-1597829
+Ref: Library Names-Footnote-2598049
+Node: General Functions598135
+Node: Strtonum Function599163
+Node: Assert Function601943
+Node: Round Function605269
+Node: Cliff Random Function606810
+Node: Ordinal Functions607826
+Ref: Ordinal Functions-Footnote-1610903
+Ref: Ordinal Functions-Footnote-2611155
+Node: Join Function611366
+Ref: Join Function-Footnote-1613137
+Node: Getlocaltime Function613337
+Node: Readfile Function617073
+Node: Data File Management618912
+Node: Filetrans Function619544
+Node: Rewind Function623613
+Node: File Checking625000
+Ref: File Checking-Footnote-1626132
+Node: Empty Files626333
+Node: Ignoring Assigns628312
+Node: Getopt Function629866
+Ref: Getopt Function-Footnote-1641169
+Node: Passwd Functions641372
+Ref: Passwd Functions-Footnote-1650351
+Node: Group Functions650439
+Ref: Group Functions-Footnote-1658380
+Node: Walking Arrays658593
+Node: Library Functions Summary660196
+Node: Library exercises661584
+Node: Sample Programs662864
+Node: Running Examples663634
+Node: Clones664362
+Node: Cut Program665586
+Node: Egrep Program675454
+Ref: Egrep Program-Footnote-1683425
+Node: Id Program683535
+Node: Split Program687199
+Ref: Split Program-Footnote-1690737
+Node: Tee Program690865
+Node: Uniq Program693672
+Node: Wc Program701102
+Ref: Wc Program-Footnote-1705367
+Node: Miscellaneous Programs705459
+Node: Dupword Program706672
+Node: Alarm Program708703
+Node: Translate Program713517
+Ref: Translate Program-Footnote-1717908
+Ref: Translate Program-Footnote-2718178
+Node: Labels Program718312
+Ref: Labels Program-Footnote-1721683
+Node: Word Sorting721767
+Node: History Sorting725810
+Node: Extract Program727646
+Node: Simple Sed735182
+Node: Igawk Program738244
+Ref: Igawk Program-Footnote-1752555
+Ref: Igawk Program-Footnote-2752756
+Node: Anagram Program752894
+Node: Signature Program755962
+Node: Programs Summary757209
+Node: Programs Exercises758424
+Node: Advanced Features762075
+Node: Nondecimal Data764023
+Node: Array Sorting765600
+Node: Controlling Array Traversal766297
+Node: Array Sorting Functions774577
+Ref: Array Sorting Functions-Footnote-1778484
+Node: Two-way I/O778678
+Ref: Two-way I/O-Footnote-1784194
+Node: TCP/IP Networking784276
+Node: Profiling787120
+Node: Advanced Features Summary794671
+Node: Internationalization796535
+Node: I18N and L10N798015
+Node: Explaining gettext798701
+Ref: Explaining gettext-Footnote-1803841
+Ref: Explaining gettext-Footnote-2804025
+Node: Programmer i18n804190
+Node: Translator i18n808415
+Node: String Extraction809209
+Ref: String Extraction-Footnote-1810170
+Node: Printf Ordering810256
+Ref: Printf Ordering-Footnote-1813038
+Node: I18N Portability813102
+Ref: I18N Portability-Footnote-1815551
+Node: I18N Example815614
+Ref: I18N Example-Footnote-1818336
+Node: Gawk I18N818408
+Node: I18N Summary819046
+Node: Debugger820385
+Node: Debugging821407
+Node: Debugging Concepts821848
+Node: Debugging Terms823704
+Node: Awk Debugging826301
+Node: Sample Debugging Session827193
+Node: Debugger Invocation827713
+Node: Finding The Bug829046
+Node: List of Debugger Commands835528
+Node: Breakpoint Control836860
+Node: Debugger Execution Control840524
+Node: Viewing And Changing Data843884
+Node: Execution Stack847242
+Node: Debugger Info848755
+Node: Miscellaneous Debugger Commands852749
+Node: Readline Support857933
+Node: Limitations858825
+Node: Debugging Summary861099
+Node: Arbitrary Precision Arithmetic862263
+Node: Computer Arithmetic863592
+Ref: Computer Arithmetic-Footnote-1867979
+Node: Math Definitions868036
+Ref: table-ieee-formats870920
+Node: MPFR features871424
+Node: FP Math Caution873066
+Ref: FP Math Caution-Footnote-1874107
+Node: Inexactness of computations874476
+Node: Inexact representation875424
+Node: Comparing FP Values876779
+Node: Errors accumulate877743
+Node: Getting Accuracy879176
+Node: Try To Round881835
+Node: Setting precision882734
+Ref: table-predefined-precision-strings883416
+Node: Setting the rounding mode885209
+Ref: table-gawk-rounding-modes885573
+Ref: Setting the rounding mode-Footnote-1889027
+Node: Arbitrary Precision Integers889206
+Ref: Arbitrary Precision Integers-Footnote-1893001
+Node: POSIX Floating Point Problems893150
+Ref: POSIX Floating Point Problems-Footnote-1897026
+Node: Floating point summary897064
+Node: Dynamic Extensions899281
+Node: Extension Intro900833
+Node: Plugin License902098
+Node: Extension Mechanism Outline902783
+Ref: figure-load-extension903207
+Ref: figure-load-new-function904692
+Ref: figure-call-new-function905694
+Node: Extension API Description907678
+Node: Extension API Functions Introduction909128
+Node: General Data Types913993
+Ref: General Data Types-Footnote-1919686
+Node: Requesting Values919985
+Ref: table-value-types-returned920722
+Node: Memory Allocation Functions921680
+Ref: Memory Allocation Functions-Footnote-1924427
+Node: Constructor Functions924523
+Node: Registration Functions926281
+Node: Extension Functions926966
+Node: Exit Callback Functions929268
+Node: Extension Version String930517
+Node: Input Parsers931167
+Node: Output Wrappers940970
+Node: Two-way processors945486
+Node: Printing Messages947690
+Ref: Printing Messages-Footnote-1948767
+Node: Updating `ERRNO'948919
+Node: Accessing Parameters949658
+Node: Symbol Table Access950888
+Node: Symbol table by name951402
+Node: Symbol table by cookie953378
+Ref: Symbol table by cookie-Footnote-1957511
+Node: Cached values957574
+Ref: Cached values-Footnote-1961078
+Node: Array Manipulation961169
+Ref: Array Manipulation-Footnote-1962267
+Node: Array Data Types962306
+Ref: Array Data Types-Footnote-1965009
+Node: Array Functions965101
+Node: Flattening Arrays968975
+Node: Creating Arrays975827
+Node: Extension API Variables980558
+Node: Extension Versioning981194
+Node: Extension API Informational Variables983095
+Node: Extension API Boilerplate984181
+Node: Finding Extensions987985
+Node: Extension Example988545
+Node: Internal File Description989275
+Node: Internal File Ops993366
+Ref: Internal File Ops-Footnote-11004798
+Node: Using Internal File Ops1004938
+Ref: Using Internal File Ops-Footnote-11007285
+Node: Extension Samples1007553
+Node: Extension Sample File Functions1009077
+Node: Extension Sample Fnmatch1016645
+Node: Extension Sample Fork1018127
+Node: Extension Sample Inplace1019340
+Node: Extension Sample Ord1021015
+Node: Extension Sample Readdir1021851
+Ref: table-readdir-file-types1022707
+Node: Extension Sample Revout1023506
+Node: Extension Sample Rev2way1024097
+Node: Extension Sample Read write array1024838
+Node: Extension Sample Readfile1026717
+Node: Extension Sample API Tests1027817
+Node: Extension Sample Time1028342
+Node: gawkextlib1029657
+Node: Extension summary1032470
+Node: Extension Exercises1036163
+Node: Language History1036885
+Node: V7/SVR3.11038528
+Node: SVR41040848
+Node: POSIX1042290
+Node: BTL1043676
+Node: POSIX/GNU1044410
+Node: Feature History1050009
+Node: Common Extensions1063139
+Node: Ranges and Locales1064451
+Ref: Ranges and Locales-Footnote-11069068
+Ref: Ranges and Locales-Footnote-21069095
+Ref: Ranges and Locales-Footnote-31069329
+Node: Contributors1069550
+Node: History summary1074975
+Node: Installation1076344
+Node: Gawk Distribution1077295
+Node: Getting1077779
+Node: Extracting1078603
+Node: Distribution contents1080245
+Node: Unix Installation1086015
+Node: Quick Installation1086632
+Node: Additional Configuration Options1089074
+Node: Configuration Philosophy1090812
+Node: Non-Unix Installation1093163
+Node: PC Installation1093621
+Node: PC Binary Installation1094932
+Node: PC Compiling1096780
+Ref: PC Compiling-Footnote-11099779
+Node: PC Testing1099884
+Node: PC Using1101060
+Node: Cygwin1105218
+Node: MSYS1106027
+Node: VMS Installation1106541
+Node: VMS Compilation1107337
+Ref: VMS Compilation-Footnote-11108559
+Node: VMS Dynamic Extensions1108617
+Node: VMS Installation Details1109990
+Node: VMS Running1112242
+Node: VMS GNV1115076
+Node: VMS Old Gawk1115799
+Node: Bugs1116269
+Node: Other Versions1120273
+Node: Installation summary1126528
+Node: Notes1127584
+Node: Compatibility Mode1128449
+Node: Additions1129231
+Node: Accessing The Source1130156
+Node: Adding Code1131592
+Node: New Ports1137770
+Node: Derived Files1142251
+Ref: Derived Files-Footnote-11147332
+Ref: Derived Files-Footnote-21147366
+Ref: Derived Files-Footnote-31147962
+Node: Future Extensions1148076
+Node: Implementation Limitations1148682
+Node: Extension Design1149930
+Node: Old Extension Problems1151084
+Ref: Old Extension Problems-Footnote-11152601
+Node: Extension New Mechanism Goals1152658
+Ref: Extension New Mechanism Goals-Footnote-11156018
+Node: Extension Other Design Decisions1156207
+Node: Extension Future Growth1158313
+Node: Old Extension Mechanism1159149
+Node: Notes summary1160911
+Node: Basic Concepts1162097
+Node: Basic High Level1162778
+Ref: figure-general-flow1163050
+Ref: figure-process-flow1163649
+Ref: Basic High Level-Footnote-11166878
+Node: Basic Data Typing1167063
+Node: Glossary1170391
+Node: Copying1195543
+Node: GNU Free Documentation License1233099
+Node: Index1258235

End Tag Table