aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--doc/ChangeLog5
-rw-r--r--doc/gawk.info4047
-rw-r--r--doc/gawk.texi3961
3 files changed, 4038 insertions, 3975 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 61cce0f9..4616137b 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,8 @@
+2012-11-06 Arnold D. Robbins <arnold@skeeve.com>
+
+ * gawk.texi: Rearrange chapter order and separate into parts
+ using @part for TeX.
+
2012-11-05 Arnold D. Robbins <arnold@skeeve.com>
* gawk.texi: Semi-rationalize invocations of @image.
diff --git a/doc/gawk.info b/doc/gawk.info
index f093d4fe..cc0c2598 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -87,13 +87,13 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Arrays:: The description and use of arrays. Also
includes array-oriented control statements.
* Functions:: Built-in and user-defined functions.
+* Library Functions:: A Library of `awk' Functions.
+* Sample Programs:: Many `awk' programs with complete
+ explanations.
* Internationalization:: Getting `gawk' to speak your
language.
* Advanced Features:: Stuff for advanced users, specific to
`gawk'.
-* Library Functions:: A Library of `awk' Functions.
-* Sample Programs:: Many `awk' programs with complete
- explanations.
* Debugger:: The `gawk' debugger.
* Arbitrary Precision Arithmetic:: Arbitrary precision arithmetic with
`gawk'.
@@ -385,28 +385,6 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
runtime.
* Indirect Calls:: Choosing the function to call at
runtime.
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU `gettext' works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging `printf' arguments.
-* I18N Portability:: `awk'-level portability
- issues.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: `gawk' is also
- internationalized.
-* Nondecimal Data:: Allowing nondecimal input data.
-* Array Sorting:: Facilities for controlling array
- traversal and sorting arrays.
-* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
-* Array Sorting Functions:: How to use `asort()' and
- `asorti()'.
-* Two-way I/O:: Two-way communications with another
- process.
-* TCP/IP Networking:: Using `gawk' for network
- programming.
-* Profiling:: Profiling your `awk' programs.
* Library Names:: How to best name private global
variables in library functions.
* General Functions:: Functions that are of general use.
@@ -468,6 +446,28 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Anagram Program:: Finding anagrams from a dictionary.
* Signature Program:: People do amazing things with too much
time on their hands.
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU `gettext' works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging `printf' arguments.
+* I18N Portability:: `awk'-level portability
+ issues.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: `gawk' is also
+ internationalized.
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array
+ traversal and sorting arrays.
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use `asort()' and
+ `asorti()'.
+* Two-way I/O:: Two-way communications with another
+ process.
+* TCP/IP Networking:: Using `gawk' for network
+ programming.
+* Profiling:: Profiling your `awk' programs.
* Debugging:: Introduction to `gawk'
debugger.
* Debugging Concepts:: Debugging in General.
@@ -941,6 +941,12 @@ expert should find useful. In particular, the description of POSIX
`awk' and the example programs in *note Library Functions::, and in
*note Sample Programs::, should be of interest.
+ This Info file is split into several parts, as follows:
+
+ Part I describes the `awk' language and `gawk' program in detail.
+It starts with the basics, and continues through all of the features of
+`awk'. It contains the following chapters:
+
*note Getting Started::, provides the essentials you need to know to
begin using `awk'.
@@ -973,6 +979,21 @@ described, as well as sorting arrays in `gawk'. It also describes how
*note Functions::, describes the built-in functions `awk' and `gawk'
provide, as well as how to define your own functions.
+ Part II shows how to use `awk' and `gawk' for problem solving.
+There is lots of code here for you to read and learn from. It contains
+the following chapters:
+
+ *note Library Functions::, which provides a number of functions
+meant to be used from main `awk' programs.
+
+ *note Sample Programs::, which provides many sample `awk' programs.
+
+ Reading these two chapters allows you to see `awk' solving real
+problems.
+
+ Part III focuses on features specific to `gawk'. It contains the
+following chapters:
+
*note Internationalization::, describes special features in `gawk'
for translating program messages into different languages at runtime.
@@ -981,10 +1002,6 @@ advanced features. Of particular note are the abilities to have
two-way communications with another process, perform TCP/IP networking,
and profile your `awk' programs.
- *note Library Functions::, and *note Sample Programs::, provide many
-sample `awk' programs. Reading them allows you to see `awk' solving
-real problems.
-
*note Debugger::, describes the `awk' debugger.
*note Arbitrary Precision Arithmetic::, describes advanced
@@ -993,6 +1010,10 @@ arithmetic facilities provided by `gawk'.
*note Dynamic Extensions::, describes how to add new variables and
functions to `gawk' by writing extensions in C.
+ Part IV provides the appendices, the Glossary, and two licenses that
+cover the `gawk' source code and this Info file, respectively. It
+contains the following appendices:
+
*note Language History::, describes how the `awk' language has
evolved since its first release to present. It also describes how
`gawk' has acquired features over time.
@@ -10928,7 +10949,7 @@ by creating an arbitrary index:
-| a

-File: gawk.info, Node: Functions, Next: Internationalization, Prev: Arrays, Up: Top
+File: gawk.info, Node: Functions, Next: Library Functions, Prev: Arrays, Up: Top
9 Functions
***********
@@ -13374,1439 +13395,9 @@ example, in the following case:
`gawk' will look up the actual function to call only once.

-File: gawk.info, Node: Internationalization, Next: Advanced Features, Prev: Functions, Up: Top
-
-10 Internationalization with `gawk'
-***********************************
-
-Once upon a time, computer makers wrote software that worked only in
-English. Eventually, hardware and software vendors noticed that if
-their systems worked in the native languages of non-English-speaking
-countries, they were able to sell more systems. As a result,
-internationalization and localization of programs and software systems
-became a common practice.
-
- For many years, the ability to provide internationalization was
-largely restricted to programs written in C and C++. This major node
-describes the underlying library `gawk' uses for internationalization,
-as well as how `gawk' makes internationalization features available at
-the `awk' program level. Having internationalization available at the
-`awk' level gives software developers additional flexibility--they are
-no longer forced to write in C or C++ when internationalization is a
-requirement.
-
-* Menu:
-
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU `gettext' works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: `gawk' is also internationalized.
-
-
-File: gawk.info, Node: I18N and L10N, Next: Explaining gettext, Up: Internationalization
-
-10.1 Internationalization and Localization
-==========================================
-
-"Internationalization" means writing (or modifying) a program once, in
-such a way that it can use multiple languages without requiring further
-source-code changes. "Localization" means providing the data necessary
-for an internationalized program to work in a particular language.
-Most typically, these terms refer to features such as the language used
-for printing error messages, the language used to read responses, and
-information related to how numerical and monetary values are printed
-and read.
-
-
-File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev: I18N and L10N, Up: Internationalization
-
-10.2 GNU `gettext'
-==================
-
-The facilities in GNU `gettext' focus on messages; strings printed by a
-program, either directly or via formatting with `printf' or
-`sprintf()'.(1)
-
- When using GNU `gettext', each application has its own "text
-domain". This is a unique name, such as `kpilot' or `gawk', that
-identifies the application. A complete application may have multiple
-components--programs written in C or C++, as well as scripts written in
-`sh' or `awk'. All of the components use the same text domain.
-
- To make the discussion concrete, assume we're writing an application
-named `guide'. Internationalization consists of the following steps,
-in this order:
-
- 1. The programmer goes through the source for all of `guide''s
- components and marks each string that is a candidate for
- translation. For example, `"`-F': option required"' is a good
- candidate for translation. A table with strings of option names
- is not (e.g., `gawk''s `--profile' option should remain the same,
- no matter what the local language).
-
- 2. The programmer indicates the application's text domain (`"guide"')
- to the `gettext' library, by calling the `textdomain()' function.
-
- 3. Messages from the application are extracted from the source code
- and collected into a portable object template file (`guide.pot'),
- which lists the strings and their translations. The translations
- are initially empty. The original (usually English) messages
- serve as the key for lookup of the translations.
-
- 4. For each language with a translator, `guide.pot' is copied to a
- portable object file (`.po') and translations are created and
- shipped with the application. For example, there might be a
- `fr.po' for a French translation.
-
- 5. Each language's `.po' file is converted into a binary message
- object (`.mo') file. A message object file contains the original
- messages and their translations in a binary format that allows
- fast lookup of translations at runtime.
-
- 6. When `guide' is built and installed, the binary translation files
- are installed in a standard place.
-
- 7. For testing and development, it is possible to tell `gettext' to
- use `.mo' files in a different directory than the standard one by
- using the `bindtextdomain()' function.
-
- 8. At runtime, `guide' looks up each string via a call to
- `gettext()'. The returned string is the translated string if
- available, or the original string if not.
-
- 9. If necessary, it is possible to access messages from a different
- text domain than the one belonging to the application, without
- having to switch the application's default text domain back and
- forth.
-
- In C (or C++), the string marking and dynamic translation lookup are
-accomplished by wrapping each string in a call to `gettext()':
-
- printf("%s", gettext("Don't Panic!\n"));
-
- The tools that extract messages from source code pull out all
-strings enclosed in calls to `gettext()'.
-
- The GNU `gettext' developers, recognizing that typing `gettext(...)'
-over and over again is both painful and ugly to look at, use the macro
-`_' (an underscore) to make things easier:
-
- /* In the standard header file: */
- #define _(str) gettext(str)
-
- /* In the program text: */
- printf("%s", _("Don't Panic!\n"));
-
-This reduces the typing overhead to just three extra characters per
-string and is considerably easier to read as well.
-
- There are locale "categories" for different types of locale-related
-information. The defined locale categories that `gettext' knows about
-are:
-
-`LC_MESSAGES'
- Text messages. This is the default category for `gettext'
- operations, but it is possible to supply a different one
- explicitly, if necessary. (It is almost never necessary to supply
- a different category.)
-
-`LC_COLLATE'
- Text-collation information; i.e., how different characters and/or
- groups of characters sort in a given language.
-
-`LC_CTYPE'
- Character-type information (alphabetic, digit, upper- or
- lowercase, and so on). This information is accessed via the POSIX
- character classes in regular expressions, such as `/[[:alnum:]]/'
- (*note Regexp Operators::).
-
-`LC_MONETARY'
- Monetary information, such as the currency symbol, and whether the
- symbol goes before or after a number.
-
-`LC_NUMERIC'
- Numeric information, such as which characters to use for the
- decimal point and the thousands separator.(2)
-
-`LC_RESPONSE'
- Response information, such as how "yes" and "no" appear in the
- local language, and possibly other information as well.
-
-`LC_TIME'
- Time- and date-related information, such as 12- or 24-hour clock,
- month printed before or after the day in a date, local month
- abbreviations, and so on.
-
-`LC_ALL'
- All of the above. (Not too useful in the context of `gettext'.)
-
- ---------- Footnotes ----------
-
- (1) For some operating systems, the `gawk' port doesn't support GNU
-`gettext'. Therefore, these features are not available if you are
-using one of those operating systems. Sorry.
-
- (2) Americans use a comma every three decimal places and a period
-for the decimal point, while many Europeans do exactly the opposite:
-1,234.56 versus 1.234,56.
-
-
-File: gawk.info, Node: Programmer i18n, Next: Translator i18n, Prev: Explaining gettext, Up: Internationalization
-
-10.3 Internationalizing `awk' Programs
-======================================
-
-`gawk' provides the following variables and functions for
-internationalization:
-
-`TEXTDOMAIN'
- This variable indicates the application's text domain. For
- compatibility with GNU `gettext', the default value is
- `"messages"'.
-
-`_"your message here"'
- String constants marked with a leading underscore are candidates
- for translation at runtime. String constants without a leading
- underscore are not translated.
-
-`dcgettext(STRING [, DOMAIN [, CATEGORY]])'
- Return the translation of STRING in text domain DOMAIN for locale
- category CATEGORY. The default value for DOMAIN is the current
- value of `TEXTDOMAIN'. The default value for CATEGORY is
- `"LC_MESSAGES"'.
-
- If you supply a value for CATEGORY, it must be a string equal to
- one of the known locale categories described in *note Explaining
- gettext::. You must also supply a text domain. Use `TEXTDOMAIN'
- if you want to use the current domain.
-
- CAUTION: The order of arguments to the `awk' version of the
- `dcgettext()' function is purposely different from the order
- for the C version. The `awk' version's order was chosen to
- be simple and to allow for reasonable `awk'-style default
- arguments.
-
-`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])'
- Return the plural form used for NUMBER of the translation of
- STRING1 and STRING2 in text domain DOMAIN for locale category
- CATEGORY. STRING1 is the English singular variant of a message,
- and STRING2 the English plural variant of the same message. The
- default value for DOMAIN is the current value of `TEXTDOMAIN'.
- The default value for CATEGORY is `"LC_MESSAGES"'.
-
- The same remarks about argument order as for the `dcgettext()'
- function apply.
-
-`bindtextdomain(DIRECTORY [, DOMAIN])'
- Change the directory in which `gettext' looks for `.mo' files, in
- case they will not or cannot be placed in the standard locations
- (e.g., during testing). Return the directory in which DOMAIN is
- "bound."
-
- The default DOMAIN is the value of `TEXTDOMAIN'. If DIRECTORY is
- the null string (`""'), then `bindtextdomain()' returns the
- current binding for the given DOMAIN.
-
- To use these facilities in your `awk' program, follow the steps
-outlined in *note Explaining gettext::, like so:
-
- 1. Set the variable `TEXTDOMAIN' to the text domain of your program.
- This is best done in a `BEGIN' rule (*note BEGIN/END::), or it can
- also be done via the `-v' command-line option (*note Options::):
-
- BEGIN {
- TEXTDOMAIN = "guide"
- ...
- }
-
- 2. Mark all translatable strings with a leading underscore (`_')
- character. It _must_ be adjacent to the opening quote of the
- string. For example:
-
- print _"hello, world"
- x = _"you goofed"
- printf(_"Number of users is %d\n", nusers)
-
- 3. If you are creating strings dynamically, you can still translate
- them, using the `dcgettext()' built-in function:
-
- message = nusers " users logged in"
- message = dcgettext(message, "adminprog")
- print message
-
- Here, the call to `dcgettext()' supplies a different text domain
- (`"adminprog"') in which to find the message, but it uses the
- default `"LC_MESSAGES"' category.
-
- 4. During development, you might want to put the `.mo' file in a
- private directory for testing. This is done with the
- `bindtextdomain()' built-in function:
-
- BEGIN {
- TEXTDOMAIN = "guide" # our text domain
- if (Testing) {
- # where to find our files
- bindtextdomain("testdir")
- # joe is in charge of adminprog
- bindtextdomain("../joe/testdir", "adminprog")
- }
- ...
- }
-
-
- *Note I18N Example::, for an example program showing the steps to
-create and use translations from `awk'.
-
-
-File: gawk.info, Node: Translator i18n, Next: I18N Example, Prev: Programmer i18n, Up: Internationalization
-
-10.4 Translating `awk' Programs
-===============================
-
-Once a program's translatable strings have been marked, they must be
-extracted to create the initial `.po' file. As part of translation, it
-is often helpful to rearrange the order in which arguments to `printf'
-are output.
-
- `gawk''s `--gen-pot' command-line option extracts the messages and
-is discussed next. After that, `printf''s ability to rearrange the
-order for `printf' arguments at runtime is covered.
-
-* Menu:
-
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging `printf' arguments.
-* I18N Portability:: `awk'-level portability issues.
-
-
-File: gawk.info, Node: String Extraction, Next: Printf Ordering, Up: Translator i18n
-
-10.4.1 Extracting Marked Strings
---------------------------------
-
-Once your `awk' program is working, and all the strings have been
-marked and you've set (and perhaps bound) the text domain, it is time
-to produce translations. First, use the `--gen-pot' command-line
-option to create the initial `.pot' file:
-
- $ gawk --gen-pot -f guide.awk > guide.pot
-
- When run with `--gen-pot', `gawk' does not execute your program.
-Instead, it parses it as usual and prints all marked strings to
-standard output in the format of a GNU `gettext' Portable Object file.
-Also included in the output are any constant strings that appear as the
-first argument to `dcgettext()' or as the first and second argument to
-`dcngettext()'.(1) *Note I18N Example::, for the full list of steps to
-go through to create and test translations for `guide'.
-
- ---------- Footnotes ----------
-
- (1) The `xgettext' utility that comes with GNU `gettext' can handle
-`.awk' files.
-
-
-File: gawk.info, Node: Printf Ordering, Next: I18N Portability, Prev: String Extraction, Up: Translator i18n
-
-10.4.2 Rearranging `printf' Arguments
--------------------------------------
-
-Format strings for `printf' and `sprintf()' (*note Printf::) present a
-special problem for translation. Consider the following:(1)
-
- printf(_"String `%s' has %d characters\n",
- string, length(string)))
-
- A possible German translation for this might be:
-
- "%d Zeichen lang ist die Zeichenkette `%s'\n"
-
- The problem should be obvious: the order of the format
-specifications is different from the original! Even though `gettext()'
-can return the translated string at runtime, it cannot change the
-argument order in the call to `printf'.
-
- To solve this problem, `printf' format specifiers may have an
-additional optional element, which we call a "positional specifier".
-For example:
-
- "%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
-
- Here, the positional specifier consists of an integer count, which
-indicates which argument to use, and a `$'. Counts are one-based, and
-the format string itself is _not_ included. Thus, in the following
-example, `string' is the first argument and `length(string)' is the
-second:
-
- $ gawk 'BEGIN {
- > string = "Dont Panic"
- > printf _"%2$d characters live in \"%1$s\"\n",
- > string, length(string)
- > }'
- -| 10 characters live in "Dont Panic"
-
- If present, positional specifiers come first in the format
-specification, before the flags, the field width, and/or the precision.
-
- Positional specifiers can be used with the dynamic field width and
-precision capability:
-
- $ gawk 'BEGIN {
- > printf("%*.*s\n", 10, 20, "hello")
- > printf("%3$*2$.*1$s\n", 20, 10, "hello")
- > }'
- -| hello
- -| hello
-
- NOTE: When using `*' with a positional specifier, the `*' comes
- first, then the integer position, and then the `$'. This is
- somewhat counterintuitive.
-
- `gawk' does not allow you to mix regular format specifiers and those
-with positional specifiers in the same string:
-
- $ gawk 'BEGIN { printf _"%d %3$s\n", 1, 2, "hi" }'
- error--> gawk: cmd. line:1: fatal: must use `count$' on all formats or none
-
- NOTE: There are some pathological cases that `gawk' may fail to
- diagnose. In such cases, the output may not be what you expect.
- It's still a bad idea to try mixing them, even if `gawk' doesn't
- detect it.
-
- Although positional specifiers can be used directly in `awk'
-programs, their primary purpose is to help in producing correct
-translations of format strings into languages different from the one in
-which the program is first written.
-
- ---------- Footnotes ----------
-
- (1) This example is borrowed from the GNU `gettext' manual.
-
-
-File: gawk.info, Node: I18N Portability, Prev: Printf Ordering, Up: Translator i18n
-
-10.4.3 `awk' Portability Issues
--------------------------------
-
-`gawk''s internationalization features were purposely chosen to have as
-little impact as possible on the portability of `awk' programs that use
-them to other versions of `awk'. Consider this program:
-
- BEGIN {
- TEXTDOMAIN = "guide"
- if (Test_Guide) # set with -v
- bindtextdomain("/test/guide/messages")
- print _"don't panic!"
- }
-
-As written, it won't work on other versions of `awk'. However, it is
-actually almost portable, requiring very little change:
-
- * Assignments to `TEXTDOMAIN' won't have any effect, since
- `TEXTDOMAIN' is not special in other `awk' implementations.
-
- * Non-GNU versions of `awk' treat marked strings as the
- concatenation of a variable named `_' with the string following
- it.(1) Typically, the variable `_' has the null string (`""') as
- its value, leaving the original string constant as the result.
-
- * By defining "dummy" functions to replace `dcgettext()',
- `dcngettext()' and `bindtextdomain()', the `awk' program can be
- made to run, but all the messages are output in the original
- language. For example:
-
- function bindtextdomain(dir, domain)
- {
- return dir
- }
-
- function dcgettext(string, domain, category)
- {
- return string
- }
-
- function dcngettext(string1, string2, number, domain, category)
- {
- return (number == 1 ? string1 : string2)
- }
-
- * The use of positional specifications in `printf' or `sprintf()' is
- _not_ portable. To support `gettext()' at the C level, many
- systems' C versions of `sprintf()' do support positional
- specifiers. But it works only if enough arguments are supplied in
- the function call. Many versions of `awk' pass `printf' formats
- and arguments unchanged to the underlying C library version of
- `sprintf()', but only one format and argument at a time. What
- happens if a positional specification is used is anybody's guess.
- However, since the positional specifications are primarily for use
- in _translated_ format strings, and since non-GNU `awk's never
- retrieve the translated string, this should not be a problem in
- practice.
-
- ---------- Footnotes ----------
-
- (1) This is good fodder for an "Obfuscated `awk'" contest.
-
-
-File: gawk.info, Node: I18N Example, Next: Gawk I18N, Prev: Translator i18n, Up: Internationalization
-
-10.5 A Simple Internationalization Example
-==========================================
-
-Now let's look at a step-by-step example of how to internationalize and
-localize a simple `awk' program, using `guide.awk' as our original
-source:
-
- BEGIN {
- TEXTDOMAIN = "guide"
- bindtextdomain(".") # for testing
- print _"Don't Panic"
- print _"The Answer Is", 42
- print "Pardon me, Zaphod who?"
- }
-
-Run `gawk --gen-pot' to create the `.pot' file:
-
- $ gawk --gen-pot -f guide.awk > guide.pot
-
-This produces:
-
- #: guide.awk:4
- msgid "Don't Panic"
- msgstr ""
-
- #: guide.awk:5
- msgid "The Answer Is"
- msgstr ""
-
- This original portable object template file is saved and reused for
-each language into which the application is translated. The `msgid' is
-the original string and the `msgstr' is the translation.
-
- NOTE: Strings not marked with a leading underscore do not appear
- in the `guide.pot' file.
-
- Next, the messages must be translated. Here is a translation to a
-hypothetical dialect of English, called "Mellow":(1)
-
- $ cp guide.pot guide-mellow.po
- ADD TRANSLATIONS TO guide-mellow.po ...
-
-Following are the translations:
-
- #: guide.awk:4
- msgid "Don't Panic"
- msgstr "Hey man, relax!"
-
- #: guide.awk:5
- msgid "The Answer Is"
- msgstr "Like, the scoop is"
-
- The next step is to make the directory to hold the binary message
-object file and then to create the `guide.mo' file. The directory
-layout shown here is standard for GNU `gettext' on GNU/Linux systems.
-Other versions of `gettext' may use a different layout:
-
- $ mkdir en_US en_US/LC_MESSAGES
-
- The `msgfmt' utility does the conversion from human-readable `.po'
-file to machine-readable `.mo' file. By default, `msgfmt' creates a
-file named `messages'. This file must be renamed and placed in the
-proper directory so that `gawk' can find it:
-
- $ msgfmt guide-mellow.po
- $ mv messages en_US/LC_MESSAGES/guide.mo
-
- Finally, we run the program to test it:
-
- $ gawk -f guide.awk
- -| Hey man, relax!
- -| Like, the scoop is 42
- -| Pardon me, Zaphod who?
-
- If the three replacement functions for `dcgettext()', `dcngettext()'
-and `bindtextdomain()' (*note I18N Portability::) are in a file named
-`libintl.awk', then we can run `guide.awk' unchanged as follows:
-
- $ gawk --posix -f guide.awk -f libintl.awk
- -| Don't Panic
- -| The Answer Is 42
- -| Pardon me, Zaphod who?
-
- ---------- Footnotes ----------
-
- (1) Perhaps it would be better if it were called "Hippy." Ah, well.
-
-
-File: gawk.info, Node: Gawk I18N, Prev: I18N Example, Up: Internationalization
-
-10.6 `gawk' Can Speak Your Language
-===================================
-
-`gawk' itself has been internationalized using the GNU `gettext'
-package. (GNU `gettext' is described in complete detail in *note (GNU
-`gettext' utilities)Top:: gettext, GNU gettext tools.) As of this
-writing, the latest version of GNU `gettext' is version 0.18.1
-(ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.1.tar.gz).
-
- If a translation of `gawk''s messages exists, then `gawk' produces
-usage messages, warnings, and fatal errors in the local language.
-
-
-File: gawk.info, Node: Advanced Features, Next: Library Functions, Prev: Internationalization, Up: Top
-
-11 Advanced Features of `gawk'
-******************************
-
- Write documentation as if whoever reads it is a violent psychopath
- who knows where you live.
- Steve English, as quoted by Peter Langston
-
- This major node discusses advanced features in `gawk'. It's a bit
-of a "grab bag" of items that are otherwise unrelated to each other.
-First, a command-line option allows `gawk' to recognize nondecimal
-numbers in input data, not just in `awk' programs. Then, `gawk''s
-special features for sorting arrays are presented. Next, two-way I/O,
-discussed briefly in earlier parts of this Info file, is described in
-full detail, along with the basics of TCP/IP networking. Finally,
-`gawk' can "profile" an `awk' program, making it possible to tune it
-for performance.
-
- *note Dynamic Extensions::, discusses the ability to dynamically add
-new built-in functions to `gawk'. As this feature is still immature
-and likely to change, its description is relegated to an appendix.
-
-* Menu:
-
-* Nondecimal Data:: Allowing nondecimal input data.
-* Array Sorting:: Facilities for controlling array traversal and
- sorting arrays.
-* Two-way I/O:: Two-way communications with another process.
-* TCP/IP Networking:: Using `gawk' for network programming.
-* Profiling:: Profiling your `awk' programs.
-
-
-File: gawk.info, Node: Nondecimal Data, Next: Array Sorting, Up: Advanced Features
-
-11.1 Allowing Nondecimal Input Data
-===================================
-
-If you run `gawk' with the `--non-decimal-data' option, you can have
-nondecimal constants in your input data:
-
- $ echo 0123 123 0x123 |
- > gawk --non-decimal-data '{ printf "%d, %d, %d\n",
- > $1, $2, $3 }'
- -| 83, 123, 291
-
- For this feature to work, write your program so that `gawk' treats
-your data as numeric:
-
- $ echo 0123 123 0x123 | gawk '{ print $1, $2, $3 }'
- -| 0123 123 0x123
-
-The `print' statement treats its expressions as strings. Although the
-fields can act as numbers when necessary, they are still strings, so
-`print' does not try to treat them numerically. You may need to add
-zero to a field to force it to be treated as a number. For example:
-
- $ echo 0123 123 0x123 | gawk --non-decimal-data '
- > { print $1, $2, $3
- > print $1 + 0, $2 + 0, $3 + 0 }'
- -| 0123 123 0x123
- -| 83 123 291
-
- Because it is common to have decimal data with leading zeros, and
-because using this facility could lead to surprising results, the
-default is to leave it disabled. If you want it, you must explicitly
-request it.
-
- CAUTION: _Use of this option is not recommended._ It can break old
- programs very badly. Instead, use the `strtonum()' function to
- convert your data (*note Nondecimal-numbers::). This makes your
- programs easier to write and easier to read, and leads to less
- surprising results.
-
-
-File: gawk.info, Node: Array Sorting, Next: Two-way I/O, Prev: Nondecimal Data, Up: Advanced Features
-
-11.2 Controlling Array Traversal and Array Sorting
-==================================================
-
-`gawk' lets you control the order in which a `for (i in array)' loop
-traverses an array.
-
- In addition, two built-in functions, `asort()' and `asorti()', let
-you sort arrays based on the array values and indices, respectively.
-These two functions also provide control over the sorting criteria used
-to order the elements during sorting.
-
-* Menu:
-
-* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
-* Array Sorting Functions:: How to use `asort()' and `asorti()'.
-
-
-File: gawk.info, Node: Controlling Array Traversal, Next: Array Sorting Functions, Up: Array Sorting
-
-11.2.1 Controlling Array Traversal
-----------------------------------
-
-By default, the order in which a `for (i in array)' loop scans an array
-is not defined; it is generally based upon the internal implementation
-of arrays inside `awk'.
-
- Often, though, it is desirable to be able to loop over the elements
-in a particular order that you, the programmer, choose. `gawk' lets
-you do this.
-
- *note Controlling Scanning::, describes how you can assign special,
-pre-defined values to `PROCINFO["sorted_in"]' in order to control the
-order in which `gawk' will traverse an array during a `for' loop.
-
- In addition, the value of `PROCINFO["sorted_in"]' can be a function
-name. This lets you traverse an array based on any custom criterion.
-The array elements are ordered according to the return value of this
-function. The comparison function should be defined with at least four
-arguments:
-
- function comp_func(i1, v1, i2, v2)
- {
- COMPARE ELEMENTS 1 AND 2 IN SOME FASHION
- RETURN < 0; 0; OR > 0
- }
-
- Here, I1 and I2 are the indices, and V1 and V2 are the corresponding
-values of the two elements being compared. Either V1 or V2, or both,
-can be arrays if the array being traversed contains subarrays as values.
-(*Note Arrays of Arrays::, for more information about subarrays.) The
-three possible return values are interpreted as follows:
-
-`comp_func(i1, v1, i2, v2) < 0'
- Index I1 comes before index I2 during loop traversal.
-
-`comp_func(i1, v1, i2, v2) == 0'
- Indices I1 and I2 come together but the relative order with
- respect to each other is undefined.
-
-`comp_func(i1, v1, i2, v2) > 0'
- Index I1 comes after index I2 during loop traversal.
-
- Our first comparison function can be used to scan an array in
-numerical order of the indices:
-
- function cmp_num_idx(i1, v1, i2, v2)
- {
- # numerical index comparison, ascending order
- return (i1 - i2)
- }
-
- Our second function traverses an array based on the string order of
-the element values rather than by indices:
-
- function cmp_str_val(i1, v1, i2, v2)
- {
- # string value comparison, ascending order
- v1 = v1 ""
- v2 = v2 ""
- if (v1 < v2)
- return -1
- return (v1 != v2)
- }
-
- The third comparison function makes all numbers, and numeric strings
-without any leading or trailing spaces, come out first during loop
-traversal:
-
- function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
- {
- # numbers before string value comparison, ascending order
- n1 = v1 + 0
- n2 = v2 + 0
- if (n1 == v1)
- return (n2 == v2) ? (n1 - n2) : -1
- else if (n2 == v2)
- return 1
- return (v1 < v2) ? -1 : (v1 != v2)
- }
-
- Here is a main program to demonstrate how `gawk' behaves using each
-of the previous functions:
-
- BEGIN {
- data["one"] = 10
- data["two"] = 20
- data[10] = "one"
- data[100] = 100
- data[20] = "two"
-
- f[1] = "cmp_num_idx"
- f[2] = "cmp_str_val"
- f[3] = "cmp_num_str_val"
- for (i = 1; i <= 3; i++) {
- printf("Sort function: %s\n", f[i])
- PROCINFO["sorted_in"] = f[i]
- for (j in data)
- printf("\tdata[%s] = %s\n", j, data[j])
- print ""
- }
- }
-
- Here are the results when the program is run:
-
- $ gawk -f compdemo.awk
- -| Sort function: cmp_num_idx Sort by numeric index
- -| data[two] = 20
- -| data[one] = 10 Both strings are numerically zero
- -| data[10] = one
- -| data[20] = two
- -| data[100] = 100
- -|
- -| Sort function: cmp_str_val Sort by element values as strings
- -| data[one] = 10
- -| data[100] = 100 String 100 is less than string 20
- -| data[two] = 20
- -| data[10] = one
- -| data[20] = two
- -|
- -| Sort function: cmp_num_str_val Sort all numeric values before all strings
- -| data[one] = 10
- -| data[two] = 20
- -| data[100] = 100
- -| data[10] = one
- -| data[20] = two
-
- Consider sorting the entries of a GNU/Linux system password file
-according to login name. The following program sorts records by a
-specific field position and can be used for this purpose:
-
- # sort.awk --- simple program to sort by field position
- # field position is specified by the global variable POS
-
- function cmp_field(i1, v1, i2, v2)
- {
- # comparison by value, as string, and ascending order
- return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
- }
-
- {
- for (i = 1; i <= NF; i++)
- a[NR][i] = $i
- }
-
- END {
- PROCINFO["sorted_in"] = "cmp_field"
- if (POS < 1 || POS > NF)
- POS = 1
- for (i in a) {
- for (j = 1; j <= NF; j++)
- printf("%s%c", a[i][j], j < NF ? ":" : "")
- print ""
- }
- }
-
- The first field in each entry of the password file is the user's
-login name, and the fields are separated by colons. Each record
-defines a subarray, with each field as an element in the subarray.
-Running the program produces the following output:
-
- $ gawk -v POS=1 -F: -f sort.awk /etc/passwd
- -| adm:x:3:4:adm:/var/adm:/sbin/nologin
- -| apache:x:48:48:Apache:/var/www:/sbin/nologin
- -| avahi:x:70:70:Avahi daemon:/:/sbin/nologin
- ...
-
- The comparison should normally always return the same value when
-given a specific pair of array elements as its arguments. If
-inconsistent results are returned then the order is undefined. This
-behavior can be exploited to introduce random order into otherwise
-seemingly ordered data:
-
- function cmp_randomize(i1, v1, i2, v2)
- {
- # random order
- return (2 - 4 * rand())
- }
-
- As mentioned above, the order of the indices is arbitrary if two
-elements compare equal. This is usually not a problem, but letting the
-tied elements come out in arbitrary order can be an issue, especially
-when comparing item values. The partial ordering of the equal elements
-may change during the next loop traversal, if other elements are added
-or removed from the array. One way to resolve ties when comparing
-elements with otherwise equal values is to include the indices in the
-comparison rules. Note that doing this may make the loop traversal
-less efficient, so consider it only if necessary. The following
-comparison functions force a deterministic order, and are based on the
-fact that the indices of two elements are never equal:
-
- function cmp_numeric(i1, v1, i2, v2)
- {
- # numerical value (and index) comparison, descending order
- return (v1 != v2) ? (v2 - v1) : (i2 - i1)
- }
-
- function cmp_string(i1, v1, i2, v2)
- {
- # string value (and index) comparison, descending order
- v1 = v1 i1
- v2 = v2 i2
- return (v1 > v2) ? -1 : (v1 != v2)
- }
-
- A custom comparison function can often simplify ordered loop
-traversal, and the sky is really the limit when it comes to designing
-such a function.
-
- When string comparisons are made during a sort, either for element
-values where one or both aren't numbers, or for element indices handled
-as strings, the value of `IGNORECASE' (*note Built-in Variables::)
-controls whether the comparisons treat corresponding uppercase and
-lowercase letters as equivalent or distinct.
-
- Another point to keep in mind is that in the case of subarrays the
-element values can themselves be arrays; a production comparison
-function should use the `isarray()' function (*note Type Functions::),
-to check for this, and choose a defined sorting order for subarrays.
-
- All sorting based on `PROCINFO["sorted_in"]' is disabled in POSIX
-mode, since the `PROCINFO' array is not special in that case.
-
- As a side note, sorting the array indices before traversing the
-array has been reported to add 15% to 20% overhead to the execution
-time of `awk' programs. For this reason, sorted array traversal is not
-the default.
-
-
-File: gawk.info, Node: Array Sorting Functions, Prev: Controlling Array Traversal, Up: Array Sorting
-
-11.2.2 Sorting Array Values and Indices with `gawk'
----------------------------------------------------
-
-In most `awk' implementations, sorting an array requires writing a
-`sort()' function. While this can be educational for exploring
-different sorting algorithms, usually that's not the point of the
-program. `gawk' provides the built-in `asort()' and `asorti()'
-functions (*note String Functions::) for sorting arrays. For example:
-
- POPULATE THE ARRAY data
- n = asort(data)
- for (i = 1; i <= n; i++)
- DO SOMETHING WITH data[i]
-
- After the call to `asort()', the array `data' is indexed from 1 to
-some number N, the total number of elements in `data'. (This count is
-`asort()''s return value.) `data[1]' <= `data[2]' <= `data[3]', and so
-on. The comparison is based on the type of the elements (*note Typing
-and Comparison::). All numeric values come before all string values,
-which in turn come before all subarrays.
-
- An important side effect of calling `asort()' is that _the array's
-original indices are irrevocably lost_. As this isn't always
-desirable, `asort()' accepts a second argument:
-
- POPULATE THE ARRAY source
- n = asort(source, dest)
- for (i = 1; i <= n; i++)
- DO SOMETHING WITH dest[i]
-
- In this case, `gawk' copies the `source' array into the `dest' array
-and then sorts `dest', destroying its indices. However, the `source'
-array is not affected.
-
- `asort()' accepts a third string argument to control comparison of
-array elements. As with `PROCINFO["sorted_in"]', this argument may be
-one of the predefined names that `gawk' provides (*note Controlling
-Scanning::), or the name of a user-defined function (*note Controlling
-Array Traversal::).
-
- NOTE: In all cases, the sorted element values consist of the
- original array's element values. The ability to control
- comparison merely affects the way in which they are sorted.
-
- Often, what's needed is to sort on the values of the _indices_
-instead of the values of the elements. To do that, use the `asorti()'
-function. The interface is identical to that of `asort()', except that
-the index values are used for sorting, and become the values of the
-result array:
-
- { source[$0] = some_func($0) }
-
- END {
- n = asorti(source, dest)
- for (i = 1; i <= n; i++) {
- Work with sorted indices directly:
- DO SOMETHING WITH dest[i]
- ...
- Access original array via sorted indices:
- DO SOMETHING WITH source[dest[i]]
- }
- }
-
- Similar to `asort()', in all cases, the sorted element values
-consist of the original array's indices. The ability to control
-comparison merely affects the way in which they are sorted.
-
- Sorting the array by replacing the indices provides maximal
-flexibility. To traverse the elements in decreasing order, use a loop
-that goes from N down to 1, either over the elements or over the
-indices.(1)
-
- Copying array indices and elements isn't expensive in terms of
-memory. Internally, `gawk' maintains "reference counts" to data. For
-example, when `asort()' copies the first array to the second one, there
-is only one copy of the original array elements' data, even though both
-arrays use the values.
-
- Because `IGNORECASE' affects string comparisons, the value of
-`IGNORECASE' also affects sorting for both `asort()' and `asorti()'.
-Note also that the locale's sorting order does _not_ come into play;
-comparisons are based on character values only.(2) Caveat Emptor.
-
- ---------- Footnotes ----------
-
- (1) You may also use one of the predefined sorting names that sorts
-in decreasing order.
-
- (2) This is true because locale-based comparison occurs only when in
-POSIX compatibility mode, and since `asort()' and `asorti()' are `gawk'
-extensions, they are not available in that case.
-
-
-File: gawk.info, Node: Two-way I/O, Next: TCP/IP Networking, Prev: Array Sorting, Up: Advanced Features
-
-11.3 Two-Way Communications with Another Process
-================================================
-
- From: brennan@whidbey.com (Mike Brennan)
- Newsgroups: comp.lang.awk
- Subject: Re: Learn the SECRET to Attract Women Easily
- Date: 4 Aug 1997 17:34:46 GMT
- Message-ID: <5s53rm$eca@news.whidbey.com>
-
- On 3 Aug 1997 13:17:43 GMT, Want More Dates???
- <tracy78@kilgrona.com> wrote:
- >Learn the SECRET to Attract Women Easily
- >
- >The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
-
- The scent of awk programmers is a lot more attractive to women than
- the scent of perl programmers.
- --
- Mike Brennan
-
- It is often useful to be able to send data to a separate program for
-processing and then read the result. This can always be done with
-temporary files:
-
- # Write the data for processing
- tempfile = ("mydata." PROCINFO["pid"])
- while (NOT DONE WITH DATA)
- print DATA | ("subprogram > " tempfile)
- close("subprogram > " tempfile)
-
- # Read the results, remove tempfile when done
- while ((getline newdata < tempfile) > 0)
- PROCESS newdata APPROPRIATELY
- close(tempfile)
- system("rm " tempfile)
-
-This works, but not elegantly. Among other things, it requires that
-the program be run in a directory that cannot be shared among users;
-for example, `/tmp' will not do, as another user might happen to be
-using a temporary file with the same name.
-
- However, with `gawk', it is possible to open a _two-way_ pipe to
-another process. The second process is termed a "coprocess", since it
-runs in parallel with `gawk'. The two-way connection is created using
-the `|&' operator (borrowed from the Korn shell, `ksh'):(1)
-
- do {
- print DATA |& "subprogram"
- "subprogram" |& getline results
- } while (DATA LEFT TO PROCESS)
- close("subprogram")
-
- The first time an I/O operation is executed using the `|&' operator,
-`gawk' creates a two-way pipeline to a child process that runs the
-other program. Output created with `print' or `printf' is written to
-the program's standard input, and output from the program's standard
-output can be read by the `gawk' program using `getline'. As is the
-case with processes started by `|', the subprogram can be any program,
-or pipeline of programs, that can be started by the shell.
-
- There are some cautionary items to be aware of:
-
- * As the code inside `gawk' currently stands, the coprocess's
- standard error goes to the same place that the parent `gawk''s
- standard error goes. It is not possible to read the child's
- standard error separately.
-
- * I/O buffering may be a problem. `gawk' automatically flushes all
- output down the pipe to the coprocess. However, if the coprocess
- does not flush its output, `gawk' may hang when doing a `getline'
- in order to read the coprocess's results. This could lead to a
- situation known as "deadlock", where each process is waiting for
- the other one to do something.
-
- It is possible to close just one end of the two-way pipe to a
-coprocess, by supplying a second argument to the `close()' function of
-either `"to"' or `"from"' (*note Close Files And Pipes::). These
-strings tell `gawk' to close the end of the pipe that sends data to the
-coprocess or the end that reads from it, respectively.
-
- This is particularly necessary in order to use the system `sort'
-utility as part of a coprocess; `sort' must read _all_ of its input
-data before it can produce any output. The `sort' program does not
-receive an end-of-file indication until `gawk' closes the write end of
-the pipe.
-
- When you have finished writing data to the `sort' utility, you can
-close the `"to"' end of the pipe, and then start reading sorted data
-via `getline'. For example:
-
- BEGIN {
- command = "LC_ALL=C sort"
- n = split("abcdefghijklmnopqrstuvwxyz", a, "")
-
- for (i = n; i > 0; i--)
- print a[i] |& command
- close(command, "to")
-
- while ((command |& getline line) > 0)
- print "got", line
- close(command)
- }
-
- This program writes the letters of the alphabet in reverse order, one
-per line, down the two-way pipe to `sort'. It then closes the write
-end of the pipe, so that `sort' receives an end-of-file indication.
-This causes `sort' to sort the data and write the sorted data back to
-the `gawk' program. Once all of the data has been read, `gawk'
-terminates the coprocess and exits.
-
- As a side note, the assignment `LC_ALL=C' in the `sort' command
-ensures traditional Unix (ASCII) sorting from `sort'.
-
- You may also use pseudo-ttys (ptys) for two-way communication
-instead of pipes, if your system supports them. This is done on a
-per-command basis, by setting a special element in the `PROCINFO' array
-(*note Auto-set::), like so:
-
- command = "sort -nr" # command, save in convenience variable
- PROCINFO[command, "pty"] = 1 # update PROCINFO
- print ... |& command # start two-way pipe
- ...
-
-Using ptys avoids the buffer deadlock issues described earlier, at some
-loss in performance. If your system does not have ptys, or if all the
-system's ptys are in use, `gawk' automatically falls back to using
-regular pipes.
-
- ---------- Footnotes ----------
-
- (1) This is very different from the same operator in the C shell.
-
-
-File: gawk.info, Node: TCP/IP Networking, Next: Profiling, Prev: Two-way I/O, Up: Advanced Features
-
-11.4 Using `gawk' for Network Programming
-=========================================
-
- `EMISTERED':
- A host is a host from coast to coast,
- and no-one can talk to host that's close,
- unless the host that isn't close
- is busy hung or dead.
-
- In addition to being able to open a two-way pipeline to a coprocess
-on the same system (*note Two-way I/O::), it is possible to make a
-two-way connection to another process on another system across an IP
-network connection.
-
- You can think of this as just a _very long_ two-way pipeline to a
-coprocess. The way `gawk' decides that you want to use TCP/IP
-networking is by recognizing special file names that begin with one of
-`/inet/', `/inet4/' or `/inet6'.
-
- The full syntax of the special file name is
-`/NET-TYPE/PROTOCOL/LOCAL-PORT/REMOTE-HOST/REMOTE-PORT'. The
-components are:
-
-NET-TYPE
- Specifies the kind of Internet connection to make. Use `/inet4/'
- to force IPv4, and `/inet6/' to force IPv6. Plain `/inet/' (which
- used to be the only option) uses the system default, most likely
- IPv4.
-
-PROTOCOL
- The protocol to use over IP. This must be either `tcp', or `udp',
- for a TCP or UDP IP connection, respectively. The use of TCP is
- recommended for most applications.
-
-LOCAL-PORT
- The local TCP or UDP port number to use. Use a port number of `0'
- when you want the system to pick a port. This is what you should do
- when writing a TCP or UDP client. You may also use a well-known
- service name, such as `smtp' or `http', in which case `gawk'
- attempts to determine the predefined port number using the C
- `getaddrinfo()' function.
-
-REMOTE-HOST
- The IP address or fully-qualified domain name of the Internet host
- to which you want to connect.
-
-REMOTE-PORT
- The TCP or UDP port number to use on the given REMOTE-HOST.
- Again, use `0' if you don't care, or else a well-known service
- name.
-
- NOTE: Failure in opening a two-way socket will result in a
- non-fatal error being returned to the calling code. The value of
- `ERRNO' indicates the error (*note Auto-set::).
-
- Consider the following very simple example:
-
- BEGIN {
- Service = "/inet/tcp/0/localhost/daytime"
- Service |& getline
- print $0
- close(Service)
- }
-
- This program reads the current date and time from the local system's
-TCP `daytime' server. It then prints the results and closes the
-connection.
-
- Because this topic is extensive, the use of `gawk' for TCP/IP
-programming is documented separately. See *note (General
-Introduction)Top:: gawkinet, TCP/IP Internetworking with `gawk', for a
-much more complete introduction and discussion, as well as extensive
-examples.
-
-
-File: gawk.info, Node: Profiling, Prev: TCP/IP Networking, Up: Advanced Features
-
-11.5 Profiling Your `awk' Programs
-==================================
-
-You may produce execution traces of your `awk' programs. This is done
-by passing the option `--profile' to `gawk'. When `gawk' has finished
-running, it creates a profile of your program in a file named
-`awkprof.out'. Because it is profiling, it also executes up to 45%
-slower than `gawk' normally does.
-
- As shown in the following example, the `--profile' option can be
-used to change the name of the file where `gawk' will write the profile:
-
- gawk --profile=myprog.prof -f myprog.awk data1 data2
-
-In the above example, `gawk' places the profile in `myprog.prof'
-instead of in `awkprof.out'.
-
- Here is a sample session showing a simple `awk' program, its input
-data, and the results from running `gawk' with the `--profile' option.
-First, the `awk' program:
-
- BEGIN { print "First BEGIN rule" }
-
- END { print "First END rule" }
-
- /foo/ {
- print "matched /foo/, gosh"
- for (i = 1; i <= 3; i++)
- sing()
- }
-
- {
- if (/foo/)
- print "if is true"
- else
- print "else is true"
- }
-
- BEGIN { print "Second BEGIN rule" }
-
- END { print "Second END rule" }
-
- function sing( dummy)
- {
- print "I gotta be me!"
- }
-
- Following is the input data:
-
- foo
- bar
- baz
- foo
- junk
-
- Here is the `awkprof.out' that results from running the `gawk'
-profiler on this program and data (this example also illustrates that
-`awk' programmers sometimes have to work late):
-
- # gawk profile, created Sun Aug 13 00:00:15 2000
-
- # BEGIN block(s)
-
- BEGIN {
- 1 print "First BEGIN rule"
- 1 print "Second BEGIN rule"
- }
-
- # Rule(s)
-
- 5 /foo/ { # 2
- 2 print "matched /foo/, gosh"
- 6 for (i = 1; i <= 3; i++) {
- 6 sing()
- }
- }
-
- 5 {
- 5 if (/foo/) { # 2
- 2 print "if is true"
- 3 } else {
- 3 print "else is true"
- }
- }
-
- # END block(s)
-
- END {
- 1 print "First END rule"
- 1 print "Second END rule"
- }
-
- # Functions, listed alphabetically
-
- 6 function sing(dummy)
- {
- 6 print "I gotta be me!"
- }
-
- This example illustrates many of the basic features of profiling
-output. They are as follows:
-
- * The program is printed in the order `BEGIN' rule, `BEGINFILE' rule,
- pattern/action rules, `ENDFILE' rule, `END' rule and functions,
- listed alphabetically. Multiple `BEGIN' and `END' rules are
- merged together, as are multiple `BEGINFILE' and `ENDFILE' rules.
-
- * Pattern-action rules have two counts. The first count, to the
- left of the rule, shows how many times the rule's pattern was
- _tested_. The second count, to the right of the rule's opening
- left brace in a comment, shows how many times the rule's action
- was _executed_. The difference between the two indicates how many
- times the rule's pattern evaluated to false.
-
- * Similarly, the count for an `if'-`else' statement shows how many
- times the condition was tested. To the right of the opening left
- brace for the `if''s body is a count showing how many times the
- condition was true. The count for the `else' indicates how many
- times the test failed.
-
- * The count for a loop header (such as `for' or `while') shows how
- many times the loop test was executed. (Because of this, you
- can't just look at the count on the first statement in a rule to
- determine how many times the rule was executed. If the first
- statement is a loop, the count is misleading.)
-
- * For user-defined functions, the count next to the `function'
- keyword indicates how many times the function was called. The
- counts next to the statements in the body show how many times
- those statements were executed.
-
- * The layout uses "K&R" style with TABs. Braces are used
- everywhere, even when the body of an `if', `else', or loop is only
- a single statement.
-
- * Parentheses are used only where needed, as indicated by the
- structure of the program and the precedence rules. For example,
- `(3 + 5) * 4' means add three plus five, then multiply the total
- by four. However, `3 + 5 * 4' has no parentheses, and means `3 +
- (5 * 4)'.
-
- * Parentheses are used around the arguments to `print' and `printf'
- only when the `print' or `printf' statement is followed by a
- redirection. Similarly, if the target of a redirection isn't a
- scalar, it gets parenthesized.
-
- * `gawk' supplies leading comments in front of the `BEGIN' and `END'
- rules, the pattern/action rules, and the functions.
-
-
- The profiled version of your program may not look exactly like what
-you typed when you wrote it. This is because `gawk' creates the
-profiled version by "pretty printing" its internal representation of
-the program. The advantage to this is that `gawk' can produce a
-standard representation. The disadvantage is that all source-code
-comments are lost, as are the distinctions among multiple `BEGIN',
-`END', `BEGINFILE', and `ENDFILE' rules. Also, things such as:
-
- /foo/
-
-come out as:
-
- /foo/ {
- print $0
- }
+File: gawk.info, Node: Library Functions, Next: Sample Programs, Prev: Functions, Up: Top
-which is correct, but possibly surprising.
-
- Besides creating profiles when a program has completed, `gawk' can
-produce a profile while it is running. This is useful if your `awk'
-program goes into an infinite loop and you want to see what has been
-executed. To use this feature, run `gawk' with the `--profile' option
-in the background:
-
- $ gawk --profile -f myprog &
- [1] 13992
-
-The shell prints a job number and process ID number; in this case,
-13992. Use the `kill' command to send the `USR1' signal to `gawk':
-
- $ kill -USR1 13992
-
-As usual, the profiled version of the program is written to
-`awkprof.out', or to a different file if one specified with the
-`--profile' option.
-
- Along with the regular profile, as shown earlier, the profile
-includes a trace of any active functions:
-
- # Function Call Stack:
-
- # 3. baz
- # 2. bar
- # 1. foo
- # -- main --
-
- You may send `gawk' the `USR1' signal as many times as you like.
-Each time, the profile and function call trace are appended to the
-output profile file.
-
- If you use the `HUP' signal instead of the `USR1' signal, `gawk'
-produces the profile and the function call trace and then exits.
-
- When `gawk' runs on MS-Windows systems, it uses the `INT' and `QUIT'
-signals for producing the profile and, in the case of the `INT' signal,
-`gawk' exits. This is because these systems don't support the `kill'
-command, so the only signals you can deliver to a program are those
-generated by the keyboard. The `INT' signal is generated by the
-`Ctrl-<C>' or `Ctrl-<BREAK>' key, while the `QUIT' signal is generated
-by the `Ctrl-<\>' key.
-
- Finally, `gawk' also accepts another option `--pretty-print'. When
-called this way, `gawk' "pretty prints" the program into `awkprof.out',
-without any execution counts.
-
-
-File: gawk.info, Node: Library Functions, Next: Sample Programs, Prev: Advanced Features, Up: Top
-
-12 A Library of `awk' Functions
+10 A Library of `awk' Functions
*******************************
*note User-defined::, describes how to write your own `awk' functions.
@@ -14878,7 +13469,7 @@ contents of the input record.

File: gawk.info, Node: Library Names, Next: General Functions, Up: Library Functions
-12.1 Naming Library Function Global Variables
+10.1 Naming Library Function Global Variables
=============================================
Due to the way the `awk' language evolved, variables are either
@@ -14958,7 +13549,7 @@ verifying this.

File: gawk.info, Node: General Functions, Next: Data File Management, Prev: Library Names, Up: Library Functions
-12.2 General Programming
+10.2 General Programming
========================
This minor node presents a number of functions that are of general
@@ -14981,7 +13572,7 @@ programming use.

File: gawk.info, Node: Strtonum Function, Next: Assert Function, Up: General Functions
-12.2.1 Converting Strings To Numbers
+10.2.1 Converting Strings To Numbers
------------------------------------
The `strtonum()' function (*note String Functions::) is a `gawk'
@@ -15065,7 +13656,7 @@ be tested with `gawk' and the results compared to the built-in

File: gawk.info, Node: Assert Function, Next: Round Function, Prev: Strtonum Function, Up: General Functions
-12.2.2 Assertions
+10.2.2 Assertions
-----------------
When writing large programs, it is often useful to know that a
@@ -15151,7 +13742,7 @@ rule always ends with an `exit' statement.

File: gawk.info, Node: Round Function, Next: Cliff Random Function, Prev: Assert Function, Up: General Functions
-12.2.3 Rounding Numbers
+10.2.3 Rounding Numbers
-----------------------
The way `printf' and `sprintf()' (*note Printf::) perform rounding
@@ -15197,7 +13788,7 @@ might be useful if your `awk''s `printf' does unbiased rounding:

File: gawk.info, Node: Cliff Random Function, Next: Ordinal Functions, Prev: Round Function, Up: General Functions
-12.2.4 The Cliff Random Number Generator
+10.2.4 The Cliff Random Number Generator
----------------------------------------
The Cliff random number generator
@@ -15226,7 +13817,7 @@ might try using this function instead.

File: gawk.info, Node: Ordinal Functions, Next: Join Function, Prev: Cliff Random Function, Up: General Functions
-12.2.5 Translating Between Characters and Numbers
+10.2.5 Translating Between Characters and Numbers
-------------------------------------------------
One commercial implementation of `awk' supplies a built-in function,
@@ -15324,7 +13915,7 @@ extensions, you can simplify `_ord_init' to loop from 0 to 255.

File: gawk.info, Node: Join Function, Next: Getlocaltime Function, Prev: Ordinal Functions, Up: General Functions
-12.2.6 Merging an Array into a String
+10.2.6 Merging an Array into a String
-------------------------------------
When doing string processing, it is often useful to be able to join all
@@ -15371,7 +13962,7 @@ makes string operations more difficult than they really need to be.

File: gawk.info, Node: Getlocaltime Function, Prev: Join Function, Up: General Functions
-12.2.7 Managing the Time of Day
+10.2.7 Managing the Time of Day
-------------------------------
The `systime()' and `strftime()' functions described in *note Time
@@ -15453,7 +14044,7 @@ optional timestamp value to use instead of the current time.

File: gawk.info, Node: Data File Management, Next: Getopt Function, Prev: General Functions, Up: Library Functions
-12.3 Data File Management
+10.3 Data File Management
=========================
This minor node presents functions that are useful for managing
@@ -15470,7 +14061,7 @@ command-line data files.

File: gawk.info, Node: Filetrans Function, Next: Rewind Function, Up: Data File Management
-12.3.1 Noting Data File Boundaries
+10.3.1 Noting Data File Boundaries
----------------------------------
The `BEGIN' and `END' rules are each executed exactly once at the
@@ -15568,7 +14159,7 @@ it provides an easy way to do per-file cleanup processing.

File: gawk.info, Node: Rewind Function, Next: File Checking, Prev: Filetrans Function, Up: Data File Management
-12.3.2 Rereading the Current File
+10.3.2 Rereading the Current File
---------------------------------
Another request for a new built-in function was for a `rewind()'
@@ -15610,7 +14201,7 @@ Nextfile Statement::).

File: gawk.info, Node: File Checking, Next: Empty Files, Prev: Rewind Function, Up: Data File Management
-12.3.3 Checking for Readable Data Files
+10.3.3 Checking for Readable Data Files
---------------------------------------
Normally, if you give `awk' a data file that isn't readable, it stops
@@ -15639,7 +14230,7 @@ in the list). See also *note ARGC and ARGV::.

File: gawk.info, Node: Empty Files, Next: Ignoring Assigns, Prev: File Checking, Up: Data File Management
-12.3.4 Checking For Zero-length Files
+10.3.4 Checking For Zero-length Files
-------------------------------------
All known `awk' implementations silently skip over zero-length files.
@@ -15696,7 +14287,7 @@ intervening value in `ARGV' is a variable assignment.

File: gawk.info, Node: Ignoring Assigns, Prev: Empty Files, Up: Data File Management
-12.3.5 Treating Assignments as File Names
+10.3.5 Treating Assignments as File Names
-----------------------------------------
Occasionally, you might not want `awk' to process command-line variable
@@ -15739,7 +14330,7 @@ arguments are left alone.

File: gawk.info, Node: Getopt Function, Next: Passwd Functions, Prev: Data File Management, Up: Library Functions
-12.4 Processing Command-Line Options
+10.4 Processing Command-Line Options
====================================
Most utilities on POSIX compatible systems take options on the command
@@ -16032,7 +14623,7 @@ have left it alone, since using `substr()' is more portable.

File: gawk.info, Node: Passwd Functions, Next: Group Functions, Prev: Getopt Function, Up: Library Functions
-12.5 Reading the User Database
+10.5 Reading the User Database
==============================
The `PROCINFO' array (*note Built-in Variables::) provides access to
@@ -16275,7 +14866,7 @@ network database.

File: gawk.info, Node: Group Functions, Next: Walking Arrays, Prev: Passwd Functions, Up: Library Functions
-12.6 Reading the Group Database
+10.6 Reading the Group Database
===============================
Much of the discussion presented in *note Passwd Functions::, applies
@@ -16509,7 +15100,7 @@ very simple, relying on `awk''s associative arrays to do work.

File: gawk.info, Node: Walking Arrays, Prev: Group Functions, Up: Library Functions
-12.7 Traversing Arrays of Arrays
+10.7 Traversing Arrays of Arrays
================================
*note Arrays of Arrays::, described how `gawk' provides arrays of
@@ -16558,9 +15149,9 @@ value. Here is a main program to demonstrate:
-| a[3] = 3

-File: gawk.info, Node: Sample Programs, Next: Debugger, Prev: Library Functions, Up: Top
+File: gawk.info, Node: Sample Programs, Next: Internationalization, Prev: Library Functions, Up: Top
-13 Practical `awk' Programs
+11 Practical `awk' Programs
***************************
*note Library Functions::, presents the idea that reading programs in a
@@ -16580,7 +15171,7 @@ Library Functions::.

File: gawk.info, Node: Running Examples, Next: Clones, Up: Sample Programs
-13.1 Running the Example Programs
+11.1 Running the Example Programs
=================================
To run a given program, you would typically do something like this:
@@ -16603,7 +15194,7 @@ OPTIONS are any command-line options for the program that start with a

File: gawk.info, Node: Clones, Next: Miscellaneous Programs, Prev: Running Examples, Up: Sample Programs
-13.2 Reinventing Wheels for Fun and Profit
+11.2 Reinventing Wheels for Fun and Profit
==========================================
This minor node presents a number of POSIX utilities implemented in
@@ -16633,7 +15224,7 @@ programming for "real world" tasks.

File: gawk.info, Node: Cut Program, Next: Egrep Program, Up: Clones
-13.2.1 Cutting out Fields and Columns
+11.2.1 Cutting out Fields and Columns
-------------------------------------
The `cut' utility selects, or "cuts," characters or fields from its
@@ -16892,7 +15483,7 @@ solution to the problem of picking the input line apart by characters.

File: gawk.info, Node: Egrep Program, Next: Id Program, Prev: Cut Program, Up: Clones
-13.2.2 Searching for Regular Expressions in Files
+11.2.2 Searching for Regular Expressions in Files
-------------------------------------------------
The `egrep' utility searches files for patterns. It uses regular
@@ -17124,7 +15715,7 @@ the translated line, not the original.

File: gawk.info, Node: Id Program, Next: Split Program, Prev: Egrep Program, Up: Clones
-13.2.3 Printing out User Information
+11.2.3 Printing out User Information
------------------------------------
The `id' utility lists a user's real and effective user ID numbers,
@@ -17231,7 +15822,7 @@ body never executes.

File: gawk.info, Node: Split Program, Next: Tee Program, Prev: Id Program, Up: Clones
-13.2.4 Splitting a Large File into Pieces
+11.2.4 Splitting a Large File into Pieces
-----------------------------------------
The `split' program splits large text files into smaller pieces. Usage
@@ -17339,7 +15930,7 @@ not relevant for what the program aims to demonstrate.

File: gawk.info, Node: Tee Program, Next: Uniq Program, Prev: Split Program, Up: Clones
-13.2.5 Duplicating Output into Multiple Files
+11.2.5 Duplicating Output into Multiple Files
---------------------------------------------
The `tee' program is known as a "pipe fitting." `tee' copies its
@@ -17427,7 +16018,7 @@ N input records and M output files, the first method only executes N

File: gawk.info, Node: Uniq Program, Next: Wc Program, Prev: Tee Program, Up: Clones
-13.2.6 Printing Nonduplicated Lines of Text
+11.2.6 Printing Nonduplicated Lines of Text
-------------------------------------------
The `uniq' utility reads sorted lines of data on its standard input,
@@ -17646,7 +16237,7 @@ line of input data:

File: gawk.info, Node: Wc Program, Prev: Uniq Program, Up: Clones
-13.2.7 Counting Things
+11.2.7 Counting Things
----------------------
The `wc' (word count) utility counts lines, words, and characters in
@@ -17791,7 +16382,7 @@ characters, not bytes.

File: gawk.info, Node: Miscellaneous Programs, Prev: Clones, Up: Sample Programs
-13.3 A Grab Bag of `awk' Programs
+11.3 A Grab Bag of `awk' Programs
=================================
This minor node is a large "grab bag" of miscellaneous programs. We
@@ -17818,7 +16409,7 @@ hope you find them both interesting and enjoyable.

File: gawk.info, Node: Dupword Program, Next: Alarm Program, Up: Miscellaneous Programs
-13.3.1 Finding Duplicated Words in a Document
+11.3.1 Finding Duplicated Words in a Document
---------------------------------------------
A common error when writing large amounts of prose is to accidentally
@@ -17866,7 +16457,7 @@ word, comparing it to the previous one:

File: gawk.info, Node: Alarm Program, Next: Translate Program, Prev: Dupword Program, Up: Miscellaneous Programs
-13.3.2 An Alarm Clock Program
+11.3.2 An Alarm Clock Program
-----------------------------
Nothing cures insomnia like a ringing alarm clock.
@@ -17999,7 +16590,7 @@ necessary:

File: gawk.info, Node: Translate Program, Next: Labels Program, Prev: Alarm Program, Up: Miscellaneous Programs
-13.3.3 Transliterating Characters
+11.3.3 Transliterating Characters
---------------------------------
The system `tr' utility transliterates characters. For example, it is
@@ -18125,7 +16716,7 @@ split each character in a string into separate array elements.

File: gawk.info, Node: Labels Program, Next: Word Sorting, Prev: Translate Program, Up: Miscellaneous Programs
-13.3.4 Printing Mailing Labels
+11.3.4 Printing Mailing Labels
------------------------------
Here is a "real world"(1) program. This script reads lists of names and
@@ -18232,7 +16823,7 @@ something done."

File: gawk.info, Node: Word Sorting, Next: History Sorting, Prev: Labels Program, Up: Miscellaneous Programs
-13.3.5 Generating Word-Usage Counts
+11.3.5 Generating Word-Usage Counts
-----------------------------------
When working with large amounts of text, it can be interesting to know
@@ -18336,7 +16927,7 @@ operating system documentation for more information on how to use the

File: gawk.info, Node: History Sorting, Next: Extract Program, Prev: Word Sorting, Up: Miscellaneous Programs
-13.3.6 Removing Duplicates from Unsorted Text
+11.3.6 Removing Duplicates from Unsorted Text
---------------------------------------------
The `uniq' program (*note Uniq Program::), removes duplicate lines from
@@ -18383,7 +16974,7 @@ seen.

File: gawk.info, Node: Extract Program, Next: Simple Sed, Prev: History Sorting, Up: Miscellaneous Programs
-13.3.7 Extracting Programs from Texinfo Source Files
+11.3.7 Extracting Programs from Texinfo Source Files
----------------------------------------------------
The nodes *note Library Functions::, and *note Sample Programs::, are
@@ -18583,7 +17174,7 @@ function. Consider how you might use it to simplify the code.

File: gawk.info, Node: Simple Sed, Next: Igawk Program, Prev: Extract Program, Up: Miscellaneous Programs
-13.3.8 A Simple Stream Editor
+11.3.8 A Simple Stream Editor
-----------------------------
The `sed' utility is a stream editor, a program that reads a stream of
@@ -18664,7 +17255,7 @@ the single rule handles the printing scheme outlined above, using

File: gawk.info, Node: Igawk Program, Next: Anagram Program, Prev: Simple Sed, Up: Miscellaneous Programs
-13.3.9 An Easy Way to Use Library Functions
+11.3.9 An Easy Way to Use Library Functions
-------------------------------------------
In *note Include Files::, we saw how `gawk' provides a built-in
@@ -19061,7 +17652,7 @@ can loop forever if the file exists but is empty. Caveat emptor.

File: gawk.info, Node: Anagram Program, Next: Signature Program, Prev: Igawk Program, Up: Miscellaneous Programs
-13.3.10 Finding Anagrams From A Dictionary
+11.3.10 Finding Anagrams From A Dictionary
------------------------------------------
An interesting programming challenge is to search for "anagrams" in a
@@ -19151,7 +17742,7 @@ otherwise the anagrams would appear in arbitrary order:

File: gawk.info, Node: Signature Program, Prev: Anagram Program, Up: Miscellaneous Programs
-13.3.11 And Now For Something Completely Different
+11.3.11 And Now For Something Completely Different
--------------------------------------------------
The following program was written by Davide Brini and is published on
@@ -19176,7 +17767,1437 @@ supplies the following copyright terms:
We leave it to you to determine what the program does.

-File: gawk.info, Node: Debugger, Next: Arbitrary Precision Arithmetic, Prev: Sample Programs, Up: Top
+File: gawk.info, Node: Internationalization, Next: Advanced Features, Prev: Sample Programs, Up: Top
+
+12 Internationalization with `gawk'
+***********************************
+
+Once upon a time, computer makers wrote software that worked only in
+English. Eventually, hardware and software vendors noticed that if
+their systems worked in the native languages of non-English-speaking
+countries, they were able to sell more systems. As a result,
+internationalization and localization of programs and software systems
+became a common practice.
+
+ For many years, the ability to provide internationalization was
+largely restricted to programs written in C and C++. This major node
+describes the underlying library `gawk' uses for internationalization,
+as well as how `gawk' makes internationalization features available at
+the `awk' program level. Having internationalization available at the
+`awk' level gives software developers additional flexibility--they are
+no longer forced to write in C or C++ when internationalization is a
+requirement.
+
+* Menu:
+
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU `gettext' works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: `gawk' is also internationalized.
+
+
+File: gawk.info, Node: I18N and L10N, Next: Explaining gettext, Up: Internationalization
+
+12.1 Internationalization and Localization
+==========================================
+
+"Internationalization" means writing (or modifying) a program once, in
+such a way that it can use multiple languages without requiring further
+source-code changes. "Localization" means providing the data necessary
+for an internationalized program to work in a particular language.
+Most typically, these terms refer to features such as the language used
+for printing error messages, the language used to read responses, and
+information related to how numerical and monetary values are printed
+and read.
+
+
+File: gawk.info, Node: Explaining gettext, Next: Programmer i18n, Prev: I18N and L10N, Up: Internationalization
+
+12.2 GNU `gettext'
+==================
+
+The facilities in GNU `gettext' focus on messages; strings printed by a
+program, either directly or via formatting with `printf' or
+`sprintf()'.(1)
+
+ When using GNU `gettext', each application has its own "text
+domain". This is a unique name, such as `kpilot' or `gawk', that
+identifies the application. A complete application may have multiple
+components--programs written in C or C++, as well as scripts written in
+`sh' or `awk'. All of the components use the same text domain.
+
+ To make the discussion concrete, assume we're writing an application
+named `guide'. Internationalization consists of the following steps,
+in this order:
+
+ 1. The programmer goes through the source for all of `guide''s
+ components and marks each string that is a candidate for
+ translation. For example, `"`-F': option required"' is a good
+ candidate for translation. A table with strings of option names
+ is not (e.g., `gawk''s `--profile' option should remain the same,
+ no matter what the local language).
+
+ 2. The programmer indicates the application's text domain (`"guide"')
+ to the `gettext' library, by calling the `textdomain()' function.
+
+ 3. Messages from the application are extracted from the source code
+ and collected into a portable object template file (`guide.pot'),
+ which lists the strings and their translations. The translations
+ are initially empty. The original (usually English) messages
+ serve as the key for lookup of the translations.
+
+ 4. For each language with a translator, `guide.pot' is copied to a
+ portable object file (`.po') and translations are created and
+ shipped with the application. For example, there might be a
+ `fr.po' for a French translation.
+
+ 5. Each language's `.po' file is converted into a binary message
+ object (`.mo') file. A message object file contains the original
+ messages and their translations in a binary format that allows
+ fast lookup of translations at runtime.
+
+ 6. When `guide' is built and installed, the binary translation files
+ are installed in a standard place.
+
+ 7. For testing and development, it is possible to tell `gettext' to
+ use `.mo' files in a different directory than the standard one by
+ using the `bindtextdomain()' function.
+
+ 8. At runtime, `guide' looks up each string via a call to
+ `gettext()'. The returned string is the translated string if
+ available, or the original string if not.
+
+ 9. If necessary, it is possible to access messages from a different
+ text domain than the one belonging to the application, without
+ having to switch the application's default text domain back and
+ forth.
+
+ In C (or C++), the string marking and dynamic translation lookup are
+accomplished by wrapping each string in a call to `gettext()':
+
+ printf("%s", gettext("Don't Panic!\n"));
+
+ The tools that extract messages from source code pull out all
+strings enclosed in calls to `gettext()'.
+
+ The GNU `gettext' developers, recognizing that typing `gettext(...)'
+over and over again is both painful and ugly to look at, use the macro
+`_' (an underscore) to make things easier:
+
+ /* In the standard header file: */
+ #define _(str) gettext(str)
+
+ /* In the program text: */
+ printf("%s", _("Don't Panic!\n"));
+
+This reduces the typing overhead to just three extra characters per
+string and is considerably easier to read as well.
+
+ There are locale "categories" for different types of locale-related
+information. The defined locale categories that `gettext' knows about
+are:
+
+`LC_MESSAGES'
+ Text messages. This is the default category for `gettext'
+ operations, but it is possible to supply a different one
+ explicitly, if necessary. (It is almost never necessary to supply
+ a different category.)
+
+`LC_COLLATE'
+ Text-collation information; i.e., how different characters and/or
+ groups of characters sort in a given language.
+
+`LC_CTYPE'
+ Character-type information (alphabetic, digit, upper- or
+ lowercase, and so on). This information is accessed via the POSIX
+ character classes in regular expressions, such as `/[[:alnum:]]/'
+ (*note Regexp Operators::).
+
+`LC_MONETARY'
+ Monetary information, such as the currency symbol, and whether the
+ symbol goes before or after a number.
+
+`LC_NUMERIC'
+ Numeric information, such as which characters to use for the
+ decimal point and the thousands separator.(2)
+
+`LC_RESPONSE'
+ Response information, such as how "yes" and "no" appear in the
+ local language, and possibly other information as well.
+
+`LC_TIME'
+ Time- and date-related information, such as 12- or 24-hour clock,
+ month printed before or after the day in a date, local month
+ abbreviations, and so on.
+
+`LC_ALL'
+ All of the above. (Not too useful in the context of `gettext'.)
+
+ ---------- Footnotes ----------
+
+ (1) For some operating systems, the `gawk' port doesn't support GNU
+`gettext'. Therefore, these features are not available if you are
+using one of those operating systems. Sorry.
+
+ (2) Americans use a comma every three decimal places and a period
+for the decimal point, while many Europeans do exactly the opposite:
+1,234.56 versus 1.234,56.
+
+
+File: gawk.info, Node: Programmer i18n, Next: Translator i18n, Prev: Explaining gettext, Up: Internationalization
+
+12.3 Internationalizing `awk' Programs
+======================================
+
+`gawk' provides the following variables and functions for
+internationalization:
+
+`TEXTDOMAIN'
+ This variable indicates the application's text domain. For
+ compatibility with GNU `gettext', the default value is
+ `"messages"'.
+
+`_"your message here"'
+ String constants marked with a leading underscore are candidates
+ for translation at runtime. String constants without a leading
+ underscore are not translated.
+
+`dcgettext(STRING [, DOMAIN [, CATEGORY]])'
+ Return the translation of STRING in text domain DOMAIN for locale
+ category CATEGORY. The default value for DOMAIN is the current
+ value of `TEXTDOMAIN'. The default value for CATEGORY is
+ `"LC_MESSAGES"'.
+
+ If you supply a value for CATEGORY, it must be a string equal to
+ one of the known locale categories described in *note Explaining
+ gettext::. You must also supply a text domain. Use `TEXTDOMAIN'
+ if you want to use the current domain.
+
+ CAUTION: The order of arguments to the `awk' version of the
+ `dcgettext()' function is purposely different from the order
+ for the C version. The `awk' version's order was chosen to
+ be simple and to allow for reasonable `awk'-style default
+ arguments.
+
+`dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])'
+ Return the plural form used for NUMBER of the translation of
+ STRING1 and STRING2 in text domain DOMAIN for locale category
+ CATEGORY. STRING1 is the English singular variant of a message,
+ and STRING2 the English plural variant of the same message. The
+ default value for DOMAIN is the current value of `TEXTDOMAIN'.
+ The default value for CATEGORY is `"LC_MESSAGES"'.
+
+ The same remarks about argument order as for the `dcgettext()'
+ function apply.
+
+`bindtextdomain(DIRECTORY [, DOMAIN])'
+ Change the directory in which `gettext' looks for `.mo' files, in
+ case they will not or cannot be placed in the standard locations
+ (e.g., during testing). Return the directory in which DOMAIN is
+ "bound."
+
+ The default DOMAIN is the value of `TEXTDOMAIN'. If DIRECTORY is
+ the null string (`""'), then `bindtextdomain()' returns the
+ current binding for the given DOMAIN.
+
+ To use these facilities in your `awk' program, follow the steps
+outlined in *note Explaining gettext::, like so:
+
+ 1. Set the variable `TEXTDOMAIN' to the text domain of your program.
+ This is best done in a `BEGIN' rule (*note BEGIN/END::), or it can
+ also be done via the `-v' command-line option (*note Options::):
+
+ BEGIN {
+ TEXTDOMAIN = "guide"
+ ...
+ }
+
+ 2. Mark all translatable strings with a leading underscore (`_')
+ character. It _must_ be adjacent to the opening quote of the
+ string. For example:
+
+ print _"hello, world"
+ x = _"you goofed"
+ printf(_"Number of users is %d\n", nusers)
+
+ 3. If you are creating strings dynamically, you can still translate
+ them, using the `dcgettext()' built-in function:
+
+ message = nusers " users logged in"
+ message = dcgettext(message, "adminprog")
+ print message
+
+ Here, the call to `dcgettext()' supplies a different text domain
+ (`"adminprog"') in which to find the message, but it uses the
+ default `"LC_MESSAGES"' category.
+
+ 4. During development, you might want to put the `.mo' file in a
+ private directory for testing. This is done with the
+ `bindtextdomain()' built-in function:
+
+ BEGIN {
+ TEXTDOMAIN = "guide" # our text domain
+ if (Testing) {
+ # where to find our files
+ bindtextdomain("testdir")
+ # joe is in charge of adminprog
+ bindtextdomain("../joe/testdir", "adminprog")
+ }
+ ...
+ }
+
+
+ *Note I18N Example::, for an example program showing the steps to
+create and use translations from `awk'.
+
+
+File: gawk.info, Node: Translator i18n, Next: I18N Example, Prev: Programmer i18n, Up: Internationalization
+
+12.4 Translating `awk' Programs
+===============================
+
+Once a program's translatable strings have been marked, they must be
+extracted to create the initial `.po' file. As part of translation, it
+is often helpful to rearrange the order in which arguments to `printf'
+are output.
+
+ `gawk''s `--gen-pot' command-line option extracts the messages and
+is discussed next. After that, `printf''s ability to rearrange the
+order for `printf' arguments at runtime is covered.
+
+* Menu:
+
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging `printf' arguments.
+* I18N Portability:: `awk'-level portability issues.
+
+
+File: gawk.info, Node: String Extraction, Next: Printf Ordering, Up: Translator i18n
+
+12.4.1 Extracting Marked Strings
+--------------------------------
+
+Once your `awk' program is working, and all the strings have been
+marked and you've set (and perhaps bound) the text domain, it is time
+to produce translations. First, use the `--gen-pot' command-line
+option to create the initial `.pot' file:
+
+ $ gawk --gen-pot -f guide.awk > guide.pot
+
+ When run with `--gen-pot', `gawk' does not execute your program.
+Instead, it parses it as usual and prints all marked strings to
+standard output in the format of a GNU `gettext' Portable Object file.
+Also included in the output are any constant strings that appear as the
+first argument to `dcgettext()' or as the first and second argument to
+`dcngettext()'.(1) *Note I18N Example::, for the full list of steps to
+go through to create and test translations for `guide'.
+
+ ---------- Footnotes ----------
+
+ (1) The `xgettext' utility that comes with GNU `gettext' can handle
+`.awk' files.
+
+
+File: gawk.info, Node: Printf Ordering, Next: I18N Portability, Prev: String Extraction, Up: Translator i18n
+
+12.4.2 Rearranging `printf' Arguments
+-------------------------------------
+
+Format strings for `printf' and `sprintf()' (*note Printf::) present a
+special problem for translation. Consider the following:(1)
+
+ printf(_"String `%s' has %d characters\n",
+ string, length(string)))
+
+ A possible German translation for this might be:
+
+ "%d Zeichen lang ist die Zeichenkette `%s'\n"
+
+ The problem should be obvious: the order of the format
+specifications is different from the original! Even though `gettext()'
+can return the translated string at runtime, it cannot change the
+argument order in the call to `printf'.
+
+ To solve this problem, `printf' format specifiers may have an
+additional optional element, which we call a "positional specifier".
+For example:
+
+ "%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
+
+ Here, the positional specifier consists of an integer count, which
+indicates which argument to use, and a `$'. Counts are one-based, and
+the format string itself is _not_ included. Thus, in the following
+example, `string' is the first argument and `length(string)' is the
+second:
+
+ $ gawk 'BEGIN {
+ > string = "Dont Panic"
+ > printf _"%2$d characters live in \"%1$s\"\n",
+ > string, length(string)
+ > }'
+ -| 10 characters live in "Dont Panic"
+
+ If present, positional specifiers come first in the format
+specification, before the flags, the field width, and/or the precision.
+
+ Positional specifiers can be used with the dynamic field width and
+precision capability:
+
+ $ gawk 'BEGIN {
+ > printf("%*.*s\n", 10, 20, "hello")
+ > printf("%3$*2$.*1$s\n", 20, 10, "hello")
+ > }'
+ -| hello
+ -| hello
+
+ NOTE: When using `*' with a positional specifier, the `*' comes
+ first, then the integer position, and then the `$'. This is
+ somewhat counterintuitive.
+
+ `gawk' does not allow you to mix regular format specifiers and those
+with positional specifiers in the same string:
+
+ $ gawk 'BEGIN { printf _"%d %3$s\n", 1, 2, "hi" }'
+ error--> gawk: cmd. line:1: fatal: must use `count$' on all formats or none
+
+ NOTE: There are some pathological cases that `gawk' may fail to
+ diagnose. In such cases, the output may not be what you expect.
+ It's still a bad idea to try mixing them, even if `gawk' doesn't
+ detect it.
+
+ Although positional specifiers can be used directly in `awk'
+programs, their primary purpose is to help in producing correct
+translations of format strings into languages different from the one in
+which the program is first written.
+
+ ---------- Footnotes ----------
+
+ (1) This example is borrowed from the GNU `gettext' manual.
+
+
+File: gawk.info, Node: I18N Portability, Prev: Printf Ordering, Up: Translator i18n
+
+12.4.3 `awk' Portability Issues
+-------------------------------
+
+`gawk''s internationalization features were purposely chosen to have as
+little impact as possible on the portability of `awk' programs that use
+them to other versions of `awk'. Consider this program:
+
+ BEGIN {
+ TEXTDOMAIN = "guide"
+ if (Test_Guide) # set with -v
+ bindtextdomain("/test/guide/messages")
+ print _"don't panic!"
+ }
+
+As written, it won't work on other versions of `awk'. However, it is
+actually almost portable, requiring very little change:
+
+ * Assignments to `TEXTDOMAIN' won't have any effect, since
+ `TEXTDOMAIN' is not special in other `awk' implementations.
+
+ * Non-GNU versions of `awk' treat marked strings as the
+ concatenation of a variable named `_' with the string following
+ it.(1) Typically, the variable `_' has the null string (`""') as
+ its value, leaving the original string constant as the result.
+
+ * By defining "dummy" functions to replace `dcgettext()',
+ `dcngettext()' and `bindtextdomain()', the `awk' program can be
+ made to run, but all the messages are output in the original
+ language. For example:
+
+ function bindtextdomain(dir, domain)
+ {
+ return dir
+ }
+
+ function dcgettext(string, domain, category)
+ {
+ return string
+ }
+
+ function dcngettext(string1, string2, number, domain, category)
+ {
+ return (number == 1 ? string1 : string2)
+ }
+
+ * The use of positional specifications in `printf' or `sprintf()' is
+ _not_ portable. To support `gettext()' at the C level, many
+ systems' C versions of `sprintf()' do support positional
+ specifiers. But it works only if enough arguments are supplied in
+ the function call. Many versions of `awk' pass `printf' formats
+ and arguments unchanged to the underlying C library version of
+ `sprintf()', but only one format and argument at a time. What
+ happens if a positional specification is used is anybody's guess.
+ However, since the positional specifications are primarily for use
+ in _translated_ format strings, and since non-GNU `awk's never
+ retrieve the translated string, this should not be a problem in
+ practice.
+
+ ---------- Footnotes ----------
+
+ (1) This is good fodder for an "Obfuscated `awk'" contest.
+
+
+File: gawk.info, Node: I18N Example, Next: Gawk I18N, Prev: Translator i18n, Up: Internationalization
+
+12.5 A Simple Internationalization Example
+==========================================
+
+Now let's look at a step-by-step example of how to internationalize and
+localize a simple `awk' program, using `guide.awk' as our original
+source:
+
+ BEGIN {
+ TEXTDOMAIN = "guide"
+ bindtextdomain(".") # for testing
+ print _"Don't Panic"
+ print _"The Answer Is", 42
+ print "Pardon me, Zaphod who?"
+ }
+
+Run `gawk --gen-pot' to create the `.pot' file:
+
+ $ gawk --gen-pot -f guide.awk > guide.pot
+
+This produces:
+
+ #: guide.awk:4
+ msgid "Don't Panic"
+ msgstr ""
+
+ #: guide.awk:5
+ msgid "The Answer Is"
+ msgstr ""
+
+ This original portable object template file is saved and reused for
+each language into which the application is translated. The `msgid' is
+the original string and the `msgstr' is the translation.
+
+ NOTE: Strings not marked with a leading underscore do not appear
+ in the `guide.pot' file.
+
+ Next, the messages must be translated. Here is a translation to a
+hypothetical dialect of English, called "Mellow":(1)
+
+ $ cp guide.pot guide-mellow.po
+ ADD TRANSLATIONS TO guide-mellow.po ...
+
+Following are the translations:
+
+ #: guide.awk:4
+ msgid "Don't Panic"
+ msgstr "Hey man, relax!"
+
+ #: guide.awk:5
+ msgid "The Answer Is"
+ msgstr "Like, the scoop is"
+
+ The next step is to make the directory to hold the binary message
+object file and then to create the `guide.mo' file. The directory
+layout shown here is standard for GNU `gettext' on GNU/Linux systems.
+Other versions of `gettext' may use a different layout:
+
+ $ mkdir en_US en_US/LC_MESSAGES
+
+ The `msgfmt' utility does the conversion from human-readable `.po'
+file to machine-readable `.mo' file. By default, `msgfmt' creates a
+file named `messages'. This file must be renamed and placed in the
+proper directory so that `gawk' can find it:
+
+ $ msgfmt guide-mellow.po
+ $ mv messages en_US/LC_MESSAGES/guide.mo
+
+ Finally, we run the program to test it:
+
+ $ gawk -f guide.awk
+ -| Hey man, relax!
+ -| Like, the scoop is 42
+ -| Pardon me, Zaphod who?
+
+ If the three replacement functions for `dcgettext()', `dcngettext()'
+and `bindtextdomain()' (*note I18N Portability::) are in a file named
+`libintl.awk', then we can run `guide.awk' unchanged as follows:
+
+ $ gawk --posix -f guide.awk -f libintl.awk
+ -| Don't Panic
+ -| The Answer Is 42
+ -| Pardon me, Zaphod who?
+
+ ---------- Footnotes ----------
+
+ (1) Perhaps it would be better if it were called "Hippy." Ah, well.
+
+
+File: gawk.info, Node: Gawk I18N, Prev: I18N Example, Up: Internationalization
+
+12.6 `gawk' Can Speak Your Language
+===================================
+
+`gawk' itself has been internationalized using the GNU `gettext'
+package. (GNU `gettext' is described in complete detail in *note (GNU
+`gettext' utilities)Top:: gettext, GNU gettext tools.) As of this
+writing, the latest version of GNU `gettext' is version 0.18.1
+(ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.1.tar.gz).
+
+ If a translation of `gawk''s messages exists, then `gawk' produces
+usage messages, warnings, and fatal errors in the local language.
+
+
+File: gawk.info, Node: Advanced Features, Next: Debugger, Prev: Internationalization, Up: Top
+
+13 Advanced Features of `gawk'
+******************************
+
+ Write documentation as if whoever reads it is a violent psychopath
+ who knows where you live.
+ Steve English, as quoted by Peter Langston
+
+ This major node discusses advanced features in `gawk'. It's a bit
+of a "grab bag" of items that are otherwise unrelated to each other.
+First, a command-line option allows `gawk' to recognize nondecimal
+numbers in input data, not just in `awk' programs. Then, `gawk''s
+special features for sorting arrays are presented. Next, two-way I/O,
+discussed briefly in earlier parts of this Info file, is described in
+full detail, along with the basics of TCP/IP networking. Finally,
+`gawk' can "profile" an `awk' program, making it possible to tune it
+for performance.
+
+ *note Dynamic Extensions::, discusses the ability to dynamically add
+new built-in functions to `gawk'. As this feature is still immature
+and likely to change, its description is relegated to an appendix.
+
+* Menu:
+
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array traversal and
+ sorting arrays.
+* Two-way I/O:: Two-way communications with another process.
+* TCP/IP Networking:: Using `gawk' for network programming.
+* Profiling:: Profiling your `awk' programs.
+
+
+File: gawk.info, Node: Nondecimal Data, Next: Array Sorting, Up: Advanced Features
+
+13.1 Allowing Nondecimal Input Data
+===================================
+
+If you run `gawk' with the `--non-decimal-data' option, you can have
+nondecimal constants in your input data:
+
+ $ echo 0123 123 0x123 |
+ > gawk --non-decimal-data '{ printf "%d, %d, %d\n",
+ > $1, $2, $3 }'
+ -| 83, 123, 291
+
+ For this feature to work, write your program so that `gawk' treats
+your data as numeric:
+
+ $ echo 0123 123 0x123 | gawk '{ print $1, $2, $3 }'
+ -| 0123 123 0x123
+
+The `print' statement treats its expressions as strings. Although the
+fields can act as numbers when necessary, they are still strings, so
+`print' does not try to treat them numerically. You may need to add
+zero to a field to force it to be treated as a number. For example:
+
+ $ echo 0123 123 0x123 | gawk --non-decimal-data '
+ > { print $1, $2, $3
+ > print $1 + 0, $2 + 0, $3 + 0 }'
+ -| 0123 123 0x123
+ -| 83 123 291
+
+ Because it is common to have decimal data with leading zeros, and
+because using this facility could lead to surprising results, the
+default is to leave it disabled. If you want it, you must explicitly
+request it.
+
+ CAUTION: _Use of this option is not recommended._ It can break old
+ programs very badly. Instead, use the `strtonum()' function to
+ convert your data (*note Nondecimal-numbers::). This makes your
+ programs easier to write and easier to read, and leads to less
+ surprising results.
+
+
+File: gawk.info, Node: Array Sorting, Next: Two-way I/O, Prev: Nondecimal Data, Up: Advanced Features
+
+13.2 Controlling Array Traversal and Array Sorting
+==================================================
+
+`gawk' lets you control the order in which a `for (i in array)' loop
+traverses an array.
+
+ In addition, two built-in functions, `asort()' and `asorti()', let
+you sort arrays based on the array values and indices, respectively.
+These two functions also provide control over the sorting criteria used
+to order the elements during sorting.
+
+* Menu:
+
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use `asort()' and `asorti()'.
+
+
+File: gawk.info, Node: Controlling Array Traversal, Next: Array Sorting Functions, Up: Array Sorting
+
+13.2.1 Controlling Array Traversal
+----------------------------------
+
+By default, the order in which a `for (i in array)' loop scans an array
+is not defined; it is generally based upon the internal implementation
+of arrays inside `awk'.
+
+ Often, though, it is desirable to be able to loop over the elements
+in a particular order that you, the programmer, choose. `gawk' lets
+you do this.
+
+ *note Controlling Scanning::, describes how you can assign special,
+pre-defined values to `PROCINFO["sorted_in"]' in order to control the
+order in which `gawk' will traverse an array during a `for' loop.
+
+ In addition, the value of `PROCINFO["sorted_in"]' can be a function
+name. This lets you traverse an array based on any custom criterion.
+The array elements are ordered according to the return value of this
+function. The comparison function should be defined with at least four
+arguments:
+
+ function comp_func(i1, v1, i2, v2)
+ {
+ COMPARE ELEMENTS 1 AND 2 IN SOME FASHION
+ RETURN < 0; 0; OR > 0
+ }
+
+ Here, I1 and I2 are the indices, and V1 and V2 are the corresponding
+values of the two elements being compared. Either V1 or V2, or both,
+can be arrays if the array being traversed contains subarrays as values.
+(*Note Arrays of Arrays::, for more information about subarrays.) The
+three possible return values are interpreted as follows:
+
+`comp_func(i1, v1, i2, v2) < 0'
+ Index I1 comes before index I2 during loop traversal.
+
+`comp_func(i1, v1, i2, v2) == 0'
+ Indices I1 and I2 come together but the relative order with
+ respect to each other is undefined.
+
+`comp_func(i1, v1, i2, v2) > 0'
+ Index I1 comes after index I2 during loop traversal.
+
+ Our first comparison function can be used to scan an array in
+numerical order of the indices:
+
+ function cmp_num_idx(i1, v1, i2, v2)
+ {
+ # numerical index comparison, ascending order
+ return (i1 - i2)
+ }
+
+ Our second function traverses an array based on the string order of
+the element values rather than by indices:
+
+ function cmp_str_val(i1, v1, i2, v2)
+ {
+ # string value comparison, ascending order
+ v1 = v1 ""
+ v2 = v2 ""
+ if (v1 < v2)
+ return -1
+ return (v1 != v2)
+ }
+
+ The third comparison function makes all numbers, and numeric strings
+without any leading or trailing spaces, come out first during loop
+traversal:
+
+ function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
+ {
+ # numbers before string value comparison, ascending order
+ n1 = v1 + 0
+ n2 = v2 + 0
+ if (n1 == v1)
+ return (n2 == v2) ? (n1 - n2) : -1
+ else if (n2 == v2)
+ return 1
+ return (v1 < v2) ? -1 : (v1 != v2)
+ }
+
+ Here is a main program to demonstrate how `gawk' behaves using each
+of the previous functions:
+
+ BEGIN {
+ data["one"] = 10
+ data["two"] = 20
+ data[10] = "one"
+ data[100] = 100
+ data[20] = "two"
+
+ f[1] = "cmp_num_idx"
+ f[2] = "cmp_str_val"
+ f[3] = "cmp_num_str_val"
+ for (i = 1; i <= 3; i++) {
+ printf("Sort function: %s\n", f[i])
+ PROCINFO["sorted_in"] = f[i]
+ for (j in data)
+ printf("\tdata[%s] = %s\n", j, data[j])
+ print ""
+ }
+ }
+
+ Here are the results when the program is run:
+
+ $ gawk -f compdemo.awk
+ -| Sort function: cmp_num_idx Sort by numeric index
+ -| data[two] = 20
+ -| data[one] = 10 Both strings are numerically zero
+ -| data[10] = one
+ -| data[20] = two
+ -| data[100] = 100
+ -|
+ -| Sort function: cmp_str_val Sort by element values as strings
+ -| data[one] = 10
+ -| data[100] = 100 String 100 is less than string 20
+ -| data[two] = 20
+ -| data[10] = one
+ -| data[20] = two
+ -|
+ -| Sort function: cmp_num_str_val Sort all numeric values before all strings
+ -| data[one] = 10
+ -| data[two] = 20
+ -| data[100] = 100
+ -| data[10] = one
+ -| data[20] = two
+
+ Consider sorting the entries of a GNU/Linux system password file
+according to login name. The following program sorts records by a
+specific field position and can be used for this purpose:
+
+ # sort.awk --- simple program to sort by field position
+ # field position is specified by the global variable POS
+
+ function cmp_field(i1, v1, i2, v2)
+ {
+ # comparison by value, as string, and ascending order
+ return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
+ }
+
+ {
+ for (i = 1; i <= NF; i++)
+ a[NR][i] = $i
+ }
+
+ END {
+ PROCINFO["sorted_in"] = "cmp_field"
+ if (POS < 1 || POS > NF)
+ POS = 1
+ for (i in a) {
+ for (j = 1; j <= NF; j++)
+ printf("%s%c", a[i][j], j < NF ? ":" : "")
+ print ""
+ }
+ }
+
+ The first field in each entry of the password file is the user's
+login name, and the fields are separated by colons. Each record
+defines a subarray, with each field as an element in the subarray.
+Running the program produces the following output:
+
+ $ gawk -v POS=1 -F: -f sort.awk /etc/passwd
+ -| adm:x:3:4:adm:/var/adm:/sbin/nologin
+ -| apache:x:48:48:Apache:/var/www:/sbin/nologin
+ -| avahi:x:70:70:Avahi daemon:/:/sbin/nologin
+ ...
+
+ The comparison should normally always return the same value when
+given a specific pair of array elements as its arguments. If
+inconsistent results are returned then the order is undefined. This
+behavior can be exploited to introduce random order into otherwise
+seemingly ordered data:
+
+ function cmp_randomize(i1, v1, i2, v2)
+ {
+ # random order
+ return (2 - 4 * rand())
+ }
+
+ As mentioned above, the order of the indices is arbitrary if two
+elements compare equal. This is usually not a problem, but letting the
+tied elements come out in arbitrary order can be an issue, especially
+when comparing item values. The partial ordering of the equal elements
+may change during the next loop traversal, if other elements are added
+or removed from the array. One way to resolve ties when comparing
+elements with otherwise equal values is to include the indices in the
+comparison rules. Note that doing this may make the loop traversal
+less efficient, so consider it only if necessary. The following
+comparison functions force a deterministic order, and are based on the
+fact that the indices of two elements are never equal:
+
+ function cmp_numeric(i1, v1, i2, v2)
+ {
+ # numerical value (and index) comparison, descending order
+ return (v1 != v2) ? (v2 - v1) : (i2 - i1)
+ }
+
+ function cmp_string(i1, v1, i2, v2)
+ {
+ # string value (and index) comparison, descending order
+ v1 = v1 i1
+ v2 = v2 i2
+ return (v1 > v2) ? -1 : (v1 != v2)
+ }
+
+ A custom comparison function can often simplify ordered loop
+traversal, and the sky is really the limit when it comes to designing
+such a function.
+
+ When string comparisons are made during a sort, either for element
+values where one or both aren't numbers, or for element indices handled
+as strings, the value of `IGNORECASE' (*note Built-in Variables::)
+controls whether the comparisons treat corresponding uppercase and
+lowercase letters as equivalent or distinct.
+
+ Another point to keep in mind is that in the case of subarrays the
+element values can themselves be arrays; a production comparison
+function should use the `isarray()' function (*note Type Functions::),
+to check for this, and choose a defined sorting order for subarrays.
+
+ All sorting based on `PROCINFO["sorted_in"]' is disabled in POSIX
+mode, since the `PROCINFO' array is not special in that case.
+
+ As a side note, sorting the array indices before traversing the
+array has been reported to add 15% to 20% overhead to the execution
+time of `awk' programs. For this reason, sorted array traversal is not
+the default.
+
+
+File: gawk.info, Node: Array Sorting Functions, Prev: Controlling Array Traversal, Up: Array Sorting
+
+13.2.2 Sorting Array Values and Indices with `gawk'
+---------------------------------------------------
+
+In most `awk' implementations, sorting an array requires writing a
+`sort()' function. While this can be educational for exploring
+different sorting algorithms, usually that's not the point of the
+program. `gawk' provides the built-in `asort()' and `asorti()'
+functions (*note String Functions::) for sorting arrays. For example:
+
+ POPULATE THE ARRAY data
+ n = asort(data)
+ for (i = 1; i <= n; i++)
+ DO SOMETHING WITH data[i]
+
+ After the call to `asort()', the array `data' is indexed from 1 to
+some number N, the total number of elements in `data'. (This count is
+`asort()''s return value.) `data[1]' <= `data[2]' <= `data[3]', and so
+on. The comparison is based on the type of the elements (*note Typing
+and Comparison::). All numeric values come before all string values,
+which in turn come before all subarrays.
+
+ An important side effect of calling `asort()' is that _the array's
+original indices are irrevocably lost_. As this isn't always
+desirable, `asort()' accepts a second argument:
+
+ POPULATE THE ARRAY source
+ n = asort(source, dest)
+ for (i = 1; i <= n; i++)
+ DO SOMETHING WITH dest[i]
+
+ In this case, `gawk' copies the `source' array into the `dest' array
+and then sorts `dest', destroying its indices. However, the `source'
+array is not affected.
+
+ `asort()' accepts a third string argument to control comparison of
+array elements. As with `PROCINFO["sorted_in"]', this argument may be
+one of the predefined names that `gawk' provides (*note Controlling
+Scanning::), or the name of a user-defined function (*note Controlling
+Array Traversal::).
+
+ NOTE: In all cases, the sorted element values consist of the
+ original array's element values. The ability to control
+ comparison merely affects the way in which they are sorted.
+
+ Often, what's needed is to sort on the values of the _indices_
+instead of the values of the elements. To do that, use the `asorti()'
+function. The interface is identical to that of `asort()', except that
+the index values are used for sorting, and become the values of the
+result array:
+
+ { source[$0] = some_func($0) }
+
+ END {
+ n = asorti(source, dest)
+ for (i = 1; i <= n; i++) {
+ Work with sorted indices directly:
+ DO SOMETHING WITH dest[i]
+ ...
+ Access original array via sorted indices:
+ DO SOMETHING WITH source[dest[i]]
+ }
+ }
+
+ Similar to `asort()', in all cases, the sorted element values
+consist of the original array's indices. The ability to control
+comparison merely affects the way in which they are sorted.
+
+ Sorting the array by replacing the indices provides maximal
+flexibility. To traverse the elements in decreasing order, use a loop
+that goes from N down to 1, either over the elements or over the
+indices.(1)
+
+ Copying array indices and elements isn't expensive in terms of
+memory. Internally, `gawk' maintains "reference counts" to data. For
+example, when `asort()' copies the first array to the second one, there
+is only one copy of the original array elements' data, even though both
+arrays use the values.
+
+ Because `IGNORECASE' affects string comparisons, the value of
+`IGNORECASE' also affects sorting for both `asort()' and `asorti()'.
+Note also that the locale's sorting order does _not_ come into play;
+comparisons are based on character values only.(2) Caveat Emptor.
+
+ ---------- Footnotes ----------
+
+ (1) You may also use one of the predefined sorting names that sorts
+in decreasing order.
+
+ (2) This is true because locale-based comparison occurs only when in
+POSIX compatibility mode, and since `asort()' and `asorti()' are `gawk'
+extensions, they are not available in that case.
+
+
+File: gawk.info, Node: Two-way I/O, Next: TCP/IP Networking, Prev: Array Sorting, Up: Advanced Features
+
+13.3 Two-Way Communications with Another Process
+================================================
+
+ From: brennan@whidbey.com (Mike Brennan)
+ Newsgroups: comp.lang.awk
+ Subject: Re: Learn the SECRET to Attract Women Easily
+ Date: 4 Aug 1997 17:34:46 GMT
+ Message-ID: <5s53rm$eca@news.whidbey.com>
+
+ On 3 Aug 1997 13:17:43 GMT, Want More Dates???
+ <tracy78@kilgrona.com> wrote:
+ >Learn the SECRET to Attract Women Easily
+ >
+ >The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
+
+ The scent of awk programmers is a lot more attractive to women than
+ the scent of perl programmers.
+ --
+ Mike Brennan
+
+ It is often useful to be able to send data to a separate program for
+processing and then read the result. This can always be done with
+temporary files:
+
+ # Write the data for processing
+ tempfile = ("mydata." PROCINFO["pid"])
+ while (NOT DONE WITH DATA)
+ print DATA | ("subprogram > " tempfile)
+ close("subprogram > " tempfile)
+
+ # Read the results, remove tempfile when done
+ while ((getline newdata < tempfile) > 0)
+ PROCESS newdata APPROPRIATELY
+ close(tempfile)
+ system("rm " tempfile)
+
+This works, but not elegantly. Among other things, it requires that
+the program be run in a directory that cannot be shared among users;
+for example, `/tmp' will not do, as another user might happen to be
+using a temporary file with the same name.
+
+ However, with `gawk', it is possible to open a _two-way_ pipe to
+another process. The second process is termed a "coprocess", since it
+runs in parallel with `gawk'. The two-way connection is created using
+the `|&' operator (borrowed from the Korn shell, `ksh'):(1)
+
+ do {
+ print DATA |& "subprogram"
+ "subprogram" |& getline results
+ } while (DATA LEFT TO PROCESS)
+ close("subprogram")
+
+ The first time an I/O operation is executed using the `|&' operator,
+`gawk' creates a two-way pipeline to a child process that runs the
+other program. Output created with `print' or `printf' is written to
+the program's standard input, and output from the program's standard
+output can be read by the `gawk' program using `getline'. As is the
+case with processes started by `|', the subprogram can be any program,
+or pipeline of programs, that can be started by the shell.
+
+ There are some cautionary items to be aware of:
+
+ * As the code inside `gawk' currently stands, the coprocess's
+ standard error goes to the same place that the parent `gawk''s
+ standard error goes. It is not possible to read the child's
+ standard error separately.
+
+ * I/O buffering may be a problem. `gawk' automatically flushes all
+ output down the pipe to the coprocess. However, if the coprocess
+ does not flush its output, `gawk' may hang when doing a `getline'
+ in order to read the coprocess's results. This could lead to a
+ situation known as "deadlock", where each process is waiting for
+ the other one to do something.
+
+ It is possible to close just one end of the two-way pipe to a
+coprocess, by supplying a second argument to the `close()' function of
+either `"to"' or `"from"' (*note Close Files And Pipes::). These
+strings tell `gawk' to close the end of the pipe that sends data to the
+coprocess or the end that reads from it, respectively.
+
+ This is particularly necessary in order to use the system `sort'
+utility as part of a coprocess; `sort' must read _all_ of its input
+data before it can produce any output. The `sort' program does not
+receive an end-of-file indication until `gawk' closes the write end of
+the pipe.
+
+ When you have finished writing data to the `sort' utility, you can
+close the `"to"' end of the pipe, and then start reading sorted data
+via `getline'. For example:
+
+ BEGIN {
+ command = "LC_ALL=C sort"
+ n = split("abcdefghijklmnopqrstuvwxyz", a, "")
+
+ for (i = n; i > 0; i--)
+ print a[i] |& command
+ close(command, "to")
+
+ while ((command |& getline line) > 0)
+ print "got", line
+ close(command)
+ }
+
+ This program writes the letters of the alphabet in reverse order, one
+per line, down the two-way pipe to `sort'. It then closes the write
+end of the pipe, so that `sort' receives an end-of-file indication.
+This causes `sort' to sort the data and write the sorted data back to
+the `gawk' program. Once all of the data has been read, `gawk'
+terminates the coprocess and exits.
+
+ As a side note, the assignment `LC_ALL=C' in the `sort' command
+ensures traditional Unix (ASCII) sorting from `sort'.
+
+ You may also use pseudo-ttys (ptys) for two-way communication
+instead of pipes, if your system supports them. This is done on a
+per-command basis, by setting a special element in the `PROCINFO' array
+(*note Auto-set::), like so:
+
+ command = "sort -nr" # command, save in convenience variable
+ PROCINFO[command, "pty"] = 1 # update PROCINFO
+ print ... |& command # start two-way pipe
+ ...
+
+Using ptys avoids the buffer deadlock issues described earlier, at some
+loss in performance. If your system does not have ptys, or if all the
+system's ptys are in use, `gawk' automatically falls back to using
+regular pipes.
+
+ ---------- Footnotes ----------
+
+ (1) This is very different from the same operator in the C shell.
+
+
+File: gawk.info, Node: TCP/IP Networking, Next: Profiling, Prev: Two-way I/O, Up: Advanced Features
+
+13.4 Using `gawk' for Network Programming
+=========================================
+
+ `EMISTERED':
+ A host is a host from coast to coast,
+ and no-one can talk to host that's close,
+ unless the host that isn't close
+ is busy hung or dead.
+
+ In addition to being able to open a two-way pipeline to a coprocess
+on the same system (*note Two-way I/O::), it is possible to make a
+two-way connection to another process on another system across an IP
+network connection.
+
+ You can think of this as just a _very long_ two-way pipeline to a
+coprocess. The way `gawk' decides that you want to use TCP/IP
+networking is by recognizing special file names that begin with one of
+`/inet/', `/inet4/' or `/inet6'.
+
+ The full syntax of the special file name is
+`/NET-TYPE/PROTOCOL/LOCAL-PORT/REMOTE-HOST/REMOTE-PORT'. The
+components are:
+
+NET-TYPE
+ Specifies the kind of Internet connection to make. Use `/inet4/'
+ to force IPv4, and `/inet6/' to force IPv6. Plain `/inet/' (which
+ used to be the only option) uses the system default, most likely
+ IPv4.
+
+PROTOCOL
+ The protocol to use over IP. This must be either `tcp', or `udp',
+ for a TCP or UDP IP connection, respectively. The use of TCP is
+ recommended for most applications.
+
+LOCAL-PORT
+ The local TCP or UDP port number to use. Use a port number of `0'
+ when you want the system to pick a port. This is what you should do
+ when writing a TCP or UDP client. You may also use a well-known
+ service name, such as `smtp' or `http', in which case `gawk'
+ attempts to determine the predefined port number using the C
+ `getaddrinfo()' function.
+
+REMOTE-HOST
+ The IP address or fully-qualified domain name of the Internet host
+ to which you want to connect.
+
+REMOTE-PORT
+ The TCP or UDP port number to use on the given REMOTE-HOST.
+ Again, use `0' if you don't care, or else a well-known service
+ name.
+
+ NOTE: Failure in opening a two-way socket will result in a
+ non-fatal error being returned to the calling code. The value of
+ `ERRNO' indicates the error (*note Auto-set::).
+
+ Consider the following very simple example:
+
+ BEGIN {
+ Service = "/inet/tcp/0/localhost/daytime"
+ Service |& getline
+ print $0
+ close(Service)
+ }
+
+ This program reads the current date and time from the local system's
+TCP `daytime' server. It then prints the results and closes the
+connection.
+
+ Because this topic is extensive, the use of `gawk' for TCP/IP
+programming is documented separately. See *note (General
+Introduction)Top:: gawkinet, TCP/IP Internetworking with `gawk', for a
+much more complete introduction and discussion, as well as extensive
+examples.
+
+
+File: gawk.info, Node: Profiling, Prev: TCP/IP Networking, Up: Advanced Features
+
+13.5 Profiling Your `awk' Programs
+==================================
+
+You may produce execution traces of your `awk' programs. This is done
+by passing the option `--profile' to `gawk'. When `gawk' has finished
+running, it creates a profile of your program in a file named
+`awkprof.out'. Because it is profiling, it also executes up to 45%
+slower than `gawk' normally does.
+
+ As shown in the following example, the `--profile' option can be
+used to change the name of the file where `gawk' will write the profile:
+
+ gawk --profile=myprog.prof -f myprog.awk data1 data2
+
+In the above example, `gawk' places the profile in `myprog.prof'
+instead of in `awkprof.out'.
+
+ Here is a sample session showing a simple `awk' program, its input
+data, and the results from running `gawk' with the `--profile' option.
+First, the `awk' program:
+
+ BEGIN { print "First BEGIN rule" }
+
+ END { print "First END rule" }
+
+ /foo/ {
+ print "matched /foo/, gosh"
+ for (i = 1; i <= 3; i++)
+ sing()
+ }
+
+ {
+ if (/foo/)
+ print "if is true"
+ else
+ print "else is true"
+ }
+
+ BEGIN { print "Second BEGIN rule" }
+
+ END { print "Second END rule" }
+
+ function sing( dummy)
+ {
+ print "I gotta be me!"
+ }
+
+ Following is the input data:
+
+ foo
+ bar
+ baz
+ foo
+ junk
+
+ Here is the `awkprof.out' that results from running the `gawk'
+profiler on this program and data (this example also illustrates that
+`awk' programmers sometimes have to work late):
+
+ # gawk profile, created Sun Aug 13 00:00:15 2000
+
+ # BEGIN block(s)
+
+ BEGIN {
+ 1 print "First BEGIN rule"
+ 1 print "Second BEGIN rule"
+ }
+
+ # Rule(s)
+
+ 5 /foo/ { # 2
+ 2 print "matched /foo/, gosh"
+ 6 for (i = 1; i <= 3; i++) {
+ 6 sing()
+ }
+ }
+
+ 5 {
+ 5 if (/foo/) { # 2
+ 2 print "if is true"
+ 3 } else {
+ 3 print "else is true"
+ }
+ }
+
+ # END block(s)
+
+ END {
+ 1 print "First END rule"
+ 1 print "Second END rule"
+ }
+
+ # Functions, listed alphabetically
+
+ 6 function sing(dummy)
+ {
+ 6 print "I gotta be me!"
+ }
+
+ This example illustrates many of the basic features of profiling
+output. They are as follows:
+
+ * The program is printed in the order `BEGIN' rule, `BEGINFILE' rule,
+ pattern/action rules, `ENDFILE' rule, `END' rule and functions,
+ listed alphabetically. Multiple `BEGIN' and `END' rules are
+ merged together, as are multiple `BEGINFILE' and `ENDFILE' rules.
+
+ * Pattern-action rules have two counts. The first count, to the
+ left of the rule, shows how many times the rule's pattern was
+ _tested_. The second count, to the right of the rule's opening
+ left brace in a comment, shows how many times the rule's action
+ was _executed_. The difference between the two indicates how many
+ times the rule's pattern evaluated to false.
+
+ * Similarly, the count for an `if'-`else' statement shows how many
+ times the condition was tested. To the right of the opening left
+ brace for the `if''s body is a count showing how many times the
+ condition was true. The count for the `else' indicates how many
+ times the test failed.
+
+ * The count for a loop header (such as `for' or `while') shows how
+ many times the loop test was executed. (Because of this, you
+ can't just look at the count on the first statement in a rule to
+ determine how many times the rule was executed. If the first
+ statement is a loop, the count is misleading.)
+
+ * For user-defined functions, the count next to the `function'
+ keyword indicates how many times the function was called. The
+ counts next to the statements in the body show how many times
+ those statements were executed.
+
+ * The layout uses "K&R" style with TABs. Braces are used
+ everywhere, even when the body of an `if', `else', or loop is only
+ a single statement.
+
+ * Parentheses are used only where needed, as indicated by the
+ structure of the program and the precedence rules. For example,
+ `(3 + 5) * 4' means add three plus five, then multiply the total
+ by four. However, `3 + 5 * 4' has no parentheses, and means `3 +
+ (5 * 4)'.
+
+ * Parentheses are used around the arguments to `print' and `printf'
+ only when the `print' or `printf' statement is followed by a
+ redirection. Similarly, if the target of a redirection isn't a
+ scalar, it gets parenthesized.
+
+ * `gawk' supplies leading comments in front of the `BEGIN' and `END'
+ rules, the pattern/action rules, and the functions.
+
+
+ The profiled version of your program may not look exactly like what
+you typed when you wrote it. This is because `gawk' creates the
+profiled version by "pretty printing" its internal representation of
+the program. The advantage to this is that `gawk' can produce a
+standard representation. The disadvantage is that all source-code
+comments are lost, as are the distinctions among multiple `BEGIN',
+`END', `BEGINFILE', and `ENDFILE' rules. Also, things such as:
+
+ /foo/
+
+come out as:
+
+ /foo/ {
+ print $0
+ }
+
+which is correct, but possibly surprising.
+
+ Besides creating profiles when a program has completed, `gawk' can
+produce a profile while it is running. This is useful if your `awk'
+program goes into an infinite loop and you want to see what has been
+executed. To use this feature, run `gawk' with the `--profile' option
+in the background:
+
+ $ gawk --profile -f myprog &
+ [1] 13992
+
+The shell prints a job number and process ID number; in this case,
+13992. Use the `kill' command to send the `USR1' signal to `gawk':
+
+ $ kill -USR1 13992
+
+As usual, the profiled version of the program is written to
+`awkprof.out', or to a different file if one specified with the
+`--profile' option.
+
+ Along with the regular profile, as shown earlier, the profile
+includes a trace of any active functions:
+
+ # Function Call Stack:
+
+ # 3. baz
+ # 2. bar
+ # 1. foo
+ # -- main --
+
+ You may send `gawk' the `USR1' signal as many times as you like.
+Each time, the profile and function call trace are appended to the
+output profile file.
+
+ If you use the `HUP' signal instead of the `USR1' signal, `gawk'
+produces the profile and the function call trace and then exits.
+
+ When `gawk' runs on MS-Windows systems, it uses the `INT' and `QUIT'
+signals for producing the profile and, in the case of the `INT' signal,
+`gawk' exits. This is because these systems don't support the `kill'
+command, so the only signals you can deliver to a program are those
+generated by the keyboard. The `INT' signal is generated by the
+`Ctrl-<C>' or `Ctrl-<BREAK>' key, while the `QUIT' signal is generated
+by the `Ctrl-<\>' key.
+
+ Finally, `gawk' also accepts another option `--pretty-print'. When
+called this way, `gawk' "pretty prints" the program into `awkprof.out',
+without any execution counts.
+
+
+File: gawk.info, Node: Debugger, Next: Arbitrary Precision Arithmetic, Prev: Advanced Features, Up: Top
14 Debugging `awk' Programs
***************************
@@ -29489,8 +29510,8 @@ Index
* break debugger command: Breakpoint Control. (line 11)
* break statement: Break Statement. (line 6)
* Brennan, Michael <1>: Other Versions. (line 6)
-* Brennan, Michael <2>: Simple Sed. (line 25)
-* Brennan, Michael <3>: Two-way I/O. (line 6)
+* Brennan, Michael <2>: Two-way I/O. (line 6)
+* Brennan, Michael <3>: Simple Sed. (line 25)
* Brennan, Michael: Delete. (line 56)
* Brian Kernighan's awk, extensions <1>: Other Versions. (line 13)
* Brian Kernighan's awk, extensions: BTL. (line 6)
@@ -31042,10 +31063,10 @@ Index
* private variables: Library Names. (line 11)
* processes, two-way communications with: Two-way I/O. (line 23)
* processing data: Basic High Level. (line 6)
-* PROCINFO array <1>: Id Program. (line 15)
-* PROCINFO array <2>: Group Functions. (line 6)
-* PROCINFO array <3>: Passwd Functions. (line 6)
-* PROCINFO array <4>: Two-way I/O. (line 116)
+* PROCINFO array <1>: Two-way I/O. (line 116)
+* PROCINFO array <2>: Id Program. (line 15)
+* PROCINFO array <3>: Group Functions. (line 6)
+* PROCINFO array <4>: Passwd Functions. (line 6)
* PROCINFO array <5>: Time Functions. (line 46)
* PROCINFO array <6>: Auto-set. (line 130)
* PROCINFO array: Obsolete. (line 11)
@@ -31672,507 +31693,507 @@ Node: History47786
Node: Names50177
Ref: Names-Footnote-151654
Node: This Manual51726
-Ref: This Manual-Footnote-156854
-Node: Conventions56954
-Node: Manual History59088
-Ref: Manual History-Footnote-162358
-Ref: Manual History-Footnote-262399
-Node: How To Contribute62473
-Node: Acknowledgments63617
-Node: Getting Started68113
-Node: Running gawk70492
-Node: One-shot71678
-Node: Read Terminal72903
-Ref: Read Terminal-Footnote-174553
-Ref: Read Terminal-Footnote-274829
-Node: Long75000
-Node: Executable Scripts76376
-Ref: Executable Scripts-Footnote-178245
-Ref: Executable Scripts-Footnote-278347
-Node: Comments78894
-Node: Quoting81361
-Node: DOS Quoting85984
-Node: Sample Data Files86659
-Node: Very Simple89691
-Node: Two Rules94290
-Node: More Complex96437
-Ref: More Complex-Footnote-199367
-Node: Statements/Lines99452
-Ref: Statements/Lines-Footnote-1103914
-Node: Other Features104179
-Node: When105107
-Node: Invoking Gawk107254
-Node: Command Line108715
-Node: Options109498
-Ref: Options-Footnote-1124896
-Node: Other Arguments124921
-Node: Naming Standard Input127579
-Node: Environment Variables128673
-Node: AWKPATH Variable129231
-Ref: AWKPATH Variable-Footnote-1131989
-Node: AWKLIBPATH Variable132249
-Node: Other Environment Variables132846
-Node: Exit Status135341
-Node: Include Files136016
-Node: Loading Shared Libraries139585
-Node: Obsolete140810
-Node: Undocumented141507
-Node: Regexp141750
-Node: Regexp Usage143139
-Node: Escape Sequences145165
-Node: Regexp Operators150928
-Ref: Regexp Operators-Footnote-1158308
-Ref: Regexp Operators-Footnote-2158455
-Node: Bracket Expressions158553
-Ref: table-char-classes160443
-Node: GNU Regexp Operators162966
-Node: Case-sensitivity166689
-Ref: Case-sensitivity-Footnote-1169657
-Ref: Case-sensitivity-Footnote-2169892
-Node: Leftmost Longest170000
-Node: Computed Regexps171201
-Node: Reading Files174611
-Node: Records176614
-Ref: Records-Footnote-1185538
-Node: Fields185575
-Ref: Fields-Footnote-1188608
-Node: Nonconstant Fields188694
-Node: Changing Fields190896
-Node: Field Separators196877
-Node: Default Field Splitting199506
-Node: Regexp Field Splitting200623
-Node: Single Character Fields203965
-Node: Command Line Field Separator205024
-Node: Field Splitting Summary208465
-Ref: Field Splitting Summary-Footnote-1211657
-Node: Constant Size211758
-Node: Splitting By Content216342
-Ref: Splitting By Content-Footnote-1220068
-Node: Multiple Line220108
-Ref: Multiple Line-Footnote-1225955
-Node: Getline226134
-Node: Plain Getline228350
-Node: Getline/Variable230439
-Node: Getline/File231580
-Node: Getline/Variable/File232902
-Ref: Getline/Variable/File-Footnote-1234501
-Node: Getline/Pipe234588
-Node: Getline/Variable/Pipe237148
-Node: Getline/Coprocess238255
-Node: Getline/Variable/Coprocess239498
-Node: Getline Notes240212
-Node: Getline Summary242999
-Ref: table-getline-variants243407
-Node: Read Timeout244263
-Ref: Read Timeout-Footnote-1248008
-Node: Command line directories248065
-Node: Printing248695
-Node: Print250326
-Node: Print Examples251663
-Node: Output Separators254447
-Node: OFMT256207
-Node: Printf257565
-Node: Basic Printf258471
-Node: Control Letters260010
-Node: Format Modifiers263822
-Node: Printf Examples269831
-Node: Redirection272546
-Node: Special Files279530
-Node: Special FD280063
-Ref: Special FD-Footnote-1283688
-Node: Special Network283762
-Node: Special Caveats284612
-Node: Close Files And Pipes285408
-Ref: Close Files And Pipes-Footnote-1292431
-Ref: Close Files And Pipes-Footnote-2292579
-Node: Expressions292729
-Node: Values293861
-Node: Constants294537
-Node: Scalar Constants295217
-Ref: Scalar Constants-Footnote-1296076
-Node: Nondecimal-numbers296258
-Node: Regexp Constants299317
-Node: Using Constant Regexps299792
-Node: Variables302847
-Node: Using Variables303502
-Node: Assignment Options305226
-Node: Conversion307098
-Ref: table-locale-affects312474
-Ref: Conversion-Footnote-1313098
-Node: All Operators313207
-Node: Arithmetic Ops313837
-Node: Concatenation316342
-Ref: Concatenation-Footnote-1319135
-Node: Assignment Ops319255
-Ref: table-assign-ops324243
-Node: Increment Ops325651
-Node: Truth Values and Conditions329121
-Node: Truth Values330204
-Node: Typing and Comparison331253
-Node: Variable Typing332042
-Ref: Variable Typing-Footnote-1335939
-Node: Comparison Operators336061
-Ref: table-relational-ops336471
-Node: POSIX String Comparison340020
-Ref: POSIX String Comparison-Footnote-1340976
-Node: Boolean Ops341114
-Ref: Boolean Ops-Footnote-1345192
-Node: Conditional Exp345283
-Node: Function Calls347015
-Node: Precedence350609
-Node: Locales354278
-Node: Patterns and Actions355367
-Node: Pattern Overview356421
-Node: Regexp Patterns358090
-Node: Expression Patterns358633
-Node: Ranges362318
-Node: BEGIN/END365284
-Node: Using BEGIN/END366046
-Ref: Using BEGIN/END-Footnote-1368777
-Node: I/O And BEGIN/END368883
-Node: BEGINFILE/ENDFILE371165
-Node: Empty374069
-Node: Using Shell Variables374385
-Node: Action Overview376670
-Node: Statements379027
-Node: If Statement380881
-Node: While Statement382380
-Node: Do Statement384424
-Node: For Statement385580
-Node: Switch Statement388732
-Node: Break Statement390829
-Node: Continue Statement392819
-Node: Next Statement394612
-Node: Nextfile Statement397002
-Node: Exit Statement399643
-Node: Built-in Variables402059
-Node: User-modified403154
-Ref: User-modified-Footnote-1411509
-Node: Auto-set411571
-Ref: Auto-set-Footnote-1423922
-Ref: Auto-set-Footnote-2424127
-Node: ARGC and ARGV424183
-Node: Arrays428034
-Node: Array Basics429539
-Node: Array Intro430365
-Node: Reference to Elements434683
-Node: Assigning Elements436953
-Node: Array Example437444
-Node: Scanning an Array439176
-Node: Controlling Scanning441490
-Ref: Controlling Scanning-Footnote-1446423
-Node: Delete446739
-Ref: Delete-Footnote-1449504
-Node: Numeric Array Subscripts449561
-Node: Uninitialized Subscripts451744
-Node: Multi-dimensional453372
-Node: Multi-scanning456466
-Node: Arrays of Arrays458057
-Node: Functions462702
-Node: Built-in463524
-Node: Calling Built-in464602
-Node: Numeric Functions466590
-Ref: Numeric Functions-Footnote-1470422
-Ref: Numeric Functions-Footnote-2470779
-Ref: Numeric Functions-Footnote-3470827
-Node: String Functions471096
-Ref: String Functions-Footnote-1494593
-Ref: String Functions-Footnote-2494722
-Ref: String Functions-Footnote-3494970
-Node: Gory Details495057
-Ref: table-sub-escapes496736
-Ref: table-sub-posix-92498090
-Ref: table-sub-proposed499433
-Ref: table-posix-sub500783
-Ref: table-gensub-escapes502329
-Ref: Gory Details-Footnote-1503536
-Ref: Gory Details-Footnote-2503587
-Node: I/O Functions503738
-Ref: I/O Functions-Footnote-1510393
-Node: Time Functions510540
-Ref: Time Functions-Footnote-1521432
-Ref: Time Functions-Footnote-2521500
-Ref: Time Functions-Footnote-3521658
-Ref: Time Functions-Footnote-4521769
-Ref: Time Functions-Footnote-5521881
-Ref: Time Functions-Footnote-6522108
-Node: Bitwise Functions522374
-Ref: table-bitwise-ops522932
-Ref: Bitwise Functions-Footnote-1527153
-Node: Type Functions527337
-Node: I18N Functions527807
-Node: User-defined529434
-Node: Definition Syntax530238
-Ref: Definition Syntax-Footnote-1535148
-Node: Function Example535217
-Node: Function Caveats537811
-Node: Calling A Function538232
-Node: Variable Scope539347
-Node: Pass By Value/Reference541322
-Node: Return Statement544762
-Node: Dynamic Typing547743
-Node: Indirect Calls548478
-Node: Internationalization558163
-Node: I18N and L10N559589
-Node: Explaining gettext560275
-Ref: Explaining gettext-Footnote-1565341
-Ref: Explaining gettext-Footnote-2565525
-Node: Programmer i18n565690
-Node: Translator i18n569890
-Node: String Extraction570683
-Ref: String Extraction-Footnote-1571644
-Node: Printf Ordering571730
-Ref: Printf Ordering-Footnote-1574514
-Node: I18N Portability574578
-Ref: I18N Portability-Footnote-1577027
-Node: I18N Example577090
-Ref: I18N Example-Footnote-1579725
-Node: Gawk I18N579797
-Node: Advanced Features580414
-Node: Nondecimal Data581927
-Node: Array Sorting583510
-Node: Controlling Array Traversal584207
-Node: Array Sorting Functions592445
-Ref: Array Sorting Functions-Footnote-1596119
-Ref: Array Sorting Functions-Footnote-2596212
-Node: Two-way I/O596406
-Ref: Two-way I/O-Footnote-1601838
-Node: TCP/IP Networking601908
-Node: Profiling604752
-Node: Library Functions612206
-Ref: Library Functions-Footnote-1615213
-Node: Library Names615384
-Ref: Library Names-Footnote-1618855
-Ref: Library Names-Footnote-2619075
-Node: General Functions619161
-Node: Strtonum Function620114
-Node: Assert Function623044
-Node: Round Function626370
-Node: Cliff Random Function627913
-Node: Ordinal Functions628929
-Ref: Ordinal Functions-Footnote-1631999
-Ref: Ordinal Functions-Footnote-2632251
-Node: Join Function632460
-Ref: Join Function-Footnote-1634231
-Node: Getlocaltime Function634431
-Node: Data File Management638146
-Node: Filetrans Function638778
-Node: Rewind Function642917
-Node: File Checking644304
-Node: Empty Files645398
-Node: Ignoring Assigns647628
-Node: Getopt Function649181
-Ref: Getopt Function-Footnote-1660485
-Node: Passwd Functions660688
-Ref: Passwd Functions-Footnote-1669663
-Node: Group Functions669751
-Node: Walking Arrays677835
-Node: Sample Programs679404
-Node: Running Examples680069
-Node: Clones680797
-Node: Cut Program682021
-Node: Egrep Program691866
-Ref: Egrep Program-Footnote-1699639
-Node: Id Program699749
-Node: Split Program703365
-Ref: Split Program-Footnote-1706884
-Node: Tee Program707012
-Node: Uniq Program709815
-Node: Wc Program717244
-Ref: Wc Program-Footnote-1721510
-Ref: Wc Program-Footnote-2721710
-Node: Miscellaneous Programs721802
-Node: Dupword Program722990
-Node: Alarm Program725021
-Node: Translate Program729770
-Ref: Translate Program-Footnote-1734157
-Ref: Translate Program-Footnote-2734385
-Node: Labels Program734519
-Ref: Labels Program-Footnote-1737890
-Node: Word Sorting737974
-Node: History Sorting741858
-Node: Extract Program743697
-Ref: Extract Program-Footnote-1751180
-Node: Simple Sed751308
-Node: Igawk Program754370
-Ref: Igawk Program-Footnote-1769527
-Ref: Igawk Program-Footnote-2769728
-Node: Anagram Program769866
-Node: Signature Program772934
-Node: Debugger774034
-Node: Debugging775000
-Node: Debugging Concepts775433
-Node: Debugging Terms777289
-Node: Awk Debugging779886
-Node: Sample Debugging Session780778
-Node: Debugger Invocation781298
-Node: Finding The Bug782627
-Node: List of Debugger Commands789115
-Node: Breakpoint Control790449
-Node: Debugger Execution Control794113
-Node: Viewing And Changing Data797473
-Node: Execution Stack800829
-Node: Debugger Info802296
-Node: Miscellaneous Debugger Commands806277
-Node: Readline Support811722
-Node: Limitations812553
-Node: Arbitrary Precision Arithmetic814805
-Ref: Arbitrary Precision Arithmetic-Footnote-1816447
-Node: General Arithmetic816595
-Node: Floating Point Issues818315
-Node: String Conversion Precision819196
-Ref: String Conversion Precision-Footnote-1820902
-Node: Unexpected Results821011
-Node: POSIX Floating Point Problems823164
-Ref: POSIX Floating Point Problems-Footnote-1826989
-Node: Integer Programming827027
-Node: Floating-point Programming828780
-Ref: Floating-point Programming-Footnote-1835089
-Node: Floating-point Representation835353
-Node: Floating-point Context836518
-Ref: table-ieee-formats837360
-Node: Rounding Mode838744
-Ref: table-rounding-modes839223
-Ref: Rounding Mode-Footnote-1842227
-Node: Gawk and MPFR842408
-Node: Arbitrary Precision Floats843650
-Ref: Arbitrary Precision Floats-Footnote-1846079
-Node: Setting Precision846390
-Node: Setting Rounding Mode849123
-Ref: table-gawk-rounding-modes849527
-Node: Floating-point Constants850707
-Node: Changing Precision852131
-Ref: Changing Precision-Footnote-1853531
-Node: Exact Arithmetic853705
-Node: Arbitrary Precision Integers856813
-Ref: Arbitrary Precision Integers-Footnote-1859813
-Node: Dynamic Extensions859960
-Node: Extension Intro861283
-Node: Plugin License862486
-Node: Extension Design863160
-Node: Old Extension Problems864231
-Ref: Old Extension Problems-Footnote-1865741
-Node: Extension New Mechanism Goals865798
-Ref: Extension New Mechanism Goals-Footnote-1868510
-Node: Extension Other Design Decisions868696
-Node: Extension Mechanism Outline870443
-Ref: load-extension871468
-Ref: load-new-function872946
-Ref: call-new-function873927
-Node: Extension Future Growth875908
-Node: Extension API Description876650
-Node: Extension API Functions Introduction877970
-Node: General Data Types882045
-Ref: General Data Types-Footnote-1887678
-Node: Requesting Values887977
-Ref: table-value-types-returned888708
-Node: Constructor Functions889662
-Node: Registration Functions892658
-Node: Extension Functions893343
-Node: Exit Callback Functions895162
-Node: Extension Version String896405
-Node: Input Parsers897055
-Node: Output Wrappers905636
-Node: Two-way processors910029
-Node: Printing Messages912151
-Ref: Printing Messages-Footnote-1913228
-Node: Updating `ERRNO'913380
-Node: Accessing Parameters914119
-Node: Symbol Table Access915349
-Node: Symbol table by name915861
-Ref: Symbol table by name-Footnote-1918033
-Node: Symbol table by cookie918113
-Ref: Symbol table by cookie-Footnote-1922242
-Node: Cached values922305
-Ref: Cached values-Footnote-1925506
-Node: Array Manipulation925597
-Ref: Array Manipulation-Footnote-1926695
-Node: Array Data Types926734
-Ref: Array Data Types-Footnote-1929456
-Node: Array Functions929548
-Node: Flattening Arrays933314
-Node: Creating Arrays940145
-Node: Extension API Variables944941
-Node: Extension Versioning945577
-Node: Extension API Informational Variables947478
-Node: Extension API Boilerplate948564
-Node: Finding Extensions952398
-Node: Extension Example952945
-Node: Internal File Description953683
-Node: Internal File Ops957371
-Ref: Internal File Ops-Footnote-1968455
-Node: Using Internal File Ops968595
-Ref: Using Internal File Ops-Footnote-1970951
-Node: Extension Samples971217
-Node: Extension Sample File Functions972660
-Node: Extension Sample Fnmatch981029
-Node: Extension Sample Fork982755
-Node: Extension Sample Ord983969
-Node: Extension Sample Readdir984745
-Node: Extension Sample Revout987083
-Node: Extension Sample Rev2way987676
-Node: Extension Sample Read write array988366
-Node: Extension Sample Readfile990249
-Node: Extension Sample API Tests991004
-Node: Extension Sample Time991529
-Node: gawkextlib992838
-Node: Language History995221
-Node: V7/SVR3.1996743
-Node: SVR4999064
-Node: POSIX1000506
-Node: BTL1001514
-Node: POSIX/GNU1002248
-Node: Common Extensions1007783
-Node: Ranges and Locales1008890
-Ref: Ranges and Locales-Footnote-11013508
-Ref: Ranges and Locales-Footnote-21013535
-Ref: Ranges and Locales-Footnote-31013795
-Node: Contributors1014016
-Node: Installation1018312
-Node: Gawk Distribution1019206
-Node: Getting1019690
-Node: Extracting1020516
-Node: Distribution contents1022208
-Node: Unix Installation1027430
-Node: Quick Installation1028047
-Node: Additional Configuration Options1030009
-Node: Configuration Philosophy1031486
-Node: Non-Unix Installation1033828
-Node: PC Installation1034286
-Node: PC Binary Installation1035585
-Node: PC Compiling1037433
-Node: PC Testing1040377
-Node: PC Using1041553
-Node: Cygwin1045738
-Node: MSYS1046738
-Node: VMS Installation1047252
-Node: VMS Compilation1047855
-Ref: VMS Compilation-Footnote-11048862
-Node: VMS Installation Details1048920
-Node: VMS Running1050555
-Node: VMS Old Gawk1052162
-Node: Bugs1052636
-Node: Other Versions1056488
-Node: Notes1061803
-Node: Compatibility Mode1062390
-Node: Additions1063173
-Node: Accessing The Source1064100
-Node: Adding Code1065526
-Node: New Ports1071568
-Node: Derived Files1075703
-Ref: Derived Files-Footnote-11081008
-Ref: Derived Files-Footnote-21081042
-Ref: Derived Files-Footnote-31081642
-Node: Future Extensions1081740
-Node: Basic Concepts1083227
-Node: Basic High Level1083908
-Ref: figure-general-flow1084179
-Ref: figure-process-flow1084778
-Ref: Basic High Level-Footnote-11088007
-Node: Basic Data Typing1088192
-Node: Glossary1091547
-Node: Copying1116858
-Node: GNU Free Documentation License1154415
-Node: Index1179552
+Ref: This Manual-Footnote-157632
+Node: Conventions57732
+Node: Manual History59866
+Ref: Manual History-Footnote-163136
+Ref: Manual History-Footnote-263177
+Node: How To Contribute63251
+Node: Acknowledgments64395
+Node: Getting Started68891
+Node: Running gawk71270
+Node: One-shot72456
+Node: Read Terminal73681
+Ref: Read Terminal-Footnote-175331
+Ref: Read Terminal-Footnote-275607
+Node: Long75778
+Node: Executable Scripts77154
+Ref: Executable Scripts-Footnote-179023
+Ref: Executable Scripts-Footnote-279125
+Node: Comments79672
+Node: Quoting82139
+Node: DOS Quoting86762
+Node: Sample Data Files87437
+Node: Very Simple90469
+Node: Two Rules95068
+Node: More Complex97215
+Ref: More Complex-Footnote-1100145
+Node: Statements/Lines100230
+Ref: Statements/Lines-Footnote-1104692
+Node: Other Features104957
+Node: When105885
+Node: Invoking Gawk108032
+Node: Command Line109493
+Node: Options110276
+Ref: Options-Footnote-1125674
+Node: Other Arguments125699
+Node: Naming Standard Input128357
+Node: Environment Variables129451
+Node: AWKPATH Variable130009
+Ref: AWKPATH Variable-Footnote-1132767
+Node: AWKLIBPATH Variable133027
+Node: Other Environment Variables133624
+Node: Exit Status136119
+Node: Include Files136794
+Node: Loading Shared Libraries140363
+Node: Obsolete141588
+Node: Undocumented142285
+Node: Regexp142528
+Node: Regexp Usage143917
+Node: Escape Sequences145943
+Node: Regexp Operators151706
+Ref: Regexp Operators-Footnote-1159086
+Ref: Regexp Operators-Footnote-2159233
+Node: Bracket Expressions159331
+Ref: table-char-classes161221
+Node: GNU Regexp Operators163744
+Node: Case-sensitivity167467
+Ref: Case-sensitivity-Footnote-1170435
+Ref: Case-sensitivity-Footnote-2170670
+Node: Leftmost Longest170778
+Node: Computed Regexps171979
+Node: Reading Files175389
+Node: Records177392
+Ref: Records-Footnote-1186316
+Node: Fields186353
+Ref: Fields-Footnote-1189386
+Node: Nonconstant Fields189472
+Node: Changing Fields191674
+Node: Field Separators197655
+Node: Default Field Splitting200284
+Node: Regexp Field Splitting201401
+Node: Single Character Fields204743
+Node: Command Line Field Separator205802
+Node: Field Splitting Summary209243
+Ref: Field Splitting Summary-Footnote-1212435
+Node: Constant Size212536
+Node: Splitting By Content217120
+Ref: Splitting By Content-Footnote-1220846
+Node: Multiple Line220886
+Ref: Multiple Line-Footnote-1226733
+Node: Getline226912
+Node: Plain Getline229128
+Node: Getline/Variable231217
+Node: Getline/File232358
+Node: Getline/Variable/File233680
+Ref: Getline/Variable/File-Footnote-1235279
+Node: Getline/Pipe235366
+Node: Getline/Variable/Pipe237926
+Node: Getline/Coprocess239033
+Node: Getline/Variable/Coprocess240276
+Node: Getline Notes240990
+Node: Getline Summary243777
+Ref: table-getline-variants244185
+Node: Read Timeout245041
+Ref: Read Timeout-Footnote-1248786
+Node: Command line directories248843
+Node: Printing249473
+Node: Print251104
+Node: Print Examples252441
+Node: Output Separators255225
+Node: OFMT256985
+Node: Printf258343
+Node: Basic Printf259249
+Node: Control Letters260788
+Node: Format Modifiers264600
+Node: Printf Examples270609
+Node: Redirection273324
+Node: Special Files280308
+Node: Special FD280841
+Ref: Special FD-Footnote-1284466
+Node: Special Network284540
+Node: Special Caveats285390
+Node: Close Files And Pipes286186
+Ref: Close Files And Pipes-Footnote-1293209
+Ref: Close Files And Pipes-Footnote-2293357
+Node: Expressions293507
+Node: Values294639
+Node: Constants295315
+Node: Scalar Constants295995
+Ref: Scalar Constants-Footnote-1296854
+Node: Nondecimal-numbers297036
+Node: Regexp Constants300095
+Node: Using Constant Regexps300570
+Node: Variables303625
+Node: Using Variables304280
+Node: Assignment Options306004
+Node: Conversion307876
+Ref: table-locale-affects313252
+Ref: Conversion-Footnote-1313876
+Node: All Operators313985
+Node: Arithmetic Ops314615
+Node: Concatenation317120
+Ref: Concatenation-Footnote-1319913
+Node: Assignment Ops320033
+Ref: table-assign-ops325021
+Node: Increment Ops326429
+Node: Truth Values and Conditions329899
+Node: Truth Values330982
+Node: Typing and Comparison332031
+Node: Variable Typing332820
+Ref: Variable Typing-Footnote-1336717
+Node: Comparison Operators336839
+Ref: table-relational-ops337249
+Node: POSIX String Comparison340798
+Ref: POSIX String Comparison-Footnote-1341754
+Node: Boolean Ops341892
+Ref: Boolean Ops-Footnote-1345970
+Node: Conditional Exp346061
+Node: Function Calls347793
+Node: Precedence351387
+Node: Locales355056
+Node: Patterns and Actions356145
+Node: Pattern Overview357199
+Node: Regexp Patterns358868
+Node: Expression Patterns359411
+Node: Ranges363096
+Node: BEGIN/END366062
+Node: Using BEGIN/END366824
+Ref: Using BEGIN/END-Footnote-1369555
+Node: I/O And BEGIN/END369661
+Node: BEGINFILE/ENDFILE371943
+Node: Empty374847
+Node: Using Shell Variables375163
+Node: Action Overview377448
+Node: Statements379805
+Node: If Statement381659
+Node: While Statement383158
+Node: Do Statement385202
+Node: For Statement386358
+Node: Switch Statement389510
+Node: Break Statement391607
+Node: Continue Statement393597
+Node: Next Statement395390
+Node: Nextfile Statement397780
+Node: Exit Statement400421
+Node: Built-in Variables402837
+Node: User-modified403932
+Ref: User-modified-Footnote-1412287
+Node: Auto-set412349
+Ref: Auto-set-Footnote-1424700
+Ref: Auto-set-Footnote-2424905
+Node: ARGC and ARGV424961
+Node: Arrays428812
+Node: Array Basics430317
+Node: Array Intro431143
+Node: Reference to Elements435461
+Node: Assigning Elements437731
+Node: Array Example438222
+Node: Scanning an Array439954
+Node: Controlling Scanning442268
+Ref: Controlling Scanning-Footnote-1447201
+Node: Delete447517
+Ref: Delete-Footnote-1450282
+Node: Numeric Array Subscripts450339
+Node: Uninitialized Subscripts452522
+Node: Multi-dimensional454150
+Node: Multi-scanning457244
+Node: Arrays of Arrays458835
+Node: Functions463480
+Node: Built-in464299
+Node: Calling Built-in465377
+Node: Numeric Functions467365
+Ref: Numeric Functions-Footnote-1471197
+Ref: Numeric Functions-Footnote-2471554
+Ref: Numeric Functions-Footnote-3471602
+Node: String Functions471871
+Ref: String Functions-Footnote-1495368
+Ref: String Functions-Footnote-2495497
+Ref: String Functions-Footnote-3495745
+Node: Gory Details495832
+Ref: table-sub-escapes497511
+Ref: table-sub-posix-92498865
+Ref: table-sub-proposed500208
+Ref: table-posix-sub501558
+Ref: table-gensub-escapes503104
+Ref: Gory Details-Footnote-1504311
+Ref: Gory Details-Footnote-2504362
+Node: I/O Functions504513
+Ref: I/O Functions-Footnote-1511168
+Node: Time Functions511315
+Ref: Time Functions-Footnote-1522207
+Ref: Time Functions-Footnote-2522275
+Ref: Time Functions-Footnote-3522433
+Ref: Time Functions-Footnote-4522544
+Ref: Time Functions-Footnote-5522656
+Ref: Time Functions-Footnote-6522883
+Node: Bitwise Functions523149
+Ref: table-bitwise-ops523707
+Ref: Bitwise Functions-Footnote-1527928
+Node: Type Functions528112
+Node: I18N Functions528582
+Node: User-defined530209
+Node: Definition Syntax531013
+Ref: Definition Syntax-Footnote-1535923
+Node: Function Example535992
+Node: Function Caveats538586
+Node: Calling A Function539007
+Node: Variable Scope540122
+Node: Pass By Value/Reference542097
+Node: Return Statement545537
+Node: Dynamic Typing548518
+Node: Indirect Calls549253
+Node: Library Functions558938
+Ref: Library Functions-Footnote-1561937
+Node: Library Names562108
+Ref: Library Names-Footnote-1565579
+Ref: Library Names-Footnote-2565799
+Node: General Functions565885
+Node: Strtonum Function566838
+Node: Assert Function569768
+Node: Round Function573094
+Node: Cliff Random Function574637
+Node: Ordinal Functions575653
+Ref: Ordinal Functions-Footnote-1578723
+Ref: Ordinal Functions-Footnote-2578975
+Node: Join Function579184
+Ref: Join Function-Footnote-1580955
+Node: Getlocaltime Function581155
+Node: Data File Management584870
+Node: Filetrans Function585502
+Node: Rewind Function589641
+Node: File Checking591028
+Node: Empty Files592122
+Node: Ignoring Assigns594352
+Node: Getopt Function595905
+Ref: Getopt Function-Footnote-1607209
+Node: Passwd Functions607412
+Ref: Passwd Functions-Footnote-1616387
+Node: Group Functions616475
+Node: Walking Arrays624559
+Node: Sample Programs626128
+Node: Running Examples626805
+Node: Clones627533
+Node: Cut Program628757
+Node: Egrep Program638602
+Ref: Egrep Program-Footnote-1646375
+Node: Id Program646485
+Node: Split Program650101
+Ref: Split Program-Footnote-1653620
+Node: Tee Program653748
+Node: Uniq Program656551
+Node: Wc Program663980
+Ref: Wc Program-Footnote-1668246
+Ref: Wc Program-Footnote-2668446
+Node: Miscellaneous Programs668538
+Node: Dupword Program669726
+Node: Alarm Program671757
+Node: Translate Program676506
+Ref: Translate Program-Footnote-1680893
+Ref: Translate Program-Footnote-2681121
+Node: Labels Program681255
+Ref: Labels Program-Footnote-1684626
+Node: Word Sorting684710
+Node: History Sorting688594
+Node: Extract Program690433
+Ref: Extract Program-Footnote-1697916
+Node: Simple Sed698044
+Node: Igawk Program701106
+Ref: Igawk Program-Footnote-1716263
+Ref: Igawk Program-Footnote-2716464
+Node: Anagram Program716602
+Node: Signature Program719670
+Node: Internationalization720770
+Node: I18N and L10N722202
+Node: Explaining gettext722888
+Ref: Explaining gettext-Footnote-1727954
+Ref: Explaining gettext-Footnote-2728138
+Node: Programmer i18n728303
+Node: Translator i18n732503
+Node: String Extraction733296
+Ref: String Extraction-Footnote-1734257
+Node: Printf Ordering734343
+Ref: Printf Ordering-Footnote-1737127
+Node: I18N Portability737191
+Ref: I18N Portability-Footnote-1739640
+Node: I18N Example739703
+Ref: I18N Example-Footnote-1742338
+Node: Gawk I18N742410
+Node: Advanced Features743027
+Node: Nondecimal Data744531
+Node: Array Sorting746114
+Node: Controlling Array Traversal746811
+Node: Array Sorting Functions755049
+Ref: Array Sorting Functions-Footnote-1758723
+Ref: Array Sorting Functions-Footnote-2758816
+Node: Two-way I/O759010
+Ref: Two-way I/O-Footnote-1764442
+Node: TCP/IP Networking764512
+Node: Profiling767356
+Node: Debugger774810
+Node: Debugging775778
+Node: Debugging Concepts776211
+Node: Debugging Terms778067
+Node: Awk Debugging780664
+Node: Sample Debugging Session781556
+Node: Debugger Invocation782076
+Node: Finding The Bug783405
+Node: List of Debugger Commands789893
+Node: Breakpoint Control791227
+Node: Debugger Execution Control794891
+Node: Viewing And Changing Data798251
+Node: Execution Stack801607
+Node: Debugger Info803074
+Node: Miscellaneous Debugger Commands807055
+Node: Readline Support812500
+Node: Limitations813331
+Node: Arbitrary Precision Arithmetic815583
+Ref: Arbitrary Precision Arithmetic-Footnote-1817225
+Node: General Arithmetic817373
+Node: Floating Point Issues819093
+Node: String Conversion Precision819974
+Ref: String Conversion Precision-Footnote-1821680
+Node: Unexpected Results821789
+Node: POSIX Floating Point Problems823942
+Ref: POSIX Floating Point Problems-Footnote-1827767
+Node: Integer Programming827805
+Node: Floating-point Programming829558
+Ref: Floating-point Programming-Footnote-1835867
+Node: Floating-point Representation836131
+Node: Floating-point Context837296
+Ref: table-ieee-formats838138
+Node: Rounding Mode839522
+Ref: table-rounding-modes840001
+Ref: Rounding Mode-Footnote-1843005
+Node: Gawk and MPFR843186
+Node: Arbitrary Precision Floats844428
+Ref: Arbitrary Precision Floats-Footnote-1846857
+Node: Setting Precision847168
+Node: Setting Rounding Mode849901
+Ref: table-gawk-rounding-modes850305
+Node: Floating-point Constants851485
+Node: Changing Precision852909
+Ref: Changing Precision-Footnote-1854309
+Node: Exact Arithmetic854483
+Node: Arbitrary Precision Integers857591
+Ref: Arbitrary Precision Integers-Footnote-1860591
+Node: Dynamic Extensions860738
+Node: Extension Intro862061
+Node: Plugin License863264
+Node: Extension Design863938
+Node: Old Extension Problems865009
+Ref: Old Extension Problems-Footnote-1866519
+Node: Extension New Mechanism Goals866576
+Ref: Extension New Mechanism Goals-Footnote-1869288
+Node: Extension Other Design Decisions869474
+Node: Extension Mechanism Outline871221
+Ref: load-extension872246
+Ref: load-new-function873724
+Ref: call-new-function874705
+Node: Extension Future Growth876686
+Node: Extension API Description877428
+Node: Extension API Functions Introduction878748
+Node: General Data Types882823
+Ref: General Data Types-Footnote-1888456
+Node: Requesting Values888755
+Ref: table-value-types-returned889486
+Node: Constructor Functions890440
+Node: Registration Functions893436
+Node: Extension Functions894121
+Node: Exit Callback Functions895940
+Node: Extension Version String897183
+Node: Input Parsers897833
+Node: Output Wrappers906414
+Node: Two-way processors910807
+Node: Printing Messages912929
+Ref: Printing Messages-Footnote-1914006
+Node: Updating `ERRNO'914158
+Node: Accessing Parameters914897
+Node: Symbol Table Access916127
+Node: Symbol table by name916639
+Ref: Symbol table by name-Footnote-1918811
+Node: Symbol table by cookie918891
+Ref: Symbol table by cookie-Footnote-1923020
+Node: Cached values923083
+Ref: Cached values-Footnote-1926284
+Node: Array Manipulation926375
+Ref: Array Manipulation-Footnote-1927473
+Node: Array Data Types927512
+Ref: Array Data Types-Footnote-1930234
+Node: Array Functions930326
+Node: Flattening Arrays934092
+Node: Creating Arrays940923
+Node: Extension API Variables945719
+Node: Extension Versioning946355
+Node: Extension API Informational Variables948256
+Node: Extension API Boilerplate949342
+Node: Finding Extensions953176
+Node: Extension Example953723
+Node: Internal File Description954461
+Node: Internal File Ops958149
+Ref: Internal File Ops-Footnote-1969233
+Node: Using Internal File Ops969373
+Ref: Using Internal File Ops-Footnote-1971729
+Node: Extension Samples971995
+Node: Extension Sample File Functions973438
+Node: Extension Sample Fnmatch981807
+Node: Extension Sample Fork983533
+Node: Extension Sample Ord984747
+Node: Extension Sample Readdir985523
+Node: Extension Sample Revout987861
+Node: Extension Sample Rev2way988454
+Node: Extension Sample Read write array989144
+Node: Extension Sample Readfile991027
+Node: Extension Sample API Tests991782
+Node: Extension Sample Time992307
+Node: gawkextlib993616
+Node: Language History995999
+Node: V7/SVR3.1997521
+Node: SVR4999842
+Node: POSIX1001284
+Node: BTL1002292
+Node: POSIX/GNU1003026
+Node: Common Extensions1008561
+Node: Ranges and Locales1009668
+Ref: Ranges and Locales-Footnote-11014286
+Ref: Ranges and Locales-Footnote-21014313
+Ref: Ranges and Locales-Footnote-31014573
+Node: Contributors1014794
+Node: Installation1019090
+Node: Gawk Distribution1019984
+Node: Getting1020468
+Node: Extracting1021294
+Node: Distribution contents1022986
+Node: Unix Installation1028208
+Node: Quick Installation1028825
+Node: Additional Configuration Options1030787
+Node: Configuration Philosophy1032264
+Node: Non-Unix Installation1034606
+Node: PC Installation1035064
+Node: PC Binary Installation1036363
+Node: PC Compiling1038211
+Node: PC Testing1041155
+Node: PC Using1042331
+Node: Cygwin1046516
+Node: MSYS1047516
+Node: VMS Installation1048030
+Node: VMS Compilation1048633
+Ref: VMS Compilation-Footnote-11049640
+Node: VMS Installation Details1049698
+Node: VMS Running1051333
+Node: VMS Old Gawk1052940
+Node: Bugs1053414
+Node: Other Versions1057266
+Node: Notes1062581
+Node: Compatibility Mode1063168
+Node: Additions1063951
+Node: Accessing The Source1064878
+Node: Adding Code1066304
+Node: New Ports1072346
+Node: Derived Files1076481
+Ref: Derived Files-Footnote-11081786
+Ref: Derived Files-Footnote-21081820
+Ref: Derived Files-Footnote-31082420
+Node: Future Extensions1082518
+Node: Basic Concepts1084005
+Node: Basic High Level1084686
+Ref: figure-general-flow1084957
+Ref: figure-process-flow1085556
+Ref: Basic High Level-Footnote-11088785
+Node: Basic Data Typing1088970
+Node: Glossary1092325
+Node: Copying1117636
+Node: GNU Free Documentation License1155193
+Node: Index1180330

End Tag Table
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 7584f35f..e4ffc222 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -20,7 +20,7 @@
@c applies to and all the info about who's publishing this edition
@c These apply across the board.
-@set UPDATE-MONTH October, 2012
+@set UPDATE-MONTH November, 2012
@set VERSION 4.0
@set PATCHLEVEL 1
@@ -294,13 +294,13 @@ particular records in a file and perform operations upon them.
* Arrays:: The description and use of arrays. Also
includes array-oriented control statements.
* Functions:: Built-in and user-defined functions.
+* Library Functions:: A Library of @command{awk} Functions.
+* Sample Programs:: Many @command{awk} programs with complete
+ explanations.
* Internationalization:: Getting @command{gawk} to speak your
language.
* Advanced Features:: Stuff for advanced users, specific to
@command{gawk}.
-* Library Functions:: A Library of @command{awk} Functions.
-* Sample Programs:: Many @command{awk} programs with complete
- explanations.
* Debugger:: The @code{gawk} debugger.
* Arbitrary Precision Arithmetic:: Arbitrary precision arithmetic with
@command{gawk}.
@@ -593,28 +593,6 @@ particular records in a file and perform operations upon them.
runtime.
* Indirect Calls:: Choosing the function to call at
runtime.
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU @code{gettext} works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging @code{printf} arguments.
-* I18N Portability:: @command{awk}-level portability
- issues.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: @command{gawk} is also
- internationalized.
-* Nondecimal Data:: Allowing nondecimal input data.
-* Array Sorting:: Facilities for controlling array
- traversal and sorting arrays.
-* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
-* Array Sorting Functions:: How to use @code{asort()} and
- @code{asorti()}.
-* Two-way I/O:: Two-way communications with another
- process.
-* TCP/IP Networking:: Using @command{gawk} for network
- programming.
-* Profiling:: Profiling your @command{awk} programs.
* Library Names:: How to best name private global
variables in library functions.
* General Functions:: Functions that are of general use.
@@ -676,6 +654,28 @@ particular records in a file and perform operations upon them.
* Anagram Program:: Finding anagrams from a dictionary.
* Signature Program:: People do amazing things with too much
time on their hands.
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU @code{gettext} works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging @code{printf} arguments.
+* I18N Portability:: @command{awk}-level portability
+ issues.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: @command{gawk} is also
+ internationalized.
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array
+ traversal and sorting arrays.
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use @code{asort()} and
+ @code{asorti()}.
+* Two-way I/O:: Two-way communications with another
+ process.
+* TCP/IP Networking:: Using @command{gawk} for network
+ programming.
+* Profiling:: Profiling your @command{awk} programs.
* Debugging:: Introduction to @command{gawk}
debugger.
* Debugging Concepts:: Debugging in General.
@@ -1251,6 +1251,12 @@ expert should find useful. In particular, the description of POSIX
@ref{Sample Programs},
should be of interest.
+This @value{DOCUMENT} is split into several parts, as follows:
+
+Part I describes the @command{awk} language and @command{gawk} program in detail.
+It starts with the basics, and continues through all of the features of @command{awk}.
+It contains the following chapters:
+
@ref{Getting Started},
provides the essentials you need to know to begin using @command{awk}.
@@ -1294,6 +1300,22 @@ describes the built-in functions @command{awk} and
@command{gawk} provide, as well as how to define
your own functions.
+Part II shows how to use @command{awk} and @command{gawk} for problem solving.
+There is lots of code here for you to read and learn from.
+It contains the following chapters:
+
+@ref{Library Functions}, which provides a number of functions meant to
+be used from main @command{awk} programs.
+
+@ref{Sample Programs},
+which provides many sample @command{awk} programs.
+
+Reading these two chapters allows you to see @command{awk}
+solving real problems.
+
+Part III focuses on features specific to @command{gawk}.
+It contains the following chapters:
+
@ref{Internationalization},
describes special features in @command{gawk} for translating program
messages into different languages at runtime.
@@ -1305,12 +1327,6 @@ are the abilities to have two-way communications with another process,
perform TCP/IP networking, and
profile your @command{awk} programs.
-@ref{Library Functions}, and
-@ref{Sample Programs},
-provide many sample @command{awk} programs.
-Reading them allows you to see @command{awk}
-solving real problems.
-
@ref{Debugger}, describes the @command{awk} debugger.
@ref{Arbitrary Precision Arithmetic},
@@ -1320,6 +1336,10 @@ describes advanced arithmetic facilities provided by
@ref{Dynamic Extensions}, describes how to add new variables and
functions to @command{gawk} by writing extensions in C.
+Part IV provides the appendices, the Glossary, and two licenses that cover
+the @command{gawk} source code and this @value{DOCUMENT}, respectively.
+It contains the following appendices:
+
@ref{Language History},
describes how the @command{awk} language has evolved since
its first release to present. It also describes how @command{gawk}
@@ -1780,12 +1800,14 @@ Nof Ayalon @*
ISRAEL @*
March, 2011
-@ignore
-@c Try this
@iftex
-@page
-@headings off
-@majorheading I@ @ @ @ The @command{awk} Language and @command{gawk}
+@part Part I:@* The @command{awk} Language
+@end iftex
+
+@ignore
+@ifdocbook
+@part Part I:@* The @command{awk} Language
+
Part I describes the @command{awk} language and @command{gawk} program in detail.
It starts with the basics, and continues through all of the features of @command{awk}
and @command{gawk}. It contains the following chapters:
@@ -1795,6 +1817,9 @@ and @command{gawk}. It contains the following chapters:
@ref{Getting Started}.
@item
+@ref{Invoking Gawk}.
+
+@item
@ref{Regexp}.
@item
@@ -1814,21 +1839,8 @@ and @command{gawk}. It contains the following chapters:
@item
@ref{Functions}.
-
-@item
-@ref{Internationalization}.
-
-@item
-@ref{Advanced Features}.
-
-@item
-@ref{Invoking Gawk}.
@end itemize
-
-@page
-@evenheading @thispage@ @ @ @strong{@value{TITLE}} @| @|
-@oddheading @| @| @strong{@thischapter}@ @ @ @thispage
-@end iftex
+@end ifdocbook
@end ignore
@node Getting Started
@@ -4181,31 +4193,6 @@ long-undocumented ``feature'' of Unix @code{awk}.
@end ignore
-@ignore
-@c Try this
-@iftex
-@page
-@headings off
-@majorheading II@ @ @ Using @command{awk} and @command{gawk}
-Part II shows how to use @command{awk} and @command{gawk} for problem solving.
-There is lots of code here for you to read and learn from.
-It contains the following chapters:
-
-@itemize @bullet
-@item
-@ref{Library Functions}.
-
-@item
-@ref{Sample Programs}.
-
-@end itemize
-
-@page
-@evenheading @thispage@ @ @ @strong{@value{TITLE}} @| @|
-@oddheading @| @| @strong{@thischapter}@ @ @ @thispage
-@end iftex
-@end ignore
-
@node Regexp
@chapter Regular Expressions
@cindex regexp, See regular expressions
@@ -17932,1891 +17919,28 @@ for (i = 1; i <= n; i++)
@c ENDOFRANGE funcud
-@node Internationalization
-@chapter Internationalization with @command{gawk}
-
-Once upon a time, computer makers
-wrote software that worked only in English.
-Eventually, hardware and software vendors noticed that if their
-systems worked in the native languages of non-English-speaking
-countries, they were able to sell more systems.
-As a result, internationalization and localization
-of programs and software systems became a common practice.
-
-@c STARTOFRANGE inloc
-@cindex internationalization, localization
-@cindex @command{gawk}, internationalization and, See internationalization
-@cindex internationalization, localization, @command{gawk} and
-For many years, the ability to provide internationalization
-was largely restricted to programs written in C and C++.
-This @value{CHAPTER} describes the underlying library @command{gawk}
-uses for internationalization, as well as how
-@command{gawk} makes internationalization
-features available at the @command{awk} program level.
-Having internationalization available at the @command{awk} level
-gives software developers additional flexibility---they are no
-longer forced to write in C or C++ when internationalization is
-a requirement.
-
-@menu
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU @code{gettext} works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: @command{gawk} is also internationalized.
-@end menu
-
-@node I18N and L10N
-@section Internationalization and Localization
-
-@cindex internationalization
-@cindex localization, See internationalization@comma{} localization
-@cindex localization
-@dfn{Internationalization} means writing (or modifying) a program once,
-in such a way that it can use multiple languages without requiring
-further source-code changes.
-@dfn{Localization} means providing the data necessary for an
-internationalized program to work in a particular language.
-Most typically, these terms refer to features such as the language
-used for printing error messages, the language used to read
-responses, and information related to how numerical and
-monetary values are printed and read.
-
-@node Explaining gettext
-@section GNU @code{gettext}
-
-@cindex internationalizing a program
-@c STARTOFRANGE gettex
-@cindex @code{gettext} library
-The facilities in GNU @code{gettext} focus on messages; strings printed
-by a program, either directly or via formatting with @code{printf} or
-@code{sprintf()}.@footnote{For some operating systems, the @command{gawk}
-port doesn't support GNU @code{gettext}.
-Therefore, these features are not available
-if you are using one of those operating systems. Sorry.}
-
-@cindex portability, @code{gettext} library and
-When using GNU @code{gettext}, each application has its own
-@dfn{text domain}. This is a unique name, such as @samp{kpilot} or @samp{gawk},
-that identifies the application.
-A complete application may have multiple components---programs written
-in C or C++, as well as scripts written in @command{sh} or @command{awk}.
-All of the components use the same text domain.
-
-To make the discussion concrete, assume we're writing an application
-named @command{guide}. Internationalization consists of the
-following steps, in this order:
-
-@enumerate
-@item
-The programmer goes
-through the source for all of @command{guide}'s components
-and marks each string that is a candidate for translation.
-For example, @code{"`-F': option required"} is a good candidate for translation.
-A table with strings of option names is not (e.g., @command{gawk}'s
-@option{--profile} option should remain the same, no matter what the local
-language).
-
-@cindex @code{textdomain()} function (C library)
-@item
-The programmer indicates the application's text domain
-(@code{"guide"}) to the @code{gettext} library,
-by calling the @code{textdomain()} function.
-
-@cindex @code{.pot} files
-@cindex files, @code{.pot}
-@cindex portable object template files
-@cindex files, portable object template
-@item
-Messages from the application are extracted from the source code and
-collected into a portable object template file (@file{guide.pot}),
-which lists the strings and their translations.
-The translations are initially empty.
-The original (usually English) messages serve as the key for
-lookup of the translations.
-
-@cindex @code{.po} files
-@cindex files, @code{.po}
-@cindex portable object files
-@cindex files, portable object
-@item
-For each language with a translator, @file{guide.pot}
-is copied to a portable object file (@code{.po})
-and translations are created and shipped with the application.
-For example, there might be a @file{fr.po} for a French translation.
-
-@cindex @code{.mo} files
-@cindex files, @code{.mo}
-@cindex message object files
-@cindex files, message object
-@item
-Each language's @file{.po} file is converted into a binary
-message object (@file{.mo}) file.
-A message object file contains the original messages and their
-translations in a binary format that allows fast lookup of translations
-at runtime.
-
-@item
-When @command{guide} is built and installed, the binary translation files
-are installed in a standard place.
-
-@cindex @code{bindtextdomain()} function (C library)
-@item
-For testing and development, it is possible to tell @code{gettext}
-to use @file{.mo} files in a different directory than the standard
-one by using the @code{bindtextdomain()} function.
-
-@cindex @code{.mo} files, specifying directory of
-@cindex files, @code{.mo}, specifying directory of
-@cindex message object files, specifying directory of
-@cindex files, message object, specifying directory of
-@item
-At runtime, @command{guide} looks up each string via a call
-to @code{gettext()}. The returned string is the translated string
-if available, or the original string if not.
-
-@item
-If necessary, it is possible to access messages from a different
-text domain than the one belonging to the application, without
-having to switch the application's default text domain back
-and forth.
-@end enumerate
-
-@cindex @code{gettext()} function (C library)
-In C (or C++), the string marking and dynamic translation lookup
-are accomplished by wrapping each string in a call to @code{gettext()}:
-
-@example
-printf("%s", gettext("Don't Panic!\n"));
-@end example
-
-The tools that extract messages from source code pull out all
-strings enclosed in calls to @code{gettext()}.
-
-@cindex @code{_} (underscore), @code{_} C macro
-@cindex underscore (@code{_}), @code{_} C macro
-The GNU @code{gettext} developers, recognizing that typing
-@samp{gettext(@dots{})} over and over again is both painful and ugly to look
-at, use the macro @samp{_} (an underscore) to make things easier:
-
-@example
-/* In the standard header file: */
-#define _(str) gettext(str)
-
-/* In the program text: */
-printf("%s", _("Don't Panic!\n"));
-@end example
-
-@cindex internationalization, localization, locale categories
-@cindex @code{gettext} library, locale categories
-@cindex locale categories
-@noindent
-This reduces the typing overhead to just three extra characters per string
-and is considerably easier to read as well.
-
-There are locale @dfn{categories}
-for different types of locale-related information.
-The defined locale categories that @code{gettext} knows about are:
-
-@table @code
-@cindex @code{LC_MESSAGES} locale category
-@item LC_MESSAGES
-Text messages. This is the default category for @code{gettext}
-operations, but it is possible to supply a different one explicitly,
-if necessary. (It is almost never necessary to supply a different category.)
-
-@cindex sorting characters in different languages
-@cindex @code{LC_COLLATE} locale category
-@item LC_COLLATE
-Text-collation information; i.e., how different characters
-and/or groups of characters sort in a given language.
-
-@cindex @code{LC_CTYPE} locale category
-@item LC_CTYPE
-Character-type information (alphabetic, digit, upper- or lowercase, and
-so on).
-This information is accessed via the
-POSIX character classes in regular expressions,
-such as @code{/[[:alnum:]]/}
-(@pxref{Regexp Operators}).
-
-@cindex monetary information, localization
-@cindex currency symbols, localization
-@cindex @code{LC_MONETARY} locale category
-@item LC_MONETARY
-Monetary information, such as the currency symbol, and whether the
-symbol goes before or after a number.
-
-@cindex @code{LC_NUMERIC} locale category
-@item LC_NUMERIC
-Numeric information, such as which characters to use for the decimal
-point and the thousands separator.@footnote{Americans
-use a comma every three decimal places and a period for the decimal
-point, while many Europeans do exactly the opposite:
-1,234.56 versus 1.234,56.}
-
-@cindex @code{LC_RESPONSE} locale category
-@item LC_RESPONSE
-Response information, such as how ``yes'' and ``no'' appear in the
-local language, and possibly other information as well.
-
-@cindex time, localization and
-@cindex dates, information related to@comma{} localization
-@cindex @code{LC_TIME} locale category
-@item LC_TIME
-Time- and date-related information, such as 12- or 24-hour clock, month printed
-before or after the day in a date, local month abbreviations, and so on.
-
-@cindex @code{LC_ALL} locale category
-@item LC_ALL
-All of the above. (Not too useful in the context of @code{gettext}.)
-@end table
-@c ENDOFRANGE gettex
-
-@node Programmer i18n
-@section Internationalizing @command{awk} Programs
-@c STARTOFRANGE inap
-@cindex @command{awk} programs, internationalizing
-
-@command{gawk} provides the following variables and functions for
-internationalization:
-
-@table @code
-@cindex @code{TEXTDOMAIN} variable
-@item TEXTDOMAIN
-This variable indicates the application's text domain.
-For compatibility with GNU @code{gettext}, the default
-value is @code{"messages"}.
-
-@cindex internationalization, localization, marked strings
-@cindex strings, for localization
-@item _"your message here"
-String constants marked with a leading underscore
-are candidates for translation at runtime.
-String constants without a leading underscore are not translated.
-
-@cindex @code{dcgettext()} function (@command{gawk})
-@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
-Return the translation of @var{string} in
-text domain @var{domain} for locale category @var{category}.
-The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
-The default value for @var{category} is @code{"LC_MESSAGES"}.
-
-If you supply a value for @var{category}, it must be a string equal to
-one of the known locale categories described in
-@ifnotinfo
-the previous @value{SECTION}.
-@end ifnotinfo
-@ifinfo
-@ref{Explaining gettext}.
-@end ifinfo
-You must also supply a text domain. Use @code{TEXTDOMAIN} if
-you want to use the current domain.
-
-@quotation CAUTION
-The order of arguments to the @command{awk} version
-of the @code{dcgettext()} function is purposely different from the order for
-the C version. The @command{awk} version's order was
-chosen to be simple and to allow for reasonable @command{awk}-style
-default arguments.
-@end quotation
-
-@cindex @code{dcngettext()} function (@command{gawk})
-@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
-Return the plural form used for @var{number} of the
-translation of @var{string1} and @var{string2} in text domain
-@var{domain} for locale category @var{category}. @var{string1} is the
-English singular variant of a message, and @var{string2} the English plural
-variant of the same message.
-The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
-The default value for @var{category} is @code{"LC_MESSAGES"}.
-
-The same remarks about argument order as for the @code{dcgettext()} function apply.
-
-@cindex @code{.mo} files, specifying directory of
-@cindex files, @code{.mo}, specifying directory of
-@cindex message object files, specifying directory of
-@cindex files, message object, specifying directory of
-@cindex @code{bindtextdomain()} function (@command{gawk})
-@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]})
-Change the directory in which
-@code{gettext} looks for @file{.mo} files, in case they
-will not or cannot be placed in the standard locations
-(e.g., during testing).
-Return the directory in which @var{domain} is ``bound.''
-
-The default @var{domain} is the value of @code{TEXTDOMAIN}.
-If @var{directory} is the null string (@code{""}), then
-@code{bindtextdomain()} returns the current binding for the
-given @var{domain}.
-@end table
-
-To use these facilities in your @command{awk} program, follow the steps
-outlined in
-@ifnotinfo
-the previous @value{SECTION},
-@end ifnotinfo
-@ifinfo
-@ref{Explaining gettext},
-@end ifinfo
-like so:
-
-@enumerate
-@cindex @code{BEGIN} pattern, @code{TEXTDOMAIN} variable and
-@cindex @code{TEXTDOMAIN} variable, @code{BEGIN} pattern and
-@item
-Set the variable @code{TEXTDOMAIN} to the text domain of
-your program. This is best done in a @code{BEGIN} rule
-(@pxref{BEGIN/END}),
-or it can also be done via the @option{-v} command-line
-option (@pxref{Options}):
-
-@example
-BEGIN @{
- TEXTDOMAIN = "guide"
- @dots{}
-@}
-@end example
-
-@cindex @code{_} (underscore), translatable string
-@cindex underscore (@code{_}), translatable string
-@item
-Mark all translatable strings with a leading underscore (@samp{_})
-character. It @emph{must} be adjacent to the opening
-quote of the string. For example:
-
-@example
-print _"hello, world"
-x = _"you goofed"
-printf(_"Number of users is %d\n", nusers)
-@end example
-
-@item
-If you are creating strings dynamically, you can
-still translate them, using the @code{dcgettext()}
-built-in function:
-
-@example
-message = nusers " users logged in"
-message = dcgettext(message, "adminprog")
-print message
-@end example
-
-Here, the call to @code{dcgettext()} supplies a different
-text domain (@code{"adminprog"}) in which to find the
-message, but it uses the default @code{"LC_MESSAGES"} category.
-
-@cindex @code{LC_MESSAGES} locale category, @code{bindtextdomain()} function (@command{gawk})
-@item
-During development, you might want to put the @file{.mo}
-file in a private directory for testing. This is done
-with the @code{bindtextdomain()} built-in function:
-
-@example
-BEGIN @{
- TEXTDOMAIN = "guide" # our text domain
- if (Testing) @{
- # where to find our files
- bindtextdomain("testdir")
- # joe is in charge of adminprog
- bindtextdomain("../joe/testdir", "adminprog")
- @}
- @dots{}
-@}
-@end example
-
-@end enumerate
-
-@xref{I18N Example},
-for an example program showing the steps to create
-and use translations from @command{awk}.
-
-@node Translator i18n
-@section Translating @command{awk} Programs
-
-@cindex @code{.po} files
-@cindex files, @code{.po}
-@cindex portable object files
-@cindex files, portable object
-Once a program's translatable strings have been marked, they must
-be extracted to create the initial @file{.po} file.
-As part of translation, it is often helpful to rearrange the order
-in which arguments to @code{printf} are output.
-
-@command{gawk}'s @option{--gen-pot} command-line option extracts
-the messages and is discussed next.
-After that, @code{printf}'s ability to
-rearrange the order for @code{printf} arguments at runtime
-is covered.
-
-@menu
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging @code{printf} arguments.
-* I18N Portability:: @command{awk}-level portability issues.
-@end menu
-
-@node String Extraction
-@subsection Extracting Marked Strings
-@cindex strings, extracting
-@cindex marked strings@comma{} extracting
-@cindex @code{--gen-pot} option
-@cindex command-line options, string extraction
-@cindex string extraction (internationalization)
-@cindex marked string extraction (internationalization)
-@cindex extraction, of marked strings (internationalization)
-
-@cindex @code{--gen-pot} option
-Once your @command{awk} program is working, and all the strings have
-been marked and you've set (and perhaps bound) the text domain,
-it is time to produce translations.
-First, use the @option{--gen-pot} command-line option to create
-the initial @file{.pot} file:
-
-@example
-$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
-@end example
-
-@cindex @code{xgettext} utility
-When run with @option{--gen-pot}, @command{gawk} does not execute your
-program. Instead, it parses it as usual and prints all marked strings
-to standard output in the format of a GNU @code{gettext} Portable Object
-file. Also included in the output are any constant strings that
-appear as the first argument to @code{dcgettext()} or as the first and
-second argument to @code{dcngettext()}.@footnote{The
-@command{xgettext} utility that comes with GNU
-@code{gettext} can handle @file{.awk} files.}
-@xref{I18N Example},
-for the full list of steps to go through to create and test
-translations for @command{guide}.
-
-@node Printf Ordering
-@subsection Rearranging @code{printf} Arguments
-
-@cindex @code{printf} statement, positional specifiers
-@cindex positional specifiers, @code{printf} statement
-Format strings for @code{printf} and @code{sprintf()}
-(@pxref{Printf})
-present a special problem for translation.
-Consider the following:@footnote{This example is borrowed
-from the GNU @code{gettext} manual.}
-
-@c line broken here only for smallbook format
-@example
-printf(_"String `%s' has %d characters\n",
- string, length(string)))
-@end example
-
-A possible German translation for this might be:
-
-@example
-"%d Zeichen lang ist die Zeichenkette `%s'\n"
-@end example
-
-The problem should be obvious: the order of the format
-specifications is different from the original!
-Even though @code{gettext()} can return the translated string
-at runtime,
-it cannot change the argument order in the call to @code{printf}.
-
-To solve this problem, @code{printf} format specifiers may have
-an additional optional element, which we call a @dfn{positional specifier}.
-For example:
-
-@example
-"%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
-@end example
-
-Here, the positional specifier consists of an integer count, which indicates which
-argument to use, and a @samp{$}. Counts are one-based, and the
-format string itself is @emph{not} included. Thus, in the following
-example, @samp{string} is the first argument and @samp{length(string)} is the second:
-
-@example
-$ @kbd{gawk 'BEGIN @{}
-> @kbd{string = "Dont Panic"}
-> @kbd{printf _"%2$d characters live in \"%1$s\"\n",}
-> @kbd{string, length(string)}
-> @kbd{@}'}
-@print{} 10 characters live in "Dont Panic"
-@end example
-
-If present, positional specifiers come first in the format specification,
-before the flags, the field width, and/or the precision.
-
-Positional specifiers can be used with the dynamic field width and
-precision capability:
-
-@example
-$ @kbd{gawk 'BEGIN @{}
-> @kbd{printf("%*.*s\n", 10, 20, "hello")}
-> @kbd{printf("%3$*2$.*1$s\n", 20, 10, "hello")}
-> @kbd{@}'}
-@print{} hello
-@print{} hello
-@end example
-
-@quotation NOTE
-When using @samp{*} with a positional specifier, the @samp{*}
-comes first, then the integer position, and then the @samp{$}.
-This is somewhat counterintuitive.
-@end quotation
-
-@cindex @code{printf} statement, positional specifiers, mixing with regular formats
-@cindex positional specifiers, @code{printf} statement, mixing with regular formats
-@cindex format specifiers, mixing regular with positional specifiers
-@command{gawk} does not allow you to mix regular format specifiers
-and those with positional specifiers in the same string:
-
-@example
-$ @kbd{gawk 'BEGIN @{ printf _"%d %3$s\n", 1, 2, "hi" @}'}
-@error{} gawk: cmd. line:1: fatal: must use `count$' on all formats or none
-@end example
-
-@quotation NOTE
-There are some pathological cases that @command{gawk} may fail to
-diagnose. In such cases, the output may not be what you expect.
-It's still a bad idea to try mixing them, even if @command{gawk}
-doesn't detect it.
-@end quotation
-
-Although positional specifiers can be used directly in @command{awk} programs,
-their primary purpose is to help in producing correct translations of
-format strings into languages different from the one in which the program
-is first written.
-
-@node I18N Portability
-@subsection @command{awk} Portability Issues
-
-@cindex portability, internationalization and
-@cindex internationalization, localization, portability and
-@command{gawk}'s internationalization features were purposely chosen to
-have as little impact as possible on the portability of @command{awk}
-programs that use them to other versions of @command{awk}.
-Consider this program:
-
-@example
-BEGIN @{
- TEXTDOMAIN = "guide"
- if (Test_Guide) # set with -v
- bindtextdomain("/test/guide/messages")
- print _"don't panic!"
-@}
-@end example
-
-@noindent
-As written, it won't work on other versions of @command{awk}.
-However, it is actually almost portable, requiring very little
-change:
-
-@itemize @bullet
-@cindex @code{TEXTDOMAIN} variable, portability and
-@item
-Assignments to @code{TEXTDOMAIN} won't have any effect,
-since @code{TEXTDOMAIN} is not special in other @command{awk} implementations.
-
-@item
-Non-GNU versions of @command{awk} treat marked strings
-as the concatenation of a variable named @code{_} with the string
-following it.@footnote{This is good fodder for an ``Obfuscated
-@command{awk}'' contest.} Typically, the variable @code{_} has
-the null string (@code{""}) as its value, leaving the original string constant as
-the result.
-
-@item
-By defining ``dummy'' functions to replace @code{dcgettext()}, @code{dcngettext()}
-and @code{bindtextdomain()}, the @command{awk} program can be made to run, but
-all the messages are output in the original language.
-For example:
-
-@cindex @code{bindtextdomain()} function (@command{gawk}), portability and
-@cindex @code{dcgettext()} function (@command{gawk}), portability and
-@cindex @code{dcngettext()} function (@command{gawk}), portability and
-@example
-@c file eg/lib/libintl.awk
-function bindtextdomain(dir, domain)
-@{
- return dir
-@}
-
-function dcgettext(string, domain, category)
-@{
- return string
-@}
-
-function dcngettext(string1, string2, number, domain, category)
-@{
- return (number == 1 ? string1 : string2)
-@}
-@c endfile
-@end example
-
-@item
-The use of positional specifications in @code{printf} or
-@code{sprintf()} is @emph{not} portable.
-To support @code{gettext()} at the C level, many systems' C versions of
-@code{sprintf()} do support positional specifiers. But it works only if
-enough arguments are supplied in the function call. Many versions of
-@command{awk} pass @code{printf} formats and arguments unchanged to the
-underlying C library version of @code{sprintf()}, but only one format and
-argument at a time. What happens if a positional specification is
-used is anybody's guess.
-However, since the positional specifications are primarily for use in
-@emph{translated} format strings, and since non-GNU @command{awk}s never
-retrieve the translated string, this should not be a problem in practice.
-@end itemize
-@c ENDOFRANGE inap
-
-@node I18N Example
-@section A Simple Internationalization Example
-
-Now let's look at a step-by-step example of how to internationalize and
-localize a simple @command{awk} program, using @file{guide.awk} as our
-original source:
-
-@example
-@c file eg/prog/guide.awk
-BEGIN @{
- TEXTDOMAIN = "guide"
- bindtextdomain(".") # for testing
- print _"Don't Panic"
- print _"The Answer Is", 42
- print "Pardon me, Zaphod who?"
-@}
-@c endfile
-@end example
-
-@noindent
-Run @samp{gawk --gen-pot} to create the @file{.pot} file:
-
-@example
-$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
-@end example
-
-@noindent
-This produces:
-
-@example
-@c file eg/data/guide.po
-#: guide.awk:4
-msgid "Don't Panic"
-msgstr ""
-
-#: guide.awk:5
-msgid "The Answer Is"
-msgstr ""
-
-@c endfile
-@end example
-
-This original portable object template file is saved and reused for each language
-into which the application is translated. The @code{msgid}
-is the original string and the @code{msgstr} is the translation.
-
-@quotation NOTE
-Strings not marked with a leading underscore do not
-appear in the @file{guide.pot} file.
-@end quotation
-
-Next, the messages must be translated.
-Here is a translation to a hypothetical dialect of English,
-called ``Mellow'':@footnote{Perhaps it would be better if it were
-called ``Hippy.'' Ah, well.}
-
-@example
-@group
-$ cp guide.pot guide-mellow.po
-@var{Add translations to} guide-mellow.po @dots{}
-@end group
-@end example
-
-@noindent
-Following are the translations:
-
-@example
-@c file eg/data/guide-mellow.po
-#: guide.awk:4
-msgid "Don't Panic"
-msgstr "Hey man, relax!"
-
-#: guide.awk:5
-msgid "The Answer Is"
-msgstr "Like, the scoop is"
-
-@c endfile
-@end example
-
-@cindex Linux
-@cindex GNU/Linux
-The next step is to make the directory to hold the binary message object
-file and then to create the @file{guide.mo} file.
-The directory layout shown here is standard for GNU @code{gettext} on
-GNU/Linux systems. Other versions of @code{gettext} may use a different
-layout:
-
-@example
-$ @kbd{mkdir en_US en_US/LC_MESSAGES}
-@end example
-
-@cindex @code{.po} files, converting to @code{.mo}
-@cindex files, @code{.po}, converting to @code{.mo}
-@cindex @code{.mo} files, converting from @code{.po}
-@cindex files, @code{.mo}, converting from @code{.po}
-@cindex portable object files, converting to message object files
-@cindex files, portable object, converting to message object files
-@cindex message object files, converting from portable object files
-@cindex files, message object, converting from portable object files
-@cindex @command{msgfmt} utility
-The @command{msgfmt} utility does the conversion from human-readable
-@file{.po} file to machine-readable @file{.mo} file.
-By default, @command{msgfmt} creates a file named @file{messages}.
-This file must be renamed and placed in the proper directory so that
-@command{gawk} can find it:
-
-@example
-$ @kbd{msgfmt guide-mellow.po}
-$ @kbd{mv messages en_US/LC_MESSAGES/guide.mo}
-@end example
-
-Finally, we run the program to test it:
-
-@example
-$ @kbd{gawk -f guide.awk}
-@print{} Hey man, relax!
-@print{} Like, the scoop is 42
-@print{} Pardon me, Zaphod who?
-@end example
-
-If the three replacement functions for @code{dcgettext()}, @code{dcngettext()}
-and @code{bindtextdomain()}
-(@pxref{I18N Portability})
-are in a file named @file{libintl.awk},
-then we can run @file{guide.awk} unchanged as follows:
-
-@example
-$ @kbd{gawk --posix -f guide.awk -f libintl.awk}
-@print{} Don't Panic
-@print{} The Answer Is 42
-@print{} Pardon me, Zaphod who?
-@end example
-
-@node Gawk I18N
-@section @command{gawk} Can Speak Your Language
-
-@command{gawk} itself has been internationalized
-using the GNU @code{gettext} package.
-(GNU @code{gettext} is described in
-complete detail in
-@ifinfo
-@inforef{Top, , GNU @code{gettext} utilities, gettext, GNU gettext tools}.)
-@end ifinfo
-@ifnotinfo
-@cite{GNU gettext tools}.)
-@end ifnotinfo
-As of this writing, the latest version of GNU @code{gettext} is
-@uref{ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.1.tar.gz, @value{PVERSION} 0.18.1}.
-
-If a translation of @command{gawk}'s messages exists,
-then @command{gawk} produces usage messages, warnings,
-and fatal errors in the local language.
-@c ENDOFRANGE inloc
+@iftex
+@part Part II:@* Problem Solving With @command{awk}
+@end iftex
-@node Advanced Features
-@chapter Advanced Features of @command{gawk}
-@cindex advanced features, network connections, See Also networks, connections
-@c STARTOFRANGE gawadv
-@cindex @command{gawk}, features, advanced
-@c STARTOFRANGE advgaw
-@cindex advanced features, @command{gawk}
@ignore
-Contributed by: Peter Langston <pud!psl@bellcore.bellcore.com>
-
- Found in Steve English's "signature" line:
-
-"Write documentation as if whoever reads it is a violent psychopath
-who knows where you live."
-@end ignore
-@quotation
-@i{Write documentation as if whoever reads it is
-a violent psychopath who knows where you live.}@*
-Steve English, as quoted by Peter Langston
-@end quotation
-
-This @value{CHAPTER} discusses advanced features in @command{gawk}.
-It's a bit of a ``grab bag'' of items that are otherwise unrelated
-to each other.
-First, a command-line option allows @command{gawk} to recognize
-nondecimal numbers in input data, not just in @command{awk}
-programs.
-Then, @command{gawk}'s special features for sorting arrays are presented.
-Next, two-way I/O, discussed briefly in earlier parts of this
-@value{DOCUMENT}, is described in full detail, along with the basics
-of TCP/IP networking. Finally, @command{gawk}
-can @dfn{profile} an @command{awk} program, making it possible to tune
-it for performance.
-
-@ref{Dynamic Extensions},
-discusses the ability to dynamically add new built-in functions to
-@command{gawk}. As this feature is still immature and likely to change,
-its description is relegated to an appendix.
-
-@menu
-* Nondecimal Data:: Allowing nondecimal input data.
-* Array Sorting:: Facilities for controlling array traversal and
- sorting arrays.
-* Two-way I/O:: Two-way communications with another process.
-* TCP/IP Networking:: Using @command{gawk} for network programming.
-* Profiling:: Profiling your @command{awk} programs.
-@end menu
-
-@node Nondecimal Data
-@section Allowing Nondecimal Input Data
-@cindex @code{--non-decimal-data} option
-@cindex advanced features, @command{gawk}, nondecimal input data
-@cindex input, data@comma{} nondecimal
-@cindex constants, nondecimal
-
-If you run @command{gawk} with the @option{--non-decimal-data} option,
-you can have nondecimal constants in your input data:
-
-@c line break here for small book format
-@example
-$ @kbd{echo 0123 123 0x123 |}
-> @kbd{gawk --non-decimal-data '@{ printf "%d, %d, %d\n",}
-> @kbd{$1, $2, $3 @}'}
-@print{} 83, 123, 291
-@end example
-
-For this feature to work, write your program so that
-@command{gawk} treats your data as numeric:
-
-@example
-$ @kbd{echo 0123 123 0x123 | gawk '@{ print $1, $2, $3 @}'}
-@print{} 0123 123 0x123
-@end example
-
-@noindent
-The @code{print} statement treats its expressions as strings.
-Although the fields can act as numbers when necessary,
-they are still strings, so @code{print} does not try to treat them
-numerically. You may need to add zero to a field to force it to
-be treated as a number. For example:
-
-@example
-$ @kbd{echo 0123 123 0x123 | gawk --non-decimal-data '}
-> @kbd{@{ print $1, $2, $3}
-> @kbd{print $1 + 0, $2 + 0, $3 + 0 @}'}
-@print{} 0123 123 0x123
-@print{} 83 123 291
-@end example
-
-Because it is common to have decimal data with leading zeros, and because
-using this facility could lead to surprising results, the default is to leave it
-disabled. If you want it, you must explicitly request it.
-
-@cindex programming conventions, @code{--non-decimal-data} option
-@cindex @code{--non-decimal-data} option, @code{strtonum()} function and
-@cindex @code{strtonum()} function (@command{gawk}), @code{--non-decimal-data} option and
-@quotation CAUTION
-@emph{Use of this option is not recommended.}
-It can break old programs very badly.
-Instead, use the @code{strtonum()} function to convert your data
-(@pxref{Nondecimal-numbers}).
-This makes your programs easier to write and easier to read, and
-leads to less surprising results.
-@end quotation
-
-@node Array Sorting
-@section Controlling Array Traversal and Array Sorting
-
-@command{gawk} lets you control the order in which a @samp{for (i in array)}
-loop traverses an array.
-
-In addition, two built-in functions, @code{asort()} and @code{asorti()},
-let you sort arrays based on the array values and indices, respectively.
-These two functions also provide control over the sorting criteria used
-to order the elements during sorting.
-
-@menu
-* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
-* Array Sorting Functions:: How to use @code{asort()} and @code{asorti()}.
-@end menu
-
-@node Controlling Array Traversal
-@subsection Controlling Array Traversal
-
-By default, the order in which a @samp{for (i in array)} loop
-scans an array is not defined; it is generally based upon
-the internal implementation of arrays inside @command{awk}.
-
-Often, though, it is desirable to be able to loop over the elements
-in a particular order that you, the programmer, choose. @command{gawk}
-lets you do this.
-
-@ref{Controlling Scanning}, describes how you can assign special,
-pre-defined values to @code{PROCINFO["sorted_in"]} in order to
-control the order in which @command{gawk} will traverse an array
-during a @code{for} loop.
-
-In addition, the value of @code{PROCINFO["sorted_in"]} can be a function name.
-This lets you traverse an array based on any custom criterion.
-The array elements are ordered according to the return value of this
-function. The comparison function should be defined with at least
-four arguments:
-
-@example
-function comp_func(i1, v1, i2, v2)
-@{
- @var{compare elements 1 and 2 in some fashion}
- @var{return < 0; 0; or > 0}
-@}
-@end example
-
-Here, @var{i1} and @var{i2} are the indices, and @var{v1} and @var{v2}
-are the corresponding values of the two elements being compared.
-Either @var{v1} or @var{v2}, or both, can be arrays if the array being
-traversed contains subarrays as values.
-(@xref{Arrays of Arrays}, for more information about subarrays.)
-The three possible return values are interpreted as follows:
-
-@table @code
-@item comp_func(i1, v1, i2, v2) < 0
-Index @var{i1} comes before index @var{i2} during loop traversal.
-
-@item comp_func(i1, v1, i2, v2) == 0
-Indices @var{i1} and @var{i2}
-come together but the relative order with respect to each other is undefined.
-
-@item comp_func(i1, v1, i2, v2) > 0
-Index @var{i1} comes after index @var{i2} during loop traversal.
-@end table
-
-Our first comparison function can be used to scan an array in
-numerical order of the indices:
-
-@example
-function cmp_num_idx(i1, v1, i2, v2)
-@{
- # numerical index comparison, ascending order
- return (i1 - i2)
-@}
-@end example
-
-Our second function traverses an array based on the string order of
-the element values rather than by indices:
-
-@example
-function cmp_str_val(i1, v1, i2, v2)
-@{
- # string value comparison, ascending order
- v1 = v1 ""
- v2 = v2 ""
- if (v1 < v2)
- return -1
- return (v1 != v2)
-@}
-@end example
-
-The third
-comparison function makes all numbers, and numeric strings without
-any leading or trailing spaces, come out first during loop traversal:
-
-@example
-function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
-@{
- # numbers before string value comparison, ascending order
- n1 = v1 + 0
- n2 = v2 + 0
- if (n1 == v1)
- return (n2 == v2) ? (n1 - n2) : -1
- else if (n2 == v2)
- return 1
- return (v1 < v2) ? -1 : (v1 != v2)
-@}
-@end example
-
-Here is a main program to demonstrate how @command{gawk}
-behaves using each of the previous functions:
-
-@example
-BEGIN @{
- data["one"] = 10
- data["two"] = 20
- data[10] = "one"
- data[100] = 100
- data[20] = "two"
-
- f[1] = "cmp_num_idx"
- f[2] = "cmp_str_val"
- f[3] = "cmp_num_str_val"
- for (i = 1; i <= 3; i++) @{
- printf("Sort function: %s\n", f[i])
- PROCINFO["sorted_in"] = f[i]
- for (j in data)
- printf("\tdata[%s] = %s\n", j, data[j])
- print ""
- @}
-@}
-@end example
-
-Here are the results when the program is run:
-@page
-
-@example
-$ @kbd{gawk -f compdemo.awk}
-@print{} Sort function: cmp_num_idx @ii{Sort by numeric index}
-@print{} data[two] = 20
-@print{} data[one] = 10 @ii{Both strings are numerically zero}
-@print{} data[10] = one
-@print{} data[20] = two
-@print{} data[100] = 100
-@print{}
-@print{} Sort function: cmp_str_val @ii{Sort by element values as strings}
-@print{} data[one] = 10
-@print{} data[100] = 100 @ii{String 100 is less than string 20}
-@print{} data[two] = 20
-@print{} data[10] = one
-@print{} data[20] = two
-@print{}
-@print{} Sort function: cmp_num_str_val @ii{Sort all numeric values before all strings}
-@print{} data[one] = 10
-@print{} data[two] = 20
-@print{} data[100] = 100
-@print{} data[10] = one
-@print{} data[20] = two
-@end example
-
-Consider sorting the entries of a GNU/Linux system password file
-according to login name. The following program sorts records
-by a specific field position and can be used for this purpose:
-
-@example
-# sort.awk --- simple program to sort by field position
-# field position is specified by the global variable POS
-
-function cmp_field(i1, v1, i2, v2)
-@{
- # comparison by value, as string, and ascending order
- return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
-@}
-
-@{
- for (i = 1; i <= NF; i++)
- a[NR][i] = $i
-@}
-
-END @{
- PROCINFO["sorted_in"] = "cmp_field"
- if (POS < 1 || POS > NF)
- POS = 1
- for (i in a) @{
- for (j = 1; j <= NF; j++)
- printf("%s%c", a[i][j], j < NF ? ":" : "")
- print ""
- @}
-@}
-@end example
-
-The first field in each entry of the password file is the user's login name,
-and the fields are separated by colons.
-Each record defines a subarray,
-with each field as an element in the subarray.
-Running the program produces the
-following output:
-
-@example
-$ @kbd{gawk -v POS=1 -F: -f sort.awk /etc/passwd}
-@print{} adm:x:3:4:adm:/var/adm:/sbin/nologin
-@print{} apache:x:48:48:Apache:/var/www:/sbin/nologin
-@print{} avahi:x:70:70:Avahi daemon:/:/sbin/nologin
-@dots{}
-@end example
-
-The comparison should normally always return the same value when given a
-specific pair of array elements as its arguments. If inconsistent
-results are returned then the order is undefined. This behavior can be
-exploited to introduce random order into otherwise seemingly
-ordered data:
-
-@example
-function cmp_randomize(i1, v1, i2, v2)
-@{
- # random order
- return (2 - 4 * rand())
-@}
-@end example
-
-As mentioned above, the order of the indices is arbitrary if two
-elements compare equal. This is usually not a problem, but letting
-the tied elements come out in arbitrary order can be an issue, especially
-when comparing item values. The partial ordering of the equal elements
-may change during the next loop traversal, if other elements are added or
-removed from the array. One way to resolve ties when comparing elements
-with otherwise equal values is to include the indices in the comparison
-rules. Note that doing this may make the loop traversal less efficient,
-so consider it only if necessary. The following comparison functions
-force a deterministic order, and are based on the fact that the
-indices of two elements are never equal:
-
-@example
-function cmp_numeric(i1, v1, i2, v2)
-@{
- # numerical value (and index) comparison, descending order
- return (v1 != v2) ? (v2 - v1) : (i2 - i1)
-@}
-
-function cmp_string(i1, v1, i2, v2)
-@{
- # string value (and index) comparison, descending order
- v1 = v1 i1
- v2 = v2 i2
- return (v1 > v2) ? -1 : (v1 != v2)
-@}
-@end example
-
-@c Avoid using the term ``stable'' when describing the unpredictable behavior
-@c if two items compare equal. Usually, the goal of a "stable algorithm"
-@c is to maintain the original order of the items, which is a meaningless
-@c concept for a list constructed from a hash.
-
-A custom comparison function can often simplify ordered loop
-traversal, and the sky is really the limit when it comes to
-designing such a function.
-
-When string comparisons are made during a sort, either for element
-values where one or both aren't numbers, or for element indices
-handled as strings, the value of @code{IGNORECASE}
-(@pxref{Built-in Variables}) controls whether
-the comparisons treat corresponding uppercase and lowercase letters as
-equivalent or distinct.
-
-Another point to keep in mind is that in the case of subarrays
-the element values can themselves be arrays; a production comparison
-function should use the @code{isarray()} function
-(@pxref{Type Functions}),
-to check for this, and choose a defined sorting order for subarrays.
-
-All sorting based on @code{PROCINFO["sorted_in"]}
-is disabled in POSIX mode,
-since the @code{PROCINFO} array is not special in that case.
-
-As a side note, sorting the array indices before traversing
-the array has been reported to add 15% to 20% overhead to the
-execution time of @command{awk} programs. For this reason,
-sorted array traversal is not the default.
-
-@c The @command{gawk}
-@c maintainers believe that only the people who wish to use a
-@c feature should have to pay for it.
-
-@node Array Sorting Functions
-@subsection Sorting Array Values and Indices with @command{gawk}
-
-@cindex arrays, sorting
-@cindex @code{asort()} function (@command{gawk})
-@cindex @code{asort()} function (@command{gawk}), arrays@comma{} sorting
-@cindex sort function, arrays, sorting
-In most @command{awk} implementations, sorting an array requires
-writing a @code{sort()} function.
-While this can be educational for exploring different sorting algorithms,
-usually that's not the point of the program.
-@command{gawk} provides the built-in @code{asort()}
-and @code{asorti()} functions
-(@pxref{String Functions})
-for sorting arrays. For example:
-
-@example
-@var{populate the array} data
-n = asort(data)
-for (i = 1; i <= n; i++)
- @var{do something with} data[i]
-@end example
-
-After the call to @code{asort()}, the array @code{data} is indexed from 1
-to some number @var{n}, the total number of elements in @code{data}.
-(This count is @code{asort()}'s return value.)
-@code{data[1]} @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
-The comparison is based on the type of the elements
-(@pxref{Typing and Comparison}).
-All numeric values come before all string values,
-which in turn come before all subarrays.
-
-@cindex side effects, @code{asort()} function
-An important side effect of calling @code{asort()} is that
-@emph{the array's original indices are irrevocably lost}.
-As this isn't always desirable, @code{asort()} accepts a
-second argument:
-
-@example
-@var{populate the array} source
-n = asort(source, dest)
-for (i = 1; i <= n; i++)
- @var{do something with} dest[i]
-@end example
-
-In this case, @command{gawk} copies the @code{source} array into the
-@code{dest} array and then sorts @code{dest}, destroying its indices.
-However, the @code{source} array is not affected.
-
-@code{asort()} accepts a third string argument to control comparison of
-array elements. As with @code{PROCINFO["sorted_in"]}, this argument
-may be one of the predefined names that @command{gawk} provides
-(@pxref{Controlling Scanning}), or the name of a user-defined function
-(@pxref{Controlling Array Traversal}).
-
-@quotation NOTE
-In all cases, the sorted element values consist of the original
-array's element values. The ability to control comparison merely
-affects the way in which they are sorted.
-@end quotation
-
-Often, what's needed is to sort on the values of the @emph{indices}
-instead of the values of the elements.
-To do that, use the
-@code{asorti()} function. The interface is identical to that of
-@code{asort()}, except that the index values are used for sorting, and
-become the values of the result array:
-
-@example
-@{ source[$0] = some_func($0) @}
-
-END @{
- n = asorti(source, dest)
- for (i = 1; i <= n; i++) @{
- @ii{Work with sorted indices directly:}
- @var{do something with} dest[i]
- @dots{}
- @ii{Access original array via sorted indices:}
- @var{do something with} source[dest[i]]
- @}
-@}
-@end example
-
-Similar to @code{asort()},
-in all cases, the sorted element values consist of the original
-array's indices. The ability to control comparison merely
-affects the way in which they are sorted.
-
-Sorting the array by replacing the indices provides maximal flexibility.
-To traverse the elements in decreasing order, use a loop that goes from
-@var{n} down to 1, either over the elements or over the indices.@footnote{You
-may also use one of the predefined sorting names that sorts in
-decreasing order.}
-
-@cindex reference counting, sorting arrays
-Copying array indices and elements isn't expensive in terms of memory.
-Internally, @command{gawk} maintains @dfn{reference counts} to data.
-For example, when @code{asort()} copies the first array to the second one,
-there is only one copy of the original array elements' data, even though
-both arrays use the values.
-
-@c Document It And Call It A Feature. Sigh.
-@cindex @command{gawk}, @code{IGNORECASE} variable in
-@cindex @code{IGNORECASE} variable
-@cindex arrays, sorting, @code{IGNORECASE} variable and
-@cindex @code{IGNORECASE} variable, array sorting and
-Because @code{IGNORECASE} affects string comparisons, the value
-of @code{IGNORECASE} also affects sorting for both @code{asort()} and @code{asorti()}.
-Note also that the locale's sorting order does @emph{not}
-come into play; comparisons are based on character values only.@footnote{This
-is true because locale-based comparison occurs only when in POSIX
-compatibility mode, and since @code{asort()} and @code{asorti()} are
-@command{gawk} extensions, they are not available in that case.}
-Caveat Emptor.
-
-@node Two-way I/O
-@section Two-Way Communications with Another Process
-@cindex Brennan, Michael
-@cindex programmers, attractiveness of
-@smallexample
-@c Path: cssun.mathcs.emory.edu!gatech!newsxfer3.itd.umich.edu!news-peer.sprintlink.net!news-sea-19.sprintlink.net!news-in-west.sprintlink.net!news.sprintlink.net!Sprint!204.94.52.5!news.whidbey.com!brennan
-From: brennan@@whidbey.com (Mike Brennan)
-Newsgroups: comp.lang.awk
-Subject: Re: Learn the SECRET to Attract Women Easily
-Date: 4 Aug 1997 17:34:46 GMT
-@c Organization: WhidbeyNet
-@c Lines: 12
-Message-ID: <5s53rm$eca@@news.whidbey.com>
-@c References: <5s20dn$2e1@chronicle.concentric.net>
-@c Reply-To: brennan@whidbey.com
-@c NNTP-Posting-Host: asn202.whidbey.com
-@c X-Newsreader: slrn (0.9.4.1 UNIX)
-@c Xref: cssun.mathcs.emory.edu comp.lang.awk:5403
-
-On 3 Aug 1997 13:17:43 GMT, Want More Dates???
-<tracy78@@kilgrona.com> wrote:
->Learn the SECRET to Attract Women Easily
->
->The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
-
-The scent of awk programmers is a lot more attractive to women than
-the scent of perl programmers.
---
-Mike Brennan
-@c brennan@@whidbey.com
-@end smallexample
-
-@cindex advanced features, @command{gawk}, processes@comma{} communicating with
-@cindex processes, two-way communications with
-It is often useful to be able to
-send data to a separate program for
-processing and then read the result. This can always be
-done with temporary files:
-
-@example
-# Write the data for processing
-tempfile = ("mydata." PROCINFO["pid"])
-while (@var{not done with data})
- print @var{data} | ("subprogram > " tempfile)
-close("subprogram > " tempfile)
-
-# Read the results, remove tempfile when done
-while ((getline newdata < tempfile) > 0)
- @var{process} newdata @var{appropriately}
-close(tempfile)
-system("rm " tempfile)
-@end example
-
-@noindent
-This works, but not elegantly. Among other things, it requires that
-the program be run in a directory that cannot be shared among users;
-for example, @file{/tmp} will not do, as another user might happen
-to be using a temporary file with the same name.
-
-@cindex coprocesses
-@cindex input/output, two-way
-@cindex @code{|} (vertical bar), @code{|&} operator (I/O)
-@cindex vertical bar (@code{|}), @code{|&} operator (I/O)
-@cindex @command{csh} utility, @code{|&} operator, comparison with
-However, with @command{gawk}, it is possible to
-open a @emph{two-way} pipe to another process. The second process is
-termed a @dfn{coprocess}, since it runs in parallel with @command{gawk}.
-The two-way connection is created using the @samp{|&} operator
-(borrowed from the Korn shell, @command{ksh}):@footnote{This is very
-different from the same operator in the C shell.}
-
-@example
-do @{
- print @var{data} |& "subprogram"
- "subprogram" |& getline results
-@} while (@var{data left to process})
-close("subprogram")
-@end example
-
-The first time an I/O operation is executed using the @samp{|&}
-operator, @command{gawk} creates a two-way pipeline to a child process
-that runs the other program. Output created with @code{print}
-or @code{printf} is written to the program's standard input, and
-output from the program's standard output can be read by the @command{gawk}
-program using @code{getline}.
-As is the case with processes started by @samp{|}, the subprogram
-can be any program, or pipeline of programs, that can be started by
-the shell.
+@ifdocbook
+@part Part II:@* Problem Solving With @command{awk}
-There are some cautionary items to be aware of:
+Part II shows how to use @command{awk} and @command{gawk} for problem solving.
+There is lots of code here for you to read and learn from.
+It contains the following chapters:
@itemize @bullet
@item
-As the code inside @command{gawk} currently stands, the coprocess's
-standard error goes to the same place that the parent @command{gawk}'s
-standard error goes. It is not possible to read the child's
-standard error separately.
+@ref{Library Functions}.
-@cindex deadlocks
-@cindex buffering, input/output
-@cindex @code{getline} command, deadlock and
@item
-I/O buffering may be a problem. @command{gawk} automatically
-flushes all output down the pipe to the coprocess.
-However, if the coprocess does not flush its output,
-@command{gawk} may hang when doing a @code{getline} in order to read
-the coprocess's results. This could lead to a situation
-known as @dfn{deadlock}, where each process is waiting for the
-other one to do something.
+@ref{Sample Programs}.
@end itemize
-
-@cindex @code{close()} function, two-way pipes and
-It is possible to close just one end of the two-way pipe to
-a coprocess, by supplying a second argument to the @code{close()}
-function of either @code{"to"} or @code{"from"}
-(@pxref{Close Files And Pipes}).
-These strings tell @command{gawk} to close the end of the pipe
-that sends data to the coprocess or the end that reads from it,
-respectively.
-
-@cindex @command{sort} utility, coprocesses and
-This is particularly necessary in order to use
-the system @command{sort} utility as part of a coprocess;
-@command{sort} must read @emph{all} of its input
-data before it can produce any output.
-The @command{sort} program does not receive an end-of-file indication
-until @command{gawk} closes the write end of the pipe.
-
-When you have finished writing data to the @command{sort}
-utility, you can close the @code{"to"} end of the pipe, and
-then start reading sorted data via @code{getline}.
-For example:
-
-@example
-BEGIN @{
- command = "LC_ALL=C sort"
- n = split("abcdefghijklmnopqrstuvwxyz", a, "")
-
- for (i = n; i > 0; i--)
- print a[i] |& command
- close(command, "to")
-
- while ((command |& getline line) > 0)
- print "got", line
- close(command)
-@}
-@end example
-
-This program writes the letters of the alphabet in reverse order, one
-per line, down the two-way pipe to @command{sort}. It then closes the
-write end of the pipe, so that @command{sort} receives an end-of-file
-indication. This causes @command{sort} to sort the data and write the
-sorted data back to the @command{gawk} program. Once all of the data
-has been read, @command{gawk} terminates the coprocess and exits.
-
-As a side note, the assignment @samp{LC_ALL=C} in the @command{sort}
-command ensures traditional Unix (ASCII) sorting from @command{sort}.
-
-@cindex @command{gawk}, @code{PROCINFO} array in
-@cindex @code{PROCINFO} array
-You may also use pseudo-ttys (ptys) for
-two-way communication instead of pipes, if your system supports them.
-This is done on a per-command basis, by setting a special element
-in the @code{PROCINFO} array
-(@pxref{Auto-set}),
-like so:
-
-@example
-command = "sort -nr" # command, save in convenience variable
-PROCINFO[command, "pty"] = 1 # update PROCINFO
-print @dots{} |& command # start two-way pipe
-@dots{}
-@end example
-
-@noindent
-Using ptys avoids the buffer deadlock issues described earlier, at some
-loss in performance. If your system does not have ptys, or if all the
-system's ptys are in use, @command{gawk} automatically falls back to
-using regular pipes.
-
-@node TCP/IP Networking
-@section Using @command{gawk} for Network Programming
-@cindex advanced features, @command{gawk}, network programming
-@cindex networks, programming
-@c STARTOFRANGE tcpip
-@cindex TCP/IP
-@cindex @code{/inet/@dots{}} special files (@command{gawk})
-@cindex files, @code{/inet/@dots{}} (@command{gawk})
-@cindex @code{/inet4/@dots{}} special files (@command{gawk})
-@cindex files, @code{/inet4/@dots{}} (@command{gawk})
-@cindex @code{/inet6/@dots{}} special files (@command{gawk})
-@cindex files, @code{/inet6/@dots{}} (@command{gawk})
-@cindex @code{EMISTERED}
-@quotation
-@code{EMISTERED}:@*
-@ @ @ @ @i{A host is a host from coast to coast,@*
-@ @ @ @ and no-one can talk to host that's close,@*
-@ @ @ @ unless the host that isn't close@*
-@ @ @ @ is busy hung or dead.}
-@end quotation
-
-In addition to being able to open a two-way pipeline to a coprocess
-on the same system
-(@pxref{Two-way I/O}),
-it is possible to make a two-way connection to
-another process on another system across an IP network connection.
-
-You can think of this as just a @emph{very long} two-way pipeline to
-a coprocess.
-The way @command{gawk} decides that you want to use TCP/IP networking is
-by recognizing special @value{FN}s that begin with one of @samp{/inet/},
-@samp{/inet4/} or @samp{/inet6}.
-
-The full syntax of the special @value{FN} is
-@file{/@var{net-type}/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}.
-The components are:
-
-@table @var
-@item net-type
-Specifies the kind of Internet connection to make.
-Use @samp{/inet4/} to force IPv4, and
-@samp{/inet6/} to force IPv6.
-Plain @samp{/inet/} (which used to be the only option) uses
-the system default, most likely IPv4.
-
-@item protocol
-The protocol to use over IP. This must be either @samp{tcp}, or
-@samp{udp}, for a TCP or UDP IP connection,
-respectively. The use of TCP is recommended for most applications.
-
-@item local-port
-@cindex @code{getaddrinfo()} function (C library)
-The local TCP or UDP port number to use. Use a port number of @samp{0}
-when you want the system to pick a port. This is what you should do
-when writing a TCP or UDP client.
-You may also use a well-known service name, such as @samp{smtp}
-or @samp{http}, in which case @command{gawk} attempts to determine
-the predefined port number using the C @code{getaddrinfo()} function.
-
-@item remote-host
-The IP address or fully-qualified domain name of the Internet
-host to which you want to connect.
-
-@item remote-port
-The TCP or UDP port number to use on the given @var{remote-host}.
-Again, use @samp{0} if you don't care, or else a well-known
-service name.
-@end table
-
-@cindex @command{gawk}, @code{ERRNO} variable in
-@cindex @code{ERRNO} variable
-@quotation NOTE
-Failure in opening a two-way socket will result in a non-fatal error
-being returned to the calling code. The value of @code{ERRNO} indicates
-the error (@pxref{Auto-set}).
-@end quotation
-
-Consider the following very simple example:
-
-@example
-BEGIN @{
- Service = "/inet/tcp/0/localhost/daytime"
- Service |& getline
- print $0
- close(Service)
-@}
-@end example
-
-This program reads the current date and time from the local system's
-TCP @samp{daytime} server.
-It then prints the results and closes the connection.
-
-Because this topic is extensive, the use of @command{gawk} for
-TCP/IP programming is documented separately.
-@ifinfo
-See
-@inforef{Top, , General Introduction, gawkinet, TCP/IP Internetworking with @command{gawk}},
-@end ifinfo
-@ifnotinfo
-See @cite{TCP/IP Internetworking with @command{gawk}},
-which comes as part of the @command{gawk} distribution,
-@end ifnotinfo
-for a much more complete introduction and discussion, as well as
-extensive examples.
-
-@c ENDOFRANGE tcpip
-
-@node Profiling
-@section Profiling Your @command{awk} Programs
-@c STARTOFRANGE awkp
-@cindex @command{awk} programs, profiling
-@c STARTOFRANGE proawk
-@cindex profiling @command{awk} programs
-@cindex profiling @command{gawk}
-@cindex @code{awkprof.out} file
-@cindex files, @code{awkprof.out}
-
-You may produce execution traces of your @command{awk} programs.
-This is done by passing the option @option{--profile} to @command{gawk}.
-When @command{gawk} has finished running, it creates a profile of your program in a file
-named @file{awkprof.out}. Because it is profiling, it also executes up to 45% slower than
-@command{gawk} normally does.
-
-@cindex @code{--profile} option
-As shown in the following example,
-the @option{--profile} option can be used to change the name of the file
-where @command{gawk} will write the profile:
-
-@example
-gawk --profile=myprog.prof -f myprog.awk data1 data2
-@end example
-
-@noindent
-In the above example, @command{gawk} places the profile in
-@file{myprog.prof} instead of in @file{awkprof.out}.
-
-Here is a sample session showing a simple @command{awk} program, its input data, and the
-results from running @command{gawk} with the @option{--profile} option.
-First, the @command{awk} program:
-
-@example
-BEGIN @{ print "First BEGIN rule" @}
-
-END @{ print "First END rule" @}
-
-/foo/ @{
- print "matched /foo/, gosh"
- for (i = 1; i <= 3; i++)
- sing()
-@}
-
-@{
- if (/foo/)
- print "if is true"
- else
- print "else is true"
-@}
-
-BEGIN @{ print "Second BEGIN rule" @}
-
-END @{ print "Second END rule" @}
-
-function sing( dummy)
-@{
- print "I gotta be me!"
-@}
-@end example
-
-Following is the input data:
-
-@example
-foo
-bar
-baz
-foo
-junk
-@end example
-
-Here is the @file{awkprof.out} that results from running the @command{gawk}
-profiler on this program and data (this example also illustrates that @command{awk}
-programmers sometimes have to work late):
-
-@cindex @code{BEGIN} pattern
-@cindex @code{END} pattern
-@example
- # gawk profile, created Sun Aug 13 00:00:15 2000
-
- # BEGIN block(s)
-
- BEGIN @{
- 1 print "First BEGIN rule"
- 1 print "Second BEGIN rule"
- @}
-
- # Rule(s)
-
- 5 /foo/ @{ # 2
- 2 print "matched /foo/, gosh"
- 6 for (i = 1; i <= 3; i++) @{
- 6 sing()
- @}
- @}
-
- 5 @{
- 5 if (/foo/) @{ # 2
- 2 print "if is true"
- 3 @} else @{
- 3 print "else is true"
- @}
- @}
-
- # END block(s)
-
- END @{
- 1 print "First END rule"
- 1 print "Second END rule"
- @}
-
- # Functions, listed alphabetically
-
- 6 function sing(dummy)
- @{
- 6 print "I gotta be me!"
- @}
-@end example
-
-This example illustrates many of the basic features of profiling output.
-They are as follows:
-
-@itemize @bullet
-@item
-The program is printed in the order @code{BEGIN} rule,
-@code{BEGINFILE} rule,
-pattern/action rules,
-@code{ENDFILE} rule, @code{END} rule and functions, listed
-alphabetically.
-Multiple @code{BEGIN} and @code{END} rules are merged together,
-as are multiple @code{BEGINFILE} and @code{ENDFILE} rules.
-
-@cindex patterns, counts
-@item
-Pattern-action rules have two counts.
-The first count, to the left of the rule, shows how many times
-the rule's pattern was @emph{tested}.
-The second count, to the right of the rule's opening left brace
-in a comment,
-shows how many times the rule's action was @emph{executed}.
-The difference between the two indicates how many times the rule's
-pattern evaluated to false.
-
-@item
-Similarly,
-the count for an @code{if}-@code{else} statement shows how many times
-the condition was tested.
-To the right of the opening left brace for the @code{if}'s body
-is a count showing how many times the condition was true.
-The count for the @code{else}
-indicates how many times the test failed.
-
-@cindex loops, count for header
-@item
-The count for a loop header (such as @code{for}
-or @code{while}) shows how many times the loop test was executed.
-(Because of this, you can't just look at the count on the first
-statement in a rule to determine how many times the rule was executed.
-If the first statement is a loop, the count is misleading.)
-
-@cindex functions, user-defined, counts
-@cindex user-defined, functions, counts
-@item
-For user-defined functions, the count next to the @code{function}
-keyword indicates how many times the function was called.
-The counts next to the statements in the body show how many times
-those statements were executed.
-
-@cindex @code{@{@}} (braces)
-@cindex braces (@code{@{@}})
-@item
-The layout uses ``K&R'' style with TABs.
-Braces are used everywhere, even when
-the body of an @code{if}, @code{else}, or loop is only a single statement.
-
-@cindex @code{()} (parentheses)
-@cindex parentheses @code{()}
-@item
-Parentheses are used only where needed, as indicated by the structure
-of the program and the precedence rules.
-@c extra verbiage here satisfies the copyeditor. ugh.
-For example, @samp{(3 + 5) * 4} means add three plus five, then multiply
-the total by four. However, @samp{3 + 5 * 4} has no parentheses, and
-means @samp{3 + (5 * 4)}.
-
-@ignore
-@item
-All string concatenations are parenthesized too.
-(This could be made a bit smarter.)
+@end ifdocbook
@end ignore
-@item
-Parentheses are used around the arguments to @code{print}
-and @code{printf} only when
-the @code{print} or @code{printf} statement is followed by a redirection.
-Similarly, if
-the target of a redirection isn't a scalar, it gets parenthesized.
-
-@item
-@command{gawk} supplies leading comments in
-front of the @code{BEGIN} and @code{END} rules,
-the pattern/action rules, and the functions.
-
-@end itemize
-
-The profiled version of your program may not look exactly like what you
-typed when you wrote it. This is because @command{gawk} creates the
-profiled version by ``pretty printing'' its internal representation of
-the program. The advantage to this is that @command{gawk} can produce
-a standard representation. The disadvantage is that all source-code
-comments are lost, as are the distinctions among multiple @code{BEGIN},
-@code{END}, @code{BEGINFILE}, and @code{ENDFILE} rules. Also, things such as:
-
-@example
-/foo/
-@end example
-
-@noindent
-come out as:
-
-@example
-/foo/ @{
- print $0
-@}
-@end example
-
-@noindent
-which is correct, but possibly surprising.
-
-@cindex profiling @command{awk} programs, dynamically
-@cindex @command{gawk} program, dynamic profiling
-Besides creating profiles when a program has completed,
-@command{gawk} can produce a profile while it is running.
-This is useful if your @command{awk} program goes into an
-infinite loop and you want to see what has been executed.
-To use this feature, run @command{gawk} with the @option{--profile}
-option in the background:
-
-@example
-$ @kbd{gawk --profile -f myprog &}
-[1] 13992
-@end example
-
-@cindex @command{kill} command@comma{} dynamic profiling
-@cindex @code{USR1} signal
-@cindex @code{SIGUSR1} signal
-@cindex signals, @code{USR1}/@code{SIGUSR1}
-@noindent
-The shell prints a job number and process ID number; in this case, 13992.
-Use the @command{kill} command to send the @code{USR1} signal
-to @command{gawk}:
-
-@example
-$ @kbd{kill -USR1 13992}
-@end example
-
-@noindent
-As usual, the profiled version of the program is written to
-@file{awkprof.out}, or to a different file if one specified with
-the @option{--profile} option.
-
-Along with the regular profile, as shown earlier, the profile
-includes a trace of any active functions:
-
-@example
-# Function Call Stack:
-
-# 3. baz
-# 2. bar
-# 1. foo
-# -- main --
-@end example
-
-You may send @command{gawk} the @code{USR1} signal as many times as you like.
-Each time, the profile and function call trace are appended to the output
-profile file.
-
-@cindex @code{HUP} signal
-@cindex @code{SIGHUP} signal
-@cindex signals, @code{HUP}/@code{SIGHUP}
-If you use the @code{HUP} signal instead of the @code{USR1} signal,
-@command{gawk} produces the profile and the function call trace and then exits.
-
-@cindex @code{INT} signal (MS-Windows)
-@cindex @code{SIGINT} signal (MS-Windows)
-@cindex signals, @code{INT}/@code{SIGINT} (MS-Windows)
-@cindex @code{QUIT} signal (MS-Windows)
-@cindex @code{SIGQUIT} signal (MS-Windows)
-@cindex signals, @code{QUIT}/@code{SIGQUIT} (MS-Windows)
-When @command{gawk} runs on MS-Windows systems, it uses the
-@code{INT} and @code{QUIT} signals for producing the profile and, in
-the case of the @code{INT} signal, @command{gawk} exits. This is
-because these systems don't support the @command{kill} command, so the
-only signals you can deliver to a program are those generated by the
-keyboard. The @code{INT} signal is generated by the
-@kbd{@value{CTL}-@key{C}} or @kbd{@value{CTL}-@key{BREAK}} key, while the
-@code{QUIT} signal is generated by the @kbd{@value{CTL}-@key{\}} key.
-
-Finally, @command{gawk} also accepts another option @option{--pretty-print}.
-When called this way, @command{gawk} ``pretty prints'' the program into
-@file{awkprof.out}, without any execution counts.
-@c ENDOFRANGE advgaw
-@c ENDOFRANGE gawadv
-@c ENDOFRANGE awkp
-@c ENDOFRANGE proawk
-
@node Library Functions
@chapter A Library of @command{awk} Functions
@c STARTOFRANGE libf
@@ -25691,6 +23815,1921 @@ BEGIN {
}
@end ignore
+@iftex
+@part Part III:@* Moving Beyond Standard @command{awk} With @command{gawk}
+@end iftex
+
+@ignore
+@ifdocbook
+
+@part Part III:@* Moving Beyond Standard @command{awk} With @command{gawk}
+
+Part III focuses on features specific to @command{gawk}.
+It contains the following chapters:
+
+@itemize @bullet
+@item
+@ref{Internationalization}.
+
+@item
+@ref{Advanced Features}.
+
+@item
+@ref{Debugger}.
+
+@item
+@ref{Arbitrary Precision Arithmetic}.
+
+@item
+@ref{Dynamic Extensions}.
+@end ifdocbook
+@end ignore
+
+@node Internationalization
+@chapter Internationalization with @command{gawk}
+
+Once upon a time, computer makers
+wrote software that worked only in English.
+Eventually, hardware and software vendors noticed that if their
+systems worked in the native languages of non-English-speaking
+countries, they were able to sell more systems.
+As a result, internationalization and localization
+of programs and software systems became a common practice.
+
+@c STARTOFRANGE inloc
+@cindex internationalization, localization
+@cindex @command{gawk}, internationalization and, See internationalization
+@cindex internationalization, localization, @command{gawk} and
+For many years, the ability to provide internationalization
+was largely restricted to programs written in C and C++.
+This @value{CHAPTER} describes the underlying library @command{gawk}
+uses for internationalization, as well as how
+@command{gawk} makes internationalization
+features available at the @command{awk} program level.
+Having internationalization available at the @command{awk} level
+gives software developers additional flexibility---they are no
+longer forced to write in C or C++ when internationalization is
+a requirement.
+
+@menu
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU @code{gettext} works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: @command{gawk} is also internationalized.
+@end menu
+
+@node I18N and L10N
+@section Internationalization and Localization
+
+@cindex internationalization
+@cindex localization, See internationalization@comma{} localization
+@cindex localization
+@dfn{Internationalization} means writing (or modifying) a program once,
+in such a way that it can use multiple languages without requiring
+further source-code changes.
+@dfn{Localization} means providing the data necessary for an
+internationalized program to work in a particular language.
+Most typically, these terms refer to features such as the language
+used for printing error messages, the language used to read
+responses, and information related to how numerical and
+monetary values are printed and read.
+
+@node Explaining gettext
+@section GNU @code{gettext}
+
+@cindex internationalizing a program
+@c STARTOFRANGE gettex
+@cindex @code{gettext} library
+The facilities in GNU @code{gettext} focus on messages; strings printed
+by a program, either directly or via formatting with @code{printf} or
+@code{sprintf()}.@footnote{For some operating systems, the @command{gawk}
+port doesn't support GNU @code{gettext}.
+Therefore, these features are not available
+if you are using one of those operating systems. Sorry.}
+
+@cindex portability, @code{gettext} library and
+When using GNU @code{gettext}, each application has its own
+@dfn{text domain}. This is a unique name, such as @samp{kpilot} or @samp{gawk},
+that identifies the application.
+A complete application may have multiple components---programs written
+in C or C++, as well as scripts written in @command{sh} or @command{awk}.
+All of the components use the same text domain.
+
+To make the discussion concrete, assume we're writing an application
+named @command{guide}. Internationalization consists of the
+following steps, in this order:
+
+@enumerate
+@item
+The programmer goes
+through the source for all of @command{guide}'s components
+and marks each string that is a candidate for translation.
+For example, @code{"`-F': option required"} is a good candidate for translation.
+A table with strings of option names is not (e.g., @command{gawk}'s
+@option{--profile} option should remain the same, no matter what the local
+language).
+
+@cindex @code{textdomain()} function (C library)
+@item
+The programmer indicates the application's text domain
+(@code{"guide"}) to the @code{gettext} library,
+by calling the @code{textdomain()} function.
+
+@cindex @code{.pot} files
+@cindex files, @code{.pot}
+@cindex portable object template files
+@cindex files, portable object template
+@item
+Messages from the application are extracted from the source code and
+collected into a portable object template file (@file{guide.pot}),
+which lists the strings and their translations.
+The translations are initially empty.
+The original (usually English) messages serve as the key for
+lookup of the translations.
+
+@cindex @code{.po} files
+@cindex files, @code{.po}
+@cindex portable object files
+@cindex files, portable object
+@item
+For each language with a translator, @file{guide.pot}
+is copied to a portable object file (@code{.po})
+and translations are created and shipped with the application.
+For example, there might be a @file{fr.po} for a French translation.
+
+@cindex @code{.mo} files
+@cindex files, @code{.mo}
+@cindex message object files
+@cindex files, message object
+@item
+Each language's @file{.po} file is converted into a binary
+message object (@file{.mo}) file.
+A message object file contains the original messages and their
+translations in a binary format that allows fast lookup of translations
+at runtime.
+
+@item
+When @command{guide} is built and installed, the binary translation files
+are installed in a standard place.
+
+@cindex @code{bindtextdomain()} function (C library)
+@item
+For testing and development, it is possible to tell @code{gettext}
+to use @file{.mo} files in a different directory than the standard
+one by using the @code{bindtextdomain()} function.
+
+@cindex @code{.mo} files, specifying directory of
+@cindex files, @code{.mo}, specifying directory of
+@cindex message object files, specifying directory of
+@cindex files, message object, specifying directory of
+@item
+At runtime, @command{guide} looks up each string via a call
+to @code{gettext()}. The returned string is the translated string
+if available, or the original string if not.
+
+@item
+If necessary, it is possible to access messages from a different
+text domain than the one belonging to the application, without
+having to switch the application's default text domain back
+and forth.
+@end enumerate
+
+@cindex @code{gettext()} function (C library)
+In C (or C++), the string marking and dynamic translation lookup
+are accomplished by wrapping each string in a call to @code{gettext()}:
+
+@example
+printf("%s", gettext("Don't Panic!\n"));
+@end example
+
+The tools that extract messages from source code pull out all
+strings enclosed in calls to @code{gettext()}.
+
+@cindex @code{_} (underscore), @code{_} C macro
+@cindex underscore (@code{_}), @code{_} C macro
+The GNU @code{gettext} developers, recognizing that typing
+@samp{gettext(@dots{})} over and over again is both painful and ugly to look
+at, use the macro @samp{_} (an underscore) to make things easier:
+
+@example
+/* In the standard header file: */
+#define _(str) gettext(str)
+
+/* In the program text: */
+printf("%s", _("Don't Panic!\n"));
+@end example
+
+@cindex internationalization, localization, locale categories
+@cindex @code{gettext} library, locale categories
+@cindex locale categories
+@noindent
+This reduces the typing overhead to just three extra characters per string
+and is considerably easier to read as well.
+
+There are locale @dfn{categories}
+for different types of locale-related information.
+The defined locale categories that @code{gettext} knows about are:
+
+@table @code
+@cindex @code{LC_MESSAGES} locale category
+@item LC_MESSAGES
+Text messages. This is the default category for @code{gettext}
+operations, but it is possible to supply a different one explicitly,
+if necessary. (It is almost never necessary to supply a different category.)
+
+@cindex sorting characters in different languages
+@cindex @code{LC_COLLATE} locale category
+@item LC_COLLATE
+Text-collation information; i.e., how different characters
+and/or groups of characters sort in a given language.
+
+@cindex @code{LC_CTYPE} locale category
+@item LC_CTYPE
+Character-type information (alphabetic, digit, upper- or lowercase, and
+so on).
+This information is accessed via the
+POSIX character classes in regular expressions,
+such as @code{/[[:alnum:]]/}
+(@pxref{Regexp Operators}).
+
+@cindex monetary information, localization
+@cindex currency symbols, localization
+@cindex @code{LC_MONETARY} locale category
+@item LC_MONETARY
+Monetary information, such as the currency symbol, and whether the
+symbol goes before or after a number.
+
+@cindex @code{LC_NUMERIC} locale category
+@item LC_NUMERIC
+Numeric information, such as which characters to use for the decimal
+point and the thousands separator.@footnote{Americans
+use a comma every three decimal places and a period for the decimal
+point, while many Europeans do exactly the opposite:
+1,234.56 versus 1.234,56.}
+
+@cindex @code{LC_RESPONSE} locale category
+@item LC_RESPONSE
+Response information, such as how ``yes'' and ``no'' appear in the
+local language, and possibly other information as well.
+
+@cindex time, localization and
+@cindex dates, information related to@comma{} localization
+@cindex @code{LC_TIME} locale category
+@item LC_TIME
+Time- and date-related information, such as 12- or 24-hour clock, month printed
+before or after the day in a date, local month abbreviations, and so on.
+
+@cindex @code{LC_ALL} locale category
+@item LC_ALL
+All of the above. (Not too useful in the context of @code{gettext}.)
+@end table
+@c ENDOFRANGE gettex
+
+@node Programmer i18n
+@section Internationalizing @command{awk} Programs
+@c STARTOFRANGE inap
+@cindex @command{awk} programs, internationalizing
+
+@command{gawk} provides the following variables and functions for
+internationalization:
+
+@table @code
+@cindex @code{TEXTDOMAIN} variable
+@item TEXTDOMAIN
+This variable indicates the application's text domain.
+For compatibility with GNU @code{gettext}, the default
+value is @code{"messages"}.
+
+@cindex internationalization, localization, marked strings
+@cindex strings, for localization
+@item _"your message here"
+String constants marked with a leading underscore
+are candidates for translation at runtime.
+String constants without a leading underscore are not translated.
+
+@cindex @code{dcgettext()} function (@command{gawk})
+@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
+Return the translation of @var{string} in
+text domain @var{domain} for locale category @var{category}.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
+
+If you supply a value for @var{category}, it must be a string equal to
+one of the known locale categories described in
+@ifnotinfo
+the previous @value{SECTION}.
+@end ifnotinfo
+@ifinfo
+@ref{Explaining gettext}.
+@end ifinfo
+You must also supply a text domain. Use @code{TEXTDOMAIN} if
+you want to use the current domain.
+
+@quotation CAUTION
+The order of arguments to the @command{awk} version
+of the @code{dcgettext()} function is purposely different from the order for
+the C version. The @command{awk} version's order was
+chosen to be simple and to allow for reasonable @command{awk}-style
+default arguments.
+@end quotation
+
+@cindex @code{dcngettext()} function (@command{gawk})
+@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
+Return the plural form used for @var{number} of the
+translation of @var{string1} and @var{string2} in text domain
+@var{domain} for locale category @var{category}. @var{string1} is the
+English singular variant of a message, and @var{string2} the English plural
+variant of the same message.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
+
+The same remarks about argument order as for the @code{dcgettext()} function apply.
+
+@cindex @code{.mo} files, specifying directory of
+@cindex files, @code{.mo}, specifying directory of
+@cindex message object files, specifying directory of
+@cindex files, message object, specifying directory of
+@cindex @code{bindtextdomain()} function (@command{gawk})
+@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]})
+Change the directory in which
+@code{gettext} looks for @file{.mo} files, in case they
+will not or cannot be placed in the standard locations
+(e.g., during testing).
+Return the directory in which @var{domain} is ``bound.''
+
+The default @var{domain} is the value of @code{TEXTDOMAIN}.
+If @var{directory} is the null string (@code{""}), then
+@code{bindtextdomain()} returns the current binding for the
+given @var{domain}.
+@end table
+
+To use these facilities in your @command{awk} program, follow the steps
+outlined in
+@ifnotinfo
+the previous @value{SECTION},
+@end ifnotinfo
+@ifinfo
+@ref{Explaining gettext},
+@end ifinfo
+like so:
+
+@enumerate
+@cindex @code{BEGIN} pattern, @code{TEXTDOMAIN} variable and
+@cindex @code{TEXTDOMAIN} variable, @code{BEGIN} pattern and
+@item
+Set the variable @code{TEXTDOMAIN} to the text domain of
+your program. This is best done in a @code{BEGIN} rule
+(@pxref{BEGIN/END}),
+or it can also be done via the @option{-v} command-line
+option (@pxref{Options}):
+
+@example
+BEGIN @{
+ TEXTDOMAIN = "guide"
+ @dots{}
+@}
+@end example
+
+@cindex @code{_} (underscore), translatable string
+@cindex underscore (@code{_}), translatable string
+@item
+Mark all translatable strings with a leading underscore (@samp{_})
+character. It @emph{must} be adjacent to the opening
+quote of the string. For example:
+
+@example
+print _"hello, world"
+x = _"you goofed"
+printf(_"Number of users is %d\n", nusers)
+@end example
+
+@item
+If you are creating strings dynamically, you can
+still translate them, using the @code{dcgettext()}
+built-in function:
+
+@example
+message = nusers " users logged in"
+message = dcgettext(message, "adminprog")
+print message
+@end example
+
+Here, the call to @code{dcgettext()} supplies a different
+text domain (@code{"adminprog"}) in which to find the
+message, but it uses the default @code{"LC_MESSAGES"} category.
+
+@cindex @code{LC_MESSAGES} locale category, @code{bindtextdomain()} function (@command{gawk})
+@item
+During development, you might want to put the @file{.mo}
+file in a private directory for testing. This is done
+with the @code{bindtextdomain()} built-in function:
+
+@example
+BEGIN @{
+ TEXTDOMAIN = "guide" # our text domain
+ if (Testing) @{
+ # where to find our files
+ bindtextdomain("testdir")
+ # joe is in charge of adminprog
+ bindtextdomain("../joe/testdir", "adminprog")
+ @}
+ @dots{}
+@}
+@end example
+
+@end enumerate
+
+@xref{I18N Example},
+for an example program showing the steps to create
+and use translations from @command{awk}.
+
+@node Translator i18n
+@section Translating @command{awk} Programs
+
+@cindex @code{.po} files
+@cindex files, @code{.po}
+@cindex portable object files
+@cindex files, portable object
+Once a program's translatable strings have been marked, they must
+be extracted to create the initial @file{.po} file.
+As part of translation, it is often helpful to rearrange the order
+in which arguments to @code{printf} are output.
+
+@command{gawk}'s @option{--gen-pot} command-line option extracts
+the messages and is discussed next.
+After that, @code{printf}'s ability to
+rearrange the order for @code{printf} arguments at runtime
+is covered.
+
+@menu
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging @code{printf} arguments.
+* I18N Portability:: @command{awk}-level portability issues.
+@end menu
+
+@node String Extraction
+@subsection Extracting Marked Strings
+@cindex strings, extracting
+@cindex marked strings@comma{} extracting
+@cindex @code{--gen-pot} option
+@cindex command-line options, string extraction
+@cindex string extraction (internationalization)
+@cindex marked string extraction (internationalization)
+@cindex extraction, of marked strings (internationalization)
+
+@cindex @code{--gen-pot} option
+Once your @command{awk} program is working, and all the strings have
+been marked and you've set (and perhaps bound) the text domain,
+it is time to produce translations.
+First, use the @option{--gen-pot} command-line option to create
+the initial @file{.pot} file:
+
+@example
+$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
+@end example
+
+@cindex @code{xgettext} utility
+When run with @option{--gen-pot}, @command{gawk} does not execute your
+program. Instead, it parses it as usual and prints all marked strings
+to standard output in the format of a GNU @code{gettext} Portable Object
+file. Also included in the output are any constant strings that
+appear as the first argument to @code{dcgettext()} or as the first and
+second argument to @code{dcngettext()}.@footnote{The
+@command{xgettext} utility that comes with GNU
+@code{gettext} can handle @file{.awk} files.}
+@xref{I18N Example},
+for the full list of steps to go through to create and test
+translations for @command{guide}.
+
+@node Printf Ordering
+@subsection Rearranging @code{printf} Arguments
+
+@cindex @code{printf} statement, positional specifiers
+@cindex positional specifiers, @code{printf} statement
+Format strings for @code{printf} and @code{sprintf()}
+(@pxref{Printf})
+present a special problem for translation.
+Consider the following:@footnote{This example is borrowed
+from the GNU @code{gettext} manual.}
+
+@c line broken here only for smallbook format
+@example
+printf(_"String `%s' has %d characters\n",
+ string, length(string)))
+@end example
+
+A possible German translation for this might be:
+
+@example
+"%d Zeichen lang ist die Zeichenkette `%s'\n"
+@end example
+
+The problem should be obvious: the order of the format
+specifications is different from the original!
+Even though @code{gettext()} can return the translated string
+at runtime,
+it cannot change the argument order in the call to @code{printf}.
+
+To solve this problem, @code{printf} format specifiers may have
+an additional optional element, which we call a @dfn{positional specifier}.
+For example:
+
+@example
+"%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
+@end example
+
+Here, the positional specifier consists of an integer count, which indicates which
+argument to use, and a @samp{$}. Counts are one-based, and the
+format string itself is @emph{not} included. Thus, in the following
+example, @samp{string} is the first argument and @samp{length(string)} is the second:
+
+@example
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{string = "Dont Panic"}
+> @kbd{printf _"%2$d characters live in \"%1$s\"\n",}
+> @kbd{string, length(string)}
+> @kbd{@}'}
+@print{} 10 characters live in "Dont Panic"
+@end example
+
+If present, positional specifiers come first in the format specification,
+before the flags, the field width, and/or the precision.
+
+Positional specifiers can be used with the dynamic field width and
+precision capability:
+
+@example
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{printf("%*.*s\n", 10, 20, "hello")}
+> @kbd{printf("%3$*2$.*1$s\n", 20, 10, "hello")}
+> @kbd{@}'}
+@print{} hello
+@print{} hello
+@end example
+
+@quotation NOTE
+When using @samp{*} with a positional specifier, the @samp{*}
+comes first, then the integer position, and then the @samp{$}.
+This is somewhat counterintuitive.
+@end quotation
+
+@cindex @code{printf} statement, positional specifiers, mixing with regular formats
+@cindex positional specifiers, @code{printf} statement, mixing with regular formats
+@cindex format specifiers, mixing regular with positional specifiers
+@command{gawk} does not allow you to mix regular format specifiers
+and those with positional specifiers in the same string:
+
+@example
+$ @kbd{gawk 'BEGIN @{ printf _"%d %3$s\n", 1, 2, "hi" @}'}
+@error{} gawk: cmd. line:1: fatal: must use `count$' on all formats or none
+@end example
+
+@quotation NOTE
+There are some pathological cases that @command{gawk} may fail to
+diagnose. In such cases, the output may not be what you expect.
+It's still a bad idea to try mixing them, even if @command{gawk}
+doesn't detect it.
+@end quotation
+
+Although positional specifiers can be used directly in @command{awk} programs,
+their primary purpose is to help in producing correct translations of
+format strings into languages different from the one in which the program
+is first written.
+
+@node I18N Portability
+@subsection @command{awk} Portability Issues
+
+@cindex portability, internationalization and
+@cindex internationalization, localization, portability and
+@command{gawk}'s internationalization features were purposely chosen to
+have as little impact as possible on the portability of @command{awk}
+programs that use them to other versions of @command{awk}.
+Consider this program:
+
+@example
+BEGIN @{
+ TEXTDOMAIN = "guide"
+ if (Test_Guide) # set with -v
+ bindtextdomain("/test/guide/messages")
+ print _"don't panic!"
+@}
+@end example
+
+@noindent
+As written, it won't work on other versions of @command{awk}.
+However, it is actually almost portable, requiring very little
+change:
+
+@itemize @bullet
+@cindex @code{TEXTDOMAIN} variable, portability and
+@item
+Assignments to @code{TEXTDOMAIN} won't have any effect,
+since @code{TEXTDOMAIN} is not special in other @command{awk} implementations.
+
+@item
+Non-GNU versions of @command{awk} treat marked strings
+as the concatenation of a variable named @code{_} with the string
+following it.@footnote{This is good fodder for an ``Obfuscated
+@command{awk}'' contest.} Typically, the variable @code{_} has
+the null string (@code{""}) as its value, leaving the original string constant as
+the result.
+
+@item
+By defining ``dummy'' functions to replace @code{dcgettext()}, @code{dcngettext()}
+and @code{bindtextdomain()}, the @command{awk} program can be made to run, but
+all the messages are output in the original language.
+For example:
+
+@cindex @code{bindtextdomain()} function (@command{gawk}), portability and
+@cindex @code{dcgettext()} function (@command{gawk}), portability and
+@cindex @code{dcngettext()} function (@command{gawk}), portability and
+@example
+@c file eg/lib/libintl.awk
+function bindtextdomain(dir, domain)
+@{
+ return dir
+@}
+
+function dcgettext(string, domain, category)
+@{
+ return string
+@}
+
+function dcngettext(string1, string2, number, domain, category)
+@{
+ return (number == 1 ? string1 : string2)
+@}
+@c endfile
+@end example
+
+@item
+The use of positional specifications in @code{printf} or
+@code{sprintf()} is @emph{not} portable.
+To support @code{gettext()} at the C level, many systems' C versions of
+@code{sprintf()} do support positional specifiers. But it works only if
+enough arguments are supplied in the function call. Many versions of
+@command{awk} pass @code{printf} formats and arguments unchanged to the
+underlying C library version of @code{sprintf()}, but only one format and
+argument at a time. What happens if a positional specification is
+used is anybody's guess.
+However, since the positional specifications are primarily for use in
+@emph{translated} format strings, and since non-GNU @command{awk}s never
+retrieve the translated string, this should not be a problem in practice.
+@end itemize
+@c ENDOFRANGE inap
+
+@node I18N Example
+@section A Simple Internationalization Example
+
+Now let's look at a step-by-step example of how to internationalize and
+localize a simple @command{awk} program, using @file{guide.awk} as our
+original source:
+
+@example
+@c file eg/prog/guide.awk
+BEGIN @{
+ TEXTDOMAIN = "guide"
+ bindtextdomain(".") # for testing
+ print _"Don't Panic"
+ print _"The Answer Is", 42
+ print "Pardon me, Zaphod who?"
+@}
+@c endfile
+@end example
+
+@noindent
+Run @samp{gawk --gen-pot} to create the @file{.pot} file:
+
+@example
+$ @kbd{gawk --gen-pot -f guide.awk > guide.pot}
+@end example
+
+@noindent
+This produces:
+
+@example
+@c file eg/data/guide.po
+#: guide.awk:4
+msgid "Don't Panic"
+msgstr ""
+
+#: guide.awk:5
+msgid "The Answer Is"
+msgstr ""
+
+@c endfile
+@end example
+
+This original portable object template file is saved and reused for each language
+into which the application is translated. The @code{msgid}
+is the original string and the @code{msgstr} is the translation.
+
+@quotation NOTE
+Strings not marked with a leading underscore do not
+appear in the @file{guide.pot} file.
+@end quotation
+
+Next, the messages must be translated.
+Here is a translation to a hypothetical dialect of English,
+called ``Mellow'':@footnote{Perhaps it would be better if it were
+called ``Hippy.'' Ah, well.}
+
+@example
+@group
+$ cp guide.pot guide-mellow.po
+@var{Add translations to} guide-mellow.po @dots{}
+@end group
+@end example
+
+@noindent
+Following are the translations:
+
+@example
+@c file eg/data/guide-mellow.po
+#: guide.awk:4
+msgid "Don't Panic"
+msgstr "Hey man, relax!"
+
+#: guide.awk:5
+msgid "The Answer Is"
+msgstr "Like, the scoop is"
+
+@c endfile
+@end example
+
+@cindex Linux
+@cindex GNU/Linux
+The next step is to make the directory to hold the binary message object
+file and then to create the @file{guide.mo} file.
+The directory layout shown here is standard for GNU @code{gettext} on
+GNU/Linux systems. Other versions of @code{gettext} may use a different
+layout:
+
+@example
+$ @kbd{mkdir en_US en_US/LC_MESSAGES}
+@end example
+
+@cindex @code{.po} files, converting to @code{.mo}
+@cindex files, @code{.po}, converting to @code{.mo}
+@cindex @code{.mo} files, converting from @code{.po}
+@cindex files, @code{.mo}, converting from @code{.po}
+@cindex portable object files, converting to message object files
+@cindex files, portable object, converting to message object files
+@cindex message object files, converting from portable object files
+@cindex files, message object, converting from portable object files
+@cindex @command{msgfmt} utility
+The @command{msgfmt} utility does the conversion from human-readable
+@file{.po} file to machine-readable @file{.mo} file.
+By default, @command{msgfmt} creates a file named @file{messages}.
+This file must be renamed and placed in the proper directory so that
+@command{gawk} can find it:
+
+@example
+$ @kbd{msgfmt guide-mellow.po}
+$ @kbd{mv messages en_US/LC_MESSAGES/guide.mo}
+@end example
+
+Finally, we run the program to test it:
+
+@example
+$ @kbd{gawk -f guide.awk}
+@print{} Hey man, relax!
+@print{} Like, the scoop is 42
+@print{} Pardon me, Zaphod who?
+@end example
+
+If the three replacement functions for @code{dcgettext()}, @code{dcngettext()}
+and @code{bindtextdomain()}
+(@pxref{I18N Portability})
+are in a file named @file{libintl.awk},
+then we can run @file{guide.awk} unchanged as follows:
+
+@example
+$ @kbd{gawk --posix -f guide.awk -f libintl.awk}
+@print{} Don't Panic
+@print{} The Answer Is 42
+@print{} Pardon me, Zaphod who?
+@end example
+
+@node Gawk I18N
+@section @command{gawk} Can Speak Your Language
+
+@command{gawk} itself has been internationalized
+using the GNU @code{gettext} package.
+(GNU @code{gettext} is described in
+complete detail in
+@ifinfo
+@inforef{Top, , GNU @code{gettext} utilities, gettext, GNU gettext tools}.)
+@end ifinfo
+@ifnotinfo
+@cite{GNU gettext tools}.)
+@end ifnotinfo
+As of this writing, the latest version of GNU @code{gettext} is
+@uref{ftp://ftp.gnu.org/gnu/gettext/gettext-0.18.1.tar.gz, @value{PVERSION} 0.18.1}.
+
+If a translation of @command{gawk}'s messages exists,
+then @command{gawk} produces usage messages, warnings,
+and fatal errors in the local language.
+@c ENDOFRANGE inloc
+
+@node Advanced Features
+@chapter Advanced Features of @command{gawk}
+@cindex advanced features, network connections, See Also networks, connections
+@c STARTOFRANGE gawadv
+@cindex @command{gawk}, features, advanced
+@c STARTOFRANGE advgaw
+@cindex advanced features, @command{gawk}
+@ignore
+Contributed by: Peter Langston <pud!psl@bellcore.bellcore.com>
+
+ Found in Steve English's "signature" line:
+
+"Write documentation as if whoever reads it is a violent psychopath
+who knows where you live."
+@end ignore
+@quotation
+@i{Write documentation as if whoever reads it is
+a violent psychopath who knows where you live.}@*
+Steve English, as quoted by Peter Langston
+@end quotation
+
+This @value{CHAPTER} discusses advanced features in @command{gawk}.
+It's a bit of a ``grab bag'' of items that are otherwise unrelated
+to each other.
+First, a command-line option allows @command{gawk} to recognize
+nondecimal numbers in input data, not just in @command{awk}
+programs.
+Then, @command{gawk}'s special features for sorting arrays are presented.
+Next, two-way I/O, discussed briefly in earlier parts of this
+@value{DOCUMENT}, is described in full detail, along with the basics
+of TCP/IP networking. Finally, @command{gawk}
+can @dfn{profile} an @command{awk} program, making it possible to tune
+it for performance.
+
+@ref{Dynamic Extensions},
+discusses the ability to dynamically add new built-in functions to
+@command{gawk}. As this feature is still immature and likely to change,
+its description is relegated to an appendix.
+
+@menu
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array traversal and
+ sorting arrays.
+* Two-way I/O:: Two-way communications with another process.
+* TCP/IP Networking:: Using @command{gawk} for network programming.
+* Profiling:: Profiling your @command{awk} programs.
+@end menu
+
+@node Nondecimal Data
+@section Allowing Nondecimal Input Data
+@cindex @code{--non-decimal-data} option
+@cindex advanced features, @command{gawk}, nondecimal input data
+@cindex input, data@comma{} nondecimal
+@cindex constants, nondecimal
+
+If you run @command{gawk} with the @option{--non-decimal-data} option,
+you can have nondecimal constants in your input data:
+
+@c line break here for small book format
+@example
+$ @kbd{echo 0123 123 0x123 |}
+> @kbd{gawk --non-decimal-data '@{ printf "%d, %d, %d\n",}
+> @kbd{$1, $2, $3 @}'}
+@print{} 83, 123, 291
+@end example
+
+For this feature to work, write your program so that
+@command{gawk} treats your data as numeric:
+
+@example
+$ @kbd{echo 0123 123 0x123 | gawk '@{ print $1, $2, $3 @}'}
+@print{} 0123 123 0x123
+@end example
+
+@noindent
+The @code{print} statement treats its expressions as strings.
+Although the fields can act as numbers when necessary,
+they are still strings, so @code{print} does not try to treat them
+numerically. You may need to add zero to a field to force it to
+be treated as a number. For example:
+
+@example
+$ @kbd{echo 0123 123 0x123 | gawk --non-decimal-data '}
+> @kbd{@{ print $1, $2, $3}
+> @kbd{print $1 + 0, $2 + 0, $3 + 0 @}'}
+@print{} 0123 123 0x123
+@print{} 83 123 291
+@end example
+
+Because it is common to have decimal data with leading zeros, and because
+using this facility could lead to surprising results, the default is to leave it
+disabled. If you want it, you must explicitly request it.
+
+@cindex programming conventions, @code{--non-decimal-data} option
+@cindex @code{--non-decimal-data} option, @code{strtonum()} function and
+@cindex @code{strtonum()} function (@command{gawk}), @code{--non-decimal-data} option and
+@quotation CAUTION
+@emph{Use of this option is not recommended.}
+It can break old programs very badly.
+Instead, use the @code{strtonum()} function to convert your data
+(@pxref{Nondecimal-numbers}).
+This makes your programs easier to write and easier to read, and
+leads to less surprising results.
+@end quotation
+
+@node Array Sorting
+@section Controlling Array Traversal and Array Sorting
+
+@command{gawk} lets you control the order in which a @samp{for (i in array)}
+loop traverses an array.
+
+In addition, two built-in functions, @code{asort()} and @code{asorti()},
+let you sort arrays based on the array values and indices, respectively.
+These two functions also provide control over the sorting criteria used
+to order the elements during sorting.
+
+@menu
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use @code{asort()} and @code{asorti()}.
+@end menu
+
+@node Controlling Array Traversal
+@subsection Controlling Array Traversal
+
+By default, the order in which a @samp{for (i in array)} loop
+scans an array is not defined; it is generally based upon
+the internal implementation of arrays inside @command{awk}.
+
+Often, though, it is desirable to be able to loop over the elements
+in a particular order that you, the programmer, choose. @command{gawk}
+lets you do this.
+
+@ref{Controlling Scanning}, describes how you can assign special,
+pre-defined values to @code{PROCINFO["sorted_in"]} in order to
+control the order in which @command{gawk} will traverse an array
+during a @code{for} loop.
+
+In addition, the value of @code{PROCINFO["sorted_in"]} can be a function name.
+This lets you traverse an array based on any custom criterion.
+The array elements are ordered according to the return value of this
+function. The comparison function should be defined with at least
+four arguments:
+
+@example
+function comp_func(i1, v1, i2, v2)
+@{
+ @var{compare elements 1 and 2 in some fashion}
+ @var{return < 0; 0; or > 0}
+@}
+@end example
+
+Here, @var{i1} and @var{i2} are the indices, and @var{v1} and @var{v2}
+are the corresponding values of the two elements being compared.
+Either @var{v1} or @var{v2}, or both, can be arrays if the array being
+traversed contains subarrays as values.
+(@xref{Arrays of Arrays}, for more information about subarrays.)
+The three possible return values are interpreted as follows:
+
+@table @code
+@item comp_func(i1, v1, i2, v2) < 0
+Index @var{i1} comes before index @var{i2} during loop traversal.
+
+@item comp_func(i1, v1, i2, v2) == 0
+Indices @var{i1} and @var{i2}
+come together but the relative order with respect to each other is undefined.
+
+@item comp_func(i1, v1, i2, v2) > 0
+Index @var{i1} comes after index @var{i2} during loop traversal.
+@end table
+
+Our first comparison function can be used to scan an array in
+numerical order of the indices:
+
+@example
+function cmp_num_idx(i1, v1, i2, v2)
+@{
+ # numerical index comparison, ascending order
+ return (i1 - i2)
+@}
+@end example
+
+Our second function traverses an array based on the string order of
+the element values rather than by indices:
+
+@example
+function cmp_str_val(i1, v1, i2, v2)
+@{
+ # string value comparison, ascending order
+ v1 = v1 ""
+ v2 = v2 ""
+ if (v1 < v2)
+ return -1
+ return (v1 != v2)
+@}
+@end example
+
+The third
+comparison function makes all numbers, and numeric strings without
+any leading or trailing spaces, come out first during loop traversal:
+
+@example
+function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
+@{
+ # numbers before string value comparison, ascending order
+ n1 = v1 + 0
+ n2 = v2 + 0
+ if (n1 == v1)
+ return (n2 == v2) ? (n1 - n2) : -1
+ else if (n2 == v2)
+ return 1
+ return (v1 < v2) ? -1 : (v1 != v2)
+@}
+@end example
+
+Here is a main program to demonstrate how @command{gawk}
+behaves using each of the previous functions:
+
+@example
+BEGIN @{
+ data["one"] = 10
+ data["two"] = 20
+ data[10] = "one"
+ data[100] = 100
+ data[20] = "two"
+
+ f[1] = "cmp_num_idx"
+ f[2] = "cmp_str_val"
+ f[3] = "cmp_num_str_val"
+ for (i = 1; i <= 3; i++) @{
+ printf("Sort function: %s\n", f[i])
+ PROCINFO["sorted_in"] = f[i]
+ for (j in data)
+ printf("\tdata[%s] = %s\n", j, data[j])
+ print ""
+ @}
+@}
+@end example
+
+Here are the results when the program is run:
+@page
+
+@example
+$ @kbd{gawk -f compdemo.awk}
+@print{} Sort function: cmp_num_idx @ii{Sort by numeric index}
+@print{} data[two] = 20
+@print{} data[one] = 10 @ii{Both strings are numerically zero}
+@print{} data[10] = one
+@print{} data[20] = two
+@print{} data[100] = 100
+@print{}
+@print{} Sort function: cmp_str_val @ii{Sort by element values as strings}
+@print{} data[one] = 10
+@print{} data[100] = 100 @ii{String 100 is less than string 20}
+@print{} data[two] = 20
+@print{} data[10] = one
+@print{} data[20] = two
+@print{}
+@print{} Sort function: cmp_num_str_val @ii{Sort all numeric values before all strings}
+@print{} data[one] = 10
+@print{} data[two] = 20
+@print{} data[100] = 100
+@print{} data[10] = one
+@print{} data[20] = two
+@end example
+
+Consider sorting the entries of a GNU/Linux system password file
+according to login name. The following program sorts records
+by a specific field position and can be used for this purpose:
+
+@example
+# sort.awk --- simple program to sort by field position
+# field position is specified by the global variable POS
+
+function cmp_field(i1, v1, i2, v2)
+@{
+ # comparison by value, as string, and ascending order
+ return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
+@}
+
+@{
+ for (i = 1; i <= NF; i++)
+ a[NR][i] = $i
+@}
+
+END @{
+ PROCINFO["sorted_in"] = "cmp_field"
+ if (POS < 1 || POS > NF)
+ POS = 1
+ for (i in a) @{
+ for (j = 1; j <= NF; j++)
+ printf("%s%c", a[i][j], j < NF ? ":" : "")
+ print ""
+ @}
+@}
+@end example
+
+The first field in each entry of the password file is the user's login name,
+and the fields are separated by colons.
+Each record defines a subarray,
+with each field as an element in the subarray.
+Running the program produces the
+following output:
+
+@example
+$ @kbd{gawk -v POS=1 -F: -f sort.awk /etc/passwd}
+@print{} adm:x:3:4:adm:/var/adm:/sbin/nologin
+@print{} apache:x:48:48:Apache:/var/www:/sbin/nologin
+@print{} avahi:x:70:70:Avahi daemon:/:/sbin/nologin
+@dots{}
+@end example
+
+The comparison should normally always return the same value when given a
+specific pair of array elements as its arguments. If inconsistent
+results are returned then the order is undefined. This behavior can be
+exploited to introduce random order into otherwise seemingly
+ordered data:
+
+@example
+function cmp_randomize(i1, v1, i2, v2)
+@{
+ # random order
+ return (2 - 4 * rand())
+@}
+@end example
+
+As mentioned above, the order of the indices is arbitrary if two
+elements compare equal. This is usually not a problem, but letting
+the tied elements come out in arbitrary order can be an issue, especially
+when comparing item values. The partial ordering of the equal elements
+may change during the next loop traversal, if other elements are added or
+removed from the array. One way to resolve ties when comparing elements
+with otherwise equal values is to include the indices in the comparison
+rules. Note that doing this may make the loop traversal less efficient,
+so consider it only if necessary. The following comparison functions
+force a deterministic order, and are based on the fact that the
+indices of two elements are never equal:
+
+@example
+function cmp_numeric(i1, v1, i2, v2)
+@{
+ # numerical value (and index) comparison, descending order
+ return (v1 != v2) ? (v2 - v1) : (i2 - i1)
+@}
+
+function cmp_string(i1, v1, i2, v2)
+@{
+ # string value (and index) comparison, descending order
+ v1 = v1 i1
+ v2 = v2 i2
+ return (v1 > v2) ? -1 : (v1 != v2)
+@}
+@end example
+
+@c Avoid using the term ``stable'' when describing the unpredictable behavior
+@c if two items compare equal. Usually, the goal of a "stable algorithm"
+@c is to maintain the original order of the items, which is a meaningless
+@c concept for a list constructed from a hash.
+
+A custom comparison function can often simplify ordered loop
+traversal, and the sky is really the limit when it comes to
+designing such a function.
+
+When string comparisons are made during a sort, either for element
+values where one or both aren't numbers, or for element indices
+handled as strings, the value of @code{IGNORECASE}
+(@pxref{Built-in Variables}) controls whether
+the comparisons treat corresponding uppercase and lowercase letters as
+equivalent or distinct.
+
+Another point to keep in mind is that in the case of subarrays
+the element values can themselves be arrays; a production comparison
+function should use the @code{isarray()} function
+(@pxref{Type Functions}),
+to check for this, and choose a defined sorting order for subarrays.
+
+All sorting based on @code{PROCINFO["sorted_in"]}
+is disabled in POSIX mode,
+since the @code{PROCINFO} array is not special in that case.
+
+As a side note, sorting the array indices before traversing
+the array has been reported to add 15% to 20% overhead to the
+execution time of @command{awk} programs. For this reason,
+sorted array traversal is not the default.
+
+@c The @command{gawk}
+@c maintainers believe that only the people who wish to use a
+@c feature should have to pay for it.
+
+@node Array Sorting Functions
+@subsection Sorting Array Values and Indices with @command{gawk}
+
+@cindex arrays, sorting
+@cindex @code{asort()} function (@command{gawk})
+@cindex @code{asort()} function (@command{gawk}), arrays@comma{} sorting
+@cindex sort function, arrays, sorting
+In most @command{awk} implementations, sorting an array requires
+writing a @code{sort()} function.
+While this can be educational for exploring different sorting algorithms,
+usually that's not the point of the program.
+@command{gawk} provides the built-in @code{asort()}
+and @code{asorti()} functions
+(@pxref{String Functions})
+for sorting arrays. For example:
+
+@example
+@var{populate the array} data
+n = asort(data)
+for (i = 1; i <= n; i++)
+ @var{do something with} data[i]
+@end example
+
+After the call to @code{asort()}, the array @code{data} is indexed from 1
+to some number @var{n}, the total number of elements in @code{data}.
+(This count is @code{asort()}'s return value.)
+@code{data[1]} @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
+The comparison is based on the type of the elements
+(@pxref{Typing and Comparison}).
+All numeric values come before all string values,
+which in turn come before all subarrays.
+
+@cindex side effects, @code{asort()} function
+An important side effect of calling @code{asort()} is that
+@emph{the array's original indices are irrevocably lost}.
+As this isn't always desirable, @code{asort()} accepts a
+second argument:
+
+@example
+@var{populate the array} source
+n = asort(source, dest)
+for (i = 1; i <= n; i++)
+ @var{do something with} dest[i]
+@end example
+
+In this case, @command{gawk} copies the @code{source} array into the
+@code{dest} array and then sorts @code{dest}, destroying its indices.
+However, the @code{source} array is not affected.
+
+@code{asort()} accepts a third string argument to control comparison of
+array elements. As with @code{PROCINFO["sorted_in"]}, this argument
+may be one of the predefined names that @command{gawk} provides
+(@pxref{Controlling Scanning}), or the name of a user-defined function
+(@pxref{Controlling Array Traversal}).
+
+@quotation NOTE
+In all cases, the sorted element values consist of the original
+array's element values. The ability to control comparison merely
+affects the way in which they are sorted.
+@end quotation
+
+Often, what's needed is to sort on the values of the @emph{indices}
+instead of the values of the elements.
+To do that, use the
+@code{asorti()} function. The interface is identical to that of
+@code{asort()}, except that the index values are used for sorting, and
+become the values of the result array:
+
+@example
+@{ source[$0] = some_func($0) @}
+
+END @{
+ n = asorti(source, dest)
+ for (i = 1; i <= n; i++) @{
+ @ii{Work with sorted indices directly:}
+ @var{do something with} dest[i]
+ @dots{}
+ @ii{Access original array via sorted indices:}
+ @var{do something with} source[dest[i]]
+ @}
+@}
+@end example
+
+Similar to @code{asort()},
+in all cases, the sorted element values consist of the original
+array's indices. The ability to control comparison merely
+affects the way in which they are sorted.
+
+Sorting the array by replacing the indices provides maximal flexibility.
+To traverse the elements in decreasing order, use a loop that goes from
+@var{n} down to 1, either over the elements or over the indices.@footnote{You
+may also use one of the predefined sorting names that sorts in
+decreasing order.}
+
+@cindex reference counting, sorting arrays
+Copying array indices and elements isn't expensive in terms of memory.
+Internally, @command{gawk} maintains @dfn{reference counts} to data.
+For example, when @code{asort()} copies the first array to the second one,
+there is only one copy of the original array elements' data, even though
+both arrays use the values.
+
+@c Document It And Call It A Feature. Sigh.
+@cindex @command{gawk}, @code{IGNORECASE} variable in
+@cindex @code{IGNORECASE} variable
+@cindex arrays, sorting, @code{IGNORECASE} variable and
+@cindex @code{IGNORECASE} variable, array sorting and
+Because @code{IGNORECASE} affects string comparisons, the value
+of @code{IGNORECASE} also affects sorting for both @code{asort()} and @code{asorti()}.
+Note also that the locale's sorting order does @emph{not}
+come into play; comparisons are based on character values only.@footnote{This
+is true because locale-based comparison occurs only when in POSIX
+compatibility mode, and since @code{asort()} and @code{asorti()} are
+@command{gawk} extensions, they are not available in that case.}
+Caveat Emptor.
+
+@node Two-way I/O
+@section Two-Way Communications with Another Process
+@cindex Brennan, Michael
+@cindex programmers, attractiveness of
+@smallexample
+@c Path: cssun.mathcs.emory.edu!gatech!newsxfer3.itd.umich.edu!news-peer.sprintlink.net!news-sea-19.sprintlink.net!news-in-west.sprintlink.net!news.sprintlink.net!Sprint!204.94.52.5!news.whidbey.com!brennan
+From: brennan@@whidbey.com (Mike Brennan)
+Newsgroups: comp.lang.awk
+Subject: Re: Learn the SECRET to Attract Women Easily
+Date: 4 Aug 1997 17:34:46 GMT
+@c Organization: WhidbeyNet
+@c Lines: 12
+Message-ID: <5s53rm$eca@@news.whidbey.com>
+@c References: <5s20dn$2e1@chronicle.concentric.net>
+@c Reply-To: brennan@whidbey.com
+@c NNTP-Posting-Host: asn202.whidbey.com
+@c X-Newsreader: slrn (0.9.4.1 UNIX)
+@c Xref: cssun.mathcs.emory.edu comp.lang.awk:5403
+
+On 3 Aug 1997 13:17:43 GMT, Want More Dates???
+<tracy78@@kilgrona.com> wrote:
+>Learn the SECRET to Attract Women Easily
+>
+>The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women
+
+The scent of awk programmers is a lot more attractive to women than
+the scent of perl programmers.
+--
+Mike Brennan
+@c brennan@@whidbey.com
+@end smallexample
+
+@cindex advanced features, @command{gawk}, processes@comma{} communicating with
+@cindex processes, two-way communications with
+It is often useful to be able to
+send data to a separate program for
+processing and then read the result. This can always be
+done with temporary files:
+
+@example
+# Write the data for processing
+tempfile = ("mydata." PROCINFO["pid"])
+while (@var{not done with data})
+ print @var{data} | ("subprogram > " tempfile)
+close("subprogram > " tempfile)
+
+# Read the results, remove tempfile when done
+while ((getline newdata < tempfile) > 0)
+ @var{process} newdata @var{appropriately}
+close(tempfile)
+system("rm " tempfile)
+@end example
+
+@noindent
+This works, but not elegantly. Among other things, it requires that
+the program be run in a directory that cannot be shared among users;
+for example, @file{/tmp} will not do, as another user might happen
+to be using a temporary file with the same name.
+
+@cindex coprocesses
+@cindex input/output, two-way
+@cindex @code{|} (vertical bar), @code{|&} operator (I/O)
+@cindex vertical bar (@code{|}), @code{|&} operator (I/O)
+@cindex @command{csh} utility, @code{|&} operator, comparison with
+However, with @command{gawk}, it is possible to
+open a @emph{two-way} pipe to another process. The second process is
+termed a @dfn{coprocess}, since it runs in parallel with @command{gawk}.
+The two-way connection is created using the @samp{|&} operator
+(borrowed from the Korn shell, @command{ksh}):@footnote{This is very
+different from the same operator in the C shell.}
+
+@example
+do @{
+ print @var{data} |& "subprogram"
+ "subprogram" |& getline results
+@} while (@var{data left to process})
+close("subprogram")
+@end example
+
+The first time an I/O operation is executed using the @samp{|&}
+operator, @command{gawk} creates a two-way pipeline to a child process
+that runs the other program. Output created with @code{print}
+or @code{printf} is written to the program's standard input, and
+output from the program's standard output can be read by the @command{gawk}
+program using @code{getline}.
+As is the case with processes started by @samp{|}, the subprogram
+can be any program, or pipeline of programs, that can be started by
+the shell.
+
+There are some cautionary items to be aware of:
+
+@itemize @bullet
+@item
+As the code inside @command{gawk} currently stands, the coprocess's
+standard error goes to the same place that the parent @command{gawk}'s
+standard error goes. It is not possible to read the child's
+standard error separately.
+
+@cindex deadlocks
+@cindex buffering, input/output
+@cindex @code{getline} command, deadlock and
+@item
+I/O buffering may be a problem. @command{gawk} automatically
+flushes all output down the pipe to the coprocess.
+However, if the coprocess does not flush its output,
+@command{gawk} may hang when doing a @code{getline} in order to read
+the coprocess's results. This could lead to a situation
+known as @dfn{deadlock}, where each process is waiting for the
+other one to do something.
+@end itemize
+
+@cindex @code{close()} function, two-way pipes and
+It is possible to close just one end of the two-way pipe to
+a coprocess, by supplying a second argument to the @code{close()}
+function of either @code{"to"} or @code{"from"}
+(@pxref{Close Files And Pipes}).
+These strings tell @command{gawk} to close the end of the pipe
+that sends data to the coprocess or the end that reads from it,
+respectively.
+
+@cindex @command{sort} utility, coprocesses and
+This is particularly necessary in order to use
+the system @command{sort} utility as part of a coprocess;
+@command{sort} must read @emph{all} of its input
+data before it can produce any output.
+The @command{sort} program does not receive an end-of-file indication
+until @command{gawk} closes the write end of the pipe.
+
+When you have finished writing data to the @command{sort}
+utility, you can close the @code{"to"} end of the pipe, and
+then start reading sorted data via @code{getline}.
+For example:
+
+@example
+BEGIN @{
+ command = "LC_ALL=C sort"
+ n = split("abcdefghijklmnopqrstuvwxyz", a, "")
+
+ for (i = n; i > 0; i--)
+ print a[i] |& command
+ close(command, "to")
+
+ while ((command |& getline line) > 0)
+ print "got", line
+ close(command)
+@}
+@end example
+
+This program writes the letters of the alphabet in reverse order, one
+per line, down the two-way pipe to @command{sort}. It then closes the
+write end of the pipe, so that @command{sort} receives an end-of-file
+indication. This causes @command{sort} to sort the data and write the
+sorted data back to the @command{gawk} program. Once all of the data
+has been read, @command{gawk} terminates the coprocess and exits.
+
+As a side note, the assignment @samp{LC_ALL=C} in the @command{sort}
+command ensures traditional Unix (ASCII) sorting from @command{sort}.
+
+@cindex @command{gawk}, @code{PROCINFO} array in
+@cindex @code{PROCINFO} array
+You may also use pseudo-ttys (ptys) for
+two-way communication instead of pipes, if your system supports them.
+This is done on a per-command basis, by setting a special element
+in the @code{PROCINFO} array
+(@pxref{Auto-set}),
+like so:
+
+@example
+command = "sort -nr" # command, save in convenience variable
+PROCINFO[command, "pty"] = 1 # update PROCINFO
+print @dots{} |& command # start two-way pipe
+@dots{}
+@end example
+
+@noindent
+Using ptys avoids the buffer deadlock issues described earlier, at some
+loss in performance. If your system does not have ptys, or if all the
+system's ptys are in use, @command{gawk} automatically falls back to
+using regular pipes.
+
+@node TCP/IP Networking
+@section Using @command{gawk} for Network Programming
+@cindex advanced features, @command{gawk}, network programming
+@cindex networks, programming
+@c STARTOFRANGE tcpip
+@cindex TCP/IP
+@cindex @code{/inet/@dots{}} special files (@command{gawk})
+@cindex files, @code{/inet/@dots{}} (@command{gawk})
+@cindex @code{/inet4/@dots{}} special files (@command{gawk})
+@cindex files, @code{/inet4/@dots{}} (@command{gawk})
+@cindex @code{/inet6/@dots{}} special files (@command{gawk})
+@cindex files, @code{/inet6/@dots{}} (@command{gawk})
+@cindex @code{EMISTERED}
+@quotation
+@code{EMISTERED}:@*
+@ @ @ @ @i{A host is a host from coast to coast,@*
+@ @ @ @ and no-one can talk to host that's close,@*
+@ @ @ @ unless the host that isn't close@*
+@ @ @ @ is busy hung or dead.}
+@end quotation
+
+In addition to being able to open a two-way pipeline to a coprocess
+on the same system
+(@pxref{Two-way I/O}),
+it is possible to make a two-way connection to
+another process on another system across an IP network connection.
+
+You can think of this as just a @emph{very long} two-way pipeline to
+a coprocess.
+The way @command{gawk} decides that you want to use TCP/IP networking is
+by recognizing special @value{FN}s that begin with one of @samp{/inet/},
+@samp{/inet4/} or @samp{/inet6}.
+
+The full syntax of the special @value{FN} is
+@file{/@var{net-type}/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}.
+The components are:
+
+@table @var
+@item net-type
+Specifies the kind of Internet connection to make.
+Use @samp{/inet4/} to force IPv4, and
+@samp{/inet6/} to force IPv6.
+Plain @samp{/inet/} (which used to be the only option) uses
+the system default, most likely IPv4.
+
+@item protocol
+The protocol to use over IP. This must be either @samp{tcp}, or
+@samp{udp}, for a TCP or UDP IP connection,
+respectively. The use of TCP is recommended for most applications.
+
+@item local-port
+@cindex @code{getaddrinfo()} function (C library)
+The local TCP or UDP port number to use. Use a port number of @samp{0}
+when you want the system to pick a port. This is what you should do
+when writing a TCP or UDP client.
+You may also use a well-known service name, such as @samp{smtp}
+or @samp{http}, in which case @command{gawk} attempts to determine
+the predefined port number using the C @code{getaddrinfo()} function.
+
+@item remote-host
+The IP address or fully-qualified domain name of the Internet
+host to which you want to connect.
+
+@item remote-port
+The TCP or UDP port number to use on the given @var{remote-host}.
+Again, use @samp{0} if you don't care, or else a well-known
+service name.
+@end table
+
+@cindex @command{gawk}, @code{ERRNO} variable in
+@cindex @code{ERRNO} variable
+@quotation NOTE
+Failure in opening a two-way socket will result in a non-fatal error
+being returned to the calling code. The value of @code{ERRNO} indicates
+the error (@pxref{Auto-set}).
+@end quotation
+
+Consider the following very simple example:
+
+@example
+BEGIN @{
+ Service = "/inet/tcp/0/localhost/daytime"
+ Service |& getline
+ print $0
+ close(Service)
+@}
+@end example
+
+This program reads the current date and time from the local system's
+TCP @samp{daytime} server.
+It then prints the results and closes the connection.
+
+Because this topic is extensive, the use of @command{gawk} for
+TCP/IP programming is documented separately.
+@ifinfo
+See
+@inforef{Top, , General Introduction, gawkinet, TCP/IP Internetworking with @command{gawk}},
+@end ifinfo
+@ifnotinfo
+See @cite{TCP/IP Internetworking with @command{gawk}},
+which comes as part of the @command{gawk} distribution,
+@end ifnotinfo
+for a much more complete introduction and discussion, as well as
+extensive examples.
+
+@c ENDOFRANGE tcpip
+
+@node Profiling
+@section Profiling Your @command{awk} Programs
+@c STARTOFRANGE awkp
+@cindex @command{awk} programs, profiling
+@c STARTOFRANGE proawk
+@cindex profiling @command{awk} programs
+@cindex profiling @command{gawk}
+@cindex @code{awkprof.out} file
+@cindex files, @code{awkprof.out}
+
+You may produce execution traces of your @command{awk} programs.
+This is done by passing the option @option{--profile} to @command{gawk}.
+When @command{gawk} has finished running, it creates a profile of your program in a file
+named @file{awkprof.out}. Because it is profiling, it also executes up to 45% slower than
+@command{gawk} normally does.
+
+@cindex @code{--profile} option
+As shown in the following example,
+the @option{--profile} option can be used to change the name of the file
+where @command{gawk} will write the profile:
+
+@example
+gawk --profile=myprog.prof -f myprog.awk data1 data2
+@end example
+
+@noindent
+In the above example, @command{gawk} places the profile in
+@file{myprog.prof} instead of in @file{awkprof.out}.
+
+Here is a sample session showing a simple @command{awk} program, its input data, and the
+results from running @command{gawk} with the @option{--profile} option.
+First, the @command{awk} program:
+
+@example
+BEGIN @{ print "First BEGIN rule" @}
+
+END @{ print "First END rule" @}
+
+/foo/ @{
+ print "matched /foo/, gosh"
+ for (i = 1; i <= 3; i++)
+ sing()
+@}
+
+@{
+ if (/foo/)
+ print "if is true"
+ else
+ print "else is true"
+@}
+
+BEGIN @{ print "Second BEGIN rule" @}
+
+END @{ print "Second END rule" @}
+
+function sing( dummy)
+@{
+ print "I gotta be me!"
+@}
+@end example
+
+Following is the input data:
+
+@example
+foo
+bar
+baz
+foo
+junk
+@end example
+
+Here is the @file{awkprof.out} that results from running the @command{gawk}
+profiler on this program and data (this example also illustrates that @command{awk}
+programmers sometimes have to work late):
+
+@cindex @code{BEGIN} pattern
+@cindex @code{END} pattern
+@example
+ # gawk profile, created Sun Aug 13 00:00:15 2000
+
+ # BEGIN block(s)
+
+ BEGIN @{
+ 1 print "First BEGIN rule"
+ 1 print "Second BEGIN rule"
+ @}
+
+ # Rule(s)
+
+ 5 /foo/ @{ # 2
+ 2 print "matched /foo/, gosh"
+ 6 for (i = 1; i <= 3; i++) @{
+ 6 sing()
+ @}
+ @}
+
+ 5 @{
+ 5 if (/foo/) @{ # 2
+ 2 print "if is true"
+ 3 @} else @{
+ 3 print "else is true"
+ @}
+ @}
+
+ # END block(s)
+
+ END @{
+ 1 print "First END rule"
+ 1 print "Second END rule"
+ @}
+
+ # Functions, listed alphabetically
+
+ 6 function sing(dummy)
+ @{
+ 6 print "I gotta be me!"
+ @}
+@end example
+
+This example illustrates many of the basic features of profiling output.
+They are as follows:
+
+@itemize @bullet
+@item
+The program is printed in the order @code{BEGIN} rule,
+@code{BEGINFILE} rule,
+pattern/action rules,
+@code{ENDFILE} rule, @code{END} rule and functions, listed
+alphabetically.
+Multiple @code{BEGIN} and @code{END} rules are merged together,
+as are multiple @code{BEGINFILE} and @code{ENDFILE} rules.
+
+@cindex patterns, counts
+@item
+Pattern-action rules have two counts.
+The first count, to the left of the rule, shows how many times
+the rule's pattern was @emph{tested}.
+The second count, to the right of the rule's opening left brace
+in a comment,
+shows how many times the rule's action was @emph{executed}.
+The difference between the two indicates how many times the rule's
+pattern evaluated to false.
+
+@item
+Similarly,
+the count for an @code{if}-@code{else} statement shows how many times
+the condition was tested.
+To the right of the opening left brace for the @code{if}'s body
+is a count showing how many times the condition was true.
+The count for the @code{else}
+indicates how many times the test failed.
+
+@cindex loops, count for header
+@item
+The count for a loop header (such as @code{for}
+or @code{while}) shows how many times the loop test was executed.
+(Because of this, you can't just look at the count on the first
+statement in a rule to determine how many times the rule was executed.
+If the first statement is a loop, the count is misleading.)
+
+@cindex functions, user-defined, counts
+@cindex user-defined, functions, counts
+@item
+For user-defined functions, the count next to the @code{function}
+keyword indicates how many times the function was called.
+The counts next to the statements in the body show how many times
+those statements were executed.
+
+@cindex @code{@{@}} (braces)
+@cindex braces (@code{@{@}})
+@item
+The layout uses ``K&R'' style with TABs.
+Braces are used everywhere, even when
+the body of an @code{if}, @code{else}, or loop is only a single statement.
+
+@cindex @code{()} (parentheses)
+@cindex parentheses @code{()}
+@item
+Parentheses are used only where needed, as indicated by the structure
+of the program and the precedence rules.
+@c extra verbiage here satisfies the copyeditor. ugh.
+For example, @samp{(3 + 5) * 4} means add three plus five, then multiply
+the total by four. However, @samp{3 + 5 * 4} has no parentheses, and
+means @samp{3 + (5 * 4)}.
+
+@ignore
+@item
+All string concatenations are parenthesized too.
+(This could be made a bit smarter.)
+@end ignore
+
+@item
+Parentheses are used around the arguments to @code{print}
+and @code{printf} only when
+the @code{print} or @code{printf} statement is followed by a redirection.
+Similarly, if
+the target of a redirection isn't a scalar, it gets parenthesized.
+
+@item
+@command{gawk} supplies leading comments in
+front of the @code{BEGIN} and @code{END} rules,
+the pattern/action rules, and the functions.
+
+@end itemize
+
+The profiled version of your program may not look exactly like what you
+typed when you wrote it. This is because @command{gawk} creates the
+profiled version by ``pretty printing'' its internal representation of
+the program. The advantage to this is that @command{gawk} can produce
+a standard representation. The disadvantage is that all source-code
+comments are lost, as are the distinctions among multiple @code{BEGIN},
+@code{END}, @code{BEGINFILE}, and @code{ENDFILE} rules. Also, things such as:
+
+@example
+/foo/
+@end example
+
+@noindent
+come out as:
+
+@example
+/foo/ @{
+ print $0
+@}
+@end example
+
+@noindent
+which is correct, but possibly surprising.
+
+@cindex profiling @command{awk} programs, dynamically
+@cindex @command{gawk} program, dynamic profiling
+Besides creating profiles when a program has completed,
+@command{gawk} can produce a profile while it is running.
+This is useful if your @command{awk} program goes into an
+infinite loop and you want to see what has been executed.
+To use this feature, run @command{gawk} with the @option{--profile}
+option in the background:
+
+@example
+$ @kbd{gawk --profile -f myprog &}
+[1] 13992
+@end example
+
+@cindex @command{kill} command@comma{} dynamic profiling
+@cindex @code{USR1} signal
+@cindex @code{SIGUSR1} signal
+@cindex signals, @code{USR1}/@code{SIGUSR1}
+@noindent
+The shell prints a job number and process ID number; in this case, 13992.
+Use the @command{kill} command to send the @code{USR1} signal
+to @command{gawk}:
+
+@example
+$ @kbd{kill -USR1 13992}
+@end example
+
+@noindent
+As usual, the profiled version of the program is written to
+@file{awkprof.out}, or to a different file if one specified with
+the @option{--profile} option.
+
+Along with the regular profile, as shown earlier, the profile
+includes a trace of any active functions:
+
+@example
+# Function Call Stack:
+
+# 3. baz
+# 2. bar
+# 1. foo
+# -- main --
+@end example
+
+You may send @command{gawk} the @code{USR1} signal as many times as you like.
+Each time, the profile and function call trace are appended to the output
+profile file.
+
+@cindex @code{HUP} signal
+@cindex @code{SIGHUP} signal
+@cindex signals, @code{HUP}/@code{SIGHUP}
+If you use the @code{HUP} signal instead of the @code{USR1} signal,
+@command{gawk} produces the profile and the function call trace and then exits.
+
+@cindex @code{INT} signal (MS-Windows)
+@cindex @code{SIGINT} signal (MS-Windows)
+@cindex signals, @code{INT}/@code{SIGINT} (MS-Windows)
+@cindex @code{QUIT} signal (MS-Windows)
+@cindex @code{SIGQUIT} signal (MS-Windows)
+@cindex signals, @code{QUIT}/@code{SIGQUIT} (MS-Windows)
+When @command{gawk} runs on MS-Windows systems, it uses the
+@code{INT} and @code{QUIT} signals for producing the profile and, in
+the case of the @code{INT} signal, @command{gawk} exits. This is
+because these systems don't support the @command{kill} command, so the
+only signals you can deliver to a program are those generated by the
+keyboard. The @code{INT} signal is generated by the
+@kbd{@value{CTL}-@key{C}} or @kbd{@value{CTL}-@key{BREAK}} key, while the
+@code{QUIT} signal is generated by the @kbd{@value{CTL}-@key{\}} key.
+
+Finally, @command{gawk} also accepts another option @option{--pretty-print}.
+When called this way, @command{gawk} ``pretty prints'' the program into
+@file{awkprof.out}, without any execution counts.
+@c ENDOFRANGE advgaw
+@c ENDOFRANGE gawadv
+@c ENDOFRANGE awkp
+@c ENDOFRANGE proawk
+
@c The original text for this chapter was contributed by Efraim Yawitz.
@c FIXME: Add more indexing.
@@ -31858,16 +31897,18 @@ If you write an extension that you wish to share with other
@command{gawk} users, please consider doing so through the
@code{gawkextlib} project.
+@iftex
+@part Part IV:@* Appendices
+@end iftex
@ignore
-@c Try this
-@iftex
-@page
-@headings off
-@majorheading III@ @ @ Appendixes
-Part III provides the appendixes, the Glossary, and two licenses that cover
+@ifdocbook
+
+@part Part IV:@* Appendices
+
+Part IV provides the appendices, the Glossary, and two licenses that cover
the @command{gawk} source code and this @value{DOCUMENT}, respectively.
-It contains the following appendixes:
+It contains the following appendices:
@itemize @bullet
@item
@@ -31891,11 +31932,7 @@ It contains the following appendixes:
@item
@ref{GNU Free Documentation License}.
@end itemize
-
-@page
-@evenheading @thispage@ @ @ @strong{@value{TITLE}} @| @|
-@oddheading @| @| @strong{@thischapter}@ @ @ @thispage
-@end iftex
+@end ifdocbook
@end ignore
@node Language History