aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawk.texi
diff options
context:
space:
mode:
authorArnold D. Robbins <arnold@skeeve.com>2011-05-04 23:39:43 +0300
committerArnold D. Robbins <arnold@skeeve.com>2011-05-04 23:39:43 +0300
commit1387c9a6046ba3a3e9ce8343daac42e1086efa6b (patch)
treed203a0f14aae778c0e907f1fca66c12fee3346e1 /doc/gawk.texi
parentf2b825c82aa6b0b2eabed734244148206f3c01a5 (diff)
downloadegawk-1387c9a6046ba3a3e9ce8343daac42e1086efa6b.tar.gz
egawk-1387c9a6046ba3a3e9ce8343daac42e1086efa6b.tar.bz2
egawk-1387c9a6046ba3a3e9ce8343daac42e1086efa6b.zip
Revamp array sorting.
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r--doc/gawk.texi1710
1 files changed, 916 insertions, 794 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 60cfd1d7..49229d19 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -306,394 +306,437 @@ particular records in a file and perform operations upon them.
* Index:: Concept and Variable Index.
@detailmenu
-* History:: The history of @command{gawk} and
- @command{awk}.
-* Names:: What name to use to find @command{awk}.
-* This Manual:: Using this @value{DOCUMENT}. Includes
- sample input files that you can use.
-* Conventions:: Typographical Conventions.
-* Manual History:: Brief history of the GNU project and this
- @value{DOCUMENT}.
-* How To Contribute:: Helping to save the world.
-* Acknowledgments:: Acknowledgments.
-* Running gawk:: How to run @command{gawk} programs;
- includes command-line syntax.
-* One-shot:: Running a short throwaway @command{awk}
- program.
-* Read Terminal:: Using no input files (input from terminal
- instead).
-* Long:: Putting permanent @command{awk} programs in
- files.
-* Executable Scripts:: Making self-contained @command{awk}
- programs.
-* Comments:: Adding documentation to @command{gawk}
- programs.
-* Quoting:: More discussion of shell quoting issues.
-* DOS Quoting:: Quoting in Windows Batch Files.
-* Sample Data Files:: Sample data files for use in the
- @command{awk} programs illustrated in this
- @value{DOCUMENT}.
-* Very Simple:: A very simple example.
-* Two Rules:: A less simple one-line example using two
- rules.
-* More Complex:: A more complex example.
-* Statements/Lines:: Subdividing or combining statements into
- lines.
-* Other Features:: Other Features of @command{awk}.
-* When:: When to use @command{gawk} and when to use
- other things.
-* Command Line:: How to run @command{awk}.
-* Options:: Command-line options and their meanings.
-* Other Arguments:: Input file names and variable assignments.
-* Naming Standard Input:: How to specify standard input with other
- files.
-* Environment Variables:: The environment variables @command{gawk}
- uses.
-* AWKPATH Variable:: Searching directories for @command{awk}
- programs.
-* Other Environment Variables:: The environment variables.
-* Exit Status:: @command{gawk}'s exit status.
-* Include Files:: Including other files into your program.
-* Obsolete:: Obsolete Options and/or features.
-* Undocumented:: Undocumented Options and Features.
-* Regexp Usage:: How to Use Regular Expressions.
-* Escape Sequences:: How to write nonprinting characters.
-* Regexp Operators:: Regular Expression Operators.
-* Bracket Expressions:: What can go between @samp{[...]}.
-* GNU Regexp Operators:: Operators specific to GNU software.
-* Case-sensitivity:: How to do case-insensitive matching.
-* Leftmost Longest:: How much text matches.
-* Computed Regexps:: Using Dynamic Regexps.
-* Locales:: How the locale affects things.
-* Records:: Controlling how data is split into records.
-* Fields:: An introduction to fields.
-* Nonconstant Fields:: Nonconstant Field Numbers.
-* Changing Fields:: Changing the Contents of a Field.
-* Field Separators:: The field separator and how to change it.
-* Default Field Splitting:: How fields are normally separated.
-* Regexp Field Splitting:: Using regexps as the field separator.
-* Single Character Fields:: Making each character a separate field.
-* Command Line Field Separator:: Setting @code{FS} from the command-line.
-* Field Splitting Summary:: Some final points and a summary table.
-* Constant Size:: Reading constant width data.
-* Splitting By Content:: Defining Fields By Content
-* Multiple Line:: Reading multi-line records.
-* Getline:: Reading files under explicit program
- control using the @code{getline} function.
-* Plain Getline:: Using @code{getline} with no arguments.
-* Getline/Variable:: Using @code{getline} into a variable.
-* Getline/File:: Using @code{getline} from a file.
-* Getline/Variable/File:: Using @code{getline} into a variable from a
- file.
-* Getline/Pipe:: Using @code{getline} from a pipe.
-* Getline/Variable/Pipe:: Using @code{getline} into a variable from a
- pipe.
-* Getline/Coprocess:: Using @code{getline} from a coprocess.
-* Getline/Variable/Coprocess:: Using @code{getline} into a variable from a
- coprocess.
-* Getline Notes:: Important things to know about
- @code{getline}.
-* Getline Summary:: Summary of @code{getline} Variants.
-* Command line directories:: What happens if you put a directory on the
- command line.
-* Print:: The @code{print} statement.
-* Print Examples:: Simple examples of @code{print} statements.
-* Output Separators:: The output separators and how to change
- them.
-* OFMT:: Controlling Numeric Output With
- @code{print}.
-* Printf:: The @code{printf} statement.
-* Basic Printf:: Syntax of the @code{printf} statement.
-* Control Letters:: Format-control letters.
-* Format Modifiers:: Format-specification modifiers.
-* Printf Examples:: Several examples.
-* Redirection:: How to redirect output to multiple files
- and pipes.
-* Special Files:: File name interpretation in @command{gawk}.
- @command{gawk} allows access to inherited
- file descriptors.
-* Special FD:: Special files for I/O.
-* Special Network:: Special files for network communications.
-* Special Caveats:: Things to watch out for.
-* Close Files And Pipes:: Closing Input and Output Files and Pipes.
-* Values:: Constants, Variables, and Regular
- Expressions.
-* Constants:: String, numeric and regexp constants.
-* Scalar Constants:: Numeric and string constants.
-* Nondecimal-numbers:: What are octal and hex numbers.
-* Regexp Constants:: Regular Expression constants.
-* Using Constant Regexps:: When and how to use a regexp constant.
-* Variables:: Variables give names to values for later
- use.
-* Using Variables:: Using variables in your programs.
-* Assignment Options:: Setting variables on the command-line and a
- summary of command-line syntax. This is an
- advanced method of input.
-* Conversion:: The conversion of strings to numbers and
- vice versa.
-* All Operators:: @command{gawk}'s operators.
-* Arithmetic Ops:: Arithmetic operations (@samp{+}, @samp{-},
- etc.)
-* Concatenation:: Concatenating strings.
-* Assignment Ops:: Changing the value of a variable or a
- field.
-* Increment Ops:: Incrementing the numeric value of a
- variable.
-* Truth Values and Conditions:: Testing for true and false.
-* Truth Values:: What is ``true'' and what is ``false''.
-* Typing and Comparison:: How variables acquire types and how this
- affects comparison of numbers and strings
- with @samp{<}, etc.
-* Variable Typing:: String type versus numeric type.
-* Comparison Operators:: The comparison operators.
-* POSIX String Comparison:: String comparison with POSIX rules.
-* Boolean Ops:: Combining comparison expressions using
- boolean operators @samp{||} (``or''),
- @samp{&&} (``and'') and @samp{!} (``not'').
-* Conditional Exp:: Conditional expressions select between two
- subexpressions under control of a third
- subexpression.
-* Function Calls:: A function call is an expression.
-* Precedence:: How various operators nest.
-* Pattern Overview:: What goes into a pattern.
-* Regexp Patterns:: Using regexps as patterns.
-* Expression Patterns:: Any expression can be used as a pattern.
-* Ranges:: Pairs of patterns specify record ranges.
-* BEGIN/END:: Specifying initialization and cleanup
- rules.
-* Using BEGIN/END:: How and why to use BEGIN/END rules.
-* I/O And BEGIN/END:: I/O issues in BEGIN/END rules.
-* Empty:: The empty pattern, which matches every
- record.
-* BEGINFILE/ENDFILE:: Two special patterns for advanced control.
-* Using Shell Variables:: How to use shell variables with
- @command{awk}.
-* Action Overview:: What goes into an action.
-* Statements:: Describes the various control statements in
- detail.
-* If Statement:: Conditionally execute some @command{awk}
- statements.
-* While Statement:: Loop until some condition is satisfied.
-* Do Statement:: Do specified action while looping until
- some condition is satisfied.
-* For Statement:: Another looping statement, that provides
- initialization and increment clauses.
-* Switch Statement:: Switch/case evaluation for conditional
- execution of statements based on a value.
-* Break Statement:: Immediately exit the innermost enclosing
- loop.
-* Continue Statement:: Skip to the end of the innermost enclosing
- loop.
-* Next Statement:: Stop processing the current input record.
-* Nextfile Statement:: Stop processing the current file.
-* Exit Statement:: Stop execution of @command{awk}.
-* Built-in Variables:: Summarizes the built-in variables.
-* User-modified:: Built-in variables that you change to
- control @command{awk}.
-* Auto-set:: Built-in variables where @command{awk}
- gives you information.
-* ARGC and ARGV:: Ways to use @code{ARGC} and @code{ARGV}.
-* Array Basics:: The basics of arrays.
-* Array Intro:: Introduction to Arrays
-* Reference to Elements:: How to examine one element of an array.
-* Assigning Elements:: How to change an element of an array.
-* Array Example:: Basic Example of an Array
-* Scanning an Array:: A variation of the @code{for} statement. It
- loops through the indices of an array's
- existing elements.
-* Controlling Scanning:: Controlling the order in which arrays
- are scanned.
-* Delete:: The @code{delete} statement removes an
- element from an array.
-* Numeric Array Subscripts:: How to use numbers as subscripts in
- @command{awk}.
-* Uninitialized Subscripts:: Using Uninitialized variables as
- subscripts.
-* Multi-dimensional:: Emulating multidimensional arrays in
- @command{awk}.
-* Multi-scanning:: Scanning multidimensional arrays.
-* Array Sorting:: Sorting array values and indices.
-* Arrays of Arrays:: True multidimensional arrays.
-* Built-in:: Summarizes the built-in functions.
-* Calling Built-in:: How to call built-in functions.
-* Numeric Functions:: Functions that work with numbers, including
- @code{int()}, @code{sin()} and
- @code{rand()}.
-* String Functions:: Functions for string manipulation, such as
- @code{split()}, @code{match()} and
- @code{sprintf()}.
-* Gory Details:: More than you want to know about @samp{\}
- and @samp{&} with @code{sub()},
- @code{gsub()}, and @code{gensub()}.
-* I/O Functions:: Functions for files and shell commands.
-* Time Functions:: Functions for dealing with timestamps.
-* Bitwise Functions:: Functions for bitwise operations.
-* Type Functions:: Functions for type information.
-* I18N Functions:: Functions for string translation.
-* User-defined:: Describes User-defined functions in detail.
-* Definition Syntax:: How to write definitions and what they
- mean.
-* Function Example:: An example function definition and what it
- does.
-* Function Caveats:: Things to watch out for.
-* Calling A Function:: Don't use spaces.
-* Variable Scope:: Controlling variable scope.
-* Pass By Value/Reference:: Passing parameters.
-* Return Statement:: Specifying the value a function returns.
-* Dynamic Typing:: How variable types can change at runtime.
-* Indirect Calls:: Choosing the function to call at runtime.
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU @code{gettext} works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging @code{printf} arguments.
-* I18N Portability:: @command{awk}-level portability issues.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: @command{gawk} is also internationalized.
-* Nondecimal Data:: Allowing nondecimal input data.
-* Two-way I/O:: Two-way communications with another
- process.
-* TCP/IP Networking:: Using @command{gawk} for network
- programming.
-* Profiling:: Profiling your @command{awk} programs.
-* Library Names:: How to best name private global variables
- in library functions.
-* General Functions:: Functions that are of general use.
-* Strtonum Function:: A replacement for the built-in
- @code{strtonum()} function.
-* Assert Function:: A function for assertions in @command{awk}
- programs.
-* Round Function:: A function for rounding if @code{sprintf()}
- does not do it correctly.
-* Cliff Random Function:: The Cliff Random Number Generator.
-* Ordinal Functions:: Functions for using characters as numbers
- and vice versa.
-* Join Function:: A function to join an array into a string.
-* Gettimeofday Function:: A function to get formatted times.
-* Data File Management:: Functions for managing command-line data
- files.
-* Filetrans Function:: A function for handling data file
- transitions.
-* Rewind Function:: A function for rereading the current file.
-* File Checking:: Checking that data files are readable.
-* Empty Files:: Checking for zero-length files.
-* Ignoring Assigns:: Treating assignments as file names.
-* Getopt Function:: A function for processing command-line
- arguments.
-* Passwd Functions:: Functions for getting user information.
-* Group Functions:: Functions for getting group information.
-* Running Examples:: How to run these examples.
-* Clones:: Clones of common utilities.
-* Cut Program:: The @command{cut} utility.
-* Egrep Program:: The @command{egrep} utility.
-* Id Program:: The @command{id} utility.
-* Split Program:: The @command{split} utility.
-* Tee Program:: The @command{tee} utility.
-* Uniq Program:: The @command{uniq} utility.
-* Wc Program:: The @command{wc} utility.
-* Miscellaneous Programs:: Some interesting @command{awk} programs.
-* Dupword Program:: Finding duplicated words in a document.
-* Alarm Program:: An alarm clock.
-* Translate Program:: A program similar to the @command{tr}
- utility.
-* Labels Program:: Printing mailing labels.
-* Word Sorting:: A program to produce a word usage count.
-* History Sorting:: Eliminating duplicate entries from a
- history file.
-* Extract Program:: Pulling out programs from Texinfo source
- files.
-* Simple Sed:: A Simple Stream Editor.
-* Igawk Program:: A wrapper for @command{awk} that includes
- files.
-* Anagram Program:: Finding anagrams from a dictionary.
-* Signature Program:: People do amazing things with too much time
- on their hands.
-* Debugging:: Introduction to @command{dgawk}.
-* Debugging Concepts:: Debugging In General.
-* Debugging Terms:: Additional Debugging Concepts.
-* Awk Debugging:: Awk Debugging.
-* Sample dgawk session:: Sample @command{dgawk} session.
-* dgawk invocation:: @command{dgawk} Invocation.
-* Finding The Bug:: Finding The Bug.
-* List of Debugger Commands:: Main @command{dgawk} Commands.
-* Breakpoint Control:: Control of breakpoints.
-* Dgawk Execution Control:: Control of execution.
-* Viewing And Changing Data:: Viewing and changing data.
-* Dgawk Stack:: Dealing with the stack.
-* Dgawk Info:: Obtaining information about the program and
- the debugger state.
-* Miscellaneous Dgawk Commands:: Miscellaneous Commands.
-* Readline Support:: Readline Support.
-* Dgawk Limitations:: Limitations and future plans.
-* V7/SVR3.1:: The major changes between V7 and System V
- Release 3.1.
-* SVR4:: Minor changes between System V Releases 3.1
- and 4.
-* POSIX:: New features from the POSIX standard.
-* BTL:: New features from Brian Kernighan's
- version of @command{awk}.
-* POSIX/GNU:: The extensions in @command{gawk} not in
- POSIX @command{awk}.
-* Contributors:: The major contributors to @command{gawk}.
-* Common Extensions:: Common Extensions Summary.
-* Gawk Distribution:: What is in the @command{gawk} distribution.
-* Getting:: How to get the distribution.
-* Extracting:: How to extract the distribution.
-* Distribution contents:: What is in the distribution.
-* Unix Installation:: Installing @command{gawk} under various
- versions of Unix.
-* Quick Installation:: Compiling @command{gawk} under Unix.
-* Additional Configuration Options:: Other compile-time options.
-* Configuration Philosophy:: How it's all supposed to work.
-* Non-Unix Installation:: Installation on Other Operating Systems.
-* PC Installation:: Installing and Compiling @command{gawk} on
- MS-DOS and OS/2.
-* PC Binary Installation:: Installing a prepared distribution.
-* PC Compiling:: Compiling @command{gawk} for MS-DOS,
- Windows32, and OS/2.
-* PC Testing:: Testing @command{gawk} on PC
- Operating Systems.
-* PC Using:: Running @command{gawk} on MS-DOS, Windows32
- and OS/2.
-* Cygwin:: Building and running @command{gawk} for
- Cygwin.
-* MSYS:: Using @command{gawk} In The MSYS
- Environment.
-* VMS Installation:: Installing @command{gawk} on VMS.
-* VMS Compilation:: How to compile @command{gawk} under VMS.
-* VMS Installation Details:: How to install @command{gawk} under VMS.
-* VMS Running:: How to run @command{gawk} under VMS.
-* VMS Old Gawk:: An old version comes with some VMS systems.
-* Bugs:: Reporting Problems and Bugs.
-* Other Versions:: Other freely available @command{awk}
- implementations.
-* Compatibility Mode:: How to disable certain @command{gawk}
- extensions.
-* Additions:: Making Additions To @command{gawk}.
-* Accessing The Source:: Accessing the Git repository.
-* Adding Code:: Adding code to the main body of
- @command{gawk}.
-* New Ports:: Porting @command{gawk} to a new operating
- system.
-* Dynamic Extensions:: Adding new built-in functions to
- @command{gawk}.
-* Internals:: A brief look at some @command{gawk}
- internals.
-* Plugin License:: A note about licensing.
-* Sample Library:: A example of new functions.
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
-* Future Extensions:: New features that may be implemented one
- day.
-* Basic High Level:: The high level view.
-* Basic Data Typing:: A very quick intro to data types.
-* Floating Point Issues:: Stuff to know about floating-point numbers.
-* String Conversion Precision:: The String Value Can Lie.
-* Unexpected Results:: Floating Point Numbers Are Not Abstract
- Numbers.
-* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+* History:: The history of @command{gawk} and
+ @command{awk}.
+* Names:: What name to use to find @command{awk}.
+* This Manual:: Using this @value{DOCUMENT}. Includes
+ sample input files that you can use.
+* Conventions:: Typographical Conventions.
+* Manual History:: Brief history of the GNU project and
+ this @value{DOCUMENT}.
+* How To Contribute:: Helping to save the world.
+* Acknowledgments:: Acknowledgments.
+* Running gawk:: How to run @command{gawk} programs;
+ includes command-line syntax.
+* One-shot:: Running a short throwaway @command{awk}
+ program.
+* Read Terminal:: Using no input files (input from
+ terminal instead).
+* Long:: Putting permanent @command{awk}
+ programs in files.
+* Executable Scripts:: Making self-contained @command{awk}
+ programs.
+* Comments:: Adding documentation to @command{gawk}
+ programs.
+* Quoting:: More discussion of shell quoting
+ issues.
+* DOS Quoting:: Quoting in Windows Batch Files.
+* Sample Data Files:: Sample data files for use in the
+ @command{awk} programs illustrated in
+ this @value{DOCUMENT}.
+* Very Simple:: A very simple example.
+* Two Rules:: A less simple one-line example using
+ two rules.
+* More Complex:: A more complex example.
+* Statements/Lines:: Subdividing or combining statements
+ into lines.
+* Other Features:: Other Features of @command{awk}.
+* When:: When to use @command{gawk} and when to
+ use other things.
+* Command Line:: How to run @command{awk}.
+* Options:: Command-line options and their
+ meanings.
+* Other Arguments:: Input file names and variable
+ assignments.
+* Naming Standard Input:: How to specify standard input with
+ other files.
+* Environment Variables:: The environment variables
+ @command{gawk} uses.
+* AWKPATH Variable:: Searching directories for @command{awk}
+ programs.
+* Other Environment Variables:: The environment variables.
+* Exit Status:: @command{gawk}'s exit status.
+* Include Files:: Including other files into your
+ program.
+* Obsolete:: Obsolete Options and/or features.
+* Undocumented:: Undocumented Options and Features.
+* Regexp Usage:: How to Use Regular Expressions.
+* Escape Sequences:: How to write nonprinting characters.
+* Regexp Operators:: Regular Expression Operators.
+* Bracket Expressions:: What can go between @samp{[...]}.
+* GNU Regexp Operators:: Operators specific to GNU software.
+* Case-sensitivity:: How to do case-insensitive matching.
+* Leftmost Longest:: How much text matches.
+* Computed Regexps:: Using Dynamic Regexps.
+* Locales:: How the locale affects things.
+* Records:: Controlling how data is split into
+ records.
+* Fields:: An introduction to fields.
+* Nonconstant Fields:: Nonconstant Field Numbers.
+* Changing Fields:: Changing the Contents of a Field.
+* Field Separators:: The field separator and how to change
+ it.
+* Default Field Splitting:: How fields are normally separated.
+* Regexp Field Splitting:: Using regexps as the field separator.
+* Single Character Fields:: Making each character a separate field.
+* Command Line Field Separator:: Setting @code{FS} from the
+ command-line.
+* Field Splitting Summary:: Some final points and a summary table.
+* Constant Size:: Reading constant width data.
+* Splitting By Content:: Defining Fields By Content
+* Multiple Line:: Reading multi-line records.
+* Getline:: Reading files under explicit program
+ control using the @code{getline}
+ function.
+* Plain Getline:: Using @code{getline} with no arguments.
+* Getline/Variable:: Using @code{getline} into a variable.
+* Getline/File:: Using @code{getline} from a file.
+* Getline/Variable/File:: Using @code{getline} into a variable
+ from a file.
+* Getline/Pipe:: Using @code{getline} from a pipe.
+* Getline/Variable/Pipe:: Using @code{getline} into a variable
+ from a pipe.
+* Getline/Coprocess:: Using @code{getline} from a coprocess.
+* Getline/Variable/Coprocess:: Using @code{getline} into a variable
+ from a coprocess.
+* Getline Notes:: Important things to know about
+ @code{getline}.
+* Getline Summary:: Summary of @code{getline} Variants.
+* Command line directories:: What happens if you put a directory on
+ the command line.
+* Print:: The @code{print} statement.
+* Print Examples:: Simple examples of @code{print}
+ statements.
+* Output Separators:: The output separators and how to change
+ them.
+* OFMT:: Controlling Numeric Output With
+ @code{print}.
+* Printf:: The @code{printf} statement.
+* Basic Printf:: Syntax of the @code{printf} statement.
+* Control Letters:: Format-control letters.
+* Format Modifiers:: Format-specification modifiers.
+* Printf Examples:: Several examples.
+* Redirection:: How to redirect output to multiple
+ files and pipes.
+* Special Files:: File name interpretation in
+ @command{gawk}. @command{gawk} allows
+ access to inherited file descriptors.
+* Special FD:: Special files for I/O.
+* Special Network:: Special files for network
+ communications.
+* Special Caveats:: Things to watch out for.
+* Close Files And Pipes:: Closing Input and Output Files and
+ Pipes.
+* Values:: Constants, Variables, and Regular
+ Expressions.
+* Constants:: String, numeric and regexp constants.
+* Scalar Constants:: Numeric and string constants.
+* Nondecimal-numbers:: What are octal and hex numbers.
+* Regexp Constants:: Regular Expression constants.
+* Using Constant Regexps:: When and how to use a regexp constant.
+* Variables:: Variables give names to values for
+ later use.
+* Using Variables:: Using variables in your programs.
+* Assignment Options:: Setting variables on the command-line
+ and a summary of command-line syntax.
+ This is an advanced method of input.
+* Conversion:: The conversion of strings to numbers
+ and vice versa.
+* All Operators:: @command{gawk}'s operators.
+* Arithmetic Ops:: Arithmetic operations (@samp{+},
+ @samp{-}, etc.)
+* Concatenation:: Concatenating strings.
+* Assignment Ops:: Changing the value of a variable or a
+ field.
+* Increment Ops:: Incrementing the numeric value of a
+ variable.
+* Truth Values and Conditions:: Testing for true and false.
+* Truth Values:: What is ``true'' and what is ``false''.
+* Typing and Comparison:: How variables acquire types and how
+ this affects comparison of numbers and
+ strings with @samp{<}, etc.
+* Variable Typing:: String type versus numeric type.
+* Comparison Operators:: The comparison operators.
+* POSIX String Comparison:: String comparison with POSIX rules.
+* Boolean Ops:: Combining comparison expressions using
+ boolean operators @samp{||} (``or''),
+ @samp{&&} (``and'') and @samp{!}
+ (``not'').
+* Conditional Exp:: Conditional expressions select between
+ two subexpressions under control of a
+ third subexpression.
+* Function Calls:: A function call is an expression.
+* Precedence:: How various operators nest.
+* Pattern Overview:: What goes into a pattern.
+* Regexp Patterns:: Using regexps as patterns.
+* Expression Patterns:: Any expression can be used as a
+ pattern.
+* Ranges:: Pairs of patterns specify record
+ ranges.
+* BEGIN/END:: Specifying initialization and cleanup
+ rules.
+* Using BEGIN/END:: How and why to use BEGIN/END rules.
+* I/O And BEGIN/END:: I/O issues in BEGIN/END rules.
+* BEGINFILE/ENDFILE:: Two special patterns for advanced
+ control.
+* Empty:: The empty pattern, which matches every
+ record.
+* Using Shell Variables:: How to use shell variables with
+ @command{awk}.
+* Action Overview:: What goes into an action.
+* Statements:: Describes the various control
+ statements in detail.
+* If Statement:: Conditionally execute some
+ @command{awk} statements.
+* While Statement:: Loop until some condition is satisfied.
+* Do Statement:: Do specified action while looping until
+ some condition is satisfied.
+* For Statement:: Another looping statement, that
+ provides initialization and increment
+ clauses.
+* Switch Statement:: Switch/case evaluation for conditional
+ execution of statements based on a
+ value.
+* Break Statement:: Immediately exit the innermost
+ enclosing loop.
+* Continue Statement:: Skip to the end of the innermost
+ enclosing loop.
+* Next Statement:: Stop processing the current input
+ record.
+* Nextfile Statement:: Stop processing the current file.
+* Exit Statement:: Stop execution of @command{awk}.
+* Built-in Variables:: Summarizes the built-in variables.
+* User-modified:: Built-in variables that you change to
+ control @command{awk}.
+* Auto-set:: Built-in variables where @command{awk}
+ gives you information.
+* ARGC and ARGV:: Ways to use @code{ARGC} and
+ @code{ARGV}.
+* Array Basics:: The basics of arrays.
+* Array Intro:: Introduction to Arrays
+* Reference to Elements:: How to examine one element of an array.
+* Assigning Elements:: How to change an element of an array.
+* Array Example:: Basic Example of an Array
+* Scanning an Array:: A variation of the @code{for}
+ statement. It loops through the indices
+ of an array's existing elements.
+* Delete:: The @code{delete} statement removes an
+ element from an array.
+* Numeric Array Subscripts:: How to use numbers as subscripts in
+ @command{awk}.
+* Uninitialized Subscripts:: Using Uninitialized variables as
+ subscripts.
+* Multi-dimensional:: Emulating multidimensional arrays in
+ @command{awk}.
+* Multi-scanning:: Scanning multidimensional arrays.
+* Arrays of Arrays:: True multidimensional arrays.
+* Built-in:: Summarizes the built-in functions.
+* Calling Built-in:: How to call built-in functions.
+* Numeric Functions:: Functions that work with numbers,
+ including @code{int()}, @code{sin()}
+ and @code{rand()}.
+* String Functions:: Functions for string manipulation, such
+ as @code{split()}, @code{match()} and
+ @code{sprintf()}.
+* Gory Details:: More than you want to know about
+ @samp{\} and @samp{&} with
+ @code{sub()}, @code{gsub()}, and
+ @code{gensub()}.
+* I/O Functions:: Functions for files and shell commands.
+* Time Functions:: Functions for dealing with timestamps.
+* Bitwise Functions:: Functions for bitwise operations.
+* Type Functions:: Functions for type information.
+* I18N Functions:: Functions for string translation.
+* User-defined:: Describes User-defined functions in
+ detail.
+* Definition Syntax:: How to write definitions and what they
+ mean.
+* Function Example:: An example function definition and what
+ it does.
+* Function Caveats:: Things to watch out for.
+* Calling A Function:: Don't use spaces.
+* Variable Scope:: Controlling variable scope.
+* Pass By Value/Reference:: Passing parameters.
+* Return Statement:: Specifying the value a function
+ returns.
+* Dynamic Typing:: How variable types can change at
+ runtime.
+* Indirect Calls:: Choosing the function to call at
+ runtime.
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU @code{gettext} works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging @code{printf} arguments.
+* I18N Portability:: @command{awk}-level portability issues.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: @command{gawk} is also
+ internationalized.
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array
+ traversal and sorting arrays.
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Controlling Scanning With A Function:: Using a function to control scanning.
+* Controlling Scanning:: Controlling the order in which arrays
+ are scanned.
+* Array Sorting Functions:: How to use @code{asort()} and
+ @code{asorti()}.
+* Two-way I/O:: Two-way communications with another
+ process.
+* TCP/IP Networking:: Using @command{gawk} for network
+ programming.
+* Profiling:: Profiling your @command{awk} programs.
+* Library Names:: How to best name private global
+ variables in library functions.
+* General Functions:: Functions that are of general use.
+* Strtonum Function:: A replacement for the built-in
+ @code{strtonum()} function.
+* Assert Function:: A function for assertions in
+ @command{awk} programs.
+* Round Function:: A function for rounding if
+ @code{sprintf()} does not do it
+ correctly.
+* Cliff Random Function:: The Cliff Random Number Generator.
+* Ordinal Functions:: Functions for using characters as
+ numbers and vice versa.
+* Join Function:: A function to join an array into a
+ string.
+* Gettimeofday Function:: A function to get formatted times.
+* Data File Management:: Functions for managing command-line
+ data files.
+* Filetrans Function:: A function for handling data file
+ transitions.
+* Rewind Function:: A function for rereading the current
+ file.
+* File Checking:: Checking that data files are readable.
+* Empty Files:: Checking for zero-length files.
+* Ignoring Assigns:: Treating assignments as file names.
+* Getopt Function:: A function for processing command-line
+ arguments.
+* Passwd Functions:: Functions for getting user information.
+* Group Functions:: Functions for getting group
+ information.
+* Walking Arrays:: A function to walk arrays of arrays.
+* Running Examples:: How to run these examples.
+* Clones:: Clones of common utilities.
+* Cut Program:: The @command{cut} utility.
+* Egrep Program:: The @command{egrep} utility.
+* Id Program:: The @command{id} utility.
+* Split Program:: The @command{split} utility.
+* Tee Program:: The @command{tee} utility.
+* Uniq Program:: The @command{uniq} utility.
+* Wc Program:: The @command{wc} utility.
+* Miscellaneous Programs:: Some interesting @command{awk}
+ programs.
+* Dupword Program:: Finding duplicated words in a document.
+* Alarm Program:: An alarm clock.
+* Translate Program:: A program similar to the @command{tr}
+ utility.
+* Labels Program:: Printing mailing labels.
+* Word Sorting:: A program to produce a word usage
+ count.
+* History Sorting:: Eliminating duplicate entries from a
+ history file.
+* Extract Program:: Pulling out programs from Texinfo
+ source files.
+* Simple Sed:: A Simple Stream Editor.
+* Igawk Program:: A wrapper for @command{awk} that
+ includes files.
+* Anagram Program:: Finding anagrams from a dictionary.
+* Signature Program:: People do amazing things with too much
+ time on their hands.
+* Debugging:: Introduction to @command{dgawk}.
+* Debugging Concepts:: Debugging In General.
+* Debugging Terms:: Additional Debugging Concepts.
+* Awk Debugging:: Awk Debugging.
+* Sample dgawk session:: Sample @command{dgawk} session.
+* dgawk invocation:: @command{dgawk} Invocation.
+* Finding The Bug:: Finding The Bug.
+* List of Debugger Commands:: Main @command{dgawk} Commands.
+* Breakpoint Control:: Control of breakpoints.
+* Dgawk Execution Control:: Control of execution.
+* Viewing And Changing Data:: Viewing and changing data.
+* Dgawk Stack:: Dealing with the stack.
+* Dgawk Info:: Obtaining information about the program
+ and the debugger state.
+* Miscellaneous Dgawk Commands:: Miscellaneous Commands.
+* Readline Support:: Readline Support.
+* Dgawk Limitations:: Limitations and future plans.
+* V7/SVR3.1:: The major changes between V7 and System
+ V Release 3.1.
+* SVR4:: Minor changes between System V Releases
+ 3.1 and 4.
+* POSIX:: New features from the POSIX standard.
+* BTL:: New features from Brian Kernighan's
+ version of @command{awk}.
+* POSIX/GNU:: The extensions in @command{gawk} not in
+ POSIX @command{awk}.
+* Common Extensions:: Common Extensions Summary.
+* Contributors:: The major contributors to
+ @command{gawk}.
+* Gawk Distribution:: What is in the @command{gawk}
+ distribution.
+* Getting:: How to get the distribution.
+* Extracting:: How to extract the distribution.
+* Distribution contents:: What is in the distribution.
+* Unix Installation:: Installing @command{gawk} under various
+ versions of Unix.
+* Quick Installation:: Compiling @command{gawk} under Unix.
+* Additional Configuration Options:: Other compile-time options.
+* Configuration Philosophy:: How it's all supposed to work.
+* Non-Unix Installation:: Installation on Other Operating
+ Systems.
+* PC Installation:: Installing and Compiling @command{gawk}
+ on MS-DOS and OS/2.
+* PC Binary Installation:: Installing a prepared distribution.
+* PC Compiling:: Compiling @command{gawk} for MS-DOS,
+ Windows32, and OS/2.
+* PC Testing:: Testing @command{gawk} on PC systems.
+* PC Using:: Running @command{gawk} on MS-DOS,
+ Windows32 and OS/2.
+* Cygwin:: Building and running @command{gawk} for
+ Cygwin.
+* MSYS:: Using @command{gawk} In The MSYS
+ Environment.
+* VMS Installation:: Installing @command{gawk} on VMS.
+* VMS Compilation:: How to compile @command{gawk} under
+ VMS.
+* VMS Installation Details:: How to install @command{gawk} under
+ VMS.
+* VMS Running:: How to run @command{gawk} under VMS.
+* VMS Old Gawk:: An old version comes with some VMS
+ systems.
+* Bugs:: Reporting Problems and Bugs.
+* Other Versions:: Other freely available @command{awk}
+ implementations.
+* Compatibility Mode:: How to disable certain @command{gawk}
+ extensions.
+* Additions:: Making Additions To @command{gawk}.
+* Accessing The Source:: Accessing the Git repository.
+* Adding Code:: Adding code to the main body of
+ @command{gawk}.
+* New Ports:: Porting @command{gawk} to a new
+ operating system.
+* Dynamic Extensions:: Adding new built-in functions to
+ @command{gawk}.
+* Internals:: A brief look at some @command{gawk}
+ internals.
+* Plugin License:: A note about licensing.
+* Sample Library:: A example of new functions.
+* Internal File Description:: What the new functions will do.
+* Internal File Ops:: The code for internal file operations.
+* Using Internal File Ops:: How to use an external extension.
+* Future Extensions:: New features that may be implemented
+ one day.
+* Basic High Level:: The high level view.
+* Basic Data Typing:: A very quick intro to data types.
+* Floating Point Issues:: Stuff to know about floating-point
+ numbers.
+* String Conversion Precision:: The String Value Can Lie.
+* Unexpected Results:: Floating Point Numbers Are Not Abstract
+ Numbers.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
@end detailmenu
@end menu
@@ -13036,7 +13079,6 @@ same @command{awk} program.
* Uninitialized Subscripts:: Using Uninitialized variables as subscripts.
* Multi-dimensional:: Emulating multidimensional arrays in
@command{awk}.
-* Array Sorting:: Sorting array values and indices.
* Arrays of Arrays:: True multidimensional arrays.
@end menu
@@ -13378,11 +13420,6 @@ END @{
@cindex elements in arrays, scanning
@cindex arrays, scanning
-@menu
-* Controlling Scanning:: Controlling the order in which arrays are scanned.
-* Controlling Scanning With A Function:: Using a function to control scanning.
-@end menu
-
In programs that use arrays, it is often necessary to use a loop that
executes once for each element of an array. In other languages, where
arrays are contiguous and indices are limited to positive integers,
@@ -13447,286 +13484,14 @@ the loop body; it is not predictable whether the @code{for} loop will
reach them. Similarly, changing @var{var} inside the loop may produce
strange results. It is best to avoid such things.
-@node Controlling Scanning
-@subsubsection Controlling Array Scanning Order
-
As an extension, @command{gawk} makes it possible for you to
loop over the elements of an array in order, based on the value of
@code{PROCINFO["sorted_in"]} (@pxref{Auto-set}).
-Several sorting options are available:
-
-@table @samp
-@item ascending index string
-Order by indices compared as strings; this is the most basic sort.
-(Internally, array indices are always strings, so with @samp{a[2*5] = 1}
-the index is actually @code{"10"} rather than numeric 10.)
-
-@item ascending index number
-Order by indices but force them to be treated as numbers in the process.
-Any index with non-numeric value will end up positioned as if it were zero.
-
-@item ascending value string
-Order by element values rather than by indices. Scalar values are
-compared as strings. Subarrays, if present, come out last.
-
-@item ascending value number
-Order by values but force scalar values to be treated as numbers
-for the purpose of comparison. If there are subarrays, those appear
-at the end of the sorted list.
-
-@item descending index string
-Reverse order from the most basic sort.
-
-@item descending index number
-Numeric indices ordered from high to low.
-
-@item descending value string
-Element values, treated as strings, ordered from high to low. Subarrays, if present,
-come out first.
-
-@item descending value number
-Element values, treated as numbers, ordered from high to low. Subarrays, if present,
-come out first.
-
-@item unsorted
-Array elements are processed in arbitrary order, the normal @command{awk}
-behavior. You can also get the normal behavior by just
-deleting the @code{"sorted_in"} item from the @code{PROCINFO} array, if
-it previously had a value assigned to it.
-@end table
-
-The array traversal order is determined before the @code{for} loop
-starts to run. Changing @code{PROCINFO["sorted_in"]} in the loop body
-will not affect the loop.
-
-Portions of the sort specification string may be truncated or omitted.
-The default is @samp{ascending} for direction, @samp{index} for sort key type,
-and @samp{string} for comparison mode. This implies that one can
-simply assign the empty string, "", instead of "ascending index string" to
-@code{PROCINFO["sorted_in"]} for the same effect.
-
-For example:
-
-@example
-$ @kbd{gawk 'BEGIN @{}
-> @kbd{ a[4] = 4}
-> @kbd{ a[3] = 3}
-> @kbd{ for (i in a)}
-> @kbd{ print i, a[i]}
-> @kbd{@}'}
-@print{} 4 4
-@print{} 3 3
-$ @kbd{gawk 'BEGIN @{}
-> @kbd{ PROCINFO["sorted_in"] = "asc index"}
-> @kbd{ a[4] = 4}
-> @kbd{ a[3] = 3}
-> @kbd{ for (i in a)}
-> @kbd{ print i, a[i]}
-> @kbd{@}'}
-@print{} 3 3
-@print{} 4 4
-@end example
-
-When sorting an array by element values, if a value happens to be
-a subarray then it is considered to be greater than any string or
-numeric value, regardless of what the subarray itself contains,
-and all subarrays are treated as being equal to each other. Their
-order relative to each other is determined by their index strings.
+This is an advanced feature, so discussion of it is delayed
+until @ref{Controlling Array Traversal}.
-@node Controlling Scanning With A Function
-@subsubsection Controlling Array Scanning Order With a User-defined Function
-
-The value of @code{PROCINFO["sorted_in"]} can also be a function name.
-This lets you traverse an array based on any custom criterion.
-The array elements are ordered according to the return value of this
-function. This comparison function should be defined with at least
-four arguments:
-
-@example
-function comp_func(i1, v1, i2, v2)
-@{
- @var{compare elements 1 and 2 in some fashion}
- @var{return < 0; 0; or > 0}
-@}
-@end example
-
-Here, @var{i1} and @var{i2} are the indices, and @var{v1} and @var{v2}
-are the corresponding values of the two elements being compared.
-Either @var{v1} or @var{v2}, or both, can be arrays if the array being
-traversed contains subarrays as values. The three possible return values
-are interpreted this way:
-
-@itemize @bullet
-@item
-If the return value of @code{comp_func(i1, v1, i2, v2)} is less than zero,
-index @var{i1} comes before index @var{i2} during loop traversal.
-
-@item
-If @code{comp_func(i1, v1, i2, v2)} returns zero, @var{i1} and @var{i2}
-come together but the relative order with respect to each other is undefined.
-
-@item
-If the return value of @code{comp_func(i1, v1, i2, v2)} is greater than zero,
-@var{i1} comes after @var{i2}.
-@end itemize
-
-The following comparison function can be used to scan an array in
-numerical order of the indices:
-
-@example
-function cmp_num_idx(i1, v1, i2, v2)
-@{
- # numerical index comparison, ascending order
- return (i1 - i2)
-@}
-@end example
-
-This function traverses an array based on an order by element values
-rather than by indices:
-
-@example
-function cmp_str_val(i1, v1, i2, v2)
-@{
- # string value comparison, ascending order
- v1 = v1 ""
- v2 = v2 ""
- if (v1 < v2)
- return -1
- return (v1 != v2)
-@}
-@end example
-
-Here is a
-comparison function to make all numbers, and numeric strings without
-any leading or trailing spaces, come out first during loop traversal:
-
-@example
-function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
-@{
- # numbers before string value comparison, ascending order
- n1 = v1 + 0
- n2 = v2 + 0
- if (n1 == v1)
- return (n2 == v2) ? (n1 - n2) : -1
- else if (n2 == v2)
- return 1
- return (v1 < v2) ? -1 : (v1 != v2)
-@}
-@end example
-
-Consider sorting the entries of a GNU/Linux system password file
-according to login names. The following program which sorts records
-by a specific field position can be used for this purpose:
-
-@example
-# sort.awk --- simple program to sort by field position
-# field position is specified by the global variable POS
-
-function cmp_field(i1, v1, i2, v2)
-@{
- # comparison by value, as string, and ascending order
- return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
-@}
-
-@{
- for (i = 1; i <= NF; i++)
- a[NR][i] = $i
-@}
-
-END @{
- PROCINFO["sorted_in"] = "cmp_field"
- if (POS < 1 || POS > NF)
- POS = 1
- for (i in a) @{
- for (j = 1; j <= NF; j++)
- printf("%s%c", a[i][j], j < NF ? ":" : "")
- print ""
- @}
-@}
-@end example
-
-The first field in each entry of the password file is the user's login name,
-and the fields are seperated by colons. Running the program produces the
-following output:
-
-@example
-$ @kbd{gawk -vPOS=1 -F: -f sort.awk /etc/passwd}
-@print{} adm:x:3:4:adm:/var/adm:/sbin/nologin
-@print{} apache:x:48:48:Apache:/var/www:/sbin/nologin
-@print{} avahi:x:70:70:Avahi daemon:/:/sbin/nologin
-@dots{}
-@end example
-
-The comparison normally should always return the same value when given a
-specific pair of array elements as its arguments. If inconsistent
-results are returned then the order is undefined. This behavior is
-sometimes exploited to introduce random order in otherwise seemingly
-ordered data:
-
-@example
-function cmp_randomize(i1, v1, i2, v2)
-@{
- # random order
- return (2 - 4 * rand())
-@}
-@end example
-
-As mentioned above, the order of the indices is arbitrary if two
-elements compare equal. This is usually not a problem, but letting
-the tied elements come out in arbitrary order can be an issue, especially
-when comparing item values. The partial ordering of the equal elements
-may change during the next loop traversal, if other elements are added or
-removed from the array. One way to resolve ties when comparing elements
-with otherwise equal values is to include the indices in the comparison
-rules. Note that doing this may make the loop traversal less efficient,
-so consider it only if necessary. The following comparison functions
-force a deterministic order, and are based on the fact that the
-indices of two elements are never equal:
-
-@example
-function cmp_numeric(i1, v1, i2, v2)
-@{
- # numerical value (and index) comparison, descending order
- return (v1 != v2) ? (v2 - v1) : (i2 - i1)
-@}
-
-function cmp_string(i1, v1, i2, v2)
-@{
- # string value (and index) comparison, descending order
- v1 = v1 i1
- v2 = v2 i2
- return (v1 > v2) ? -1 : (v1 != v2)
-@}
-@end example
-
-@c Avoid using the term ``stable'' when describing the unpredictable behavior
-@c if two items compare equal. Usually, the goal of a "stable algorithm"
-@c is to maintain the original order of the items, which is a meaningless
-@c concept for a list constructed from a hash.
-
-A custom comparison function can often simplify ordered loop
-traversal, and the the sky is really the limit when it comes to
-designing such a function.
-
-When string comparisons are made during a sort, either for element
-values where one or both aren't numbers, or for element indices
-handled as strings, the value of @code{IGNORECASE}
-(@pxref{Built-in Variables}) controls whether
-the comparisons treat corresponding uppercase and lowercase letters as
-equivalent or distinct.
-
-All sorting based on @code{PROCINFO["sorted_in"]}
-is disabled in POSIX mode,
-since the @code{PROCINFO} array is not special in that case.
-
-As a side note, sorting the array indices before traversing
-the array has been reported to add 15% to 20% overhead to the
-execution time of @command{awk} programs. For this reason,
-sorted array traversal is not the default.
-
-@c The @command{gawk}
-@c maintainers believe that only the people who wish to use a
-@c feature should have to pay for it.
+In addition, @command{gawk} provides built-in functions for
+sorting arrays; see @ref{Array Sorting Functions}.
@node Delete
@section The @code{delete} Statement
@@ -14107,124 +13872,6 @@ The result is to set @code{separate[1]} to @code{"1"} and
@code{separate[2]} to @code{"foo"}. Presto! The original sequence of
separate indices is recovered.
-@node Array Sorting
-@section Sorting Array Values and Indices with @command{gawk}
-
-@cindex arrays, sorting
-@cindex @code{asort()} function (@command{gawk})
-@cindex @code{asort()} function (@command{gawk}), arrays@comma{} sorting
-@cindex sort function, arrays, sorting
-The order in which an array is scanned with a @samp{for (i in array)}
-loop is essentially arbitrary.
-In most @command{awk} implementations, sorting an array requires
-writing a @code{sort} function.
-While this can be educational for exploring different sorting algorithms,
-usually that's not the point of the program.
-@command{gawk} provides the built-in @code{asort()}
-and @code{asorti()} functions
-(@pxref{String Functions})
-for sorting arrays. For example:
-
-@example
-@var{populate the array} data
-n = asort(data)
-for (i = 1; i <= n; i++)
- @var{do something with} data[i]
-@end example
-
-After the call to @code{asort()}, the array @code{data} is indexed from 1
-to some number @var{n}, the total number of elements in @code{data}.
-(This count is @code{asort()}'s return value.)
-@code{data[1]} @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
-The array elements are compared as strings.
-
-@cindex side effects, @code{asort()} function
-An important side effect of calling @code{asort()} is that
-@emph{the array's original indices are irrevocably lost}.
-As this isn't always desirable, @code{asort()} accepts a
-second argument:
-
-@example
-@var{populate the array} source
-n = asort(source, dest)
-for (i = 1; i <= n; i++)
- @var{do something with} dest[i]
-@end example
-
-In this case, @command{gawk} copies the @code{source} array into the
-@code{dest} array and then sorts @code{dest}, destroying its indices.
-However, the @code{source} array is not affected.
-
-@code{asort()} and @code{asorti()} accept a third string argument
-to control the comparison rule for the array elements, and the direction
-of the sorted results. The valid comparison modes are @samp{string} and @samp{number},
-and the direction can be either @samp{ascending} or @samp{descending}.
-Either mode or direction, or both, can be omitted in which
-case the defaults, @samp{string} or @samp{ascending} is assumed
-for the comparison mode and the direction, respectively. Seperate comparison
-mode from direction with a single space, and they can appear in any
-order. To compare the elements as numbers, and to reverse the elements
-of the @code{dest} array, the call to asort in the above example can be
-replaced with:
-
-@example
-asort(source, dest, "descending number")
-@end example
-
-The third argument to @code{asort()} can also be a user-defined
-function name which is used to order the array elements before
-constructing the result array.
-@xref{Scanning an Array}, for more information.
-
-
-Often, what's needed is to sort on the values of the @emph{indices}
-instead of the values of the elements.
-To do that, use the
-@code{asorti()} function. The interface is identical to that of
-@code{asort()}, except that the index values are used for sorting, and
-become the values of the result array:
-
-@example
-@{ source[$0] = some_func($0) @}
-
-END @{
- n = asorti(source, dest)
- for (i = 1; i <= n; i++) @{
- @ii{Work with sorted indices directly:}
- @var{do something with} dest[i]
- @dots{}
- @ii{Access original array via sorted indices:}
- @var{do something with} source[dest[i]]
- @}
-@}
-@end example
-
-Sorting the array by replacing the indices provides maximal flexibility.
-To traverse the elements in decreasing order, use a loop that goes from
-@var{n} down to 1, either over the elements or over the indices. This
-is an alternative to specifying @samp{descending} for the sorting order
-using the optional third argument.
-
-@cindex reference counting, sorting arrays
-Copying array indices and elements isn't expensive in terms of memory.
-Internally, @command{gawk} maintains @dfn{reference counts} to data.
-For example, when @code{asort()} copies the first array to the second one,
-there is only one copy of the original array elements' data, even though
-both arrays use the values.
-
-@c Document It And Call It A Feature. Sigh.
-@cindex @command{gawk}, @code{IGNORECASE} variable in
-@cindex @code{IGNORECASE} variable
-@cindex arrays, sorting, @code{IGNORECASE} variable and
-@cindex @code{IGNORECASE} variable, array sorting and
-Because @code{IGNORECASE} affects string comparisons, the value
-of @code{IGNORECASE} also affects sorting for both @code{asort()} and @code{asorti()}.
-Note also that the locale's sorting order does @emph{not}
-come into play; comparisons are based on character values only.@footnote{This
-is true because locale-based comparison occurs only when in POSIX
-compatibility mode, and since @code{asort()} and @code{asorti()} are
-@command{gawk} extensions, they are not available in that case.}
-Caveat Emptor.
@node Arrays of Arrays
@section Arrays of Arrays
@@ -14667,7 +14314,7 @@ order specification. The value of @code{IGNORECASE} affects the sorting.
The third argument can also be a user-defined function name in which case
the value returned by the function is used to order the array elements
before constructing the result array.
-@xref{Scanning an Array}, for more information.
+@xref{Array Sorting Functions}, for more information.
For example, if the contents of @code{a} are as follows:
@@ -14701,7 +14348,7 @@ asort(a, a, "descending")
@end example
The @code{asort()} function is described in more detail in
-@ref{Array Sorting}.
+@ref{Array Sorting Functions}.
@code{asort()} is a @command{gawk} extension; it is not available
in compatibility mode (@pxref{Options}).
@@ -14713,7 +14360,7 @@ are sorted, instead of the values. (Here too,
@code{IGNORECASE} affects the sorting.)
The @code{asorti()} function is described in more detail in
-@ref{Array Sorting}.
+@ref{Array Sorting Functions}.
@code{asorti()} is a @command{gawk} extension; it is not available
in compatibility mode (@pxref{Options}).
@@ -18474,7 +18121,9 @@ It's a bit of a ``grab bag'' of items that are otherwise unrelated
to each other.
First, a command-line option allows @command{gawk} to recognize
nondecimal numbers in input data, not just in @command{awk}
-programs. Next, two-way I/O, discussed briefly in earlier parts of this
+programs.
+Then, @command{gawk}'s special features for sorting arrays are presented.
+Next, two-way I/O, discussed briefly in earlier parts of this
@value{DOCUMENT}, is described in full detail, along with the basics
of TCP/IP networking. Finally, @command{gawk}
can @dfn{profile} an @command{awk} program, making it possible to tune
@@ -18487,6 +18136,8 @@ its description is relegated to an appendix.
@menu
* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array traversal and
+ sorting arrays.
* Two-way I/O:: Two-way communications with another process.
* TCP/IP Networking:: Using @command{gawk} for network programming.
* Profiling:: Profiling your @command{awk} programs.
@@ -18549,6 +18200,473 @@ This makes your programs easier to write and easier to read, and
leads to less surprising results.
@end quotation
+@node Array Sorting
+@section Controlling Array Traversal and Array Sorting
+
+@command{gawk} lets you control the order in which @samp{for (i in array)} loops
+will traverse an array.
+
+In addition, two built-in functions, @code{asort()} and @code{asorti()},
+let you sort arrays based on the array values and indices, respectively.
+These two functions also provide control over the sorting criteria used
+to order the elements during sorting.
+
+@menu
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use @code{asort()} and @code{asorti()}.
+@end menu
+
+@node Controlling Array Traversal
+@subsection Controlling Array Traversal
+
+By default, the order in which a @samp{for (i in array)} loop
+will scan an array is not defined; it is generally based upon
+the internal implementation of arrays inside @command{awk}.
+
+Often, though, it is desirable to be able to loop over the elements
+in a particular order that you, the programmer, choose. @command{gawk}
+lets you do this; this @value{SUBSECTION} describes how.
+
+@menu
+* Controlling Scanning With A Function:: Using a function to control scanning.
+* Controlling Scanning:: Controlling the order in which arrays
+ are scanned.
+@end menu
+
+@node Controlling Scanning With A Function
+@subsubsection Controlling Array Scanning Order With a User-defined Function
+
+The value of @code{PROCINFO["sorted_in"]} can be a function name.
+This lets you traverse an array based on any custom criterion.
+The array elements are ordered according to the return value of this
+function. This comparison function should be defined with at least
+four arguments:
+
+@example
+function comp_func(i1, v1, i2, v2)
+@{
+ @var{compare elements 1 and 2 in some fashion}
+ @var{return < 0; 0; or > 0}
+@}
+@end example
+
+Here, @var{i1} and @var{i2} are the indices, and @var{v1} and @var{v2}
+are the corresponding values of the two elements being compared.
+Either @var{v1} or @var{v2}, or both, can be arrays if the array being
+traversed contains subarrays as values. The three possible return values
+are interpreted this way:
+
+@itemize @bullet
+@item
+If the return value of @code{comp_func(i1, v1, i2, v2)} is less than zero,
+index @var{i1} comes before index @var{i2} during loop traversal.
+
+@item
+If @code{comp_func(i1, v1, i2, v2)} returns zero, @var{i1} and @var{i2}
+come together but the relative order with respect to each other is undefined.
+
+@item
+If the return value of @code{comp_func(i1, v1, i2, v2)} is greater than zero,
+@var{i1} comes after @var{i2}.
+@end itemize
+
+The following comparison function can be used to scan an array in
+numerical order of the indices:
+
+@example
+function cmp_num_idx(i1, v1, i2, v2)
+@{
+ # numerical index comparison, ascending order
+ return (i1 - i2)
+@}
+@end example
+
+This function traverses an array based on the string order of the element values
+rather than by indices:
+
+@example
+function cmp_str_val(i1, v1, i2, v2)
+@{
+ # string value comparison, ascending order
+ v1 = v1 ""
+ v2 = v2 ""
+ if (v1 < v2)
+ return -1
+ return (v1 != v2)
+@}
+@end example
+
+Here is a
+comparison function to make all numbers, and numeric strings without
+any leading or trailing spaces, come out first during loop traversal:
+
+@example
+function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
+@{
+ # numbers before string value comparison, ascending order
+ n1 = v1 + 0
+ n2 = v2 + 0
+ if (n1 == v1)
+ return (n2 == v2) ? (n1 - n2) : -1
+ else if (n2 == v2)
+ return 1
+ return (v1 < v2) ? -1 : (v1 != v2)
+@}
+@end example
+
+@strong{FIXME}: Put in a fuller example here of some data
+and show the different results when traversing.
+
+Consider sorting the entries of a GNU/Linux system password file
+according to login names. The following program which sorts records
+by a specific field position can be used for this purpose:
+
+@example
+# sort.awk --- simple program to sort by field position
+# field position is specified by the global variable POS
+
+function cmp_field(i1, v1, i2, v2)
+@{
+ # comparison by value, as string, and ascending order
+ return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
+@}
+
+@{
+ for (i = 1; i <= NF; i++)
+ a[NR][i] = $i
+@}
+
+END @{
+ PROCINFO["sorted_in"] = "cmp_field"
+ if (POS < 1 || POS > NF)
+ POS = 1
+ for (i in a) @{
+ for (j = 1; j <= NF; j++)
+ printf("%s%c", a[i][j], j < NF ? ":" : "")
+ print ""
+ @}
+@}
+@end example
+
+The first field in each entry of the password file is the user's login name,
+and the fields are seperated by colons.
+Each record defines a subarray, which each field as an element in the subarray.
+Running the program produces the
+following output:
+
+@example
+$ @kbd{gawk -vPOS=1 -F: -f sort.awk /etc/passwd}
+@print{} adm:x:3:4:adm:/var/adm:/sbin/nologin
+@print{} apache:x:48:48:Apache:/var/www:/sbin/nologin
+@print{} avahi:x:70:70:Avahi daemon:/:/sbin/nologin
+@dots{}
+@end example
+
+The comparison normally should always return the same value when given a
+specific pair of array elements as its arguments. If inconsistent
+results are returned then the order is undefined. This behavior is
+sometimes exploited to introduce random order in otherwise seemingly
+ordered data:
+
+@example
+function cmp_randomize(i1, v1, i2, v2)
+@{
+ # random order
+ return (2 - 4 * rand())
+@}
+@end example
+
+As mentioned above, the order of the indices is arbitrary if two
+elements compare equal. This is usually not a problem, but letting
+the tied elements come out in arbitrary order can be an issue, especially
+when comparing item values. The partial ordering of the equal elements
+may change during the next loop traversal, if other elements are added or
+removed from the array. One way to resolve ties when comparing elements
+with otherwise equal values is to include the indices in the comparison
+rules. Note that doing this may make the loop traversal less efficient,
+so consider it only if necessary. The following comparison functions
+force a deterministic order, and are based on the fact that the
+indices of two elements are never equal:
+
+@example
+function cmp_numeric(i1, v1, i2, v2)
+@{
+ # numerical value (and index) comparison, descending order
+ return (v1 != v2) ? (v2 - v1) : (i2 - i1)
+@}
+
+function cmp_string(i1, v1, i2, v2)
+@{
+ # string value (and index) comparison, descending order
+ v1 = v1 i1
+ v2 = v2 i2
+ return (v1 > v2) ? -1 : (v1 != v2)
+@}
+@end example
+
+@c Avoid using the term ``stable'' when describing the unpredictable behavior
+@c if two items compare equal. Usually, the goal of a "stable algorithm"
+@c is to maintain the original order of the items, which is a meaningless
+@c concept for a list constructed from a hash.
+
+A custom comparison function can often simplify ordered loop
+traversal, and the the sky is really the limit when it comes to
+designing such a function.
+
+When string comparisons are made during a sort, either for element
+values where one or both aren't numbers, or for element indices
+handled as strings, the value of @code{IGNORECASE}
+(@pxref{Built-in Variables}) controls whether
+the comparisons treat corresponding uppercase and lowercase letters as
+equivalent or distinct.
+
+Another point to keep in mind is that in the case of subarrays
+the element values can themselves be arrays; a production comparison
+function should use the @code{isarray()} function
+(@pxref{Type Functions}),
+to check for this, and choose a defined sorting order for subarrays.
+
+All sorting based on @code{PROCINFO["sorted_in"]}
+is disabled in POSIX mode,
+since the @code{PROCINFO} array is not special in that case.
+
+As a side note, sorting the array indices before traversing
+the array has been reported to add 15% to 20% overhead to the
+execution time of @command{awk} programs. For this reason,
+sorted array traversal is not the default.
+
+@c The @command{gawk}
+@c maintainers believe that only the people who wish to use a
+@c feature should have to pay for it.
+
+@node Controlling Scanning
+@subsubsection Controlling Array Scanning Order
+
+As described in
+@iftex
+the previous subsubsection,
+@end iftex
+@ref{Controlling Scanning With A Function},
+@ifnottex
+@end ifnottex
+you can provide the name of a function as the value of
+@code{PROCINFO["sorted_in"]} to specify custom sorting criteria.
+
+Often, though, you may wish to do something simple, such as
+``sort based on comparing the indices in ascending order,''
+or ``sort based on comparing the values in descending order.''
+Having to write a simple comparison function for this purpose
+for use in all of your programs becomes tedious.
+For the most likely simple cases @command{gawk} provides
+the option of supplying special names that do the requested
+sorting for you.
+You can think of them as ``predefined'' sorting functions,
+if you like, although the names purposely include characters
+that are not valid in real @command{awk} function names.
+
+The following special values are available:
+
+@table @code
+@item "@@ind_str_asc"
+Order by indices compared as strings; this is the most basic sort.
+(Internally, array indices are always strings, so with @samp{a[2*5] = 1}
+the index is actually @code{"10"} rather than numeric 10.)
+
+@item "@@ind_num_asc"
+Order by indices but force them to be treated as numbers in the process.
+Any index with non-numeric value will end up positioned as if it were zero.
+
+@item "@@val_type_asc"
+Order by element values rather than indices.
+Ordering is by the type assigned to the element
+(@pxref{Typing and Comparison}).
+All numeric values come before all string values,
+which in turn come before all subarrays.
+
+@item "@@val_str_asc"
+Order by element values rather than by indices. Scalar values are
+compared as strings. Subarrays, if present, come out last.
+
+@item "@@val_num_asc"
+Order by values but force scalar values to be treated as numbers
+for the purpose of comparison. If there are subarrays, those appear
+at the end of the sorted list.
+
+@item "@@ind_str_desc"
+Reverse order from the most basic sort.
+
+@item "@@ind_num_desc"
+Numeric indices ordered from high to low.
+
+@item "@@val_type_desc"
+Element values, based on type, in descending order.
+
+@item "@@val_str_desc"
+Element values, treated as strings, ordered from high to low. Subarrays, if present,
+come out first.
+
+@item "@@val_num_desc"
+Element values, treated as numbers, ordered from high to low. Subarrays, if present,
+come out first.
+
+@item "@@unsorted"
+Array elements are processed in arbitrary order, which is the normal @command{awk}
+behavior. You can also get the normal behavior by just
+deleting the @code{"sorted_in"} element from the @code{PROCINFO} array, if
+it previously had a value assigned to it.
+@end table
+
+The array traversal order is determined before the @code{for} loop
+starts to run. Changing @code{PROCINFO["sorted_in"]} in the loop body
+will not affect the loop.
+
+For example:
+
+@example
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{ a[4] = 4}
+> @kbd{ a[3] = 3}
+> @kbd{ for (i in a)}
+> @kbd{ print i, a[i]}
+> @kbd{@}'}
+@print{} 4 4
+@print{} 3 3
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{ PROCINFO["sorted_in"] = "@@str_ind_asc"}
+> @kbd{ a[4] = 4}
+> @kbd{ a[3] = 3}
+> @kbd{ for (i in a)}
+> @kbd{ print i, a[i]}
+> @kbd{@}'}
+@print{} 3 3
+@print{} 4 4
+@end example
+
+When sorting an array by element values, if a value happens to be
+a subarray then it is considered to be greater than any string or
+numeric value, regardless of what the subarray itself contains,
+and all subarrays are treated as being equal to each other. Their
+order relative to each other is determined by their index strings.
+
+@node Array Sorting Functions
+@subsection Sorting Array Values and Indices with @command{gawk}
+
+@cindex arrays, sorting
+@cindex @code{asort()} function (@command{gawk})
+@cindex @code{asort()} function (@command{gawk}), arrays@comma{} sorting
+@cindex sort function, arrays, sorting
+The order in which an array is scanned with a @samp{for (i in array)}
+loop is essentially arbitrary.
+In most @command{awk} implementations, sorting an array requires
+writing a @code{sort} function.
+While this can be educational for exploring different sorting algorithms,
+usually that's not the point of the program.
+@command{gawk} provides the built-in @code{asort()}
+and @code{asorti()} functions
+(@pxref{String Functions})
+for sorting arrays. For example:
+
+@example
+@var{populate the array} data
+n = asort(data)
+for (i = 1; i <= n; i++)
+ @var{do something with} data[i]
+@end example
+
+After the call to @code{asort()}, the array @code{data} is indexed from 1
+to some number @var{n}, the total number of elements in @code{data}.
+(This count is @code{asort()}'s return value.)
+@code{data[1]} @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
+The array elements are compared as strings.
+
+@cindex side effects, @code{asort()} function
+An important side effect of calling @code{asort()} is that
+@emph{the array's original indices are irrevocably lost}.
+As this isn't always desirable, @code{asort()} accepts a
+second argument:
+
+@example
+@var{populate the array} source
+n = asort(source, dest)
+for (i = 1; i <= n; i++)
+ @var{do something with} dest[i]
+@end example
+
+In this case, @command{gawk} copies the @code{source} array into the
+@code{dest} array and then sorts @code{dest}, destroying its indices.
+However, the @code{source} array is not affected.
+
+@code{asort()} and @code{asorti()} accept a third string argument
+to control the comparison rule for the array elements, and the direction
+of the sorted results. The valid comparison modes are @samp{string} and @samp{number},
+and the direction can be either @samp{ascending} or @samp{descending}.
+Either mode or direction, or both, can be omitted in which
+case the defaults, @samp{string} or @samp{ascending} is assumed
+for the comparison mode and the direction, respectively. Seperate comparison
+mode from direction with a single space, and they can appear in any
+order. To compare the elements as numbers, and to reverse the elements
+of the @code{dest} array, the call to asort in the above example can be
+replaced with:
+
+@example
+asort(source, dest, "descending number")
+@end example
+
+The third argument to @code{asort()} can also be a user-defined
+function name which is used to order the array elements before
+constructing the result array.
+@xref{Scanning an Array}, for more information.
+
+
+Often, what's needed is to sort on the values of the @emph{indices}
+instead of the values of the elements.
+To do that, use the
+@code{asorti()} function. The interface is identical to that of
+@code{asort()}, except that the index values are used for sorting, and
+become the values of the result array:
+
+@example
+@{ source[$0] = some_func($0) @}
+
+END @{
+ n = asorti(source, dest)
+ for (i = 1; i <= n; i++) @{
+ @ii{Work with sorted indices directly:}
+ @var{do something with} dest[i]
+ @dots{}
+ @ii{Access original array via sorted indices:}
+ @var{do something with} source[dest[i]]
+ @}
+@}
+@end example
+
+Sorting the array by replacing the indices provides maximal flexibility.
+To traverse the elements in decreasing order, use a loop that goes from
+@var{n} down to 1, either over the elements or over the indices. This
+is an alternative to specifying @samp{descending} for the sorting order
+using the optional third argument.
+
+@cindex reference counting, sorting arrays
+Copying array indices and elements isn't expensive in terms of memory.
+Internally, @command{gawk} maintains @dfn{reference counts} to data.
+For example, when @code{asort()} copies the first array to the second one,
+there is only one copy of the original array elements' data, even though
+both arrays use the values.
+
+@c Document It And Call It A Feature. Sigh.
+@cindex @command{gawk}, @code{IGNORECASE} variable in
+@cindex @code{IGNORECASE} variable
+@cindex arrays, sorting, @code{IGNORECASE} variable and
+@cindex @code{IGNORECASE} variable, array sorting and
+Because @code{IGNORECASE} affects string comparisons, the value
+of @code{IGNORECASE} also affects sorting for both @code{asort()} and @code{asorti()}.
+Note also that the locale's sorting order does @emph{not}
+come into play; comparisons are based on character values only.@footnote{This
+is true because locale-based comparison occurs only when in POSIX
+compatibility mode, and since @code{asort()} and @code{asorti()} are
+@command{gawk} extensions, they are not available in that case.}
+Caveat Emptor.
+
@node Two-way I/O
@section Two-Way Communications with Another Process
@cindex Brennan, Michael
@@ -26252,8 +26370,8 @@ of the @value{DOCUMENT} where you can find more information.
* SVR4:: Minor changes between System V Releases 3.1
and 4.
* POSIX:: New features from the POSIX standard.
-* BTL:: New features from Brian Kernighan's
- version of @command{awk}.
+* BTL:: New features from Brian Kernighan's version of
+ @command{awk}.
* POSIX/GNU:: The extensions in @command{gawk} not in POSIX
@command{awk}.
* Common Extensions:: Common Extensions Summary.
@@ -26762,6 +26880,9 @@ SunOS 3.x, Sun 386 (Road Runner)
@item
Tandem (non-POSIX)
+@item
+Prestandard VAX C compiler for VAX/VMS
+
@end itemize
@end itemize
@@ -26887,6 +27008,7 @@ provided the initial port to OS/2 and its documentation.
@cindex Jaegermann, Michal
Michal Jaegermann
provided the port to Atari systems and its documentation.
+(This port is no longer supported.)
He continues to provide portability checking with DEC Alpha
systems, and has done a lot of work to make sure @command{gawk}
works on non-32-bit systems.