aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawk.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r--doc/gawk.texi4521
1 files changed, 3946 insertions, 575 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 59695171..573768ea 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -321,419 +321,531 @@ particular records in a file and perform operations upon them.
* Index:: Concept and Variable Index.
@detailmenu
-* History:: The history of @command{gawk} and
- @command{awk}.
-* Names:: What name to use to find @command{awk}.
-* This Manual:: Using this @value{DOCUMENT}. Includes
- sample input files that you can use.
-* Conventions:: Typographical Conventions.
-* Manual History:: Brief history of the GNU project and this
- @value{DOCUMENT}.
-* How To Contribute:: Helping to save the world.
-* Acknowledgments:: Acknowledgments.
-* Running gawk:: How to run @command{gawk} programs;
- includes command-line syntax.
-* One-shot:: Running a short throwaway @command{awk}
- program.
-* Read Terminal:: Using no input files (input from terminal
- instead).
-* Long:: Putting permanent @command{awk} programs in
- files.
-* Executable Scripts:: Making self-contained @command{awk}
- programs.
-* Comments:: Adding documentation to @command{gawk}
- programs.
-* Quoting:: More discussion of shell quoting issues.
-* DOS Quoting:: Quoting in Windows Batch Files.
-* Sample Data Files:: Sample data files for use in the
- @command{awk} programs illustrated in this
- @value{DOCUMENT}.
-* Very Simple:: A very simple example.
-* Two Rules:: A less simple one-line example using two
- rules.
-* More Complex:: A more complex example.
-* Statements/Lines:: Subdividing or combining statements into
- lines.
-* Other Features:: Other Features of @command{awk}.
-* When:: When to use @command{gawk} and when to use
- other things.
-* Command Line:: How to run @command{awk}.
-* Options:: Command-line options and their meanings.
-* Other Arguments:: Input file names and variable assignments.
-* Naming Standard Input:: How to specify standard input with other
- files.
-* Environment Variables:: The environment variables @command{gawk}
- uses.
-* AWKPATH Variable:: Searching directories for @command{awk}
- programs.
-* AWKLIBPATH Variable:: Searching directories for @command{awk}
- shared libraries.
-* Other Environment Variables:: The environment variables.
-* Exit Status:: @command{gawk}'s exit status.
-* Include Files:: Including other files into your program.
-* Loading Shared Libraries:: Loading shared libraries into your program.
-* Obsolete:: Obsolete Options and/or features.
-* Undocumented:: Undocumented Options and Features.
-* Regexp Usage:: How to Use Regular Expressions.
-* Escape Sequences:: How to write nonprinting characters.
-* Regexp Operators:: Regular Expression Operators.
-* Bracket Expressions:: What can go between @samp{[...]}.
-* GNU Regexp Operators:: Operators specific to GNU software.
-* Case-sensitivity:: How to do case-insensitive matching.
-* Leftmost Longest:: How much text matches.
-* Computed Regexps:: Using Dynamic Regexps.
-* Records:: Controlling how data is split into records.
-* Fields:: An introduction to fields.
-* Nonconstant Fields:: Nonconstant Field Numbers.
-* Changing Fields:: Changing the Contents of a Field.
-* Field Separators:: The field separator and how to change it.
-* Default Field Splitting:: How fields are normally separated.
-* Regexp Field Splitting:: Using regexps as the field separator.
-* Single Character Fields:: Making each character a separate field.
-* Command Line Field Separator:: Setting @code{FS} from the command-line.
-* Field Splitting Summary:: Some final points and a summary table.
-* Constant Size:: Reading constant width data.
-* Splitting By Content:: Defining Fields By Content
-* Multiple Line:: Reading multi-line records.
-* Getline:: Reading files under explicit program
- control using the @code{getline} function.
-* Plain Getline:: Using @code{getline} with no arguments.
-* Getline/Variable:: Using @code{getline} into a variable.
-* Getline/File:: Using @code{getline} from a file.
-* Getline/Variable/File:: Using @code{getline} into a variable from a
- file.
-* Getline/Pipe:: Using @code{getline} from a pipe.
-* Getline/Variable/Pipe:: Using @code{getline} into a variable from a
- pipe.
-* Getline/Coprocess:: Using @code{getline} from a coprocess.
-* Getline/Variable/Coprocess:: Using @code{getline} into a variable from a
- coprocess.
-* Getline Notes:: Important things to know about
- @code{getline}.
-* Getline Summary:: Summary of @code{getline} Variants.
-* Read Timeout:: Reading input with a timeout.
-* Command line directories:: What happens if you put a directory on the
- command line.
-* Print:: The @code{print} statement.
-* Print Examples:: Simple examples of @code{print} statements.
-* Output Separators:: The output separators and how to change
- them.
-* OFMT:: Controlling Numeric Output With
- @code{print}.
-* Printf:: The @code{printf} statement.
-* Basic Printf:: Syntax of the @code{printf} statement.
-* Control Letters:: Format-control letters.
-* Format Modifiers:: Format-specification modifiers.
-* Printf Examples:: Several examples.
-* Redirection:: How to redirect output to multiple files
- and pipes.
-* Special Files:: File name interpretation in @command{gawk}.
- @command{gawk} allows access to inherited
- file descriptors.
-* Special FD:: Special files for I/O.
-* Special Network:: Special files for network communications.
-* Special Caveats:: Things to watch out for.
-* Close Files And Pipes:: Closing Input and Output Files and Pipes.
-* Values:: Constants, Variables, and Regular
- Expressions.
-* Constants:: String, numeric and regexp constants.
-* Scalar Constants:: Numeric and string constants.
-* Nondecimal-numbers:: What are octal and hex numbers.
-* Regexp Constants:: Regular Expression constants.
-* Using Constant Regexps:: When and how to use a regexp constant.
-* Variables:: Variables give names to values for later
- use.
-* Using Variables:: Using variables in your programs.
-* Assignment Options:: Setting variables on the command-line and a
- summary of command-line syntax. This is an
- advanced method of input.
-* Conversion:: The conversion of strings to numbers and
- vice versa.
-* All Operators:: @command{gawk}'s operators.
-* Arithmetic Ops:: Arithmetic operations (@samp{+}, @samp{-},
- etc.)
-* Concatenation:: Concatenating strings.
-* Assignment Ops:: Changing the value of a variable or a
- field.
-* Increment Ops:: Incrementing the numeric value of a
- variable.
-* Truth Values and Conditions:: Testing for true and false.
-* Truth Values:: What is ``true'' and what is ``false''.
-* Typing and Comparison:: How variables acquire types and how this
- affects comparison of numbers and strings
- with @samp{<}, etc.
-* Variable Typing:: String type versus numeric type.
-* Comparison Operators:: The comparison operators.
-* POSIX String Comparison:: String comparison with POSIX rules.
-* Boolean Ops:: Combining comparison expressions using
- boolean operators @samp{||} (``or''),
- @samp{&&} (``and'') and @samp{!} (``not'').
-* Conditional Exp:: Conditional expressions select between two
- subexpressions under control of a third
- subexpression.
-* Function Calls:: A function call is an expression.
-* Precedence:: How various operators nest.
-* Locales:: How the locale affects things.
-* Pattern Overview:: What goes into a pattern.
-* Regexp Patterns:: Using regexps as patterns.
-* Expression Patterns:: Any expression can be used as a pattern.
-* Ranges:: Pairs of patterns specify record ranges.
-* BEGIN/END:: Specifying initialization and cleanup
- rules.
-* Using BEGIN/END:: How and why to use BEGIN/END rules.
-* I/O And BEGIN/END:: I/O issues in BEGIN/END rules.
-* BEGINFILE/ENDFILE:: Two special patterns for advanced control.
-* Empty:: The empty pattern, which matches every
- record.
-* Using Shell Variables:: How to use shell variables with
- @command{awk}.
-* Action Overview:: What goes into an action.
-* Statements:: Describes the various control statements in
- detail.
-* If Statement:: Conditionally execute some @command{awk}
- statements.
-* While Statement:: Loop until some condition is satisfied.
-* Do Statement:: Do specified action while looping until
- some condition is satisfied.
-* For Statement:: Another looping statement, that provides
- initialization and increment clauses.
-* Switch Statement:: Switch/case evaluation for conditional
- execution of statements based on a value.
-* Break Statement:: Immediately exit the innermost enclosing
- loop.
-* Continue Statement:: Skip to the end of the innermost enclosing
- loop.
-* Next Statement:: Stop processing the current input record.
-* Nextfile Statement:: Stop processing the current file.
-* Exit Statement:: Stop execution of @command{awk}.
-* Built-in Variables:: Summarizes the built-in variables.
-* User-modified:: Built-in variables that you change to
- control @command{awk}.
-* Auto-set:: Built-in variables where @command{awk}
- gives you information.
-* ARGC and ARGV:: Ways to use @code{ARGC} and @code{ARGV}.
-* Array Basics:: The basics of arrays.
-* Array Intro:: Introduction to Arrays
-* Reference to Elements:: How to examine one element of an array.
-* Assigning Elements:: How to change an element of an array.
-* Array Example:: Basic Example of an Array
-* Scanning an Array:: A variation of the @code{for} statement. It
- loops through the indices of an array's
- existing elements.
-* Controlling Scanning:: Controlling the order in which arrays are
- scanned.
-* Delete:: The @code{delete} statement removes an
- element from an array.
-* Numeric Array Subscripts:: How to use numbers as subscripts in
- @command{awk}.
-* Uninitialized Subscripts:: Using Uninitialized variables as
- subscripts.
-* Multi-dimensional:: Emulating multidimensional arrays in
- @command{awk}.
-* Multi-scanning:: Scanning multidimensional arrays.
-* Arrays of Arrays:: True multidimensional arrays.
-* Built-in:: Summarizes the built-in functions.
-* Calling Built-in:: How to call built-in functions.
-* Numeric Functions:: Functions that work with numbers, including
- @code{int()}, @code{sin()} and
- @code{rand()}.
-* String Functions:: Functions for string manipulation, such as
- @code{split()}, @code{match()} and
- @code{sprintf()}.
-* Gory Details:: More than you want to know about @samp{\}
- and @samp{&} with @code{sub()},
- @code{gsub()}, and @code{gensub()}.
-* I/O Functions:: Functions for files and shell commands.
-* Time Functions:: Functions for dealing with timestamps.
-* Bitwise Functions:: Functions for bitwise operations.
-* Type Functions:: Functions for type information.
-* I18N Functions:: Functions for string translation.
-* User-defined:: Describes User-defined functions in detail.
-* Definition Syntax:: How to write definitions and what they
- mean.
-* Function Example:: An example function definition and what it
- does.
-* Function Caveats:: Things to watch out for.
-* Calling A Function:: Don't use spaces.
-* Variable Scope:: Controlling variable scope.
-* Pass By Value/Reference:: Passing parameters.
-* Return Statement:: Specifying the value a function returns.
-* Dynamic Typing:: How variable types can change at runtime.
-* Indirect Calls:: Choosing the function to call at runtime.
-* I18N and L10N:: Internationalization and Localization.
-* Explaining gettext:: How GNU @code{gettext} works.
-* Programmer i18n:: Features for the programmer.
-* Translator i18n:: Features for the translator.
-* String Extraction:: Extracting marked strings.
-* Printf Ordering:: Rearranging @code{printf} arguments.
-* I18N Portability:: @command{awk}-level portability issues.
-* I18N Example:: A simple i18n example.
-* Gawk I18N:: @command{gawk} is also internationalized.
-* Nondecimal Data:: Allowing nondecimal input data.
-* Array Sorting:: Facilities for controlling array traversal
- and sorting arrays.
-* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
-* Array Sorting Functions:: How to use @code{asort()} and
- @code{asorti()}.
-* Two-way I/O:: Two-way communications with another
- process.
-* TCP/IP Networking:: Using @command{gawk} for network
- programming.
-* Profiling:: Profiling your @command{awk} programs.
-* Library Names:: How to best name private global variables
- in library functions.
-* General Functions:: Functions that are of general use.
-* Strtonum Function:: A replacement for the built-in
- @code{strtonum()} function.
-* Assert Function:: A function for assertions in @command{awk}
- programs.
-* Round Function:: A function for rounding if @code{sprintf()}
- does not do it correctly.
-* Cliff Random Function:: The Cliff Random Number Generator.
-* Ordinal Functions:: Functions for using characters as numbers
- and vice versa.
-* Join Function:: A function to join an array into a string.
-* Getlocaltime Function:: A function to get formatted times.
-* Data File Management:: Functions for managing command-line data
- files.
-* Filetrans Function:: A function for handling data file
- transitions.
-* Rewind Function:: A function for rereading the current file.
-* File Checking:: Checking that data files are readable.
-* Empty Files:: Checking for zero-length files.
-* Ignoring Assigns:: Treating assignments as file names.
-* Getopt Function:: A function for processing command-line
- arguments.
-* Passwd Functions:: Functions for getting user information.
-* Group Functions:: Functions for getting group information.
-* Walking Arrays:: A function to walk arrays of arrays.
-* Running Examples:: How to run these examples.
-* Clones:: Clones of common utilities.
-* Cut Program:: The @command{cut} utility.
-* Egrep Program:: The @command{egrep} utility.
-* Id Program:: The @command{id} utility.
-* Split Program:: The @command{split} utility.
-* Tee Program:: The @command{tee} utility.
-* Uniq Program:: The @command{uniq} utility.
-* Wc Program:: The @command{wc} utility.
-* Miscellaneous Programs:: Some interesting @command{awk} programs.
-* Dupword Program:: Finding duplicated words in a document.
-* Alarm Program:: An alarm clock.
-* Translate Program:: A program similar to the @command{tr}
- utility.
-* Labels Program:: Printing mailing labels.
-* Word Sorting:: A program to produce a word usage count.
-* History Sorting:: Eliminating duplicate entries from a
- history file.
-* Extract Program:: Pulling out programs from Texinfo source
- files.
-* Simple Sed:: A Simple Stream Editor.
-* Igawk Program:: A wrapper for @command{awk} that includes
- files.
-* Anagram Program:: Finding anagrams from a dictionary.
-* Signature Program:: People do amazing things with too much time
- on their hands.
-* Debugging:: Introduction to @command{gawk} debugger.
-* Debugging Concepts:: Debugging in General.
-* Debugging Terms:: Additional Debugging Concepts.
-* Awk Debugging:: Awk Debugging.
-* Sample Debugging Session:: Sample debugging session.
-* Debugger Invocation:: How to Start the Debugger.
-* Finding The Bug:: Finding the Bug.
-* List of Debugger Commands:: Main debugger commands.
-* Breakpoint Control:: Control of Breakpoints.
-* Debugger Execution Control:: Control of Execution.
-* Viewing And Changing Data:: Viewing and Changing Data.
-* Execution Stack:: Dealing with the Stack.
-* Debugger Info:: Obtaining Information about the Program and
- the Debugger State.
-* Miscellaneous Debugger Commands:: Miscellaneous Commands.
-* Readline Support:: Readline support.
-* Limitations:: Limitations and future plans.
-* General Arithmetic:: An introduction to computer arithmetic.
-* Floating Point Issues:: Stuff to know about floating-point numbers.
-* String Conversion Precision:: The String Value Can Lie.
-* Unexpected Results:: Floating Point Numbers Are Not Abstract
- Numbers.
-* POSIX Floating Point Problems:: Standards Versus Existing Practice.
-* Integer Programming:: Effective integer programming.
-* Floating-point Programming:: Effective Floating-point Programming.
-* Floating-point Representation:: Binary floating-point representation.
-* Floating-point Context:: Floating-point context.
-* Rounding Mode:: Floating-point rounding mode.
-* Gawk and MPFR:: How @command{gawk} provides
- arbitrary-precision arithmetic.
-* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
- Arithmetic with @command{gawk}.
-* Setting Precision:: Setting the working precision.
-* Setting Rounding Mode:: Setting the rounding mode.
-* Floating-point Constants:: Representing floating-point constants.
-* Changing Precision:: Changing the precision of a number.
-* Exact Arithmetic:: Exact arithmetic with floating-point
- numbers.
-* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic with
- @command{gawk}.
-* Plugin License:: A note about licensing.
-* Sample Library:: A example of new functions.
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
-* V7/SVR3.1:: The major changes between V7 and System V
- Release 3.1.
-* SVR4:: Minor changes between System V Releases 3.1
- and 4.
-* POSIX:: New features from the POSIX standard.
-* BTL:: New features from Brian Kernighan's version
- of @command{awk}.
-* POSIX/GNU:: The extensions in @command{gawk} not in
- POSIX @command{awk}.
-* Common Extensions:: Common Extensions Summary.
-* Ranges and Locales:: How locales used to affect regexp ranges.
-* Contributors:: The major contributors to @command{gawk}.
-* Gawk Distribution:: What is in the @command{gawk} distribution.
-* Getting:: How to get the distribution.
-* Extracting:: How to extract the distribution.
-* Distribution contents:: What is in the distribution.
-* Unix Installation:: Installing @command{gawk} under various
- versions of Unix.
-* Quick Installation:: Compiling @command{gawk} under Unix.
-* Additional Configuration Options:: Other compile-time options.
-* Configuration Philosophy:: How it's all supposed to work.
-* Non-Unix Installation:: Installation on Other Operating Systems.
-* PC Installation:: Installing and Compiling @command{gawk} on
- MS-DOS and OS/2.
-* PC Binary Installation:: Installing a prepared distribution.
-* PC Compiling:: Compiling @command{gawk} for MS-DOS,
- Windows32, and OS/2.
-* PC Testing:: Testing @command{gawk} on PC systems.
-* PC Using:: Running @command{gawk} on MS-DOS, Windows32
- and OS/2.
-* Cygwin:: Building and running @command{gawk} for
- Cygwin.
-* MSYS:: Using @command{gawk} In The MSYS
- Environment.
-* VMS Installation:: Installing @command{gawk} on VMS.
-* VMS Compilation:: How to compile @command{gawk} under VMS.
-* VMS Installation Details:: How to install @command{gawk} under VMS.
-* VMS Running:: How to run @command{gawk} under VMS.
-* VMS Old Gawk:: An old version comes with some VMS systems.
-* Bugs:: Reporting Problems and Bugs.
-* Other Versions:: Other freely available @command{awk}
- implementations.
-* Compatibility Mode:: How to disable certain @command{gawk}
- extensions.
-* Additions:: Making Additions To @command{gawk}.
-* Accessing The Source:: Accessing the Git repository.
-* Adding Code:: Adding code to the main body of
- @command{gawk}.
-* New Ports:: Porting @command{gawk} to a new operating
- system.
-* Derived Files:: Why derived files are kept in the
- @command{git} repository.
-* Future Extensions:: New features that may be implemented one
- day.
-* Basic High Level:: The high level view.
-* Basic Data Typing:: A very quick intro to data types.
+* History:: The history of @command{gawk} and
+ @command{awk}.
+* Names:: What name to use to find
+ @command{awk}.
+* This Manual:: Using this @value{DOCUMENT}. Includes
+ sample input files that you can use.
+* Conventions:: Typographical Conventions.
+* Manual History:: Brief history of the GNU project and
+ this @value{DOCUMENT}.
+* How To Contribute:: Helping to save the world.
+* Acknowledgments:: Acknowledgments.
+* Running gawk:: How to run @command{gawk} programs;
+ includes command-line syntax.
+* One-shot:: Running a short throwaway
+ @command{awk} program.
+* Read Terminal:: Using no input files (input from
+ terminal instead).
+* Long:: Putting permanent @command{awk}
+ programs in files.
+* Executable Scripts:: Making self-contained @command{awk}
+ programs.
+* Comments:: Adding documentation to @command{gawk}
+ programs.
+* Quoting:: More discussion of shell quoting
+ issues.
+* DOS Quoting:: Quoting in Windows Batch Files.
+* Sample Data Files:: Sample data files for use in the
+ @command{awk} programs illustrated in
+ this @value{DOCUMENT}.
+* Very Simple:: A very simple example.
+* Two Rules:: A less simple one-line example using
+ two rules.
+* More Complex:: A more complex example.
+* Statements/Lines:: Subdividing or combining statements
+ into lines.
+* Other Features:: Other Features of @command{awk}.
+* When:: When to use @command{gawk} and when to
+ use other things.
+* Command Line:: How to run @command{awk}.
+* Options:: Command-line options and their
+ meanings.
+* Other Arguments:: Input file names and variable
+ assignments.
+* Naming Standard Input:: How to specify standard input with
+ other files.
+* Environment Variables:: The environment variables
+ @command{gawk} uses.
+* AWKPATH Variable:: Searching directories for
+ @command{awk} programs.
+* AWKLIBPATH Variable:: Searching directories for
+ @command{awk} shared libraries.
+* Other Environment Variables:: The environment variables.
+* Exit Status:: @command{gawk}'s exit status.
+* Include Files:: Including other files into your
+ program.
+* Loading Shared Libraries:: Loading shared libraries into your
+ program.
+* Obsolete:: Obsolete Options and/or features.
+* Undocumented:: Undocumented Options and Features.
+* Regexp Usage:: How to Use Regular Expressions.
+* Escape Sequences:: How to write nonprinting characters.
+* Regexp Operators:: Regular Expression Operators.
+* Bracket Expressions:: What can go between @samp{[...]}.
+* GNU Regexp Operators:: Operators specific to GNU software.
+* Case-sensitivity:: How to do case-insensitive matching.
+* Leftmost Longest:: How much text matches.
+* Computed Regexps:: Using Dynamic Regexps.
+* Records:: Controlling how data is split into
+ records.
+* Fields:: An introduction to fields.
+* Nonconstant Fields:: Nonconstant Field Numbers.
+* Changing Fields:: Changing the Contents of a Field.
+* Field Separators:: The field separator and how to change
+ it.
+* Default Field Splitting:: How fields are normally separated.
+* Regexp Field Splitting:: Using regexps as the field separator.
+* Single Character Fields:: Making each character a separate
+ field.
+* Command Line Field Separator:: Setting @code{FS} from the
+ command-line.
+* Field Splitting Summary:: Some final points and a summary table.
+* Constant Size:: Reading constant width data.
+* Splitting By Content:: Defining Fields By Content
+* Multiple Line:: Reading multi-line records.
+* Getline:: Reading files under explicit program
+ control using the @code{getline}
+ function.
+* Plain Getline:: Using @code{getline} with no
+ arguments.
+* Getline/Variable:: Using @code{getline} into a variable.
+* Getline/File:: Using @code{getline} from a file.
+* Getline/Variable/File:: Using @code{getline} into a variable
+ from a file.
+* Getline/Pipe:: Using @code{getline} from a pipe.
+* Getline/Variable/Pipe:: Using @code{getline} into a variable
+ from a pipe.
+* Getline/Coprocess:: Using @code{getline} from a coprocess.
+* Getline/Variable/Coprocess:: Using @code{getline} into a variable
+ from a coprocess.
+* Getline Notes:: Important things to know about
+ @code{getline}.
+* Getline Summary:: Summary of @code{getline} Variants.
+* Read Timeout:: Reading input with a timeout.
+* Command line directories:: What happens if you put a directory on
+ the command line.
+* Print:: The @code{print} statement.
+* Print Examples:: Simple examples of @code{print}
+ statements.
+* Output Separators:: The output separators and how to
+ change them.
+* OFMT:: Controlling Numeric Output With
+ @code{print}.
+* Printf:: The @code{printf} statement.
+* Basic Printf:: Syntax of the @code{printf} statement.
+* Control Letters:: Format-control letters.
+* Format Modifiers:: Format-specification modifiers.
+* Printf Examples:: Several examples.
+* Redirection:: How to redirect output to multiple
+ files and pipes.
+* Special Files:: File name interpretation in
+ @command{gawk}. @command{gawk} allows
+ access to inherited file descriptors.
+* Special FD:: Special files for I/O.
+* Special Network:: Special files for network
+ communications.
+* Special Caveats:: Things to watch out for.
+* Close Files And Pipes:: Closing Input and Output Files and
+ Pipes.
+* Values:: Constants, Variables, and Regular
+ Expressions.
+* Constants:: String, numeric and regexp constants.
+* Scalar Constants:: Numeric and string constants.
+* Nondecimal-numbers:: What are octal and hex numbers.
+* Regexp Constants:: Regular Expression constants.
+* Using Constant Regexps:: When and how to use a regexp constant.
+* Variables:: Variables give names to values for
+ later use.
+* Using Variables:: Using variables in your programs.
+* Assignment Options:: Setting variables on the command-line
+ and a summary of command-line syntax.
+ This is an advanced method of input.
+* Conversion:: The conversion of strings to numbers
+ and vice versa.
+* All Operators:: @command{gawk}'s operators.
+* Arithmetic Ops:: Arithmetic operations (@samp{+},
+ @samp{-}, etc.)
+* Concatenation:: Concatenating strings.
+* Assignment Ops:: Changing the value of a variable or a
+ field.
+* Increment Ops:: Incrementing the numeric value of a
+ variable.
+* Truth Values and Conditions:: Testing for true and false.
+* Truth Values:: What is ``true'' and what is
+ ``false''.
+* Typing and Comparison:: How variables acquire types and how
+ this affects comparison of numbers and
+ strings with @samp{<}, etc.
+* Variable Typing:: String type versus numeric type.
+* Comparison Operators:: The comparison operators.
+* POSIX String Comparison:: String comparison with POSIX rules.
+* Boolean Ops:: Combining comparison expressions using
+ boolean operators @samp{||} (``or''),
+ @samp{&&} (``and'') and @samp{!}
+ (``not'').
+* Conditional Exp:: Conditional expressions select between
+ two subexpressions under control of a
+ third subexpression.
+* Function Calls:: A function call is an expression.
+* Precedence:: How various operators nest.
+* Locales:: How the locale affects things.
+* Pattern Overview:: What goes into a pattern.
+* Regexp Patterns:: Using regexps as patterns.
+* Expression Patterns:: Any expression can be used as a
+ pattern.
+* Ranges:: Pairs of patterns specify record
+ ranges.
+* BEGIN/END:: Specifying initialization and cleanup
+ rules.
+* Using BEGIN/END:: How and why to use BEGIN/END rules.
+* I/O And BEGIN/END:: I/O issues in BEGIN/END rules.
+* BEGINFILE/ENDFILE:: Two special patterns for advanced
+ control.
+* Empty:: The empty pattern, which matches every
+ record.
+* Using Shell Variables:: How to use shell variables with
+ @command{awk}.
+* Action Overview:: What goes into an action.
+* Statements:: Describes the various control
+ statements in detail.
+* If Statement:: Conditionally execute some
+ @command{awk} statements.
+* While Statement:: Loop until some condition is
+ satisfied.
+* Do Statement:: Do specified action while looping
+ until some condition is satisfied.
+* For Statement:: Another looping statement, that
+ provides initialization and increment
+ clauses.
+* Switch Statement:: Switch/case evaluation for conditional
+ execution of statements based on a
+ value.
+* Break Statement:: Immediately exit the innermost
+ enclosing loop.
+* Continue Statement:: Skip to the end of the innermost
+ enclosing loop.
+* Next Statement:: Stop processing the current input
+ record.
+* Nextfile Statement:: Stop processing the current file.
+* Exit Statement:: Stop execution of @command{awk}.
+* Built-in Variables:: Summarizes the built-in variables.
+* User-modified:: Built-in variables that you change to
+ control @command{awk}.
+* Auto-set:: Built-in variables where @command{awk}
+ gives you information.
+* ARGC and ARGV:: Ways to use @code{ARGC} and
+ @code{ARGV}.
+* Array Basics:: The basics of arrays.
+* Array Intro:: Introduction to Arrays
+* Reference to Elements:: How to examine one element of an
+ array.
+* Assigning Elements:: How to change an element of an array.
+* Array Example:: Basic Example of an Array
+* Scanning an Array:: A variation of the @code{for}
+ statement. It loops through the
+ indices of an array's existing
+ elements.
+* Controlling Scanning:: Controlling the order in which arrays
+ are scanned.
+* Delete:: The @code{delete} statement removes an
+ element from an array.
+* Numeric Array Subscripts:: How to use numbers as subscripts in
+ @command{awk}.
+* Uninitialized Subscripts:: Using Uninitialized variables as
+ subscripts.
+* Multi-dimensional:: Emulating multidimensional arrays in
+ @command{awk}.
+* Multi-scanning:: Scanning multidimensional arrays.
+* Arrays of Arrays:: True multidimensional arrays.
+* Built-in:: Summarizes the built-in functions.
+* Calling Built-in:: How to call built-in functions.
+* Numeric Functions:: Functions that work with numbers,
+ including @code{int()}, @code{sin()}
+ and @code{rand()}.
+* String Functions:: Functions for string manipulation,
+ such as @code{split()}, @code{match()}
+ and @code{sprintf()}.
+* Gory Details:: More than you want to know about
+ @samp{\} and @samp{&} with
+ @code{sub()}, @code{gsub()}, and
+ @code{gensub()}.
+* I/O Functions:: Functions for files and shell
+ commands.
+* Time Functions:: Functions for dealing with timestamps.
+* Bitwise Functions:: Functions for bitwise operations.
+* Type Functions:: Functions for type information.
+* I18N Functions:: Functions for string translation.
+* User-defined:: Describes User-defined functions in
+ detail.
+* Definition Syntax:: How to write definitions and what they
+ mean.
+* Function Example:: An example function definition and
+ what it does.
+* Function Caveats:: Things to watch out for.
+* Calling A Function:: Don't use spaces.
+* Variable Scope:: Controlling variable scope.
+* Pass By Value/Reference:: Passing parameters.
+* Return Statement:: Specifying the value a function
+ returns.
+* Dynamic Typing:: How variable types can change at
+ runtime.
+* Indirect Calls:: Choosing the function to call at
+ runtime.
+* I18N and L10N:: Internationalization and Localization.
+* Explaining gettext:: How GNU @code{gettext} works.
+* Programmer i18n:: Features for the programmer.
+* Translator i18n:: Features for the translator.
+* String Extraction:: Extracting marked strings.
+* Printf Ordering:: Rearranging @code{printf} arguments.
+* I18N Portability:: @command{awk}-level portability
+ issues.
+* I18N Example:: A simple i18n example.
+* Gawk I18N:: @command{gawk} is also
+ internationalized.
+* Nondecimal Data:: Allowing nondecimal input data.
+* Array Sorting:: Facilities for controlling array
+ traversal and sorting arrays.
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions:: How to use @code{asort()} and
+ @code{asorti()}.
+* Two-way I/O:: Two-way communications with another
+ process.
+* TCP/IP Networking:: Using @command{gawk} for network
+ programming.
+* Profiling:: Profiling your @command{awk} programs.
+* Library Names:: How to best name private global
+ variables in library functions.
+* General Functions:: Functions that are of general use.
+* Strtonum Function:: A replacement for the built-in
+ @code{strtonum()} function.
+* Assert Function:: A function for assertions in
+ @command{awk} programs.
+* Round Function:: A function for rounding if
+ @code{sprintf()} does not do it
+ correctly.
+* Cliff Random Function:: The Cliff Random Number Generator.
+* Ordinal Functions:: Functions for using characters as
+ numbers and vice versa.
+* Join Function:: A function to join an array into a
+ string.
+* Getlocaltime Function:: A function to get formatted times.
+* Data File Management:: Functions for managing command-line
+ data files.
+* Filetrans Function:: A function for handling data file
+ transitions.
+* Rewind Function:: A function for rereading the current
+ file.
+* File Checking:: Checking that data files are readable.
+* Empty Files:: Checking for zero-length files.
+* Ignoring Assigns:: Treating assignments as file names.
+* Getopt Function:: A function for processing command-line
+ arguments.
+* Passwd Functions:: Functions for getting user
+ information.
+* Group Functions:: Functions for getting group
+ information.
+* Walking Arrays:: A function to walk arrays of arrays.
+* Running Examples:: How to run these examples.
+* Clones:: Clones of common utilities.
+* Cut Program:: The @command{cut} utility.
+* Egrep Program:: The @command{egrep} utility.
+* Id Program:: The @command{id} utility.
+* Split Program:: The @command{split} utility.
+* Tee Program:: The @command{tee} utility.
+* Uniq Program:: The @command{uniq} utility.
+* Wc Program:: The @command{wc} utility.
+* Miscellaneous Programs:: Some interesting @command{awk}
+ programs.
+* Dupword Program:: Finding duplicated words in a
+ document.
+* Alarm Program:: An alarm clock.
+* Translate Program:: A program similar to the @command{tr}
+ utility.
+* Labels Program:: Printing mailing labels.
+* Word Sorting:: A program to produce a word usage
+ count.
+* History Sorting:: Eliminating duplicate entries from a
+ history file.
+* Extract Program:: Pulling out programs from Texinfo
+ source files.
+* Simple Sed:: A Simple Stream Editor.
+* Igawk Program:: A wrapper for @command{awk} that
+ includes files.
+* Anagram Program:: Finding anagrams from a dictionary.
+* Signature Program:: People do amazing things with too much
+ time on their hands.
+* Debugging:: Introduction to @command{gawk}
+ debugger.
+* Debugging Concepts:: Debugging in General.
+* Debugging Terms:: Additional Debugging Concepts.
+* Awk Debugging:: Awk Debugging.
+* Sample Debugging Session:: Sample debugging session.
+* Debugger Invocation:: How to Start the Debugger.
+* Finding The Bug:: Finding the Bug.
+* List of Debugger Commands:: Main debugger commands.
+* Breakpoint Control:: Control of Breakpoints.
+* Debugger Execution Control:: Control of Execution.
+* Viewing And Changing Data:: Viewing and Changing Data.
+* Execution Stack:: Dealing with the Stack.
+* Debugger Info:: Obtaining Information about the
+ Program and the Debugger State.
+* Miscellaneous Debugger Commands:: Miscellaneous Commands.
+* Readline Support:: Readline support.
+* Limitations:: Limitations and future plans.
+* General Arithmetic:: An introduction to computer
+ arithmetic.
+* Floating Point Issues:: Stuff to know about floating-point
+ numbers.
+* String Conversion Precision:: The String Value Can Lie.
+* Unexpected Results:: Floating Point Numbers Are Not
+ Abstract Numbers.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+* Integer Programming:: Effective integer programming.
+* Floating-point Programming:: Effective Floating-point Programming.
+* Floating-point Representation:: Binary floating-point representation.
+* Floating-point Context:: Floating-point context.
+* Rounding Mode:: Floating-point rounding mode.
+* Gawk and MPFR:: How @command{gawk} provides
+ arbitrary-precision arithmetic.
+* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
+ Arithmetic with @command{gawk}.
+* Setting Precision:: Setting the working precision.
+* Setting Rounding Mode:: Setting the rounding mode.
+* Floating-point Constants:: Representing floating-point constants.
+* Changing Precision:: Changing the precision of a number.
+* Exact Arithmetic:: Exact arithmetic with floating-point
+ numbers.
+* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic
+ with @command{gawk}.
+* Extension Intro:: What is an extension.
+* Plugin License:: A note about licensing.
+* Extension Design:: Design notes about the extension API.
+* Old Extension Problems:: Problems with the old mechanism.
+* Extension New Mechanism Goals:: Goals for the new mechanism.
+* Extension Other Design Decisions:: Some other design decisions.
+* Extension Mechanism Outline:: An outline of how it works.
+* Extension Future Growth:: Some room for future growth.
+* Extension API Description:: A full description of the API.
+* Extension API Functions Introduction:: Introduction to the API functions.
+* General Data Types:: The data types.
+* Requesting Values:: How to get a value.
+* Constructor Functions:: Functions for creating values.
+* Registration Functions:: Functions to register things with
+ @command{gawk}.
+* Extension Functions:: Registering extension functions.
+* Exit Callback Functions:: Registering an exit callback.
+* Extension Version String:: Registering a version string.
+* Input Parsers:: Registering an input parser.
+* Output Wrappers:: Registering an output wrapper.
+* Two-way processors:: Registering a two-way processor.
+* Printing Messages:: Functions for printing messages.
+* Updating @code{ERRNO}:: Functions for updating @code{ERRNO}.
+* Accessing Parameters:: Functions for accessing parameters.
+* Symbol Table Access:: Functions for accessing global
+ variables.
+* Symbol table by name:: Accessing variables by name.
+* Symbol table by cookie:: Accessing variables by ``cookie''.
+* Cached values:: Creating and using cached values.
+* Array Manipulation:: Functions for working with arrays.
+* Array Data Types:: Data types for working with arrays.
+* Array Functions:: Functions for working with arrays.
+* Flattening Arrays:: How to flatten arrays.
+* Creating Arrays:: How to create and populate arrays.
+* Extension API Variables:: Variables provided by the API.
+* Extension Versioning:: API Version information.
+* Extension API Informational Variables:: Variables providing information about
+ @command{gawk}'s invocation.
+* Extension API Boilerplate:: Boilerplate code for using the API.
+* Finding Extensions:: How @command{gawk} find compiled
+ extensions.
+* Extension Example:: Example C code for an extension.
+* Internal File Description:: What the new functions will do.
+* Internal File Ops:: The code for internal file operations.
+* Using Internal File Ops:: How to use an external extension.
+* Extension Samples:: The sample extensions that ship with
+ @code{gawk}.
+* Extension Sample File Functions:: The file functions sample.
+* Extension Sample Fnmatch:: An interface to @code{fnmatch()}.
+* Extension Sample Fork:: An interface to @code{fork()} and
+ other process functions.
+* Extension Sample Ord:: Character to value to character
+ conversions.
+* Extension Sample Readdir:: An interface to @code{readdir()}.
+* Extension Sample Revout:: Reversing output sample output
+ wrapper.
+* Extension Sample Rev2way:: Reversing data sample two-way
+ processor.
+* Extension Sample Read write array:: Serializing an array to a file.
+* Extension Sample Readfile:: Reading an entire file into a string.
+* Extension Sample API Tests:: Tests for the API.
+* Extension Sample Time:: An interface to @code{gettimeofday()}
+ and @code{sleep()}.
+* gawkextlib:: The @code{gawkextlib} project.
+* V7/SVR3.1:: The major changes between V7 and
+ System V Release 3.1.
+* SVR4:: Minor changes between System V
+ Releases 3.1 and 4.
+* POSIX:: New features from the POSIX standard.
+* BTL:: New features from Brian Kernighan's
+ version of @command{awk}.
+* POSIX/GNU:: The extensions in @command{gawk} not
+ in POSIX @command{awk}.
+* Common Extensions:: Common Extensions Summary.
+* Ranges and Locales:: How locales used to affect regexp
+ ranges.
+* Contributors:: The major contributors to
+ @command{gawk}.
+* Gawk Distribution:: What is in the @command{gawk}
+ distribution.
+* Getting:: How to get the distribution.
+* Extracting:: How to extract the distribution.
+* Distribution contents:: What is in the distribution.
+* Unix Installation:: Installing @command{gawk} under
+ various versions of Unix.
+* Quick Installation:: Compiling @command{gawk} under Unix.
+* Additional Configuration Options:: Other compile-time options.
+* Configuration Philosophy:: How it's all supposed to work.
+* Non-Unix Installation:: Installation on Other Operating
+ Systems.
+* PC Installation:: Installing and Compiling
+ @command{gawk} on MS-DOS and OS/2.
+* PC Binary Installation:: Installing a prepared distribution.
+* PC Compiling:: Compiling @command{gawk} for MS-DOS,
+ Windows32, and OS/2.
+* PC Testing:: Testing @command{gawk} on PC systems.
+* PC Using:: Running @command{gawk} on MS-DOS,
+ Windows32 and OS/2.
+* Cygwin:: Building and running @command{gawk}
+ for Cygwin.
+* MSYS:: Using @command{gawk} In The MSYS
+ Environment.
+* VMS Installation:: Installing @command{gawk} on VMS.
+* VMS Compilation:: How to compile @command{gawk} under
+ VMS.
+* VMS Installation Details:: How to install @command{gawk} under
+ VMS.
+* VMS Running:: How to run @command{gawk} under VMS.
+* VMS Old Gawk:: An old version comes with some VMS
+ systems.
+* Bugs:: Reporting Problems and Bugs.
+* Other Versions:: Other freely available @command{awk}
+ implementations.
+* Compatibility Mode:: How to disable certain @command{gawk}
+ extensions.
+* Additions:: Making Additions To @command{gawk}.
+* Accessing The Source:: Accessing the Git repository.
+* Adding Code:: Adding code to the main body of
+ @command{gawk}.
+* New Ports:: Porting @command{gawk} to a new
+ operating system.
+* Derived Files:: Why derived files are kept in the
+ @command{git} repository.
+* Future Extensions:: New features that may be implemented
+ one day.
+* Basic High Level:: The high level view.
+* Basic Data Typing:: A very quick intro to data types.
@end detailmenu
@end menu
@@ -28049,42 +28161,62 @@ gawk -M 'BEGIN @{ n = 13; print n % 2 @}'
@node Dynamic Extensions
@chapter Writing Extensions for @command{gawk}
-This chapter is a placeholder, pending a rewrite for the new API.
-Some of the old bits remain, since they can be partially reused.
-
-
-@c STARTOFRANGE gladfgaw
-@cindex @command{gawk}, functions, adding
-@c STARTOFRANGE adfugaw
-@cindex adding, functions to @command{gawk}
-@c STARTOFRANGE fubadgaw
-@cindex functions, built-in, adding to @command{gawk}
-It is possible to add new built-in
-functions to @command{gawk} using dynamically loaded libraries. This
-facility is available on systems (such as GNU/Linux) that support
-the C @code{dlopen()} and @code{dlsym()} functions.
-This @value{CHAPTER} describes how to write and use dynamically
-loaded extensions for @command{gawk}.
-Experience with programming in
-C or C++ is necessary when reading this @value{SECTION}.
+It is possible to add new built-in functions to @command{gawk} using
+dynamically loaded libraries. This facility is available on systems (such
+as GNU/Linux) that support the C @code{dlopen()} and @code{dlsym()}
+functions. This @value{CHAPTER} describes how to create extensions
+using code written in C or C++. If you don't know anything about C
+programming, you can safely skip this @value{CHAPTER}, although you
+may wish to review the documentation on the extensions that come with
+@command{gawk} (@pxref{Extension Samples}), and the section on the
+@code{gawkextlib} project (@pxref{gawkextlib}).
@quotation NOTE
When @option{--sandbox} is specified, extensions are disabled
-(@pxref{Options}.
+(@pxref{Options}).
@end quotation
@menu
+* Extension Intro:: What is an extension.
* Plugin License:: A note about licensing.
-* Sample Library:: A example of new functions.
+* Extension Design:: Design notes about the extension API.
+* Extension API Description:: A full description of the API.
+* Extension Example:: Example C code for an extension.
+* Extension Samples:: The sample extensions that ship with
+ @code{gawk}.
+* gawkextlib:: The @code{gawkextlib} project.
@end menu
+@node Extension Intro
+@section Introduction
+
+An @dfn{extension} (sometimes called a @dfn{plug-in}) is a piece of
+external compiled code that @command{gawk} can load at runtime to
+provide additional functionality, over and above the built-in capabilities
+described in the rest of this @value{DOCUMENT}.
+
+Extensions are useful because they allow you (of course) to extend
+@command{gawk}'s functionality. For example, they can provide access to
+system calls (such as @code{chdir()} to change directory) and to other
+C library routines that could be of use. As with most software,
+``the sky is the limit;'' if you can imagine something that you might
+want to do and can write in C or C++, you can write an extension to do it!
+
+Extensions are written in C or C++, using the @dfn{Application Programming
+Interface} (API) defined for this purpose by the @command{gawk}
+developers. The rest of this @value{CHAPTER} explains the design
+decisions behind the API, the facilities it provides and how to use
+them, and presents a small sample extension. In addition, it documents
+the sample extensions included in the @command{gawk} distribution,
+and describes the @code{gawkextlib} project.
+
@node Plugin License
@section Extension Licensing
Every dynamic extension should define the global symbol
@code{plugin_is_GPL_compatible} to assert that it has been licensed under
a GPL-compatible license. If this symbol does not exist, @command{gawk}
-will emit a fatal error and exit.
+emits a fatal error and exits when it tries to load your extension.
The declared type of the symbol should be @code{int}. It does not need
to be in any allocated section, though. The code merely asserts that
@@ -28094,23 +28226,2383 @@ the symbol exists in the global scope. Something like this is enough:
int plugin_is_GPL_compatible;
@end example
-@node Sample Library
-@section Example: Directory and File Operation Built-ins
-@c STARTOFRANGE chdirg
-@cindex @code{chdir()} function@comma{} implementing in @command{gawk}
-@c STARTOFRANGE statg
-@cindex @code{stat()} function@comma{} implementing in @command{gawk}
-@c STARTOFRANGE filre
-@cindex files, information about@comma{} retrieving
-@c STARTOFRANGE dirch
-@cindex directories, changing
-
-Two useful functions that are not in @command{awk} are @code{chdir()}
-(so that an @command{awk} program can change its directory) and
-@code{stat()} (so that an @command{awk} program can gather information about
-a file).
-This @value{SECTION} implements these functions for @command{gawk} in an
-external extension library.
+@node Extension Design
+@section Extension API Design
+
+The first version of extensions for @command{gawk} was developed in
+the mid-1990s and released with @command{gawk} 3.1 in the late 1990s.
+The basic mechanisms and design remained unchanged for close to 15 years,
+until 2012.
+
+The old extension mechanism used data types and functions from
+@command{gawk} itself, with a ``clever hack'' to install extension
+functions.
+
+@command{gawk} included some sample extensions, of which a few were
+really useful. However, it was clear from the outset that the extension
+mechanism was bolted onto the side and was not really thought out.
+
+@menu
+* Old Extension Problems:: Problems with the old mechanism.
+* Extension New Mechanism Goals:: Goals for the new mechanism.
+* Extension Other Design Decisions:: Some other design decisions.
+* Extension Mechanism Outline:: An outline of how it works.
+* Extension Future Growth:: Some room for future growth.
+@end menu
+
+@node Old Extension Problems
+@subsection Problems With The Old Mechanism
+
+The old extension mechanism had several problems:
+
+@itemize @bullet
+@item
+It depended heavily upon @command{gawk} internals. Any time the
+@code{NODE} structure@footnote{A critical central data structure
+inside @command{gawk}.} changed, an extension would have to be
+recompiled. Furthermore, to really write extensions required understanding
+something about @command{gawk}'s internal functions. There was some
+documentation in this @value{DOCUMENT}, but it was quite minimal.
+
+@item
+Being able to call into @command{gawk} from an extension required linker
+facilities that are common on Unix-derived systems but that did
+not work on Windows systems; users wanting extensions on Windows
+had to statically link them into @command{gawk}, even though Windows supports
+dynamic loading of shared objects.
+
+@item
+The API would change occasionally as @command{gawk} changed; no compatibility
+between versions was ever offered or planned for.
+@end itemize
+
+Despite the drawbacks, the @command{xgawk} project developers forked
+@command{gawk} and developed several significant extensions. They also
+enhanced @command{gawk}'s facilities relating to file inclusion and
+shared object access.
+
+A new API was desired for a long time, but only in 2012 did the
+@command{gawk} maintainer and the @command{xgawk} developers finally
+start working on it together. More information about the @command{xgawk}
+project is provided in @ref{gawkextlib}.
+
+@node Extension New Mechanism Goals
+@subsection Goals For A New Mechanism
+
+Some goals for the new API were:
+
+@itemize @bullet
+@item
+The API should be independent of @command{gawk} internals. Changes in
+@command{gawk} internals should not be visible to the writer of an
+extension function.
+
+@item
+The API should provide @emph{binary} compatibility across @command{gawk}
+releases as long as the API itself does not change.
+
+@item
+The API should enable extensions written in C to have roughly the
+same ``appearance'' to @command{awk}-level code as @command{awk}
+functions do. This means that extensions should have:
+
+@itemize @minus
+@item
+The ability to access function parameters.
+
+@item
+The ability to turn an undefined parameter into an array (call by reference).
+
+@item
+The ability to create, access and update global variables.
+
+@item
+Easy access to all the elements of an array at once (``array flattening'')
+in order to loop over all the element in an easy fashion for C code.
+
+@item
+The ability to create arrays (including @command{gawk}'s true
+multi-dimensional arrays).
+@end itemize
+@end itemize
+
+Some additional important goals were:
+
+@itemize @bullet
+@item
+The API should use only features in ISO C 90, so that extensions
+can be written using the widest range of C and C++ compilers. The header
+should include the appropriate @samp{#ifdef __cplusplus} and @samp{extern "C"}
+magic so that a C++ compiler could be used. (If using C++, the runtime
+system has to be smart enough to call any constructors and destructors,
+as @command{gawk} is a C program. As of this writing, this has not been
+tested.)
+
+@item
+The API mechanism should not require access to @command{gawk}'s
+symbols@footnote{The @dfn{symbols} are the variables and functions
+defined inside @command{gawk}. Access to these symbols by code
+external to @command{gawk} loaded dynamically at runtime is
+problematic on Windows.} by the compile-time or dynamic linker,
+in order to enable creation of extensions that also work on Windows.
+@end itemize
+
+During development, it became clear that there were other features
+that should be available to extensions, which were also subsequently
+provided:
+
+@itemize @bullet
+@item
+Extensions should have the ability to hook into @command{gawk}'s
+I/O redirection mechanism. In particular, the @command{xgawk}
+developers provided a so-called ``open hook'' to take over reading
+records. During development, this was generalized to allow
+extensions to hook into input processing, output processing, and
+two-way I/O.
+
+@item
+An extension should be able to provide a ``call back'' function
+to perform clean up actions when @command{gawk} exits.
+
+@item
+An extension should be able to provide a version string so that
+@command{gawk}'s @option{--version} option can provide information
+about extensions as well.
+@end itemize
+
+@node Extension Other Design Decisions
+@subsection Other Design Decisions
+
+As an ``arbitrary'' design decision, extensions can read the values of
+built-in variables and arrays (such as @code{ARGV} and @code{FS}), but cannot
+change them, with the exception of @code{PROCINFO}.
+
+The reason for this is to prevent an extension function from affecting
+the flow of an @command{awk} program outside its control. While a real
+@command{awk} function can do what it likes, that is at the discretion
+of the programmer. An extension function should provide a service or
+make a C API available for use within @command{awk}, and not mess with
+@code{FS} or @code{ARGC} and @code{ARGV}.
+
+In addition, it becomes easy to start down a slippery slope. How
+much access to @command{gawk} facilities do extensions need?
+Do they need @code{getline}? What about calling @code{gsub()} or
+compiling regular expressions? What about calling into @command{awk}
+functions? (@emph{That} would be messy.)
+
+In order to avoid these issues, the @command{gawk} developers chose
+to start with the simplest, most basic features that are still truly useful.
+
+Another decision is that although @command{gawk} provides nice things like
+MPFR, and arrays indexed internally by integers, these features are not
+being brought out to the API in order to keep things simple and close to
+traditional @command{awk} semantics. (In fact, arrays indexed internally
+by integers are so transparent that they aren't even documented!)
+
+With time, the API will undoubtedly evolve; the @command{gawk} developers
+expect this to be driven by user needs. For now, the current API seems
+to provide a minimal yet powerful set of features for creating extensions.
+
+@node Extension Mechanism Outline
+@subsection At A High Level How It Works
+
+The requirement to avoid access to @command{gawk}'s symbols is, at first
+glance, a difficult one to meet.
+
+One design, apparently used by Perl and Ruby and maybe others, would
+be to make the mainline @command{gawk} code into a library, with the
+@command{gawk} utility a small C @code{main()} function linked against
+the library.
+
+This seemed like the tail wagging the dog, complicating build and
+installation and making a simple copy of the @command{gawk} executable
+from one system to another (or one place to another on the same
+system!) into a chancy operation.
+
+Pat Rankin suggested the solution that was adopted. Communication between
+@command{gawk} and an extension is two-way. First, when an extension
+is loaded, it is passed a pointer to a @code{struct} whose fields are
+function pointers.
+@iftex
+This is shown in @ref{load-extension}.
+@end iftex
+
+@float Figure,load-extension
+@caption{Loading the extension}
+@ifinfo
+@center @image{api-figure1, , , Loading the extension, txt}
+@end ifinfo
+@ifhtml
+@center @image{api-figure1, , , Loading the extension, png}
+@end ifhtml
+@ifnotinfo
+@ifnothtml
+@center @image{api-figure1, , , Loading the extension}
+@end ifnothtml
+@end ifnotinfo
+@end float
+
+The extension can call functions inside @command{gawk} through these
+function pointers, at runtime, without needing (link-time) access
+to @command{gawk}'s symbols. One of these function pointers is to a
+function for ``registering'' new built-in functions.
+@iftex
+This is shown in @ref{load-new-function}.
+@end iftex
+
+@float Figure,load-new-function
+@caption{Loading the new function}
+@ifinfo
+@center @image{api-figure2, , , Loading the new function, txt}
+@end ifinfo
+@ifhtml
+@center @image{api-figure2, , , Loading the new function, png}
+@end ifhtml
+@ifnotinfo
+@ifnothtml
+@center @image{api-figure2, , , Loading the new function}
+@end ifnothtml
+@end ifnotinfo
+@end float
+
+In the other direction, the extension registers its new functions
+with @command{gawk} by passing function pointers to the functions that
+provide the new feature (@code{do_chdir()}, for example). @command{gawk}
+associates the function pointer with a name and can then call it, using a
+defined calling convention.
+@iftex
+This is shown in @ref{call-new-function}.
+@end iftex
+
+@float Figure,call-new-function
+@caption{Calling the new function}
+@ifinfo
+@center @image{api-figure3, , , Calling the new function, txt}
+@end ifinfo
+@ifhtml
+@center @image{api-figure3, , , Calling the new function, png}
+@end ifhtml
+@ifnotinfo
+@ifnothtml
+@center @image{api-figure3, , , Calling the new function}
+@end ifnothtml
+@end ifnotinfo
+@end float
+
+The @code{do_@var{xxx}()} function, in turn, then uses the function
+pointers in the API @code{struct} to do its work, such as updating
+variables or arrays, printing messages, setting @code{ERRNO}, and so on.
+
+Convenience macros in the @file{gawkapi.h} header file make calling
+through the function pointers look like regular function calls so that
+extension code is quite readable and understandable.
+
+Although all of this sounds medium complicated, the result is that
+extension code is quite clean and straightforward. This can be seen in
+the sample extensions @file{filefuncs.c} (@pxref{Extension Example})
+and also the @file{testext.c} code for testing the APIs.
+
+Some other bits and pieces:
+
+@itemize @bullet
+@item
+The API provides access to @command{gawk}'s @code{do_@var{xxx}} values,
+reflecting command line options, like @code{do_lint}, @code{do_profiling}
+and so on (@pxref{Extension API Variables}).
+These are informational: an extension cannot affect these
+inside @command{gawk}. In addition, attempting to assign to them
+produces a compile-time error.
+
+@item
+The API also provides major and minor version numbers, so that an
+extension can check if the @command{gawk} it is loaded with supports the
+facilities it was compiled with. (Version mismatches ``shouldn't''
+happen, but we all know how @emph{that} goes.)
+@xref{Extension Versioning}, for details.
+@end itemize
+
+@node Extension Future Growth
+@subsection Room For Future Growth
+
+The API provides room for future growth, in two ways.
+
+An ``extension id'' is passed into the extension when its loaded. This
+extension id is then passed back to @command{gawk} with each function
+call. This allows @command{gawk} to identify the extension calling into it,
+should it need to know.
+
+A ``name space'' is passed into @command{gawk} when an extension function
+is registered. This provides for a future mechanism for grouping
+extension functions and possibly avoiding name conflicts.
+
+Of course, as of this writing, no decisions have been made with respect
+to any of the above.
+
+@node Extension API Description
+@section API Description
+
+This (rather large) @value{SECTION} describes the API in detail.
+
+@menu
+* Extension API Functions Introduction:: Introduction to the API functions.
+* General Data Types:: The data types.
+* Requesting Values:: How to get a value.
+* Constructor Functions:: Functions for creating values.
+* Registration Functions:: Functions to register things with
+ @command{gawk}.
+* Printing Messages:: Functions for printing messages.
+* Updating @code{ERRNO}:: Functions for updating @code{ERRNO}.
+* Accessing Parameters:: Functions for accessing parameters.
+* Symbol Table Access:: Functions for accessing global
+ variables.
+* Array Manipulation:: Functions for working with arrays.
+* Extension API Variables:: Variables provided by the API.
+* Extension API Boilerplate:: Boilerplate code for using the API.
+* Finding Extensions:: How @command{gawk} find compiled
+ extensions.
+@end menu
+
+@node Extension API Functions Introduction
+@subsection Introduction
+
+Access to facilities within @command{gawk} are made available
+by calling through function pointers passed into your extension.
+
+API function pointers are provided for the following kinds of operations:
+
+@itemize @bullet
+@item
+Registrations functions. You may register:
+@itemize @minus
+@item
+extension functions,
+@item
+exit callbacks,
+@item
+a version string,
+@item
+input parsers,
+@item
+output wrappers,
+@item
+and two-way processors.
+@end itemize
+All of these are discussed in detail, later in this @value{CHAPTER}.
+
+@item
+Printing fatal, warning, and ``lint'' warning messages.
+
+@item
+Updating @code{ERRNO}, or unsetting it.
+
+@item
+Accessing parameters, including converting an undefined parameter into
+an array.
+
+@item
+Symbol table access: retrieving a global variable, creating one,
+or changing one. This also includes the ability to create a scalar
+variable that will be @emph{constant} within @command{awk} code.
+
+@item
+Creating and releasing cached values; this provides an
+efficient way to use values for multiple variables and
+can be a big performance win.
+
+@item
+Manipulating arrays:
+@itemize @minus
+@item
+Retrieving, adding, deleting, and modifying elements
+@item
+Getting the count of elements in an array
+@item
+Creating a new array
+@item
+Clearing an array
+@item
+Flattening an array for easy C style looping over all its indices and elements
+@end itemize
+@end itemize
+
+Some points about using the API:
+
+@itemize @bullet
+@item
+You must include @code{<sys/types.h>} and @code{<sys/stat.h>} before including
+the @file{gawkapi.h} header file. In addition, you must include either
+@code{<stddef.h>} or @code{<stdlib.h>} to get the definition of @code{size_t}.
+If you wish to use the boilerplate @code{dl_load_func()} macro, you will
+need to include @code{<stdio.h>} as well.
+Finally, to pass reasonable integer values for @code{ERRNO}, you
+will need to include @code{<errno.h>}.
+
+@item
+Although the API only uses ISO C 90 features, there is an exception; the
+``constructor'' functions use the @code{inline} keyword. If your compiler
+does not support this keyword, you should either place
+@samp{-Dinline=''} on your command line, or use the GNU Autotools and include a
+@file{config.h} file in your extensions.
+
+@item
+All pointers filled in by @command{gawk} are to memory
+managed by @command{gawk} and should be treated by the extension as
+read-only. Memory for @emph{all} strings passed into @command{gawk}
+from the extension @emph{must} come from @code{malloc()} and is managed
+by @command{gawk} from then on.
+
+@item
+The API defines several simple structs that map values as seen
+from @command{awk}. A value can be a @code{double}, a string, or an
+array (as in multidimensional arrays, or when creating a new array).
+Strings maintain both pointer and length since embedded @code{NUL}
+characters are allowed.
+
+By intent, strings are maintained using the current multibyte encoding (as
+defined by @env{LC_@var{xxx}} environment variables) and not using wide
+characters. This matches how @command{gawk} stores strings internally
+and also how characters are likely to be input and output from files.
+
+@item
+When retrieving a value (such as a parameter or that of a global variable
+or array element), the extension requests a specific type (number, string,
+scalars, value cookie, array, or ``undefined''). When the request is
+``undefined,'' the returned value will have the real underlying type.
+
+However, if the request and actual type don't match, the access function
+returns ``false'' and fills in the type of the actual value that is there,
+so that the extension can, e.g., print an error message
+(``scalar passed where array expected'').
+
+@c This is documented in the header file and needs some expanding upon.
+@c The table there should be presented here
+@end itemize
+
+While you may call the API functions by using the function pointers
+directly, the interface is not so pretty. To make extension code look
+more like regular code, the @file{gawkapi.h} header file defines a number
+of macros which you should use in your code. This @value{SECTION} presents
+the macros as if they were functions.
+
+@node General Data Types
+@subsection General Purpose Data Types
+
+@quotation
+@i{I have a true love/hate relationship with unions.}@*
+Arnold Robbins
+
+@i{That's the thing about unions: the compiler will arrange things so they
+can accommodate both love and hate.}@*
+Chet Ramey
+@end quotation
+
+The extension API defines a number of simple types and structures for general
+purpose use. Additional, more specialized, data structures, are introduced
+in subsequent @value{SECTION}s, together with the functions that use them.
+
+@table @code
+@item typedef void *awk_ext_id_t;
+A value of this type is received from @command{gawk} when an extension is loaded.
+That value must then be passed back to @command{gawk} as the first parameter of
+each API function.
+
+@item #define awk_const @dots{}
+This macro expands to @samp{const} when compiling an extension,
+and to nothing when compiling @command{gawk} itself. This makes
+certain fields in the API data structures unwritable from extension code,
+while allowing @command{gawk} to use them as it needs to.
+
+@item typedef int awk_bool_t;
+A simple boolean type. At the moment, the API does not define special
+``true'' and ``false'' values, although perhaps it should.
+
+@item typedef struct @{
+@itemx @ @ @ @ char *str;@ @ @ @ @ @ /* data */
+@itemx @ @ @ @ size_t len;@ @ @ @ @ /* length thereof, in chars */
+@itemx @} awk_string_t;
+This represents a mutable string. @command{gawk}
+owns the memory pointed to if it supplied
+the value. Otherwise, it takes ownership of the memory pointed to.
+@strong{Such memory must come from @code{malloc()}!}
+
+As mentioned earlier, strings are maintained using the current
+multibyte encoding.
+
+@item typedef enum @{
+@itemx @ @ @ @ AWK_UNDEFINED,
+@itemx @ @ @ @ AWK_NUMBER,
+@itemx @ @ @ @ AWK_STRING,
+@itemx @ @ @ @ AWK_ARRAY,
+@itemx @ @ @ @ AWK_SCALAR,@ @ @ @ @ @ @ @ @ /* opaque access to a variable */
+@itemx @ @ @ @ AWK_VALUE_COOKIE@ @ @ /* for updating a previously created value */
+@itemx @} awk_valtype_t;
+This @code{enum} indicates the type of a value.
+It is used in the following @code{struct}.
+
+@item typedef struct @{
+@itemx @ @ @ @ awk_valtype_t val_type;
+@itemx @ @ @ @ union @{
+@itemx @ @ @ @ @ @ @ @ awk_string_t@ @ @ @ @ @ @ s;
+@itemx @ @ @ @ @ @ @ @ double@ @ @ @ @ @ @ @ @ @ @ @ @ d;
+@itemx @ @ @ @ @ @ @ @ awk_array_t@ @ @ @ @ @ @ @ a;
+@itemx @ @ @ @ @ @ @ @ awk_scalar_t@ @ @ @ @ @ @ scl;
+@itemx @ @ @ @ @ @ @ @ awk_value_cookie_t@ vc;
+@itemx @ @ @ @ @} u;
+@itemx @} awk_value_t;
+An ``@command{awk} value.''
+The @code{val_type} member indicates what kind of value the
+@code{union} holds, and each member is of the appropriate type.
+
+@item #define str_value@ @ @ @ @ @ u.s
+@itemx #define num_value@ @ @ @ @ @ u.d
+@itemx #define array_cookie@ @ @ u.a
+@itemx #define scalar_cookie@ @ u.scl
+@itemx #define value_cookie@ @ @ u.vc
+These macros make accessing the fields of the @code{awk_value_t} more
+readable.
+
+@item typedef void *awk_scalar_t;
+Scalars can be represented as an opaque type. These values are obtained from
+@command{gawk} and then passed back into it. This is discussed in a general fashion below,
+and in more detail in @ref{Symbol table by cookie}.
+
+@item typedef void *awk_value_cookie_t;
+A ``value cookie'' is an opaque type representing a cached value.
+This is also discussed in a general fashion below,
+and in more detail in @ref{Cached values}.
+
+@end table
+
+Scalar values in @command{awk} are either numbers or strings. The
+@code{awk_value_t} struct represents values. The @code{val_type} member
+indicates what is in the @code{union}.
+
+Representing numbers is easy---the API uses a C @code{double}. Strings
+require more work. Since @command{gawk} allows embedded @code{NUL} bytes
+in string values, a string must be represented as a pair containing a
+data-pointer and length. This is the @code{awk_string_t} type.
+
+Identifiers (i.e., the names of global variables) can be associated
+with either scalar values or with arrays. In addition, @command{gawk}
+provides true arrays of arrays, where any given array element can
+itself be an array. Discussion of arrays is delayed until
+@ref{Array Manipulation}.
+
+The various macros listed earlier make it easier to use the elements
+of the @code{union} as if they were fields in a @code{struct}; this
+is a common coding practice in C. Such code is easier to write and to
+read, however it remains @emph{your} responsibility to make sure that
+the @code{val_type} member correctly reflects the type of the value in
+the @code{awk_value_t}.
+
+Conceptually, the first three members of the @code{union} (number, string,
+and array) are all that is needed for working with @command{awk} values.
+However, since the API provides routines for accessing and changing
+the value of global scalar variables only by using the variable's name,
+there is a performance penalty: @command{gawk} must find the variable
+each time it is accessed and changed. This turns out to be a real issue,
+not just a theoretical one.
+
+Thus, if you know that your extension will spend considerable time
+reading and/or changing the value of one or more scalar variables, you
+can obtain a @dfn{scalar cookie}@footnote{See
+@uref{http://catb.org/jargon/html/C/cookie.html, the ``cookie'' entry in the Jargon file} for a
+definition of @dfn{cookie}, and @uref{http://catb.org/jargon/html/M/magic-cookie.html,
+the ``magic cookie'' entry in the Jargon file} for a nice example. See
+also the entry for ``Cookie'' in the @ref{Glossary}.}
+object for that variable, and then use
+the cookie for getting the variable's value or for changing the variable's
+value.
+This is the @code{awk_scalar_t} type and @code{scalar_cookie} macro.
+Given a scalar cookie, @command{gawk} can directly retrieve or
+modify the value, as required, without having to first find it.
+
+The @code{awk_value_cookie_t} type and @code{value_cookie} macro are similar.
+If you know that you wish to
+use the same numeric or string @emph{value} for one or more variables,
+you can create the value once, retaining a @dfn{value cookie} for it,
+and then pass in that value cookie whenever you wish to set the value of a
+variable. This saves both storage space within the running @command{gawk}
+process as well as the time needed to create the value.
+
+@node Requesting Values
+@subsection Requesting Values
+
+All of the functions that return values from @command{gawk}
+work in the same way. You pass in an @code{awk_valtype_t} value
+to indicate what kind of value you expect. If the actual value
+matches what you requested, the function returns true and fills
+in the @code{awk_value_t} result.
+Otherwise, the function returns false, and the @code{val_type}
+member indicates the type of the actual value. You may then
+print an error message, or reissue the request for the actual
+value type, as appropriate. This behavior is summarized in
+@ref{table-value-types-returned}.
+
+@ifnotplaintext
+@float Table,table-value-types-returned
+@caption{Value Types Returned}
+@multitable @columnfractions .50 .50
+@headitem @tab Type of Actual Value:
+@end multitable
+@multitable @columnfractions .166 .166 .198 .15 .15 .166
+@headitem @tab @tab String @tab Number @tab Array @tab Undefined
+@item @tab @b{String} @tab String @tab String @tab false @tab false
+@item @tab @b{Number} @tab Number if can be converted, else false @tab Number @tab false @tab false
+@item @b{Type} @tab @b{Array} @tab false @tab false @tab Array @tab false
+@item @b{Requested:} @tab @b{Scalar} @tab Scalar @tab Scalar @tab false @tab false
+@item @tab @b{Undefined} @tab String @tab Number @tab Array @tab Undefined
+@item @tab @b{Value Cookie} @tab false @tab false @tab false @tab false
+@end multitable
+@end float
+@end ifnotplaintext
+@ifplaintext
+@float Table,table-value-types-returned
+@caption{Value Types Returned}
+@example
+ +-------------------------------------------------+
+ | Type of Actual Value: |
+ +------------+------------+-----------+-----------+
+ | String | Number | Array | Undefined |
++-----------+-----------+------------+------------+-----------+-----------+
+| | String | String | String | false | false |
+| |-----------+------------+------------+-----------+-----------+
+| | Number | Number if | Number | false | false |
+| | | can be | | | |
+| | | converted, | | | |
+| | | else false | | | |
+| |-----------+------------+------------+-----------+-----------+
+| Type | Array | false | false | Array | false |
+| Requested |-----------+------------+------------+-----------+-----------+
+| | Scalar | Scalar | Scalar | false | false |
+| |-----------+------------+------------+-----------+-----------+
+| | Undefined | String | Number | Array | Undefined |
+| |-----------+------------+------------+-----------+-----------+
+| | Value | false | false | false | false |
+| | Cookie | | | | |
++-----------+-----------+------------+------------+-----------+-----------+
+@end example
+@end float
+@end ifplaintext
+
+@node Constructor Functions
+@subsection Constructor Functions and Convenience Macros
+
+The API provides a number of @dfn{constructor} functions for creating
+string and numeric values, as well as a number of convenience macros.
+This @value{SUBSECTION} presents them all as function prototypes, in
+the way that extension code would use them.
+
+@table @code
+@item static inline awk_value_t *
+@itemx make_const_string(const char *string, size_t length, awk_value_t *result)
+This function creates a string value in the @code{awk_value_t} variable
+pointed to by @code{result}. It expects @code{string} to be a C string constant
+(or other string data), and automatically creates a @emph{copy} of the data
+for storage in @code{result}. It returns @code{result}.
+
+@item static inline awk_value_t *
+@itemx make_malloced_string(const char *string, size_t length, awk_value_t *result)
+This function creates a string value in the @code{awk_value_t} variable
+pointed to by @code{result}. It expects @code{string} to be a @samp{char *}
+value pointing to data previously obtained from @code{malloc()}. The idea here
+is that the data is passed directly to @command{gawk}, which assumes
+responsibility for it. It returns @code{result}.
+
+@item static inline awk_value_t *
+@itemx make_null_string(awk_value_t *result)
+This specialized function creates a null string (the ``undefined'' value)
+in the @code{awk_value_t} variable pointed to by @code{result}.
+It returns @code{result}.
+
+@item static inline awk_value_t *
+@itemx make_number(double num, awk_value_t *result)
+This function simply creates a numeric value in the @code{awk_value_t} variable
+pointed to by @code{result}.
+@end table
+
+Two convenience macros may be used for allocating storage from @code{malloc()}
+and @code{realloc()}. If the allocation fails, they cause @command{gawk} to
+exit with a fatal error message. They should be used as if they were
+procedure calls that do not return a value.
+
+@table @code
+@item emalloc(pointer, type, size, message)
+The arguments to this macro are as follows:
+@c nested table
+@table @code
+@item pointer
+The pointer variable to point at the allocated storage.
+
+@item type
+The type of the pointer variable, used to create a cast for the call to @code{malloc()}.
+
+@item size
+The total number of bytes to be allocated.
+
+@item message
+A message to be prefixed to the fatal error message. Typically this is the name
+of the function using the macro.
+@end table
+
+@noindent
+For example, you might allocate a string value like so:
+
+@example
+awk_value_t result;
+char *message;
+const char greet[] = "Don't Panic!";
+
+emalloc(message, char *, sizeof(greet), "myfunc");
+strcpy(message, greet);
+make_malloced_string(message, strlen(message), & result);
+@end example
+
+@item erealloc(pointer, type, size, message)
+This is like @code{emalloc()}, but it calls @code{realloc()},
+instead of @code{malloc()}.
+The arguments are the same as for the @code{emalloc()} macro.
+@end table
+
+@node Registration Functions
+@subsection Registration Functions
+
+This @value{SECTION} describes the API functions for
+registering parts of your extension with @command{gawk}.
+
+@menu
+* Extension Functions:: Registering extension functions.
+* Exit Callback Functions:: Registering an exit callback.
+* Extension Version String:: Registering a version string.
+* Input Parsers:: Registering an input parser.
+* Output Wrappers:: Registering an output wrapper.
+* Two-way processors:: Registering a two-way processor.
+@end menu
+
+@node Extension Functions
+@subsubsection Registering An Extension Function
+
+Extension functions are described by the following record:
+
+@example
+typedef struct @{
+@ @ @ @ const char *name;
+@ @ @ @ awk_value_t *(*function)(int num_actual_args, awk_value_t *result);
+@ @ @ @ size_t num_expected_args;
+@} awk_ext_func_t;
+@end example
+
+The fields are:
+
+@table @code
+@item const char *name;
+The name of the new function.
+@command{awk} level code calls the function by this name.
+This is a regular C string.
+
+@item awk_value_t *(*function)(int num_actual_args, awk_value_t *result);
+This is a pointer to the C function that provides the desired
+functionality.
+The function must fill in the result with either a number
+or a string. @command{awk} takes ownership of any string memory.
+As mentioned earlier, string memory @strong{must} come from @code{malloc()}.
+
+The function must return the value of @code{result}.
+This is for the convenience of the calling code inside @command{gawk}.
+
+@item size_t num_expected_args;
+This is the number of arguments the function expects to receive.
+Each extension function may decide what to do if the number of
+arguments isn't what it expected. Following @command{awk} functions, it
+is likely OK to ignore extra arguments.
+@end table
+
+Once you have a record representing your extension function, you register
+it with @command{gawk} using this API function:
+
+@table @code
+@item awk_bool_t add_ext_func(const char *namespace, const awk_ext_func_t *func);
+This function returns true upon success, false otherwise.
+The @code{namespace} parameter is currently not used; you should pass in an
+empty string (@code{""}). The @code{func} pointer is the address of a
+@code{struct} representing your function, as just described.
+@end table
+
+@node Exit Callback Functions
+@subsubsection Registering An Exit Callback Function
+
+An @dfn{exit callback} function is a function that
+@command{gawk} calls before it exits.
+Such functions are useful if you have general ``clean up'' tasks
+that should be performed in your extension (such as closing data
+base connections or other resource deallocations).
+You can register such
+a function with @command{gawk} using the following function.
+
+@table @code
+@item void awk_atexit(void (*funcp)(void *data, int exit_status),
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ void *arg0);
+The parameters are:
+@c nested table
+@table @code
+@item funcp
+A pointer to the function to be called before @command{gawk} exits. The @code{data}
+parameter will be the original value of @code{arg0}.
+The @code{exit_status} parameter is
+the exit status value that @command{gawk} will pass to the @code{exit()} system call.
+
+@item arg0
+A pointer to private data which @command{gawk} saves in order to pass to
+the function pointed to by @code{funcp}.
+@end table
+@end table
+
+Exit callback functions are called in Last-In-First-Out (LIFO) order---that is, in
+the reverse order in which they are registered with @command{gawk}.
+
+@node Extension Version String
+@subsubsection Registering An Extension Version String
+
+You can register a version string which indicates the name and
+version of your extension, with @command{gawk}, as follows:
+
+@table @code
+@item void register_ext_version(const char *version);
+Register the string pointed to by @code{version} with @command{gawk}.
+@command{gawk} does @emph{not} copy the @code{version} string, so
+it should not be changed.
+@end table
+
+@command{gawk} prints all registered extension version strings when it
+is invoked with the @option{--version} option.
+
+@node Input Parsers
+@subsubsection Customized Input Parsers
+
+By default, @command{gawk} reads text files as its input. It uses the value
+of @code{RS} to find the end of the record, and then uses @code{FS}
+(or @code{FIELDWIDTHS}) to split it into fields (@pxref{Reading Files}).
+Additionally, it sets the value of @code{RT} (@pxref{Built-in Variables}).
+
+If you want, you can provide your own, custom, input parser. An input
+parser's job is to return a record to the @command{gawk} record processing
+code, along with indicators for the value and length of the data to be
+used for @code{RT}, if any.
+
+To provide an input parser, you must first provide two functions
+(where @var{XXX} is a prefix name for your extension):
+
+@table @code
+@item awk_bool_t @var{XXX}_can_take_file(const awk_input_buf_t *iobuf)
+This function examines the information available in @code{iobuf}
+(which we discuss shortly). Based on the information there, it
+decides if the input parser should be used for this file.
+If so, it should return true. Otherwise, it should return false.
+It should not change any state (variable values, etc.) within @command{gawk}.
+
+@item awk_bool_t @var{XXX}_take_control_of(awk_input_buf_t *iobuf)
+When @command{gawk} decides to hand control of the file over to the
+input parser, it calls this function. This function in turn must fill
+in certain fields in the @code{awk_input_buf_t} structure, and ensure
+that certain conditions are true. It should then return true. If an
+error of some kind occurs, it should not fill in any fields, and should
+return false; then @command{gawk} will not use the input parser.
+The details are presented shortly.
+@end table
+
+Your extension should package these functions inside an
+@code{awk_input_parser_t}, which looks like this:
+
+@example
+typedef struct input_parser @{
+ const char *name; /* name of parser */
+ awk_bool_t (*can_take_file)(const awk_input_buf_t *iobuf);
+ awk_bool_t (*take_control_of)(awk_input_buf_t *iobuf);
+ awk_const struct input_parser *awk_const next; /* for use by gawk */
+@} awk_input_parser_t;
+@end example
+
+The fields are:
+
+@table @code
+@item const char *name;
+The name of the input parser. This is a regular C string.
+
+@item awk_bool_t (*can_take_file)(const awk_input_buf_t *iobuf);
+A pointer to your @code{@var{XXX}_can_take_file()} function.
+
+@item awk_bool_t (*take_control_of)(awk_input_buf_t *iobuf);
+A pointer to your @code{@var{XXX}_take_control_of()} function.
+
+@item awk_const struct input_parser *awk_const next;
+This pointer is used by @command{gawk}.
+The extension cannot modify it.
+@end table
+
+The steps are as follows:
+
+@enumerate
+@item
+Create a @code{static awk_input_parser_t} variable and initialize it
+appropriately.
+
+@item
+When your extension is loaded, register your input parser with
+@command{gawk} using the @code{register_input_parser()} API function
+(described below).
+@end enumerate
+
+An @code{awk_input_buf_t} looks like this:
+
+@example
+typedef struct awk_input @{
+ const char *name; /* filename */
+ int fd; /* file descriptor */
+#define INVALID_HANDLE (-1)
+ void *opaque; /* private data for input parsers */
+ int (*get_record)(char **out, struct awk_input *iobuf,
+ int *errcode, char **rt_start, size_t *rt_len);
+ void (*close_func)(struct awk_input *iobuf);
+ struct stat sbuf; /* stat buf */
+@} awk_input_buf_t;
+@end example
+
+The fields can be divided into two categories: those for use (initially,
+at least) by @code{@var{XXX}_can_take_file()}, and those for use by
+@code{@var{XXX}_take_control_of()}. The first group of fields and their uses
+are as follows:
+
+@table @code
+@item const char *name;
+The name of the file.
+
+@item int fd;
+A file descriptor for the file. If @command{gawk} was able to
+open the file, then @code{fd} will @emph{not} be equal to
+@code{INVALID_HANDLE}. Otherwise, it will.
+
+@item struct stat sbuf;
+If file descriptor is valid, then @command{gawk} will have filled
+in this structure via a call to the @code{fstat()} system call.
+@end table
+
+The @code{@var{XXX}_can_take_file()} function should examine these
+fields and decide if the input parser should be used for the file.
+The decision can be made based upon @command{gawk} state (the value
+of a variable defined previously by the extension and set by
+@command{awk} code), the name of the
+file, whether or not the file descriptor is valid, the information
+in the @code{struct stat}, or any combination of the above.
+
+Once @code{@var{XXX}_can_take_file()} has returned true, and
+@command{gawk} has decided to use your input parser, it calls
+@code{@var{XXX}_take_control_of()}. That function then fills in at
+least the @code{get_record} field of the @code{awk_input_buf_t}. It must
+also ensure that @code{fd} is not set to @code{INVALID_HANDLE}. All of
+the fields that may be filled by @code{@var{XXX}_take_control_of()}
+are as follows:
+
+@table @code
+@item void *opaque;
+This is used to hold any state information needed by the input parser
+for this file. It is ``opaque'' to @command{gawk}. The input parser
+is not required to use this pointer.
+
+@item int@ (*get_record)(char@ **out,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ struct@ awk_input *iobuf,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ int *errcode,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ char **rt_start,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ size_t *rt_len);
+This function pointer should point to a function that creates the input
+records. Said function is the core of the input parser. Its behavior
+is described below.
+
+@item void (*close_func)(struct awk_input *iobuf);
+This function pointer should point to a function that does
+the ``tear down.'' It should release any resources allocated by
+@code{@var{XXX}_take_control_of()}. It may also close the file. If it
+does so, it should set the @code{fd} field to @code{INVALID_HANDLE}.
+
+If @code{fd} is still not @code{INVALID_HANDLE} after the call to this
+function, @command{gawk} calls the regular @code{close()} system call.
+
+Having a ``tear down'' function is optional. If your input parser does
+not need it, do not set this field. Then, @command{gawk} calls the
+regular @code{close()} system call on the file descriptor, so it should
+be valid.
+@end table
+
+The @code{@var{XXX}_get_record()} function does the work of creating
+input records. The parameters are as follows:
+
+@table @code
+@item char **out
+This is a pointer to a @code{char *} variable which is set to point
+to the record. @command{gawk} makes its own copy of the data, so
+the extension must manage this storage.
+
+@item struct awk_input *iobuf
+This is the @code{awk_input_buf_t} for the file. The fields should be
+used for reading data (@code{fd}) and for managing private state
+(@code{opaque}), if any.
+
+@item int *errcode
+If an error occurs, @code{*errcode} should be set to an appropriate
+code from @code{<errno.h>}.
+
+@item char **rt_start
+@itemx size_t *rt_len
+If the concept of a ``record terminator'' makes sense, then
+@code{*rt_start} should be set to point to the data to be used for
+@code{RT}, and @code{*rt_len} should be set to the length of the
+data. Otherwise, @code{*rt_len} should be set to zero.
+@code{gawk} makes its own copy of this data, so the
+extension must manage the storage.
+@end table
+
+The return value is the length of the buffer pointed to by
+@code{*out}, or @code{EOF} if end-of-file was reached or an
+error occurred.
+
+It is guaranteed that @code{errcode} is a valid pointer, so there is no
+need to test for a @code{NULL} value. @command{gawk} sets @code{*errcode}
+to zero, so there is no need to set it unless an error occurs.
+
+If an error does occur, the function should return @code{EOF} and set
+@code{*errcode} to a non-zero value. In that case, if @code{*errcode}
+does not equal @minus{}1, @command{gawk} automatically updates
+the @code{ERRNO} variable based on the value of @code{*errcode} (e.g.,
+setting @samp{*errcode = errno} should do the right thing).
+
+@command{gawk} ships with a sample extension that reads directories,
+returning records for each entry in the directory (@pxref{Extension
+Sample Readdir}). You may wish to use that code as a guide for writing
+your own input parser.
+
+When writing an input parser, you should think about (and document)
+how it is expected to interact with @command{awk} code. You may want
+it to always be called, and take effect as appropriate (as the
+@code{readdir} extension does). Or you may want it to take effect
+based upon the value of an @code{awk} variable, as the XML extension
+from the @code{gawkextlib} project does (@pxref{gawkextlib}).
+In the latter case, code in a @code{BEGINFILE} section
+can look at @code{FILENAME} and @code{ERRNO} to decide whether or
+not to activate an input parser (@pxref{BEGINFILE/ENDFILE}).
+
+You register your input parser with the following function:
+
+@table @code
+@item void register_input_parser(awk_input_parser_t *input_parser);
+Register the input parser pointed to by @code{input_parser} with
+@command{gawk}.
+@end table
+
+@node Output Wrappers
+@subsubsection Customized Output Wrappers
+
+An @dfn{output wrapper} is the mirror image of an input parser.
+It allows an extension to take over the output to a file opened
+with the @samp{>} or @samp{>>} operators (@pxref{Redirection}).
+
+The output wrapper is very similar to the input parser structure:
+
+@example
+typedef struct output_wrapper @{
+ const char *name; /* name of the wrapper */
+ awk_bool_t (*can_take_file)(const awk_output_buf_t *outbuf);
+ awk_bool_t (*take_control_of)(awk_output_buf_t *outbuf);
+ awk_const struct output_wrapper *awk_const next; /* for use by gawk */
+@} awk_output_wrapper_t;
+@end example
+
+The members are as follows:
+
+@table @code
+@item const char *name;
+This is the name of the output wrapper.
+
+@item awk_bool_t (*can_take_file)(const awk_output_buf_t *outbuf);
+This points to a function that examines the information in
+the @code{awk_output_buf_t} structure pointed to by @code{outbuf}.
+It should return true if the output wrapper wants to take over the
+file, and false otherwise. It should not change any state (variable
+values, etc.) within @command{gawk}.
+
+@item awk_bool_t (*take_control_of)(awk_output_buf_t *outbuf);
+The function pointed to by this field is called when @command{gawk}
+decides to let the output wrapper take control of the file. It should
+fill in appropriate members of the @code{awk_output_buf_t} structure,
+as described below, and return true if successful, false otherwise.
+
+@item awk_const struct output_wrapper *awk_const next;
+This is for use by @command{gawk}.
+@end table
+
+The @code{awk_output_buf_t} structure looks like this:
+
+@example
+typedef struct @{
+ const char *name; /* name of output file */
+ const char *mode; /* mode argument to fopen */
+ FILE *fp; /* stdio file pointer */
+ awk_bool_t redirected; /* true if a wrapper is active */
+ void *opaque; /* for use by output wrapper */
+ size_t (*gawk_fwrite)(const void *buf, size_t size, size_t count,
+ FILE *fp, void *opaque);
+ int (*gawk_fflush)(FILE *fp, void *opaque);
+ int (*gawk_ferror)(FILE *fp, void *opaque);
+ int (*gawk_fclose)(FILE *fp, void *opaque);
+@} awk_output_buf_t;
+@end example
+
+Here too, your extension will define @code{@var{XXX}_can_take_file()}
+and @code{@var{XXX}_take_control_of()} functions that examine and update
+data members in the @code{awk_output_buf_t}.
+The data members are as follows:
+
+@table @code
+@item const char *name;
+The name of the output file.
+
+@item const char *mode;
+The mode string (as would be used in the second argument to @code{fopen()})
+with which the file was opened.
+
+@item FILE *fp;
+The @code{FILE} pointer from @code{<stdio.h>}. @command{gawk} opens the file
+before attempting to find an output wrapper.
+
+@item awk_bool_t redirected;
+This field must be set to true by the @code{@var{XXX}_take_control_of()} function.
+
+@item void *opaque;
+This pointer is opaque to @command{gawk}. The extension should use it to store
+a pointer to any private data associated with the file.
+
+@item size_t (*gawk_fwrite)(const void *buf, size_t size, size_t count,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ FILE *fp, void *opaque);
+@itemx int (*gawk_fflush)(FILE *fp, void *opaque);
+@itemx int (*gawk_ferror)(FILE *fp, void *opaque);
+@itemx int (*gawk_fclose)(FILE *fp, void *opaque);
+These pointers should be set to point to functions that perform
+the equivalent function as the @code{<stdio.h>} functions do, if appropriate.
+@command{gawk} uses these function pointers for all output.
+@command{gawk} initializes the pointers to point to internal, ``pass through''
+functions that just call the regular @code{<stdio.h>} functions, so an
+extension only needs to redefine those functions that are appropriate for
+what it does.
+@end table
+
+The @code{@var{XXX}_can_take_file()} function should make a decision based
+upon the @code{name} and @code{mode} fields, and any additional state
+(such as @command{awk} variable values) that is appropriate.
+
+When @command{gawk} calls @code{@var{XXX}_take_control_of()}, it should fill
+in the other fields, as appropriate, except for @code{fp}, which it should just
+use normally.
+
+You register your output wrapper with the following function:
+
+@table @code
+@item void register_output_wrapper(awk_output_wrapper_t *output_wrapper);
+Register the output wrapper pointed to by @code{output_wrapper} with
+@command{gawk}.
+@end table
+
+@node Two-way processors
+@subsubsection Customized Two-way Processors
+
+A @dfn{two-way processor} combines an input parser and an output wrapper for
+two-way I/O with the @samp{|&} operator (@pxref{Redirection}). It makes identical
+use of the @code{awk_input_parser_t} and @code{awk_output_buf_t} structures
+as described earlier.
+
+A two-way processor is represented by the following structure:
+
+@example
+typedef struct two_way_processor @{
+ const char *name; /* name of the two-way processor */
+ awk_bool_t (*can_take_two_way)(const char *name);
+ awk_bool_t (*take_control_of)(const char *name,
+ awk_input_buf_t *inbuf,
+ awk_output_buf_t *outbuf);
+ awk_const struct two_way_processor *awk_const next; /* for use by gawk */
+@} awk_two_way_processor_t;
+@end example
+
+The fields are as follows:
+
+@table @code
+@item const char *name;
+The name of the two-way processor.
+
+@item awk_bool_t (*can_take_two_way)(const char *name);
+This function returns true if it wants to take over two-way I/O for this filename.
+It should not change any state (variable
+values, etc.) within @command{gawk}.
+
+@item awk_bool_t (*take_control_of)(const char *name,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_input_buf_t *inbuf,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_output_buf_t *outbuf);
+This function should fill in the @code{awk_input_buf_t} and
+@code{awk_outut_buf_t} structures pointed to by @code{inbuf} and
+@code{outbuf}, respectively. These structures were described earlier.
+
+@item awk_const struct two_way_processor *awk_const next;
+This is for use by @command{gawk}.
+@end table
+
+As with the input parser and output processor, you provide
+``yes I can take this'' and ``take over for this'' functions,
+@code{@var{XXX}_can_take_two_way()} and @code{@var{XXX}_take_control_of()}.
+
+You register your two-way processor with the following function:
+
+@table @code
+@item void register_two_way_processor(awk_two_way_processor_t *two_way_processor);
+Register the two-way processor pointed to by @code{two_way_processor} with
+@command{gawk}.
+@end table
+
+@node Printing Messages
+@subsection Printing Messages
+
+You can print different kinds of warning messages from your
+extension, as described below. Note that for these functions,
+you must pass in the extension id received from @command{gawk}
+when the extension was loaded.@footnote{Because the API uses only ISO C 90
+features, it cannot make use of the ISO C 99 variadic macro feature to hide
+that parameter. More's the pity.}
+
+@table @code
+@item void fatal(awk_ext_id_t id, const char *format, ...);
+Print a message and then cause @command{gawk} to exit immediately.
+
+@item void warning(awk_ext_id_t id, const char *format, ...);
+Print a warning message.
+
+@item void lintwarn(awk_ext_id_t id, const char *format, ...);
+Print a ``lint warning.'' Normally this is the same as printing a
+warning message, but if @command{gawk} was invoked with @samp{--lint=fatal},
+then lint warnings become fatal error messages.
+@end table
+
+All of these functions are otherwise like the C @code{printf()}
+family of functions, where the @code{format} parameter is a string
+with literal characters and formatting codes intermixed.
+
+@node Updating @code{ERRNO}
+@subsection Updating @code{ERRNO}
+
+The following functions allow you to update the @code{ERRNO}
+variable:
+
+@table @code
+@item void update_ERRNO_int(int errno_val);
+Set @code{ERRNO} to the string equivalent of the error code
+in @code{errno_val}. The value should be one of the defined
+error codes in @code{<errno.h>}, and @command{gawk} turns it
+into a (possibly translated) string using the C @code{strerror()} function.
+
+@item void update_ERRNO_string(const char *string);
+Set @code{ERRNO} directly to the string value of @code{ERRNO}.
+@command{gawk} makes a copy of the value of @code{string}.
+
+@item void unset_ERRNO();
+Unset @code{ERRNO}.
+@end table
+
+@node Accessing Parameters
+@subsection Accessing and Updating Parameters
+
+Two functions give you access to the arguments (parameters)
+passed to your extension function. They are:
+
+@table @code
+@item awk_bool_t get_argument(size_t count,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
+Fill in the @code{awk_value_t} structure pointed to by @code{result}
+with the @code{count}'th argument. Return true if the actual
+type matches @code{wanted}, false otherwise. In the latter
+case, @code{result@w{->}val_type} indicates the actual type
+(@pxref{table-value-types-returned}). Counts are zero based---the first
+argument is numbered zero, the second one, and so on. @code{wanted}
+indicates the type of value expected.
+
+@item awk_bool_t set_argument(size_t count, awk_array_t array);
+Convert a parameter that was undefined into an array; this provides
+call-by-reference for arrays. Return false if @code{count} is too big,
+or if the argument's type is not undefined. @xref{Array Manipulation},
+for more information on creating arrays.
+@end table
+
+@node Symbol Table Access
+@subsection Symbol Table Access
+
+Two sets of routines provide access to global variables, and one set
+allows you to create and release cached values.
+
+@menu
+* Symbol table by name:: Accessing variables by name.
+* Symbol table by cookie:: Accessing variables by ``cookie''.
+* Cached values:: Creating and using cached values.
+@end menu
+
+@node Symbol table by name
+@subsubsection Variable Access and Update by Name
+
+The following routines provide the ability to access and update
+global @command{awk}-level variables by name. In compiler terminology,
+identifiers of different kinds are termed @dfn{symbols}, thus the ``sym''
+in the routines' names. The data structure which stores information
+about symbols is termed a @dfn{symbol table}.
+
+@table @code
+@item awk_bool_t sym_lookup(const char *name,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
+Fill in the @code{awk_value_t} structure pointed to by @code{result}
+with the value of the variable named by the string @code{name}, which is
+a regular C string. @code{wanted} indicates the type of value expected.
+Return true if the actual type matches @code{wanted}, false otherwise
+In the latter case, @code{result->val_type} indicates the actual type
+(@pxref{table-value-types-returned}).
+
+@item awk_bool_t sym_update(const char *name, awk_value_t *value);
+Update the variable named by the string @code{name}, which is a regular
+C string. The variable is added to @command{gawk}'s symbol table
+if it is not there. Return true if everything worked, false otherwise.
+
+Changing types (scalar to array or vice versa) of an existing variable
+is @emph{not} allowed, nor may this routine be used to update an array.
+This routine cannot be be used to update any of the predefined
+variables (such as @code{ARGC} or @code{NF}).
+
+@item awk_bool_t sym_constant(const char *name, awk_value_t *value);
+Create a variable named by the string @code{name}, which is
+a regular C string, that has the constant value as given by
+@code{value}. @command{awk}-level code cannot change the value of this
+variable.@footnote{There (currently) is no @code{awk}-level feature that
+provides this ability.} The extension may change the value of @code{name}'s
+variable with subsequent calls to this routine, and may also convert
+a variable created by @code{sym_update()} into a constant. However,
+once a variable becomes a constant it cannot later be reverted into a
+mutable variable.
+@end table
+
+@node Symbol table by cookie
+@subsubsection Variable Access and Update by Cookie
+
+A @dfn{scalar cookie} is an opaque handle that provide access
+to a global variable or array. It is an optimization that
+avoids looking up variables in @command{gawk}'s symbol table every time
+access is needed. This was discussed earlier, in @ref{General Data Types}.
+
+The following functions let you work with scalar cookies.
+
+@table @code
+@item awk_bool_t sym_lookup_scalar(awk_scalar_t cookie,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
+Retrieve the current value of a scalar cookie.
+Once you have obtained a scalar_cookie using @code{sym_lookup()}, you can
+use this function to get its value more efficiently.
+Return false if the value cannot be retrieved.
+
+@item awk_bool_t sym_update_scalar(awk_scalar_t cookie, awk_value_t *value);
+Update the value associated with a scalar cookie. Return false if
+the new value is not one of @code{AWK_STRING} or @code{AWK_NUMBER}.
+Here too, the built-in variables may not be updated.
+@end table
+
+It is not obvious at first glance how to work with scalar cookies or
+what their @i{raison d'etre} really is. In theory, the @code{sym_lookup()}
+and @code{sym_update()} routines are all you really need to work with
+variables. For example, you might have code that looked up the value of
+a variable, evaluated a condition, and then possibly changed the value
+of the variable based on the result of that evaluation, like so:
+
+@example
+/* do_magic --- do something really great */
+
+static awk_value_t *
+do_magic(int nargs, awk_value_t *result)
+@{
+ awk_value_t value;
+
+ if ( sym_lookup("MAGIC_VAR", AWK_NUMBER, & value)
+ && some_condition(value.num_value)) @{
+ value.num_value += 42;
+ sym_update("MAGIC_VAR", & value);
+ @}
+
+ return make_number(0.0, result);
+@}
+@end example
+
+@noindent
+This code looks (and is) simple and straightforward. So what's the problem?
+
+Consider what happens if @command{awk}-level code associated with your
+extension calls the @code{magic()} function (implemented in C by @code{do_magic()}),
+once per record, while processing hundreds of thousands or millions of records.
+The @code{MAGIC_VAR} variable is looked up in the symbol table once or twice per function call!
+
+The symbol table lookup is really pure overhead; it is considerably more efficient
+to get a cookie that represents the variable, and use that to get the variable's
+value and update it as needed.@footnote{The difference is measurable and quite real. Trust us.}
+
+Thus, the way to use cookies is as follows. First, install your extension's variable
+in @command{gawk}'s symbol table using @code{sym_update()}, as usual. Then get a
+scalar cookie for the variable using @code{sym_lookup()}:
+
+@example
+static awk_scalar_t magic_var_cookie; /* cookie for MAGIC_VAR */
+
+static void
+my_extension_init()
+@{
+ awk_value_t value;
+
+ /* install initial value */
+ sym_update("MAGIC_VAR", make_number(42.0, & value));
+
+ /* get cookie */
+ sym_lookup("MAGIC_VAR", AWK_SCALAR, & value);
+
+ /* save the cookie */
+ magic_var_cookie = value.scalar_cookie;
+ @dots{}
+@}
+@end example
+
+Next, use the routines in this section for retrieving and updating
+the value through the cookie. Thus, @code{do_magic()} now becomes
+something like this:
+
+@example
+/* do_magic --- do something really great */
+
+static awk_value_t *
+do_magic(int nargs, awk_value_t *result)
+@{
+ awk_value_t value;
+
+ if ( sym_lookup_scalar(magic_var_cookie, AWK_NUMBER, & value)
+ && some_condition(value.num_value)) @{
+ value.num_value += 42;
+ sym_update_scalar(magic_var_cookie, & value);
+ @}
+ @dots{}
+
+ return make_number(0.0, result);
+@}
+@end example
+
+@quotation NOTE
+The previous code omitted error checking for
+presentation purposes. Your extension code should be more robust
+and carefully check the return values from the API functions.
+@end quotation
+
+@node Cached values
+@subsubsection Creating and Using Cached Values
+
+The routines in this section allow you to create and release
+cached values. As with scalar cookies, in theory, cached values
+are not necessary. You can create numbers and strings using
+the functions in @ref{Constructor Functions}. You can then
+assign those values to variables using @code{sym_update()}
+or @code{sym_update_scalar()}, as you like.
+
+However, you can understand the point of cached values if you remember that
+@emph{every} string value's storage @emph{must} come from @code{malloc()}.
+If you have 20 variables, all of which have the same string value, you
+must create 20 identical copies of the string.@footnote{Numeric values
+are clearly less problematic, requiring only a C @code{double} to store.}
+
+It is clearly more efficient, if possible, to create a value once, and
+then tell @command{gawk} to reuse the value for multiple variables. That
+is what the routines in this section let you do. The functions are as follows:
+
+@table @code
+@item awk_bool_t create_value(awk_value_t *value, awk_value_cookie_t *result);
+Create a cached string or numeric value from @code{value} for efficient later
+assignment.
+Only @code{AWK_NUMBER} and @code{AWK_STRING} values are allowed. Any other type
+is rejected. While @code{AWK_UNDEFINED} could be allowed, doing so would
+result in inferior performance.
+
+@item awk_bool_t release_value(awk_value_cookie_t vc);
+Release the memory associated with a value cookie obtained
+from @code{create_value()}.
+@end table
+
+You use value cookies in a fashion similar to the way you use scalar cookies.
+In the extension initialization routine, you create the value cookie:
+
+@example
+static awk_value_cookie_t answer_cookie; /* static value cookie */
+
+static void
+my_extension_init()
+@{
+ awk_value_t value;
+ char *long_string;
+ size_t long_string_len;
+
+ /* code from earlier */
+ @dots{}
+ /* @dots{} fill in long_string and long_string_len @dots{} */
+ make_malloced_string(long_string, long_string_len, & value);
+ create_value(& value, & answer_cookie); /* create cookie */
+ @dots{}
+@}
+@end example
+
+Once the value is created, you can use it as the value of any number
+of variables:
+
+@example
+static awk_value_t *
+do_magic(int nargs, awk_value_t *result)
+@{
+ awk_value_t new_value;
+
+ @dots{} /* as earlier */
+
+ value.val_type = AWK_VALUE_COOKIE;
+ value.value_cookie = answer_cookie;
+ sym_update("VAR1", & value);
+ sym_update("VAR2", & value);
+ @dots{}
+ sym_update("VAR100", & value);
+ @dots{}
+@}
+@end example
+
+@noindent
+Using value cookies in this way saves considerable storage, since all of
+@code{VAR1} through @code{VAR100} share the same value.
+
+You might be wondering, ``Is this sharing problematic?
+What happens if @command{awk} code assigns a new value to @code{VAR1},
+are all the others be changed too?''
+
+That's a great question. The answer is that no, it's not a problem.
+@command{gawk} is smart enough to avoid such problems.
+
+Finally, as part of your clean up action (@pxref{Exit Callback Functions})
+you should release any cached values that you created, using
+@code{release_value()}.
+
+@node Array Manipulation
+@subsection Array Manipulation
+
+The primary data structure@footnote{Okay, the only data structure.} in @command{awk}
+is the associative array (@pxref{Arrays}).
+Extensions need to be able to manipulate @command{awk} arrays.
+The API provides a number of data structures for working with arrays,
+functions for working with individual elements, and functions for
+working with arrays as a whole. This includes the ability to
+``flatten'' an array so that it is easy for C code to traverse
+every element in an array. The array data structures integrate
+nicely with the data structures for values to make it easy to
+both work with and create true arrays of arrays (@pxref{General Data Types}).
+
+@menu
+* Array Data Types:: Data types for working with arrays.
+* Array Functions:: Functions for working with arrays.
+* Flattening Arrays:: How to flatten arrays.
+* Creating Arrays:: How to create and populate arrays.
+@end menu
+
+@node Array Data Types
+@subsubsection Array Data Types
+
+The data types associated with arrays are listed below.
+
+@table @code
+@item typedef void *awk_array_t;
+If you request the value of an array variable, you get back an
+@code{awk_array_t} value. This value is opaque@footnote{It is also
+a ``cookie,'' but the @command{gawk} developers did not wish to overuse this
+term.} to the extension; it uniquely identifies the array but can
+only be used by passing it into API functions or receiving it from API
+functions. This is very similar to way @samp{FILE *} values are used
+with the @code{<stdio.h>} library routines.
+
+
+@item
+@item typedef struct awk_element @{
+@itemx @ @ @ @ /* convenience linked list pointer, not used by gawk */
+@itemx @ @ @ @ struct awk_element *next;
+@itemx @ @ @ @ enum @{
+@itemx @ @ @ @ @ @ @ @ AWK_ELEMENT_DEFAULT = 0,@ @ /* set by gawk */
+@itemx @ @ @ @ @ @ @ @ AWK_ELEMENT_DELETE = 1@ @ @ @ /* set by extension if should be deleted */
+@itemx @ @ @ @ @} flags;
+@itemx @ @ @ @ awk_value_t index;
+@itemx @ @ @ @ awk_value_t value;
+@itemx @} awk_element_t;
+The @code{awk_element_t} is a ``flattened''
+array element. @command{awk} produces an array of these
+inside the @code{awk_flat_array_t} (see the next item).
+Individual elements may be marked for deletion. New elements must be added
+individually, one at a time, using the separate API for that purpose.
+The fields are as follows:
+
+@c nested table
+@table @code
+@item struct awk_element *next;
+This pointer is for the convenience of extension writers. It allows
+an extension to create a linked list of new elements which can then be
+added to an array in a loop that traverses the list.
+
+@item enum @{ @dots{} @} flags;
+A set of flag values that convey information between @command{gawk}
+and the extension. Currently there is only one: @code{AWK_ELEMENT_DELETE},
+which the extension can set to cause @command{gawk} to delete the
+element from the original array upon release of the flattened array.
+
+@item index
+@itemx value
+The index and value of the element, respectively.
+@emph{All} memory pointed to by @code{index} and @code{value} belongs to @command{gawk}.
+@end table
+
+@item typedef struct awk_flat_array @{
+@itemx @ @ @ @ awk_const void *awk_const opaque1;@ @ @ @ /* private data for use by gawk */
+@itemx @ @ @ @ awk_const void *awk_const opaque2;@ @ @ @ /* private data for use by gawk */
+@itemx @ @ @ @ awk_const size_t count;@ @ @ @ @ /* how many elements */
+@itemx @ @ @ @ awk_element_t elements[1];@ @ /* will be extended */
+@itemx @} awk_flat_array_t;
+This is a flattened array. When an extension gets one of these
+from @command{gawk}, the @code{elements} array is of actual
+size @code{count}.
+The @code{opaque1} and @code{opaque2} pointers are for use by @command{gawk};
+therefore they are marked @code{awk_const} so that the extension cannot
+modify them.
+@end table
+
+@node Array Functions
+@subsubsection Array Functions
+
+The following functions relate to individual array elements.
+
+@table @code
+@item awk_bool_t get_element_count(awk_array_t a_cookie, size_t *count);
+For the array represented by @code{a_cookie}, return in @code{*count}
+the number of elements it contains. A subarray counts as a single element.
+Return false if there is an error.
+
+@item awk_bool_t get_array_element(awk_array_t a_cookie,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const awk_value_t *const index,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
+For the array represented by @code{a_cookie}, return in @code{*result}
+the value of the element whose index is @code{index}.
+@code{wanted} specifies the type of value you wish to retrieve.
+Return false if @code{wanted} does not match the actual type or if
+@code{index} is not in the array (@pxref{table-value-types-returned}).
+
+The value for @code{index} can be numeric, in which case @command{gawk}
+converts it to a string. Using non-integral values is possible, but
+requires that you understand how such values are converted to strings
+(@pxref{Conversion}); thus using integral values is safest.
+
+As with @emph{all} strings passed into @code{gawk} from an extension,
+the string value of @code{index} must come from @code{malloc()}, and
+@command{gawk} releases the storage.
+
+@item awk_bool_t set_array_element(awk_array_t a_cookie,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const@ awk_value_t *const index,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const@ awk_value_t *const value);
+In the array represented by @code{a_cookie}, create or modify
+the element whose index is given by @code{index}.
+The @code{ARGV} and @code{ENVIRON} arrays may not be changed.
+
+@item awk_bool_t set_array_element_by_elem(awk_array_t a_cookie,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_element_t element);
+Like @code{set_array_element()}, but take the @code{index} and @code{value}
+from @code{element}. This is a convenience macro.
+
+@item awk_bool_t del_array_element(awk_array_t a_cookie,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const awk_value_t* const index);
+Remove the element with the given index from the array
+represented by @code{a_cookie}.
+Return true if the element was removed, or false if the element did
+not exist in the array.
+@end table
+
+The following functions relate to arrays as a whole:
+
+@table @code
+@item awk_array_t create_array();
+Create a new array to which elements may be added.
+@xref{Creating Arrays}, for a discussion of how to
+create a new array and add elements to it.
+
+@item awk_bool_t clear_array(awk_array_t a_cookie);
+Clear the array represented by @code{a_cookie}.
+Return false if there was some kind of problem, true otherwise.
+The array remains an array, but after calling this function, it
+has no elements. This is equivalent to using the @code{delete}
+statement (@pxref{Delete}).
+
+@item awk_bool_t flatten_array(awk_array_t a_cookie, awk_flat_array_t **data);
+For the array represented by @code{a_cookie}, create an @code{awk_flat_array_t}
+structure and fill it in. Set the pointer whose address is passed as @code{data}
+to point to this structure.
+Return true upon success, or false otherwise.
+@xref{Flattening Arrays}, for a discussion of how to
+flatten an array and work with it.
+
+@item awk_bool_t release_flattened_array(awk_array_t a_cookie,
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_flat_array_t *data);
+When done with a flattened array, release the storage using this function.
+You must pass in both the original array cookie, and the address of
+the created @code{awk_flat_array_t} structure.
+The function returns true upon success, false otherwise.
+@end table
+
+@node Flattening Arrays
+@subsubsection Working With All The Elements of an Array
+
+To @dfn{flatten} an array is create a structure that
+represents the full array in a fashion that makes it easy
+for C code to traverse the entire array. Test code
+in @file{extension/testext.c} does this, and also serves
+as a nice example to show how to use the APIs.
+
+First, the @command{gawk} script that drives the test extension:
+
+@example
+@@load "testext"
+BEGIN @{
+ n = split("blacky rusty sophie raincloud lucky", pets)
+ printf "pets has %d elements\n", length(pets)
+ ret = dump_array_and_delete("pets", "3")
+ printf "dump_array_and_delete(pets) returned %d\n", ret
+ if ("3" in pets)
+ printf("dump_array_and_delete() did NOT remove index \"3\"!\n")
+ else
+ printf("dump_array_and_delete() did remove index \"3\"!\n")
+ print ""
+@}
+@end example
+
+@noindent
+This code creates an array with @code{split()} (@pxref{String Functions})
+and then calls @code{dump_and_delete()}. That function looks up
+the array whose name is passed as the first argument, and
+deletes the element at the index passed in the second argument.
+It then prints the return value and checks if the element
+was indeed deleted. Here is the C code that implements
+@code{dump_array_and_delete()}. It has been edited slightly for
+presentation.
+
+The first part declares variables, sets up the default
+return value in @code{result}, and checks that the function
+was called with the correct number of arguments:
+
+@example
+static awk_value_t *
+dump_array_and_delete(int nargs, awk_value_t *result)
+@{
+ awk_value_t value, value2, value3;
+ awk_flat_array_t *flat_array;
+ size_t count;
+ char *name;
+ int i;
+
+ assert(result != NULL);
+ make_number(0.0, result);
+
+ if (nargs != 2) @{
+ printf("dump_array_and_delete: nargs not right "
+ "(%d should be 2)\n", nargs);
+ goto out;
+ @}
+@end example
+
+The function then proceeds in steps, as follows. First, retrieve
+the name of the array, passed as the first argument. Then
+retrieve the array itself. If either operation fails, print
+error messages and return:
+
+@example
+ /* get argument named array as flat array and print it */
+ if (get_argument(0, AWK_STRING, & value)) @{
+ name = value.str_value.str;
+ if (sym_lookup(name, AWK_ARRAY, & value2))
+ printf("dump_array_and_delete: sym_lookup of %s passed\n",
+ name);
+ else @{
+ printf("dump_array_and_delete: sym_lookup of %s failed\n",
+ name);
+ goto out;
+ @}
+ @} else @{
+ printf("dump_array_and_delete: get_argument(0) failed\n");
+ goto out;
+ @}
+@end example
+
+For testing purposes and to make sure that the C code sees
+the same number of elements as the @command{awk} code,
+the second step is to get the count of elements in the array
+and print it:
+
+@example
+ if (! get_element_count(value2.array_cookie, & count)) @{
+ printf("dump_array_and_delete: get_element_count failed\n");
+ goto out;
+ @}
+
+ printf("dump_array_and_delete: incoming size is %lu\n",
+ (unsigned long) count);
+@end example
+
+The third step is to actually flatten the array, and then
+to double check that the count in the @code{awk_flat_array_t}
+is the same as the count just retrieved:
+
+@example
+ if (! flatten_array(value2.array_cookie, & flat_array)) @{
+ printf("dump_array_and_delete: could not flatten array\n");
+ goto out;
+ @}
+
+ if (flat_array->count != count) @{
+ printf("dump_array_and_delete: flat_array->count (%lu)"
+ " != count (%lu)\n",
+ (unsigned long) flat_array->count,
+ (unsigned long) count);
+ goto out;
+ @}
+@end example
+
+The fourth step is to retrieve the index of the element
+to be deleted, which was passed as the second argument.
+Remember that argument counts passed to @code{get_argument()}
+are zero-based, thus the second argument is numbered one:
+
+@example
+ if (! get_argument(1, AWK_STRING, & value3)) @{
+ printf("dump_array_and_delete: get_argument(1) failed\n");
+ goto out;
+ @}
+@end example
+
+The fifth step is where the ``real work'' is done. The function
+loops over every element in the array, printing the index and
+element values. In addition, upon finding the element with the
+index that is supposed to be deleted, the function sets the
+@code{AWK_ELEMENT_DELETE} bit in the @code{flags} field
+of the element. When the array is released, @command{gawk}
+traverses the flattened array, and deletes any element which
+have this flag bit set:
+
+@example
+ for (i = 0; i < flat_array->count; i++) @{
+ printf("\t%s[\"%.*s\"] = %s\n",
+ name,
+ (int) flat_array->elements[i].index.str_value.len,
+ flat_array->elements[i].index.str_value.str,
+ valrep2str(& flat_array->elements[i].value));
+
+ if (strcmp(value3.str_value.str,
+ flat_array->elements[i].index.str_value.str)
+ == 0) @{
+ flat_array->elements[i].flags |= AWK_ELEMENT_DELETE;
+ printf("dump_array_and_delete: marking element \"%s\" "
+ "for deletion\n",
+ flat_array->elements[i].index.str_value.str);
+ @}
+ @}
+@end example
+
+The sixth step is to release the flattened array. This tells
+@command{gawk} that the extension is no longer using the array,
+and that it should delete any elements marked for deletion.
+@command{gawk} also frees any storage that was allocated,
+so you should not use the pointer (@code{flat_array} in this
+code) once you have called @code{release_flattened_array()}:
+
+@example
+ if (! release_flattened_array(value2.array_cookie, flat_array)) @{
+ printf("dump_array_and_delete: could not release flattened array\n");
+ goto out;
+ @}
+@end example
+
+Finally, since everything was successful, the function sets the
+return value to success, and returns:
+
+@example
+ make_number(1.0, result);
+out:
+ return result;
+@}
+@end example
+
+Here is the output from running this part of the test:
+
+@example
+pets has 5 elements
+dump_array_and_delete: sym_lookup of pets passed
+dump_array_and_delete: incoming size is 5
+ pets["1"] = "blacky"
+ pets["2"] = "rusty"
+ pets["3"] = "sophie"
+dump_array_and_delete: marking element "3" for deletion
+ pets["4"] = "raincloud"
+ pets["5"] = "lucky"
+dump_array_and_delete(pets) returned 1
+dump_array_and_delete() did remove index "3"!
+@end example
+
+@node Creating Arrays
+@subsubsection How To Create and Populate Arrays
+
+Besides working with arrays created by @command{awk} code, you can
+create arrays and populate them as you see fit, and then @command{awk}
+code can access them and manipulate them.
+
+There are two important points about creating arrays from extension code:
+
+@enumerate 1
+@item
+You must install a new array into @command{gawk}'s symbol
+table immediately upon creating it. Once you have done so,
+you can then populate the array.
+
+@ignore
+Strictly speaking, this is required only
+for arrays that will have subarrays as elements; however it is
+a good idea to always do this. This restriction may be relaxed
+in a subsequent revision of the API.
+@end ignore
+
+Similarly, if installing a new array as a subarray of an existing array,
+you must add the new array to its parent before adding any elements to it.
+
+Thus, the correct way to build an array is to work ``top down.'' Create
+the array, and immediately install it in @command{gawk}'s symbol table
+using @code{sym_update()}, or install it as an element in a previously
+existing array using @code{set_element()}. Example code is coming shortly.
+
+@item
+Due to gawk internals, after using @code{sym_update()} to install an array
+into @command{gawk}, you have to retrieve the array cookie from the value
+passed in to @command{sym_update()} before doing anything else with it, like so:
+
+@example
+awk_value_t index, value;
+awk_array_t new_array;
+
+make_const_string("an index", 8, & index);
+
+new_array = create_array();
+val.val_type = AWK_ARRAY;
+val.array_cookie = new_array;
+
+/* install array in the symbol table */
+sym_update("array", & index, & val);
+
+new_array = val.array_cookie; /* YOU MUST DO THIS */
+@end example
+
+If installing an array as a subarray, you must also retrieve the value
+of the array cookie after the call to @code{set_element()}.
+@end enumerate
+
+The following C code is a simple test extension to create an array
+with two regular elements and with a subarray. The leading @samp{#include}
+directives and boilerplate variable declarations are omitted for brevity.
+The first step is to create a new array and then install it
+in the symbol table:
+
+@example
+@ignore
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <stdio.h>
+#include <assert.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <sys/types.h>
+#include <sys/stat.h>
+
+#include "gawkapi.h"
+
+static const gawk_api_t *api; /* for convenience macros to work */
+static awk_ext_id_t *ext_id;
+static const char *ext_version = "testarray extension: version 1.0";
+
+int plugin_is_GPL_compatible;
+
+@end ignore
+/* create_new_array --- create a named array */
+
+static void
+create_new_array()
+@{
+ awk_array_t a_cookie;
+ awk_array_t subarray;
+ awk_value_t index, value;
+
+ a_cookie = create_array();
+ value.val_type = AWK_ARRAY;
+ value.array_cookie = a_cookie;
+
+ if (! sym_update("new_array", & value))
+ printf("create_new_array: sym_update(\"new_array\") failed!\n");
+ a_cookie = value.array_cookie;
+@end example
+
+@noindent
+Note how @code{a_cookie} is reset from the @code{array_cookie} field in
+the @code{value} structure.
+
+The second step is to install two regular values into @code{new_array}:
+
+@example
+ (void) make_const_string("hello", 5, & index);
+ (void) make_const_string("world", 5, & value);
+ if (! set_array_element(a_cookie, & index, & value)) @{
+ printf("fill_in_array: set_array_element failed\n");
+ return;
+ @}
+
+ (void) make_const_string("answer", 6, & index);
+ (void) make_number(42.0, & value);
+ if (! set_array_element(a_cookie, & index, & value)) @{
+ printf("fill_in_array: set_array_element failed\n");
+ return;
+ @}
+@end example
+
+The third step is to create the subarray and install it:
+
+@example
+ (void) make_const_string("subarray", 8, & index);
+ subarray = create_array();
+ value.val_type = AWK_ARRAY;
+ value.array_cookie = subarray;
+ if (! set_array_element(a_cookie, & index, & value)) @{
+ printf("fill_in_array: set_array_element failed\n");
+ return;
+ @}
+ subarray = value.array_cookie;
+@end example
+
+The final step is to populate the subarray with its own element:
+
+@example
+ (void) make_const_string("foo", 3, & index);
+ (void) make_const_string("bar", 3, & value);
+ if (! set_array_element(subarray, & index, & value)) @{
+ printf("fill_in_array: set_array_element failed\n");
+ return;
+ @}
+@}
+@ignore
+static awk_ext_func_t func_table[] = @{
+ @{ NULL, NULL, 0 @}
+@};
+
+/* init_testarray --- additional initialization function */
+
+static awk_bool_t init_testarray(void)
+@{
+ create_new_array();
+
+ return 1;
+@}
+
+static awk_bool_t (*init_func)(void) = init_testarray;
+
+dl_load_func(func_table, testarray, "")
+@end ignore
+@end example
+
+Here is sample script that loads the extension
+and then dumps the array:
+
+@example
+@@load "subarray"
+
+function dumparray(name, array, i)
+@{
+ for (i in array)
+ if (isarray(array[i]))
+ dumparray(name "[\"" i "\"]", array[i])
+ else
+ printf("%s[\"%s\"] = %s\n", name, i, array[i])
+@}
+
+BEGIN @{
+ dumparray("new_array", new_array);
+@}
+@end example
+
+Here is the result of running the script:
+
+@example
+$ @kbd{AWKLIBPATH=$PWD ./gawk -f subarray.awk}
+@print{} new_array["subarray"]["foo"] = bar
+@print{} new_array["hello"] = world
+@print{} new_array["answer"] = 42
+@end example
+
+@noindent
+(@xref{Finding Extensions}, for more information on the
+@env{AWKLIBPATH} environment variable.)
+
+@node Extension API Variables
+@subsection API Variables
+
+The API provides two sets of variables. The first provides information
+about the version of the API (both with which the extension was compiled,
+and with which @command{gawk} was compiled). The second provides
+information about how @command{gawk} was invoked.
+
+@menu
+* Extension Versioning:: API Version information.
+* Extension API Informational Variables:: Variables providing information about
+ @command{gawk}'s invocation.
+@end menu
+
+@node Extension Versioning
+@subsubsection API Version Constants and Variables
+
+The API provides both a ``major'' and a ``minor'' version number.
+The API versions are available at compile time as constants:
+
+@table @code
+@item GAWK_API_MAJOR_VERSION
+The major version of the API.
+
+@item GAWK_API_MINOR_VERSION
+The minor version of the API.
+@end table
+
+The minor version increases when new functions are added to the API. Such
+new functions are always added to the end of the API @code{struct}.
+
+The major version increases (and the minor version is reset to zero) if any
+of the data types change size or member order, or if any of the existing
+functions change signature.
+
+It could happen that an extension may be compiled against one version
+of the API but loaded by a version of @command{gawk} using a different
+version. For this reason, the major and minor API versions of the
+running @command{gawk} are included in the API @code{struct} as read-only
+constant integers:
+
+@table @code
+@item api->major_version
+The major version of the running @command{gawk}.
+
+@item api->minor_version
+The minor version of the running @command{gawk}.
+@end table
+
+It is up to the extension to decide if there are API incompatibilities.
+Typically a check like this is enough:
+
+@example
+if (api->major_version != GAWK_API_MAJOR_VERSION
+ || api->minor_version < GAWK_API_MINOR_VERSION) @{
+ fprintf(stderr, "foo_extension: version mismatch with gawk!\n");
+ fprintf(stderr, "\tmy version (%d, %d), gawk version (%d, %d)\n",
+ GAWK_API_MAJOR_VERSION, GAWK_API_MINOR_VERSION,
+ api->major_version, api->minor_version);
+ exit(1);
+@}
+@end example
+
+Such code is included in the boilerplate @code{dl_load_func()} macro
+provided in @file{gawkapi.h} (discussed later, in
+@ref{Extension API Boilerplate}).
+
+@node Extension API Informational Variables
+@subsubsection Informational Variables
+
+The API provides access to several variables that describe
+whether the corresponding command-line options were enabled when
+@command{gawk} was invoked. The variables are:
+
+@table @code
+@item do_lint
+This variable is true if @command{gawk} was invoked with @option{--lint} option
+(@pxref{Options}).
+
+@item do_traditional
+This variable is true if @command{gawk} was invoked with @option{--traditional} option.
+
+@item do_profile
+This variable is true if @command{gawk} was invoked with @option{--profile} option.
+
+@item do_sandbox
+This variable is true if @command{gawk} was invoked with @option{--sandbox} option.
+
+@item do_debug
+This variable is true if @command{gawk} was invoked with @option{--debug} option.
+
+@item do_mpfr
+This variable is true if @command{gawk} was invoked with @option{--bignum} option.
+@end table
+
+The value of @code{do_lint} can change if @command{awk} code
+modifies the @code{LINT} built-in variable (@pxref{Built-in Variables}).
+The others should not change during execution.
+
+@node Extension API Boilerplate
+@subsection Boilerplate Code
+
+As mentioned earlier (@pxref{Extension Mechanism Outline}), the function
+definitions as presented are really macros. To use these macros, your
+extension must provide a small amount of boilerplate code (variables and
+functions) towards the top of your source file, using pre-defined names
+as described below. The boilerplate needed is also provided in comments
+in the @file{gawkapi.h} header file:
+
+@example
+/* Boiler plate code: */
+int plugin_is_GPL_compatible;
+
+static gawk_api_t *const api;
+static awk_ext_id_t ext_id;
+static const char *ext_version = NULL; /* or @dots{} = "some string" */
+
+static awk_ext_func_t func_table[] = @{
+ @{ "name", do_name, 1 @},
+ /* @dots{} */
+@};
+
+/* EITHER: */
+
+static awk_bool_t (*init_func)(void) = NULL;
+
+/* OR: */
+
+static awk_bool_t
+init_my_module(void)
+@{
+ @dots{}
+@}
+
+static awk_bool_t (*init_func)(void) = init_my_module;
+
+dl_load_func(func_table, some_name, "name_space_in_quotes")
+@end example
+
+These variables and functions are as follows:
+
+@table @code
+@item int plugin_is_GPL_compatible;
+This asserts that the extension is compatible with the GNU GPL
+(@pxref{Copying}). If your extension does not have this, @command{gawk}
+will not load it (@pxref{Plugin License}).
+
+@item static gawk_api_t *const api;
+This global @code{static} variable should be set to point to
+the @code{gawk_api_t} pointer that @command{gawk} passes to your
+@code{dl_load()} function. This variable is used by all of the macros.
+
+@item static awk_ext_id_t ext_id;
+This global static variable should be set to the @code{awk_ext_id_t}
+value that @command{gawk} passes to your @code{dl_load()} function.
+This variable is used by all of the macros.
+
+@item static const char *ext_version = NULL; /* or @dots{} = "some string" */
+This global @code{static} variable should be set either
+to @code{NULL}, or to point to a string giving the name and version of
+your extension.
+
+@item static awk_ext_func_t func_table[] = @{ @dots{} @};
+This is an array of one or more @code{awk_ext_func_t} structures
+as described earlier (@pxref{Extension Functions}).
+It can then be looped over for multiple calls to
+@code{add_ext_func()}.
+
+@item static awk_bool_t (*init_func)(void) = NULL;
+@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @r{OR}
+@itemx static awk_bool_t init_my_module(void) @{ @dots{} @}
+@itemx static awk_bool_t (*init_func)(void) = init_my_module;
+If you need to do some initialization work, you should define a
+function that does it (creates variables, opens files, etc.)
+and then define the @code{init_func} pointer to point to your
+function.
+The function should return zero (false) upon failure, non-zero
+(success) if everything goes well.
+
+If you don't need to do any initialization, define the pointer and
+initialize it to @code{NULL}.
+
+@item dl_load_func(func_table, some_name, "name_space_in_quotes")
+This macro expands to a @code{dl_load()} function that performs
+all the necessary initializations.
+@end table
+
+The point of the all the variables and arrays is to let the
+@code{dl_load()} function (from the @code{dl_load_func()}
+macro) do all the standard work. It does the following:
+
+@enumerate 1
+@item
+Check the API versions. If the extension major version does not match
+@command{gawk}'s, or if the extension minor version is greater than
+@command{gawk}'s, it prints a fatal error message and exits.
+
+@item
+Load the functions defined in @code{func_table}.
+If any of them fails to load, it prints a warning message but
+continues on.
+
+@item
+If the @code{init_func} pointer is not @code{NULL}, call the
+function it points to. If it returns non-zero, print a
+warning message.
+
+@item
+If @code{ext_version} is not @code{NULL}, register
+the version string with @command{gawk}.
+@end enumerate
+
+@node Finding Extensions
+@subsection How @command{gawk} Finds Extensions
+
+Compiled extensions have to be installed in a directory where
+@command{gawk} can find them. If @command{gawk} is configured and
+built in the default fashion, the directory in which to find
+extensions is @file{/usr/local/lib/gawk}. You can also specify a search
+path with a list of directories to search for compiled extensions.
+@xref{AWKLIBPATH Variable}, for more information.
+
+@node Extension Example
+@section Example: Some File Functions
+
+@quotation
+@i{No matter where you go, there you are.} @*
+Buckaroo Bonzai
+@end quotation
+
+@c It's enough to show chdir and stat, no need for fts
+
+Two useful functions that are not in @command{awk} are @code{chdir()} (so
+that an @command{awk} program can change its directory) and @code{stat()}
+(so that an @command{awk} program can gather information about a file).
+This @value{SECTION} implements these functions for @command{gawk}
+in an extension.
@menu
* Internal File Description:: What the new functions will do.
@@ -28121,13 +30613,13 @@ external extension library.
@node Internal File Description
@subsection Using @code{chdir()} and @code{stat()}
-This @value{SECTION} shows how to use the new functions at the @command{awk}
-level once they've been integrated into the running @command{gawk}
-interpreter.
-Using @code{chdir()} is very straightforward. It takes one argument,
-the new directory to change to:
+This @value{SECTION} shows how to use the new functions at
+the @command{awk} level once they've been integrated into the
+running @command{gawk} interpreter. Using @code{chdir()} is very
+straightforward. It takes one argument, the new directory to change to:
@example
+@@load "filefuncs"
@dots{}
newdir = "/home/arnold/funstuff"
ret = chdir(newdir)
@@ -28139,21 +30631,18 @@ if (ret < 0) @{
@dots{}
@end example
-The return value is negative if the @code{chdir} failed,
-and @code{ERRNO}
-(@pxref{Built-in Variables})
-is set to a string indicating the error.
+The return value is negative if the @code{chdir()} failed, and
+@code{ERRNO} (@pxref{Built-in Variables}) is set to a string indicating
+the error.
-Using @code{stat()} is a bit more complicated.
-The C @code{stat()} function fills in a structure that has a fair
-amount of information.
+Using @code{stat()} is a bit more complicated. The C @code{stat()}
+function fills in a structure that has a fair amount of information.
The right way to model this in @command{awk} is to fill in an associative
array with the appropriate information:
@c broke printf for page breaking
@example
file = "/home/arnold/.profile"
-fdata[1] = "x" # force `fdata' to be an array
ret = stat(file, fdata)
if (ret < 0) @{
printf("could not stat %s: %s\n",
@@ -28198,11 +30687,11 @@ be a function of the file's size if the file has holes.
The file's last access, modification, and inode update times,
respectively. These are numeric timestamps, suitable for formatting
with @code{strftime()}
-(@pxref{Built-in}).
+(@pxref{Time Functions}).
@item "pmode"
The file's ``printable mode.'' This is a string representation of
-the file's type and permissions, such as what is produced by
+the file's type and permissions, such as is produced by
@samp{ls -l}---for example, @code{"drwxr-xr-x"}.
@item "type"
@@ -28263,64 +30752,96 @@ of that number, respectively.
@node Internal File Ops
@subsection C Code for @code{chdir()} and @code{stat()}
-Here is the C code for these extensions. They were written for
-GNU/Linux. The code needs some more work for complete portability
-to other POSIX-compliant systems:@footnote{This version is edited
-slightly for presentation. See
-@file{extension/filefuncs.c} in the @command{gawk} distribution
-for the complete version.}
+Here is the C code for these extensions.@footnote{This version is
+edited slightly for presentation. See @file{extension/filefuncs.c}
+in the @command{gawk} distribution for the complete version.}
+
+The file includes a number of standard header files, and then includes
+the @file{gawkapi.h} header file which provides the API definitions.
+Those are followed by the necessary variable declarations
+to make use of the API macros and boilerplate code
+(@pxref{Extension API Boilerplate}).
@c break line for page breaking
@example
-#include "awk.h"
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <stdio.h>
+#include <assert.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <sys/types.h>
+#include <sys/stat.h>
-#include <sys/sysmacros.h>
+#include "gawkapi.h"
+
+#include "gettext.h"
+#define _(msgid) gettext(msgid)
+#define N_(msgid) msgid
+
+#include "gawkfts.h"
+#include "stack.h"
+
+static const gawk_api_t *api; /* for convenience macros to work */
+static awk_ext_id_t *ext_id;
+static awk_bool_t init_filefuncs(void);
+static awk_bool_t (*init_func)(void) = init_filefuncs;
+static const char *ext_version = "filefuncs extension: version 1.0";
int plugin_is_GPL_compatible;
+@end example
+@cindex programming conventions, @command{gawk} internals
+By convention, for an @command{awk} function @code{foo()}, the C function
+that implements it is called @code{do_foo()}. The function should have
+two arguments: the first is an @code{int} usually called @code{nargs},
+that represents the number of actual arguments for the function.
+The second is a pointer to an @code{awk_value_t}, usually named
+@code{result}.
+
+@example
/* do_chdir --- provide dynamically loaded chdir() builtin for gawk */
-static NODE *
-do_chdir(int nargs)
+static awk_value_t *
+do_chdir(int nargs, awk_value_t *result)
@{
- NODE *newdir;
+ awk_value_t newdir;
int ret = -1;
- if (do_lint && nargs != 1)
- lintwarn("chdir: called with incorrect number of arguments");
+ assert(result != NULL);
- newdir = get_scalar_argument(0, FALSE);
+ if (do_lint && nargs != 1)
+ lintwarn(ext_id,
+ _("chdir: called with incorrect number of arguments, "
+ "expecting 1"));
@end example
-The file includes the @code{"awk.h"} header file for definitions
-for the @command{gawk} internals. It includes @code{<sys/sysmacros.h>}
-for access to the @code{major()} and @code{minor}() macros.
-
-@cindex programming conventions, @command{gawk} internals
-By convention, for an @command{awk} function @code{foo}, the function that
-implements it is called @samp{do_foo}. The function should take
-a @samp{int} argument, usually called @code{nargs}, that
-represents the number of defined arguments for the function. The @code{newdir}
+The @code{newdir}
variable represents the new directory to change to, retrieved
-with @code{get_scalar_argument()}. Note that the first argument is
+with @code{get_argument()}. Note that the first argument is
numbered zero.
-This code actually accomplishes the @code{chdir()}. It first forces
-the argument to be a string and passes the string value to the
+If the argument is retrieved successfully, the function calls the
@code{chdir()} system call. If the @code{chdir()} fails, @code{ERRNO}
is updated.
@example
- (void) force_string(newdir);
- ret = chdir(newdir->stptr);
- if (ret < 0)
- update_ERRNO_int(errno);
+ if (get_argument(0, AWK_STRING, & newdir)) @{
+ ret = chdir(newdir.str_value.str);
+ if (ret < 0)
+ update_ERRNO_int(errno);
+ @}
@end example
Finally, the function returns the return value to the @command{awk} level:
@example
- return make_number((AWKNUM) ret);
+ return make_number(ret, result);
@}
@end example
@@ -28339,7 +30860,168 @@ format_mode(unsigned long fmode)
@}
@end example
-Next comes the @code{do_stat()} function. It starts with
+Next comes a function for reading symbolic links, which is also
+omitted here for brevity:
+
+@example
+/* read_symlink --- read a symbolic link into an allocated buffer.
+ @dots{} */
+
+static char *
+read_symlink(const char *fname, size_t bufsize, ssize_t *linksize)
+@{
+ @dots{}
+@}
+@end example
+
+Two helper functions simplify entering values in the
+array that will contain the result of the @code{stat()}:
+
+@example
+/* array_set --- set an array element */
+
+static void
+array_set(awk_array_t array, const char *sub, awk_value_t *value)
+@{
+ awk_value_t index;
+
+ set_array_element(array,
+ make_const_string(sub, strlen(sub), & index),
+ value);
+
+@}
+
+/* array_set_numeric --- set an array element with a number */
+
+static void
+array_set_numeric(awk_array_t array, const char *sub, double num)
+@{
+ awk_value_t tmp;
+
+ array_set(array, sub, make_number(num, & tmp));
+@}
+@end example
+
+The following function does most of the work to fill in
+the @code{awk_array_t} result array with values obtained
+from a valid @code{struct stat}. It is done in a separate function
+to support the @code{stat()} function for @command{gawk} and also
+to support the @code{fts()} extension which is included in
+the same file but whose code is not shown here
+(@pxref{Extension Sample File Functions}).
+
+The first part of the function is variable declarations,
+including a table to map file types to strings:
+
+@example
+/* fill_stat_array --- do the work to fill an array with stat info */
+
+static int
+fill_stat_array(const char *name, awk_array_t array, struct stat *sbuf)
+@{
+ char *pmode; /* printable mode */
+ const char *type = "unknown";
+ awk_value_t tmp;
+ static struct ftype_map @{
+ unsigned int mask;
+ const char *type;
+ @} ftype_map[] = @{
+ @{ S_IFREG, "file" @},
+ @{ S_IFBLK, "blockdev" @},
+ @{ S_IFCHR, "chardev" @},
+ @{ S_IFDIR, "directory" @},
+#ifdef S_IFSOCK
+ @{ S_IFSOCK, "socket" @},
+#endif
+#ifdef S_IFIFO
+ @{ S_IFIFO, "fifo" @},
+#endif
+#ifdef S_IFLNK
+ @{ S_IFLNK, "symlink" @},
+#endif
+#ifdef S_IFDOOR /* Solaris weirdness */
+ @{ S_IFDOOR, "door" @},
+#endif /* S_IFDOOR */
+ @};
+ int j, k;
+@end example
+
+The destination array is cleared, and then code fills in
+various elements based on values in the @code{struct stat}:
+
+@example
+ /* empty out the array */
+ clear_array(array);
+
+ /* fill in the array */
+ array_set(array, "name", make_const_string(name, strlen(name),
+ & tmp));
+ array_set_numeric(array, "dev", sbuf->st_dev);
+ array_set_numeric(array, "ino", sbuf->st_ino);
+ array_set_numeric(array, "mode", sbuf->st_mode);
+ array_set_numeric(array, "nlink", sbuf->st_nlink);
+ array_set_numeric(array, "uid", sbuf->st_uid);
+ array_set_numeric(array, "gid", sbuf->st_gid);
+ array_set_numeric(array, "size", sbuf->st_size);
+ array_set_numeric(array, "blocks", sbuf->st_blocks);
+ array_set_numeric(array, "atime", sbuf->st_atime);
+ array_set_numeric(array, "mtime", sbuf->st_mtime);
+ array_set_numeric(array, "ctime", sbuf->st_ctime);
+
+ /* for block and character devices, add rdev,
+ major and minor numbers */
+ if (S_ISBLK(sbuf->st_mode) || S_ISCHR(sbuf->st_mode)) @{
+ array_set_numeric(array, "rdev", sbuf->st_rdev);
+ array_set_numeric(array, "major", major(sbuf->st_rdev));
+ array_set_numeric(array, "minor", minor(sbuf->st_rdev));
+ @}
+@end example
+
+@noindent
+The latter part of the function makes selective additions
+to the destination array, depending upon the availability of
+certain members and/or the type of the file. It then returns zero,
+for success:
+
+@example
+#ifdef HAVE_ST_BLKSIZE
+ array_set_numeric(array, "blksize", sbuf->st_blksize);
+#endif /* HAVE_ST_BLKSIZE */
+
+ pmode = format_mode(sbuf->st_mode);
+ array_set(array, "pmode", make_const_string(pmode, strlen(pmode),
+ & tmp));
+
+ /* for symbolic links, add a linkval field */
+ if (S_ISLNK(sbuf->st_mode)) @{
+ char *buf;
+ ssize_t linksize;
+
+ if ((buf = read_symlink(name, sbuf->st_size,
+ & linksize)) != NULL)
+ array_set(array, "linkval",
+ make_malloced_string(buf, linksize, & tmp));
+ else
+ warning(ext_id, _("stat: unable to read symbolic link `%s'"),
+ name);
+ @}
+
+ /* add a type field */
+ type = "unknown"; /* shouldn't happen */
+ for (j = 0, k = sizeof(ftype_map)/sizeof(ftype_map[0]); j < k; j++) @{
+ if ((sbuf->st_mode & S_IFMT) == ftype_map[j].mask) @{
+ type = ftype_map[j].type;
+ break;
+ @}
+ @}
+
+ array_set(array, "type", make_const_string(type, strlen(type), &tmp));
+
+ return 0;
+@}
+@end example
+
+Finally, here is the @code{do_stat()} function. It starts with
variable declarations and argument checking:
@ignore
@@ -28349,116 +31031,140 @@ Changed message for page breaking. Used to be:
@example
/* do_stat --- provide a stat() function for gawk */
-static NODE *
-do_stat(int nargs)
+static awk_value_t *
+do_stat(int nargs, awk_value_t *result)
@{
- NODE *file, *array, *tmp;
- struct stat sbuf;
+ awk_value_t file_param, array_param;
+ char *name;
+ awk_array_t array;
int ret;
- NODE **aptr;
- char *pmode; /* printable mode */
- char *type = "unknown";
+ struct stat sbuf;
- if (do_lint && nargs > 2)
- lintwarn("stat: called with too many arguments");
+ assert(result != NULL);
+
+ if (do_lint && nargs != 2) @{
+ lintwarn(ext_id,
+ _("stat: called with wrong number of arguments"));
+ return make_number(-1, result);
+ @}
@end example
Then comes the actual work. First, the function gets the arguments.
-Then, it always clears the array.
+Next, it gets the information for the file.
The code use @code{lstat()} (instead of @code{stat()})
to get the file information,
in case the file is a symbolic link.
If there's an error, it sets @code{ERRNO} and returns:
-@c comment made multiline for page breaking
@example
/* file is first arg, array to hold results is second */
- file = get_scalar_argument(0, FALSE);
- array = get_array_argument(1, FALSE);
+ if ( ! get_argument(0, AWK_STRING, & file_param)
+ || ! get_argument(1, AWK_ARRAY, & array_param)) @{
+ warning(ext_id, _("stat: bad parameters"));
+ return make_number(-1, result);
+ @}
- /* empty out the array */
- assoc_clear(array);
+ name = file_param.str_value.str;
+ array = array_param.array_cookie;
+
+ /* always empty out the array */
+ clear_array(array);
/* lstat the file, if error, set ERRNO and return */
- (void) force_string(file);
- ret = lstat(file->stptr, & sbuf);
+ ret = lstat(name, & sbuf);
if (ret < 0) @{
update_ERRNO_int(errno);
- return make_number((AWKNUM) ret);
+ return make_number(ret, result);
@}
@end example
-Now comes the tedious part: filling in the array. Only a few of the
-calls are shown here, since they all follow the same pattern:
+The tedious work is done by @code{fill_stat_array()}, shown
+earlier. When done, return the result from @code{fill_stat_array()}:
@example
- /* fill in the array */
- aptr = assoc_lookup(array, tmp = make_string("name", 4));
- *aptr = dupnode(file);
- unref(tmp);
+ ret = fill_stat_array(name, array, & sbuf);
- aptr = assoc_lookup(array, tmp = make_string("mode", 4));
- *aptr = make_number((AWKNUM) sbuf.st_mode);
- unref(tmp);
-
- aptr = assoc_lookup(array, tmp = make_string("pmode", 5));
- pmode = format_mode(sbuf.st_mode);
- *aptr = make_string(pmode, strlen(pmode));
- unref(tmp);
+ return make_number(ret, result);
+@}
@end example
-When done, return the @code{lstat()} return value:
+@cindex programming conventions, @command{gawk} internals
+Finally, it's necessary to provide the ``glue'' that loads the
+new function(s) into @command{gawk}.
+
+The @code{filefuncs} extension also provides an @code{fts()}
+function, which we omit here. For its sake there is an initialization
+function:
@example
+/* init_filefuncs --- initialization routine */
- return make_number((AWKNUM) ret);
+static awk_bool_t
+init_filefuncs(void)
+@{
+ @dots{}
@}
@end example
-@cindex programming conventions, @command{gawk} internals
-Finally, it's necessary to provide the ``glue'' that loads the
-new function(s) into @command{gawk}. By convention, each library has
-a routine named @code{dl_load()} that does the job. The simplest way
-is to use the @code{dl_load_func} macro in @code{gawkapi.h}.
+We are almost done. We need an array of @code{awk_ext_func_t}
+structures for loading each function into @command{gawk}:
+
+@example
+static awk_ext_func_t func_table[] = @{
+ @{ "chdir", do_chdir, 1 @},
+ @{ "stat", do_stat, 2 @},
+ @{ "fts", do_fts, 3 @},
+@};
+@end example
+
+Each extension must have a routine named @code{dl_load()} to load
+everything that needs to be loaded. It is simplest to use the
+@code{dl_load_func()} macro in @code{gawkapi.h}:
+
+@example
+/* define the dl_load() function using the boilerplate macro */
+
+dl_load_func(func_table, filefuncs, "")
+@end example
And that's it! As an exercise, consider adding functions to
implement system calls such as @code{chown()}, @code{chmod()},
and @code{umask()}.
@node Using Internal File Ops
-@subsection Integrating the Extensions
+@subsection Integrating The Extensions
@cindex @command{gawk}, interpreter@comma{} adding code to
Now that the code is written, it must be possible to add it at
runtime to the running @command{gawk} interpreter. First, the
code must be compiled. Assuming that the functions are in
a file named @file{filefuncs.c}, and @var{idir} is the location
-of the @command{gawk} include files,
-the following steps create
-a GNU/Linux shared library:
+of the @file{gawkapi.h} header file,
+the following steps@footnote{In practice, you would probably want to
+use the GNU Autotools---Automake, Autoconf, Libtool, and Gettext---to
+configure and build your libraries. Instructions for doing so are beyond
+the scope of this @value{DOCUMENT}. @xref{gawkextlib}, for WWW links to
+the tools.} create a GNU/Linux shared library:
@example
$ @kbd{gcc -fPIC -shared -DHAVE_CONFIG_H -c -O -g -I@var{idir} filefuncs.c}
-$ @kbd{ld -o filefuncs.so -shared filefuncs.o}
+$ @kbd{ld -o filefuncs.so -shared filefuncs.o -lc}
@end example
-@cindex @code{extension()} function (@command{gawk})
-Once the library exists, it is loaded by calling the @code{extension()}
-built-in function.
-This function takes two arguments: the name of the
-library to load and the name of a function to call when the library
-is first loaded. This function adds the new functions to @command{gawk}.
-It returns the value returned by the initialization function
-within the shared library:
+Once the library exists, it is loaded by using the @code{@@load} keyword.
@example
# file testff.awk
+@@load "filefuncs"
+
BEGIN @{
- extension("./filefuncs.so", "dl_load")
+ "pwd" | getline curdir # save current directory
+ close("pwd")
- chdir(".") # no-op
+ chdir("/tmp")
+ system("pwd") # test it
+ chdir(curdir) # go back
- data[1] = 1 # force `data' to be an array
print "Info for testff.awk"
ret = stat("testff.awk", data)
print "ret =", ret
@@ -28476,40 +31182,705 @@ BEGIN @{
@}
@end example
-Here are the results of running the program:
+The @env{AWKLIBPATH} environment variable tells
+@command{gawk} where to find shared libraries (@pxref{Finding Extensions}).
+We set it to the current directory and run the program:
@example
-$ @kbd{gawk -f testff.awk}
+$ @kbd{AWKLIBPATH=$PWD gawk -f testff.awk}
+@print{} /tmp
@print{} Info for testff.awk
@print{} ret = 0
-@print{} data["size"] = 607
-@print{} data["ino"] = 14945891
-@print{} data["name"] = testff.awk
-@print{} data["pmode"] = -rw-rw-r--
-@print{} data["nlink"] = 1
-@print{} data["atime"] = 1293993369
-@print{} data["mtime"] = 1288520752
-@print{} data["mode"] = 33204
@print{} data["blksize"] = 4096
-@print{} data["dev"] = 2054
+@print{} data["mtime"] = 1350838628
+@print{} data["mode"] = 33204
@print{} data["type"] = file
-@print{} data["gid"] = 500
-@print{} data["uid"] = 500
+@print{} data["dev"] = 2053
+@print{} data["gid"] = 1000
+@print{} data["ino"] = 1719496
+@print{} data["ctime"] = 1350838628
@print{} data["blocks"] = 8
-@print{} data["ctime"] = 1290113572
-@print{} testff.awk modified: 10 31 10 12:25:52
+@print{} data["nlink"] = 1
+@print{} data["name"] = testff.awk
+@print{} data["atime"] = 1350838632
+@print{} data["pmode"] = -rw-rw-r--
+@print{} data["size"] = 662
+@print{} data["uid"] = 1000
+@print{} testff.awk modified: 10 21 12 18:57:08
@print{}
@print{} Info for JUNK
@print{} ret = -1
@print{} JUNK modified: 01 01 70 02:00:00
@end example
-@c ENDOFRANGE filre
-@c ENDOFRANGE dirch
-@c ENDOFRANGE statg
-@c ENDOFRANGE chdirg
-@c ENDOFRANGE gladfgaw
-@c ENDOFRANGE adfugaw
-@c ENDOFRANGE fubadgaw
+
+@node Extension Samples
+@section The Sample Extensions In The @command{gawk} Distribution
+
+This @value{SECTION} provides brief overviews of the sample extensions
+that come in the @command{gawk} distribution. Some of them are intended
+for production use, such the @code{filefuncs} and @code{readdir} extensions.
+Others mainly provide example code that shows how to use the extension API.
+
+@menu
+* Extension Sample File Functions:: The file functions sample.
+* Extension Sample Fnmatch:: An interface to @code{fnmatch()}.
+* Extension Sample Fork:: An interface to @code{fork()} and other
+ process functions.
+* Extension Sample Ord:: Character to value to character
+ conversions.
+* Extension Sample Readdir:: An interface to @code{readdir()}.
+* Extension Sample Revout:: Reversing output sample output wrapper.
+* Extension Sample Rev2way:: Reversing data sample two-way processor.
+* Extension Sample Read write array:: Serializing an array to a file.
+* Extension Sample Readfile:: Reading an entire file into a string.
+* Extension Sample API Tests:: Tests for the API.
+* Extension Sample Time:: An interface to @code{gettimeofday()}
+ and @code{sleep()}.
+@end menu
+
+@node Extension Sample File Functions
+@subsection File Related Functions
+
+The @code{filefuncs} extension provides three different functions, as follows:
+The usage is:
+
+@table @code
+@item @@load "filefuncs"
+This is how you load the extension.
+
+@item result = chdir("/some/directory")
+The @code{chdir()} function is a direct hook to the @code{chdir()}
+system call to change the current directory. It returns zero
+upon success or less than zero upon error. In the latter case it updates
+@code{ERRNO}.
+
+@item result = stat("/some/path", statdata)
+The @code{stat()} function provides a hook into the
+@code{stat()} system call. In fact, it uses @code{lstat()}.
+It returns zero upon success or less than zero upon error.
+In the latter case it updates @code{ERRNO}.
+
+In all cases, it clears the @code{statdata} array.
+When the call is successful, @code{stat()} fills the @code{statdata}
+array with information retrieved from the filesystem, as follows:
+
+@c nested table
+@multitable @columnfractions .25 .60
+@item @code{statdata["name"]} @tab
+The name of the file.
+
+@item @code{statdata["dev"]} @tab
+Corresponds to the @code{st_dev} field in the @code{struct stat}.
+
+@item @code{statdata["ino"]} @tab
+Corresponds to the @code{st_ino} field in the @code{struct stat}.
+
+@item @code{statdata["mode"]} @tab
+Corresponds to the @code{st_mode} field in the @code{struct stat}.
+
+@item @code{statdata["nlink"]} @tab
+Corresponds to the @code{st_nlink} field in the @code{struct stat}.
+
+@item @code{statdata["uid"]} @tab
+Corresponds to the @code{st_uid} field in the @code{struct stat}.
+
+@item @code{statdata["gid"]} @tab
+Corresponds to the @code{st_gid} field in the @code{struct stat}.
+
+@item @code{statdata["size"]} @tab
+Corresponds to the @code{st_size} field in the @code{struct stat}.
+
+@item @code{statdata["atime"]} @tab
+Corresponds to the @code{st_atime} field in the @code{struct stat}.
+
+@item @code{statdata["mtime"]} @tab
+Corresponds to the @code{st_mtime} field in the @code{struct stat}.
+
+@item @code{statdata["ctime"]} @tab
+Corresponds to the @code{st_ctime} field in the @code{struct stat}.
+
+@item @code{statdata["rdev"]} @tab
+Corresponds to the @code{st_rdev} field in the @code{struct stat}.
+This element is only present for device files.
+
+@item @code{statdata["major"]} @tab
+Corresponds to the @code{st_major} field in the @code{struct stat}.
+This element is only present for device files.
+
+@item @code{statdata["minor"]} @tab
+Corresponds to the @code{st_minor} field in the @code{struct stat}.
+This element is only present for device files.
+
+@item @code{statdata["blksize"]} @tab
+Corresponds to the @code{st_blksize} field in the @code{struct stat}.
+if this field is present on your system.
+(It is present on all modern systems that we know of.)
+
+@item @code{statdata["pmode"]} @tab
+A human-readable version of the mode value, such as printed by
+@command{ls}. For example, @code{"-rwxr-xr-x"}.
+
+@item @code{statdata["linkval"]} @tab
+If the named file is a symbolic link, this element will exist
+and its value is the value of the symbolic link (where the
+symbolic link points to).
+
+@item @code{statdata["type"]} @tab
+The type of the file as a string. One of
+@code{"file"},
+@code{"blockdev"},
+@code{"chardev"},
+@code{"directory"},
+@code{"socket"},
+@code{"fifo"},
+@code{"symlink"},
+@code{"door"},
+or
+@code{"unknown"}.
+Not all systems support all file types.
+@end multitable
+
+@item flags = or(FTS_PHYSICAL, ...)
+@itemx result = fts(pathlist, flags, filedata)
+Walk the file trees provided in @code{pathlist} and fill in the
+@code{filedata} array as described below. @code{flags} is the bitwise
+OR of several predefined constant values, also as described below.
+Return zero if there were no errors, otherwise return @minus{}1.
+@end table
+
+The @code{fts()} function provides a hook to the C library @code{fts()}
+routines for traversing file hierarchies. Instead of returning data
+about one file at a time in a stream, it fills in a multi-dimensional
+array with data about each file and directory encountered in the requested
+hierarchies.
+
+The arguments are as follows:
+
+@table @code
+@item pathlist
+An array of filenames. The element values are used; the index values are ignored.
+
+@item flags
+This should be the bitwise OR of one or more of the following
+predefined constant flag values. At least one of
+@code{FTS_LOGICAL} or @code{FTS_PHYSICAL} must be provided; otherwise
+@code{fts()} returns an error value and sets @code{ERRNO}.
+The flags are:
+
+@c nested table
+@table @code
+@item FTS_LOGICAL
+Do a ``logical'' file traversal, where the information returned for
+a symbolic link refers to the linked-to file, and not to the symbolic
+link itself. This flag is mutually exclusive with @code{FTS_PHYSICAL}.
+
+@item FTS_PHYSICAL
+Do a ``physical'' file traversal, where the information returned for a
+symbolic link refers to the symbolic link itself. This flag is mutually
+exclusive with @code{FTS_LOGICAL}.
+
+@item FTS_NOCHDIR
+As a performance optimization, the C library @code{fts()} routines
+change directory as they traverse a file hierarchy. This flag disables
+that optimization.
+
+@item FTS_COMFOLLOW
+Immediately follow a symbolic link named in @code{pathlist},
+whether or not @code{FTS_LOGICAL} is set.
+
+@item FTS_SEEDOT
+By default, the @code{fts()} routines do not return entries for @file{.}
+and @file{..}. This option causes entries for @file{..} to also
+be included. (The extension always includes an entry for @file{.},
+see below.)
+
+@item FTS_XDEV
+During a traversal, do not cross onto a different mounted filesystem.
+@end table
+
+@item filedata
+The @code{filedata} array is first cleared. Then, @code{fts()} creates
+an element in @code{filedata} for every element in @code{pathlist}.
+The index is the name of the directory or file given in @code{pathlist}.
+The element for this index is itself an array. There are two cases.
+
+@c nested table
+@table @emph
+@item The path is a file.
+In this case, the array contains two or three elements:
+
+@c doubly nested table
+@table @code
+@item "path"
+The full path to this file, starting from the ``root'' that was given
+in the @code{pathlist} array.
+
+@item "stat"
+This element is itself an array, containing the same information as provided
+by the @code{stat()} function described earlier for its
+@code{statdata} argument. The element may not be present if
+the @code{stat()} system call for the file failed.
+
+@item "error"
+If some kind of error was encountered, the array will also
+contain an element named @code{"error"}, which is a string describing the error.
+@end table
+
+@item The path is a directory.
+In this case, the array contains one element for each entry in the
+directory. If an entry is a file, that element is as for files, just
+described. If the entry is a directory, that element is (recursively),
+an array describing the subdirectory. If @code{FTS_SEEDOT} was provided
+in the flags, then there will also be an element named @code{".."}. This
+element will be an array containing the data as provided by @code{stat()}.
+
+In addition, there will be an element whose index is @code{"."}.
+This element is an array containing the same two or three elements as
+for a file: @code{"path"}, @code{"stat"}, and @code{"error"}.
+@end table
+@end table
+
+The @code{fts()} function returns zero if there were no errors.
+Otherwise it returns @minus{}1.
+
+@quotation NOTE
+The @code{fts()} extension does not exactly mimic the
+interface of the C library @code{fts()} routines, choosing instead to
+provide an interface that is based on associative arrays, which should
+be more comfortable to use from an @command{awk} program. This includes the
+lack of a comparison function, since @command{gawk} already provides
+powerful array sorting facilities. While an @code{fts_read()}-like
+interface could have been provided, this felt less natural than simply
+creating a multi-dimensional array to represent the file hierarchy and
+its information.
+@end quotation
+
+See @file{test/fts.awk} in the @command{gawk} distribution for an example.
+
+@node Extension Sample Fnmatch
+@subsection Interface To @code{fnmatch()}
+
+This extension provides an interface to the C library
+@code{fnmatch()} function. The usage is:
+
+@example
+@@load "fnmatch"
+
+result = fnmatch(pattern, string, flags)
+@end example
+
+The @code{fnmatch} extension adds a single function named
+@code{fnmatch()}, one constant (@code{FNM_NOMATCH}), and an array of
+flag values named @code{FNM}.
+
+The arguments to @code{fnmatch()} are:
+
+@table @code
+@item pattern
+The filename wildcard to match.
+
+@item string
+The filename string,
+
+@item flag
+Either zero, or the bitwise OR of one or more of the
+flags in the @code{FNM} array.
+@end table
+
+The return value is zero on success, @code{FNM_NOMATCH}
+if the string did not match the pattern, or
+a different non-zero value if an error occurred.
+
+The flags are follows:
+
+@multitable @columnfractions .25 .75
+@item @code{FNM["CASEFOLD"]} @tab
+Corresponds to the @code{FNM_CASEFOLD} flag as defined in @code{fnmatch()}.
+
+@item @code{FNM["FILE_NAME"]} @tab
+Corresponds to the @code{FNM_FILE_NAME} flag as defined in @code{fnmatch()}.
+
+@item @code{FNM["LEADING_DIR"]} @tab
+Corresponds to the @code{FNM_LEADING_DIR} flag as defined in @code{fnmatch()}.
+
+@item @code{FNM["NOESCAPE"]} @tab
+Corresponds to the @code{FNM_NOESCAPE} flag as defined in @code{fnmatch()}.
+
+@item @code{FNM["PATHNAME"]} @tab
+Corresponds to the @code{FNM_PATHNAME} flag as defined in @code{fnmatch()}.
+
+@item @code{FNM["PERIOD"]} @tab
+Corresponds to the @code{FNM_PERIOD} flag as defined in @code{fnmatch()}.
+@end multitable
+
+Here is an example:
+
+@example
+@@load "fnmatch"
+@dots{}
+flags = or(FNM["PERIOD"], FNM["NOESCAPE"])
+if (fnmatch("*.a", "foo.c", flags) == FNM_NOMATCH)
+ print "no match"
+@end example
+
+@node Extension Sample Fork
+@subsection Interface To @code{fork()}, @code{wait()} and @code{waitpid()}
+
+The @code{fork} extension adds three functions, as follows.
+
+@table @code
+@item @@load "fork"
+This is how you load the extension.
+
+@item pid = fork()
+This function creates a new process. The return value is the zero in the
+child and the process-id number of the child in the parent, or @minus{}1
+upon error. In the latter case, @code{ERRNO} indicates the problem.
+In the child, @code{PROCINFO["pid"]} and @code{PROCINFO["ppid"]} are
+updated to reflect the correct values.
+
+@item ret = waitpid(pid)
+This function takes a numeric argument, which is the process-id to
+wait for. The return value is that of the
+@code{waitpid()} system call.
+
+@item ret = wait()
+This function waits for the first child to die.
+The return value is that of the
+@code{wait()} system call.
+@end table
+
+There is no corresponding @code{exec()} function.
+
+Here is an example:
+
+@example
+@@load "fork"
+@dots{}
+if ((pid = fork()) == 0)
+ print "hello from the child"
+else
+ print "hello from the parent"
+@end example
+
+@node Extension Sample Ord
+@subsection Character and Numeric values: @code{ord()} and @code{chr()}
+
+The @code{ordchr} extension adds two functions, named
+@code{ord()} and @code{chr()}, as follows.
+
+@table @code
+@item number = ord(string)
+Return the numeric value of the first character in @code{string}.
+
+@item char = chr(number)
+Return the string whose first character is that represented by @code{number}.
+@end table
+
+These functions are inspired by the Pascal language functions
+of the same name. Here is an example:
+
+@example
+@@load "ordchr"
+@dots{}
+printf("The numeric value of 'A' is %d\n", ord("A"))
+printf("The string value of 65 is %s\n", chr(65))
+@end example
+
+@node Extension Sample Readdir
+@subsection Reading Directories
+
+The @code{readdir} extension adds an input parser for directories, and
+adds a single function named @code{readdir_do_ftype()}.
+The usage is as follows:
+
+@example
+@@load "readdir"
+
+readdir_do_ftype("stat") # or "dirent" or "never"
+@end example
+
+When this extension is in use, instead of skipping directories named
+on the command line (or with @code{getline}),
+they are read, with each entry returned as a record.
+
+The record consists of at least two fields: the inode number and the
+filename, separated by a forward slash character.
+On systems where the directory entry contains the file type, the record
+has a third field which is a single letter indicating the type of the
+file:
+
+@multitable @columnfractions .1 .9
+@headitem Letter @tab File Type
+@item @code{b} @tab Block device
+@item @code{c} @tab Character device
+@item @code{d} @tab Directory
+@item @code{f} @tab Regular file
+@item @code{l} @tab Symbolic link
+@item @code{p} @tab Named pipe (FIFO)
+@item @code{s} @tab Socket
+@item @code{u} @tab Anything else (unknown)
+@end multitable
+
+On systems without the file type information, calling
+@samp{readdir_do_ftype("stat")} causes the extension to use the
+@code{lstat()} system call to retrieve the appropriate information. This
+is not the default, since @code{lstat()} is a potentially expensive
+operation. By calling @samp{readdir_do_ftype("never")} one can ensure
+that the file type information is never displayed, even when readily
+available in the directory entry.
+
+The third option, @samp{readdir_do_ftype("dirent")}, takes file type
+information from the directory entry, if it is available. This is the
+default on systems that supply this information.
+
+The @code{readdir_do_ftype()} function sets @code{ERRNO} if called
+without arguments or with invalid arguments.
+
+@quotation NOTE
+On GNU/Linux systems, there are filesystems that don't support the
+@code{d_type} entry (see the @i{readdir}(3) manual page), and so the file
+type is always @samp{u}. Therefore, using @samp{readdir_do_ftype("stat")}
+is advisable even on GNU/Linux systems. In this case, the @code{readdir}
+extension falls back to using @code{lstat()} when it encounters an
+unknown file type.
+@end quotation
+
+Here is an example:
+
+@example
+@@load "readdir"
+@dots{}
+BEGIN @{ FS = "/" @}
+@{ print "file name is", $2 @}
+@end example
+
+@node Extension Sample Revout
+@subsection Reversing Output
+
+The @code{revoutput} extension adds a simple output wrapper that reverses
+the characters in each output line. It's main purpose is to show how to
+write an output wrapper, although it may be mildly amusing for the unwary.
+Here is an example:
+
+@example
+@@load "revoutput"
+
+BEGIN @{
+ REVOUT = 1
+ print "hello, world" > "/dev/stdout"
+@}
+@end example
+
+The output from this program is:
+@samp{dlrow ,olleh}.
+
+@node Extension Sample Rev2way
+@subsection Two-Way I/O Example
+
+The @code{revtwoway} extension adds a simple two-way processor that
+reverses the characters in each line sent to it for reading back by
+the @command{awk} program. It's main purpose is to show how to write
+a two-way processor, although it may also be mildly amusing.
+The following example shows how to use it:
+
+@example
+@@load "revtwoway"
+
+BEGIN @{
+ cmd = "/magic/mirror"
+ print "hello, world" |& cmd
+ cmd |& getline result
+ print result
+ close(cmd)
+@}
+@end example
+
+@node Extension Sample Read write array
+@subsection Dumping and Restoring An Array
+
+The @code{rwarray} extension adds two functions,
+named @code{writea()} and @code{reada()}, as follows:
+
+@table @code
+@item ret = writea(file, array)
+This function takes a string argument, which is the name of the file
+to which dump the array, and the array itself as the second argument.
+@code{writea()} understands multidimensional arrays. It returns one on
+success, or zero upon failure.
+
+@item ret = reada(file, array)
+@code{reada()} is the inverse of @code{writea()};
+it reads the file named as its first argument, filling in
+the array named as the second argument. It clears the array first.
+Here too, the return value is one on success and zero upon failure.
+@end table
+
+The array created by @code{reada()} is identical to that written by
+@code{writea()} in the sense that the contents are the same. However,
+due to implementation issues, the array traversal order of the recreated
+array is likely to be different from that of the original array. As array
+traversal order in @command{awk} is by default undefined, this is not
+(technically) a problem. If you need to guarantee a particular traversal
+order, use the array sorting features in @command{gawk} to do so
+(@pxref{Array Sorting}).
+
+The file contains binary data. All integral values are written in network
+byte order. However, double precision floating-point values are written
+as native binary data. Thus, arrays containing only string data can
+theoretically be dumped on systems with one byte order and restored on
+systems with a different one, but this has not been tried.
+
+Here is an example:
+
+@example
+@@load "rwarray"
+@dots{}
+ret = writea("arraydump.bin", array)
+@dots{}
+ret = reada("arraydump.bin", array)
+@end example
+
+@node Extension Sample Readfile
+@subsection Reading An Entire File
+
+The @code{readfile} extension adds a single function
+named @code{readfile()}:
+
+@table @code
+@item result = readfile("/some/path")
+The argument is the name of the file to read. The return value is a
+string containing the entire contents of the requested file. Upon error,
+the function returns the empty string and sets @code{ERRNO}.
+@end table
+
+Here is an example:
+
+@example
+@@load "readfile"
+@dots{}
+contents = readfile("/path/to/file");
+if (contents == "" && ERRNO != "") @{
+ print("problem reading file", ERRNO) > "/dev/stderr"
+ ...
+@}
+@end example
+
+@node Extension Sample API Tests
+@subsection API Tests
+
+The @code{testext} extension exercises parts of the extension API that
+are not tested by the other samples. The @file{extension/testext.c}
+file contains both the C code for the extension and @command{awk}
+test code inside C comments that run the tests. The testing framework
+extracts the @command{awk} code and runs the tests. See the source file
+for more information.
+
+@node Extension Sample Time
+@subsection Extension Time Functions
+
+@cindex time
+@cindex sleep
+
+These functions can be used by either invoking @command{gawk}
+with a command-line argument of @samp{-l time} or by
+inserting @samp{@@load "time"} in your script.
+
+@table @code
+
+@cindex @code{gettimeofday} time extension function
+@item the_time = gettimeofday()
+Return the time in seconds that has elapsed since 1970-01-01 UTC as a
+floating point value. If the time is unavailable on this platform, return
+@minus{}1 and set @code{ERRNO}. The returned time should have sub-second
+precision, but the actual precision will vary based on the platform.
+If the standard C @code{gettimeofday()} system call is available on this
+platform, then it simply returns the value. Otherwise, if on Windows,
+it tries to use @code{GetSystemTimeAsFileTime()}.
+
+@cindex @code{sleep} time extension function
+@item result = sleep(@var{seconds})
+Attempt to sleep for @var{seconds} seconds. If @var{seconds} is negative,
+or the attempt to sleep fails, return @minus{}1 and set @code{ERRNO}.
+Otherwise, return zero after sleeping for the indicated amount of time.
+Note that @var{seconds} may be a floating-point (non-integral) value.
+Implementation details: depending on platform availability, this function
+tries to use @code{nanosleep()} or @code{select()} to implement the delay.
+@end table
+
+@node gawkextlib
+@section The @code{gawkextlib} Project
+
+The @uref{http://sourceforge.net/projects/gawkextlib/, @code{gawkextlib}}
+project provides a number of @command{gawk} extensions, including one for
+processing XML files. This is the evolution of the original @command{xgawk}
+(XML @command{gawk}) project.
+
+As of this writing, there are four extensions:
+
+@itemize @bullet
+@item
+XML parser extension, using the @uref{http://expat.sourceforge.net, Expat}
+XML parsing library.
+
+@item
+Postgres SQL extension.
+
+@item
+GD graphics library extension.
+
+@item
+MPFR library extension.
+This provides access to a number of MPFR functions which @command{gawk}'s
+native MPFR support does not.
+@end itemize
+
+The @code{time} extension described earlier (@pxref{Extension Sample
+Time}) was originally from this project but has been moved in to the
+main @command{gawk} distribution.
+
+You can check out the code for the @code{gawkextlib} project
+using the @uref{http://git-scm.com, GIT} distributed source
+code control system. The command is as follows:
+
+@example
+git clone git://git.code.sf.net/p/gawkextlib/code gawkextlib-code
+@end example
+
+You will need to have the @uref{http://expat.sourceforge.net, Expat}
+XML parser library installed in order to build and use the XML extension.
+
+In addition, you must have the GNU Autotools installed
+(@uref{http://www.gnu.org/software/autoconf, Autoconf},
+@uref{http://www.gnu.org/software/automake, Automake},
+@uref{http://www.gnu.org/software/libtool, Libtool},
+and
+@uref{http://www.gnu.org/software/gettext, Gettext}).
+
+The simple recipe for building and testing @code{gawkextlib} is as follows.
+First, build and install @command{gawk}:
+
+@example
+cd .../path/to/gawk/code
+./configure --prefix=/tmp/newgawk @ii{Install in /tmp/newgawk for now}
+make && make check @ii{Build and check that all is OK}
+make install @ii{Install gawk}
+@end example
+
+Next, build @code{gawkextlib} and test it:
+
+@example
+cd .../path/to/gawkextlib-code
+./update-autotools @ii{Generate configure, etc.}
+ @ii{You may have to run this command twice}
+./configure --with-gawk=/tmp/newgawk @ii{Configure, point at ``installed'' gawk}
+make && make check @ii{Build and check that all is OK}
+@end example
+
+If you write an extension that you wish to share with other
+@command{gawk} users, please consider doing so through the
+@code{gawkextlib} project.
+
@ignore
@c Try this