aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorArnold D. Robbins <arnold@skeeve.com>2012-11-06 21:40:09 +0200
committerArnold D. Robbins <arnold@skeeve.com>2012-11-06 21:40:09 +0200
commit9e49de573a7ea6f84e6577511aec5a5fc1f47cb6 (patch)
tree0db957bbab72342b446618fed50eaf7268778de2
parentd5cc356948eb6d3ed024b1addad6daccb809448b (diff)
downloadegawk-9e49de573a7ea6f84e6577511aec5a5fc1f47cb6.tar.gz
egawk-9e49de573a7ea6f84e6577511aec5a5fc1f47cb6.tar.bz2
egawk-9e49de573a7ea6f84e6577511aec5a5fc1f47cb6.zip
Remove temp API doc after merge into gawk.texi.
-rw-r--r--doc/api.texi4103
1 files changed, 0 insertions, 4103 deletions
diff --git a/doc/api.texi b/doc/api.texi
deleted file mode 100644
index 6d048def..00000000
--- a/doc/api.texi
+++ /dev/null
@@ -1,4103 +0,0 @@
-\input texinfo @c -*-texinfo-*-
-@c %**start of header (This is for running Texinfo on a region.)
-@setfilename api.info
-@settitle Writing Extensions For Gawk
-@c %**end of header (This is for running Texinfo on a region.)
-
-@dircategory Text creation and manipulation
-@direntry
-* Gawk: (gawk). A text scanning and processing language.
-@end direntry
-@dircategory Individual utilities
-@direntry
-* awk: (gawk)Invoking gawk. Text scanning and processing.
-@end direntry
-
-@set xref-automatic-section-title
-
-@c The following information should be updated here only!
-@c This sets the edition of the document, the version of gawk it
-@c applies to and all the info about who's publishing this edition
-
-@c These apply across the board.
-@set UPDATE-MONTH October, 2012
-@set VERSION 4.1
-@set PATCHLEVEL 0
-
-@set FSF
-
-@set TITLE Writing Extensions for Gawk
-@set SUBTITLE A Temporary Manual
-@set EDITION 1
-
-@iftex
-@set DOCUMENT book
-@set CHAPTER chapter
-@set APPENDIX appendix
-@set SECTION section
-@set SUBSECTION subsection
-@set DARKCORNER @inmargin{@image{lflashlight,1cm}, @image{rflashlight,1cm}}
-@set COMMONEXT (c.e.)
-@end iftex
-@ifinfo
-@set DOCUMENT Info file
-@set CHAPTER major node
-@set APPENDIX major node
-@set SECTION minor node
-@set SUBSECTION node
-@set DARKCORNER (d.c.)
-@set COMMONEXT (c.e.)
-@end ifinfo
-@ifhtml
-@set DOCUMENT Web page
-@set CHAPTER chapter
-@set APPENDIX appendix
-@set SECTION section
-@set SUBSECTION subsection
-@set DARKCORNER (d.c.)
-@set COMMONEXT (c.e.)
-@end ifhtml
-@ifdocbook
-@set DOCUMENT book
-@set CHAPTER chapter
-@set APPENDIX appendix
-@set SECTION section
-@set SUBSECTION subsection
-@set DARKCORNER (d.c.)
-@set COMMONEXT (c.e.)
-@end ifdocbook
-@ifplaintext
-@set DOCUMENT book
-@set CHAPTER chapter
-@set APPENDIX appendix
-@set SECTION section
-@set SUBSECTION subsection
-@set DARKCORNER (d.c.)
-@set COMMONEXT (c.e.)
-@end ifplaintext
-
-@c some special symbols
-@iftex
-@set LEQ @math{@leq}
-@set PI @math{@pi}
-@end iftex
-@ifnottex
-@set LEQ <=
-@set PI @i{pi}
-@end ifnottex
-
-@ifnottex
-@macro ii{text}
-@i{\text\}
-@end macro
-@end ifnottex
-
-@c For HTML, spell out email addresses, to avoid problems with
-@c address harvesters for spammers.
-@ifhtml
-@macro EMAIL{real,spelled}
-``\spelled\''
-@end macro
-@end ifhtml
-@ifnothtml
-@macro EMAIL{real,spelled}
-@email{\real\}
-@end macro
-@end ifnothtml
-
-@set FN file name
-@set FFN File Name
-@set DF data file
-@set DDF Data File
-@set PVERSION version
-@set CTL Ctrl
-
-@ignore
-Some comments on the layout for TeX.
-1. Use at least texinfo.tex 2000-09-06.09
-2. I have done A LOT of work to make this look good. There are `@page' commands
- and use of `@group ... @end group' in a number of places. If you muck
- with anything, it's your responsibility not to break the layout.
-@end ignore
-
-@c merge the function and variable indexes into the concept index
-@ifinfo
-@synindex fn cp
-@synindex vr cp
-@end ifinfo
-@iftex
-@syncodeindex fn cp
-@syncodeindex vr cp
-@end iftex
-@ifxml
-@syncodeindex fn cp
-@syncodeindex vr cp
-@end ifxml
-
-@c If "finalout" is commented out, the printed output will show
-@c black boxes that mark lines that are too long. Thus, it is
-@c unwise to comment it out when running a master in case there are
-@c overfulls which are deemed okay.
-
-@iftex
-@finalout
-@end iftex
-
-@copying
-Copyright @copyright{} 2012
-Free Software Foundation, Inc.
-@sp 2
-
-This is Edition @value{EDITION} of @cite{@value{TITLE}: @value{SUBTITLE}},
-for the @value{VERSION}.@value{PATCHLEVEL} (or later) version of the GNU
-implementation of AWK.
-
-Permission is granted to copy, distribute and/or modify this document
-under the terms of the GNU Free Documentation License, Version 1.3 or
-any later version published by the Free Software Foundation; with the
-Invariant Sections being ``GNU General Public License'', the Front-Cover
-texts being (a) (see below), and with the Back-Cover Texts being (b)
-(see below). A copy of the license is included in the section entitled
-``GNU Free Documentation License''.
-
-@enumerate a
-@item
-``A GNU Manual''
-
-@item
-``You have the freedom to
-copy and modify this GNU manual. Buying copies from the FSF
-supports it in developing GNU and promoting software freedom.''
-@end enumerate
-@end copying
-
-@c Comment out the "smallbook" for technical review. Saves
-@c considerable paper. Remember to turn it back on *before*
-@c starting the page-breaking work.
-
-@c 4/2002: Karl Berry recommends commenting out this and the
-@c `@setchapternewpage odd', and letting users use `texi2dvi -t'
-@c if they want to waste paper.
-@c @smallbook
-
-
-@c Uncomment this for the release. Leaving it off saves paper
-@c during editing and review.
-@setchapternewpage odd
-
-@titlepage
-@title @value{TITLE}
-@subtitle @value{SUBTITLE}
-@subtitle Edition @value{EDITION}
-@subtitle @value{UPDATE-MONTH}
-@author Arnold D. Robbins
-
-@c Include the Distribution inside the titlepage environment so
-@c that headings are turned off. Headings on and off do not work.
-
-@page
-@vskip 0pt plus 1filll
-``To boldly go where no man has gone before'' is a
-Registered Trademark of Paramount Pictures Corporation. @*
-@c sorry, i couldn't resist
-@sp 3
-Published by:
-@sp 1
-
-Free Software Foundation @*
-51 Franklin Street, Fifth Floor @*
-Boston, MA 02110-1301 USA @*
-Phone: +1-617-542-5942 @*
-Fax: +1-617-542-2652 @*
-Email: @email{gnu@@gnu.org} @*
-URL: @uref{http://www.gnu.org/} @*
-
-@c This one is correct for gawk 3.1.0 from the FSF
-ISBN 1-882114-28-0 @*
-@sp 2
-@insertcopying
-@end titlepage
-
-@ifnottex
-@node Top
-@top Top Node
-
-Fake top node.
-
-@insertcopying
-
-@end ifnottex
-
-@menu
-* Extension API:: Writing Extensions for @command{gawk}.
-* Fake Chapter:: Fake Sections For Cross References.
-
-@detailmenu
-* Extension Intro:: What is an extension.
-* Plugin License:: A note about licensing.
-* Extension Design:: Design notes about the extension API.
-* Old Extension Problems:: Problems with the old mechanism.
-* Extension New Mechanism Goals:: Goals for the new mechanism.
-* Extension Other Design Decisions:: Some other design decisions.
-* Extension Mechanism Outline:: An outline of how it works.
-* Extension Future Growth:: Some room for future growth.
-* Extension API Description:: A full description of the API.
-* Extension API Functions Introduction:: Introduction to the API functions.
-* General Data Types:: The data types.
-* Requesting Values:: How to get a value.
-* Constructor Functions:: Functions for creating values.
-* Registration Functions:: Functions to register things with
- @command{gawk}.
-* Extension Functions:: Registering extension functions.
-* Input Parsers:: Registering an input parser.
-* Output Wrappers:: Registering an output wrapper.
-* Two-way processors:: Registering a two-way processor.
-* Exit Callback Functions:: Registering an exit callback.
-* Extension Version String:: Registering a version string.
-* Printing Messages:: Functions for printing messages.
-* Updating @code{ERRNO}:: Functions for updating @code{ERRNO}.
-* Accessing Parameters:: Functions for accessing parameters.
-* Symbol Table Access:: Functions for accessing global
- variables.
-* Symbol table by name:: Accessing variables by name.
-* Symbol table by cookie:: Accessing variables by ``cookie''.
-* Cached values:: Creating and using cached values.
-* Array Manipulation:: Functions for working with arrays.
-* Array Data Types:: Data types for working with arrays.
-* Array Functions:: Functions for working with arrays.
-* Flattening Arrays:: How to flatten arrays.
-* Creating Arrays:: How to create and populate arrays.
-* Extension API Variables:: Variables provided by the API.
-* Extension Versioning:: API Version information.
-* Extension API Informational Variables:: Variables providing information about
- @command{gawk}'s invocation.
-* Extension API Boilerplate:: Boilerplate code for using the API.
-* Finding Extensions:: How @command{gawk} find compiled
- extensions.
-* Extension Example:: Example C code for an extension.
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
-* Extension Samples:: The sample extensions that ship with
- @code{gawk}.
-* Extension Sample File Functions:: The file functions sample.
-* Extension Sample Fnmatch:: An interface to @code{fnmatch()}.
-* Extension Sample Fork:: An interface to @code{fork()} and
- other process functions.
-* Extension Sample Ord:: Character to value to character
- conversions.
-* Extension Sample Readdir:: An interface to @code{readdir()}.
-* Extension Sample Revout:: Reversing output sample output
- wrapper.
-* Extension Sample Rev2way:: Reversing data sample two-way
- processor.
-* Extension Sample Read write array:: Serializing an array to a file.
-* Extension Sample Readfile:: Reading an entire file into a string.
-* Extension Sample API Tests:: Tests for the API.
-* Extension Sample Time:: An interface to @code{gettimeofday()}
- and @code{sleep()}.
-* gawkextlib:: The @code{gawkextlib} project.
-* Reference to Elements:: Referring to an Array Element.
-* Built-in:: Built-in Functions.
-* Built-in Variables:: Built-in Variables.
-* Options:: Command-Line Options.
-@end detailmenu
-@end menu
-
-@contents
-
-@node Extension API
-@chapter Writing Extensions for @command{gawk}
-
-It is possible to add new built-in functions to @command{gawk} using
-dynamically loaded libraries. This facility is available on systems (such
-as GNU/Linux) that support the C @code{dlopen()} and @code{dlsym()}
-functions. This @value{CHAPTER} describes how to create extensions
-using code written in C or C++. If you don't know anything about C
-programming, you can safely skip this @value{CHAPTER}, although you
-may wish to review the documentation on the extensions that come with
-@command{gawk} (@pxref{Extension Samples}), and the section on the
-@code{gawkextlib} project (@pxref{gawkextlib}).
-
-@quotation NOTE
-When @option{--sandbox} is specified, extensions are disabled
-(@pxref{Options}).
-@end quotation
-
-@menu
-* Extension Intro:: What is an extension.
-* Plugin License:: A note about licensing.
-* Extension Design:: Design notes about the extension API.
-* Extension API Description:: A full description of the API.
-* Extension Example:: Example C code for an extension.
-* Extension Samples:: The sample extensions that ship with
- @code{gawk}.
-* gawkextlib:: The @code{gawkextlib} project.
-@end menu
-
-@node Extension Intro
-@section Introduction
-
-An @dfn{extension} (sometimes called a @dfn{plug-in}) is a piece of
-external compiled code that @command{gawk} can load at runtime to
-provide additional functionality, over and above the built-in capabilities
-described in the rest of this @value{DOCUMENT}.
-
-Extensions are useful because they allow you (of course) to extend
-@command{gawk}'s functionality. For example, they can provide access to
-system calls (such as @code{chdir()} to change directory) and to other
-C library routines that could be of use. As with most software,
-``the sky is the limit;'' if you can imagine something that you might
-want to do and can write in C or C++, you can write an extension to do it!
-
-Extensions are written in C or C++, using the @dfn{Application Programming
-Interface} (API) defined for this purpose by the @command{gawk}
-developers. The rest of this @value{CHAPTER} explains the design
-decisions behind the API, the facilities it provides and how to use
-them, and presents a small sample extension. In addition, it documents
-the sample extensions included in the @command{gawk} distribution,
-and describes the @code{gawkextlib} project.
-
-@node Plugin License
-@section Extension Licensing
-
-Every dynamic extension should define the global symbol
-@code{plugin_is_GPL_compatible} to assert that it has been licensed under
-a GPL-compatible license. If this symbol does not exist, @command{gawk}
-emits a fatal error and exits when it tries to load your extension.
-
-The declared type of the symbol should be @code{int}. It does not need
-to be in any allocated section, though. The code merely asserts that
-the symbol exists in the global scope. Something like this is enough:
-
-@example
-int plugin_is_GPL_compatible;
-@end example
-
-@node Extension Design
-@section Extension API Design
-
-The first version of extensions for @command{gawk} was developed in
-the mid-1990s and released with @command{gawk} 3.1 in the late 1990s.
-The basic mechanisms and design remained unchanged for close to 15 years,
-until 2012.
-
-The old extension mechanism used data types and functions from
-@command{gawk} itself, with a ``clever hack'' to install extension
-functions.
-
-@command{gawk} included some sample extensions, of which a few were
-really useful. However, it was clear from the outset that the extension
-mechanism was bolted onto the side and was not really thought out.
-
-@menu
-* Old Extension Problems:: Problems with the old mechanism.
-* Extension New Mechanism Goals:: Goals for the new mechanism.
-* Extension Other Design Decisions:: Some other design decisions.
-* Extension Mechanism Outline:: An outline of how it works.
-* Extension Future Growth:: Some room for future growth.
-@end menu
-
-@node Old Extension Problems
-@subsection Problems With The Old Mechanism
-
-The old extension mechanism had several problems:
-
-@itemize @bullet
-@item
-It depended heavily upon @command{gawk} internals. Any time the
-@code{NODE} structure@footnote{A critical central data structure
-inside @command{gawk}.} changed, an extension would have to be
-recompiled. Furthermore, to really write extensions required understanding
-something about @command{gawk}'s internal functions. There was some
-documentation in this @value{DOCUMENT}, but it was quite minimal.
-
-@item
-Being able to call into @command{gawk} from an extension required linker
-facilities that are common on Unix-derived systems but that did
-not work on Windows systems; users wanting extensions on Windows
-had to statically link them into @command{gawk}, even though Windows supports
-dynamic loading of shared objects.
-
-@item
-The API would change occasionally as @command{gawk} changed; no compatibility
-between versions was ever offered or planned for.
-@end itemize
-
-Despite the drawbacks, the @command{xgawk} project developers forked
-@command{gawk} and developed several significant extensions. They also
-enhanced @command{gawk}'s facilities relating to file inclusion and
-shared object access.
-
-A new API was desired for a long time, but only in 2012 did the
-@command{gawk} maintainer and the @command{xgawk} developers finally
-start working on it together. More information about the @command{xgawk}
-project is provided in @ref{gawkextlib}.
-
-@node Extension New Mechanism Goals
-@subsection Goals For A New Mechanism
-
-Some goals for the new API were:
-
-@itemize @bullet
-@item
-The API should be independent of @command{gawk} internals. Changes in
-@command{gawk} internals should not be visible to the writer of an
-extension function.
-
-@item
-The API should provide @emph{binary} compatibility across @command{gawk}
-releases as long as the API itself does not change.
-
-@item
-The API should enable extensions written in C to have roughly the
-same ``appearance'' to @command{awk}-level code as @command{awk}
-functions do. This means that extensions should have:
-
-@itemize @minus
-@item
-The ability to access function parameters.
-
-@item
-The ability to turn an undefined parameter into an array (call by reference).
-
-@item
-The ability to create, access and update global variables.
-
-@item
-Easy access to all the elements of an array at once (``array flattening'')
-in order to loop over all the element in an easy fashion for C code.
-
-@item
-The ability to create arrays (including @command{gawk}'s true
-multi-dimensional arrays).
-@end itemize
-@end itemize
-
-Some additional important goals were:
-
-@itemize @bullet
-@item
-The API should use only features in ISO C 90, so that extensions
-can be written using the widest range of C and C++ compilers. The header
-should include the appropriate @samp{#ifdef __cplusplus} and @samp{extern "C"}
-magic so that a C++ compiler could be used. (If using C++, the runtime
-system has to be smart enough to call any constructors and destructors,
-as @command{gawk} is a C program. As of this writing, this has not been
-tested.)
-
-@item
-The API mechanism should not require access to @command{gawk}'s
-symbols@footnote{The @dfn{symbols} are the variables and functions
-defined inside @command{gawk}. Access to these symbols by code
-external to @command{gawk} loaded dynamically at runtime is
-problematic on Windows.} by the compile-time or dynamic linker,
-in order to enable creation of extensions that also work on Windows.
-@end itemize
-
-During development, it became clear that there were other features
-that should be available to extensions, which were also subsequently
-provided:
-
-@itemize @bullet
-@item
-Extensions should have the ability to hook into @command{gawk}'s
-I/O redirection mechanism. In particular, the @command{xgawk}
-developers provided a so-called ``open hook'' to take over reading
-records. During development, this was generalized to allow
-extensions to hook into input processing, output processing, and
-two-way I/O.
-
-@item
-An extension should be able to provide a ``call back'' function
-to perform clean up actions when @command{gawk} exits.
-
-@item
-An extension should be able to provide a version string so that
-@command{gawk}'s @option{--version} option can provide information
-about extensions as well.
-@end itemize
-
-@node Extension Other Design Decisions
-@subsection Other Design Decisions
-
-As an ``arbitrary'' design decision, extensions can read the values of
-built-in variables and arrays (such as @code{ARGV} and @code{FS}), but cannot
-change them, with the exception of @code{PROCINFO}.
-
-The reason for this is to prevent an extension function from affecting
-the flow of an @command{awk} program outside its control. While a real
-@command{awk} function can do what it likes, that is at the discretion
-of the programmer. An extension function should provide a service or
-make a C API available for use within @command{awk}, and not mess with
-@code{FS} or @code{ARGC} and @code{ARGV}.
-
-In addition, it becomes easy to start down a slippery slope. How
-much access to @command{gawk} facilities do extensions need?
-Do they need @code{getline}? What about calling @code{gsub()} or
-compiling regular expressions? What about calling into @command{awk}
-functions? (@emph{That} would be messy.)
-
-In order to avoid these issues, the @command{gawk} developers chose
-to start with the simplest, most basic features that are still truly useful.
-
-Another decision is that although @command{gawk} provides nice things like
-MPFR, and arrays indexed internally by integers, these features are not
-being brought out to the API in order to keep things simple and close to
-traditional @command{awk} semantics. (In fact, arrays indexed internally
-by integers are so transparent that they aren't even documented!)
-
-With time, the API will undoubtedly evolve; the @command{gawk} developers
-expect this to be driven by user needs. For now, the current API seems
-to provide a minimal yet powerful set of features for creating extensions.
-
-@node Extension Mechanism Outline
-@subsection At A High Level How It Works
-
-The requirement to avoid access to @command{gawk}'s symbols is, at first
-glance, a difficult one to meet.
-
-One design, apparently used by Perl and Ruby and maybe others, would
-be to make the mainline @command{gawk} code into a library, with the
-@command{gawk} utility a small C @code{main()} function linked against
-the library.
-
-This seemed like the tail wagging the dog, complicating build and
-installation and making a simple copy of the @command{gawk} executable
-from one system to another (or one place to another on the same
-system!) into a chancy operation.
-
-Pat Rankin suggested the solution that was adopted. Communication between
-@command{gawk} and an extension is two-way. First, when an extension
-is loaded, it is passed a pointer to a @code{struct} whose fields are
-function pointers.
-@iftex
-This is shown in @ref{load-extension}.
-@end iftex
-
-@float Figure,load-extension
-@caption{Loading the extension}
-@ifinfo
-@center @image{api-figure1, , , Loading the extension, txt}
-@end ifinfo
-@ifhtml
-@center @image{api-figure1, , , Loading the extension, png}
-@end ifhtml
-@ifnotinfo
-@ifnothtml
-@center @image{api-figure1, , , Loading the extension}
-@end ifnothtml
-@end ifnotinfo
-@end float
-
-The extension can call functions inside @command{gawk} through these
-function pointers, at runtime, without needing (link-time) access
-to @command{gawk}'s symbols. One of these function pointers is to a
-function for ``registering'' new built-in functions.
-@iftex
-This is shown in @ref{load-new-function}.
-@end iftex
-
-@float Figure,load-new-function
-@caption{Loading the new function}
-@ifinfo
-@center @image{api-figure2, , , Loading the new function, txt}
-@end ifinfo
-@ifhtml
-@center @image{api-figure2, , , Loading the new function, png}
-@end ifhtml
-@ifnotinfo
-@ifnothtml
-@center @image{api-figure2, , , Loading the new function}
-@end ifnothtml
-@end ifnotinfo
-@end float
-
-In the other direction, the extension registers its new functions
-with @command{gawk} by passing function pointers to the functions that
-provide the new feature (@code{do_chdir()}, for example). @command{gawk}
-associates the function pointer with a name and can then call it, using a
-defined calling convention.
-@iftex
-This is shown in @ref{call-new-function}.
-@end iftex
-
-@float Figure,call-new-function
-@caption{Calling the new function}
-@ifinfo
-@center @image{api-figure3, , , Calling the new function, txt}
-@end ifinfo
-@ifhtml
-@center @image{api-figure3, , , Calling the new function, png}
-@end ifhtml
-@ifnotinfo
-@ifnothtml
-@center @image{api-figure3, , , Calling the new function}
-@end ifnothtml
-@end ifnotinfo
-@end float
-
-The @code{do_@var{xxx}()} function, in turn, then uses the function
-pointers in the API @code{struct} to do its work, such as updating
-variables or arrays, printing messages, setting @code{ERRNO}, and so on.
-
-Convenience macros in the @file{gawkapi.h} header file make calling
-through the function pointers look like regular function calls so that
-extension code is quite readable and understandable.
-
-Although all of this sounds medium complicated, the result is that
-extension code is quite clean and straightforward. This can be seen in
-the sample extensions @file{filefuncs.c} (@pxref{Extension Example})
-and also the @file{testext.c} code for testing the APIs.
-
-Some other bits and pieces:
-
-@itemize @bullet
-@item
-The API provides access to @command{gawk}'s @code{do_@var{xxx}} values,
-reflecting command line options, like @code{do_lint}, @code{do_profiling}
-and so on (@pxref{Extension API Variables}).
-These are informational: an extension cannot affect these
-inside @command{gawk}. In addition, attempting to assign to them
-produces a compile-time error.
-
-@item
-The API also provides major and minor version numbers, so that an
-extension can check if the @command{gawk} it is loaded with supports the
-facilities it was compiled with. (Version mismatches ``shouldn't''
-happen, but we all know how @emph{that} goes.)
-@xref{Extension Versioning}, for details.
-@end itemize
-
-@node Extension Future Growth
-@subsection Room For Future Growth
-
-The API provides room for future growth, in two ways.
-
-An ``extension id'' is passed into the extension when its loaded. This
-extension id is then passed back to @command{gawk} with each function
-call. This allows @command{gawk} to identify the extension calling into it,
-should it need to know.
-
-A ``name space'' is passed into @command{gawk} when an extension function
-is registered. This provides for a future mechanism for grouping
-extension functions and possibly avoiding name conflicts.
-
-Of course, as of this writing, no decisions have been made with respect
-to any of the above.
-
-@node Extension API Description
-@section API Description
-
-This (rather large) @value{SECTION} describes the API in detail.
-
-@menu
-* Extension API Functions Introduction:: Introduction to the API functions.
-* General Data Types:: The data types.
-* Requesting Values:: How to get a value.
-* Constructor Functions:: Functions for creating values.
-* Registration Functions:: Functions to register things with
- @command{gawk}.
-* Printing Messages:: Functions for printing messages.
-* Updating @code{ERRNO}:: Functions for updating @code{ERRNO}.
-* Accessing Parameters:: Functions for accessing parameters.
-* Symbol Table Access:: Functions for accessing global
- variables.
-* Array Manipulation:: Functions for working with arrays.
-* Extension API Variables:: Variables provided by the API.
-* Extension API Boilerplate:: Boilerplate code for using the API.
-* Finding Extensions:: How @command{gawk} find compiled
- extensions.
-@end menu
-
-@node Extension API Functions Introduction
-@subsection Introduction
-
-Access to facilities within @command{gawk} are made available
-by calling through function pointers passed into your extension.
-
-API function pointers are provided for the following kinds of operations:
-
-@itemize @bullet
-@item
-Registrations functions. You may register:
-@itemize @minus
-@item
-extension functions,
-@item
-exit callbacks,
-@item
-a version string,
-@item
-input parsers,
-@item
-output wrappers,
-@item
-and two-way processors.
-@end itemize
-All of these are discussed in detail, later in this @value{CHAPTER}.
-
-@item
-Printing fatal, warning, and ``lint'' warning messages.
-
-@item
-Updating @code{ERRNO}, or unsetting it.
-
-@item
-Accessing parameters, including converting an undefined parameter into
-an array.
-
-@item
-Symbol table access: retrieving a global variable, creating one,
-or changing one. This also includes the ability to create a scalar
-variable that will be @emph{constant} within @command{awk} code.
-
-@item
-Creating and releasing cached values; this provides an
-efficient way to use values for multiple variables and
-can be a big performance win.
-
-@item
-Manipulating arrays:
-@itemize @minus
-@item
-Retrieving, adding, deleting, and modifying elements
-@item
-Getting the count of elements in an array
-@item
-Creating a new array
-@item
-Clearing an array
-@item
-Flattening an array for easy C style looping over all its indices and elements
-@end itemize
-@end itemize
-
-Some points about using the API:
-
-@itemize @bullet
-@item
-You must include @code{<sys/types.h>} and @code{<sys/stat.h>} before including
-the @file{gawkapi.h} header file. In addition, you must include either
-@code{<stddef.h>} or @code{<stdlib.h>} to get the definition of @code{size_t}.
-If you wish to use the boilerplate @code{dl_load_func()} macro, you will
-need to include @code{<stdio.h>} as well.
-Finally, to pass reasonable integer values for @code{ERRNO}, you
-will need to include @code{<errno.h>}.
-
-@item
-Although the API only uses ISO C 90 features, there is an exception; the
-``constructor'' functions use the @code{inline} keyword. If your compiler
-does not support this keyword, you should either place
-@samp{-Dinline=''} on your command line, or use the GNU Autotools and include a
-@file{config.h} file in your extensions.
-
-@item
-All pointers filled in by @command{gawk} are to memory
-managed by @command{gawk} and should be treated by the extension as
-read-only. Memory for @emph{all} strings passed into @command{gawk}
-from the extension @emph{must} come from @code{malloc()} and is managed
-by @command{gawk} from then on.
-
-@item
-The API defines several simple structs that map values as seen
-from @command{awk}. A value can be a @code{double}, a string, or an
-array (as in multidimensional arrays, or when creating a new array).
-Strings maintain both pointer and length since embedded @code{NUL}
-characters are allowed.
-
-By intent, strings are maintained using the current multibyte encoding (as
-defined by @env{LC_@var{xxx}} environment variables) and not using wide
-characters. This matches how @command{gawk} stores strings internally
-and also how characters are likely to be input and output from files.
-
-@item
-When retrieving a value (such as a parameter or that of a global variable
-or array element), the extension requests a specific type (number, string,
-scalars, value cookie, array, or ``undefined''). When the request is
-``undefined,'' the returned value will have the real underlying type.
-
-However, if the request and actual type don't match, the access function
-returns ``false'' and fills in the type of the actual value that is there,
-so that the extension can, e.g., print an error message
-(``scalar passed where array expected'').
-
-@c This is documented in the header file and needs some expanding upon.
-@c The table there should be presented here
-@end itemize
-
-While you may call the API functions by using the function pointers
-directly, the interface is not so pretty. To make extension code look
-more like regular code, the @file{gawkapi.h} header file defines a number
-of macros which you should use in your code. This @value{SECTION} presents
-the macros as if they were functions.
-
-@node General Data Types
-@subsection General Purpose Data Types
-
-@quotation
-@i{I have a true love/hate relationship with unions.}@*
-Arnold Robbins
-
-@i{That's the thing about unions: the compiler will arrange things so they
-can accommodate both love and hate.}@*
-Chet Ramey
-@end quotation
-
-The extension API defines a number of simple types and structures for general
-purpose use. Additional, more specialized, data structures, are introduced
-in subsequent @value{SECTION}s, together with the functions that use them.
-
-@table @code
-@item typedef void *awk_ext_id_t;
-A value of this type is received from @command{gawk} when an extension is loaded.
-That value must then be passed back to @command{gawk} as the first parameter of
-each API function.
-
-@item #define awk_const @dots{}
-This macro expands to @samp{const} when compiling an extension,
-and to nothing when compiling @command{gawk} itself. This makes
-certain fields in the API data structures unwritable from extension code,
-while allowing @command{gawk} to use them as it needs to.
-
-@item typedef int awk_bool_t;
-A simple boolean type. At the moment, the API does not define special
-``true'' and ``false'' values, although perhaps it should.
-
-@item typedef struct @{
-@itemx @ @ @ @ char *str;@ @ @ @ @ @ /* data */
-@itemx @ @ @ @ size_t len;@ @ @ @ @ /* length thereof, in chars */
-@itemx @} awk_string_t;
-This represents a mutable string. @command{gawk}
-owns the memory pointed to if it supplied
-the value. Otherwise, it takes ownership of the memory pointed to.
-@strong{Such memory must come from @code{malloc()}!}
-
-As mentioned earlier, strings are maintained using the current
-multibyte encoding.
-
-@item typedef enum @{
-@itemx @ @ @ @ AWK_UNDEFINED,
-@itemx @ @ @ @ AWK_NUMBER,
-@itemx @ @ @ @ AWK_STRING,
-@itemx @ @ @ @ AWK_ARRAY,
-@itemx @ @ @ @ AWK_SCALAR,@ @ @ @ @ @ @ @ @ /* opaque access to a variable */
-@itemx @ @ @ @ AWK_VALUE_COOKIE@ @ @ /* for updating a previously created value */
-@itemx @} awk_valtype_t;
-This @code{enum} indicates the type of a value.
-It is used in the following @code{struct}.
-
-@item typedef struct @{
-@itemx @ @ @ @ awk_valtype_t val_type;
-@itemx @ @ @ @ union @{
-@itemx @ @ @ @ @ @ @ @ awk_string_t@ @ @ @ @ @ @ s;
-@itemx @ @ @ @ @ @ @ @ double@ @ @ @ @ @ @ @ @ @ @ @ @ d;
-@itemx @ @ @ @ @ @ @ @ awk_array_t@ @ @ @ @ @ @ @ a;
-@itemx @ @ @ @ @ @ @ @ awk_scalar_t@ @ @ @ @ @ @ scl;
-@itemx @ @ @ @ @ @ @ @ awk_value_cookie_t@ vc;
-@itemx @ @ @ @ @} u;
-@itemx @} awk_value_t;
-An ``@command{awk} value.''
-The @code{val_type} member indicates what kind of value the
-@code{union} holds, and each member is of the appropriate type.
-
-@item #define str_value@ @ @ @ @ @ u.s
-@itemx #define num_value@ @ @ @ @ @ u.d
-@itemx #define array_cookie@ @ @ u.a
-@itemx #define scalar_cookie@ @ u.scl
-@itemx #define value_cookie@ @ @ u.vc
-These macros make accessing the fields of the @code{awk_value_t} more
-readable.
-
-@item typedef void *awk_scalar_t;
-Scalars can be represented as an opaque type. These values are obtained from
-@command{gawk} and then passed back into it. This is discussed in a general fashion below,
-and in more detail in @ref{Symbol table by cookie}.
-
-@item typedef void *awk_value_cookie_t;
-A ``value cookie'' is an opaque type representing a cached value.
-This is also discussed in a general fashion below,
-and in more detail in @ref{Cached values}.
-
-@end table
-
-Scalar values in @command{awk} are either numbers or strings. The
-@code{awk_value_t} struct represents values. The @code{val_type} member
-indicates what is in the @code{union}.
-
-Representing numbers is easy---the API uses a C @code{double}. Strings
-require more work. Since @command{gawk} allows embedded @code{NUL} bytes
-in string values, a string must be represented as a pair containing a
-data-pointer and length. This is the @code{awk_string_t} type.
-
-Identifiers (i.e., the names of global variables) can be associated
-with either scalar values or with arrays. In addition, @command{gawk}
-provides true arrays of arrays, where any given array element can
-itself be an array. Discussion of arrays is delayed until
-@ref{Array Manipulation}.
-
-The various macros listed earlier make it easier to use the elements
-of the @code{union} as if they were fields in a @code{struct}; this
-is a common coding practice in C. Such code is easier to write and to
-read, however it remains @emph{your} responsibility to make sure that
-the @code{val_type} member correctly reflects the type of the value in
-the @code{awk_value_t}.
-
-Conceptually, the first three members of the @code{union} (number, string,
-and array) are all that is needed for working with @command{awk} values.
-However, since the API provides routines for accessing and changing
-the value of global scalar variables only by using the variable's name,
-there is a performance penalty: @command{gawk} must find the variable
-each time it is accessed and changed. This turns out to be a real issue,
-not just a theoretical one.
-
-Thus, if you know that your extension will spend considerable time
-reading and/or changing the value of one or more scalar variables, you
-can obtain a @dfn{scalar cookie}@footnote{See
-@uref{http://catb.org/jargon/html/C/cookie.html, the ``cookie'' entry in the Jargon file} for a
-definition of @dfn{cookie}, and @uref{http://catb.org/jargon/html/M/magic-cookie.html,
-the ``magic cookie'' entry in the Jargon file} for a nice example. See
-also the entry for ``Cookie'' in the @ref{Glossary}.}
-object for that variable, and then use
-the cookie for getting the variable's value or for changing the variable's
-value.
-This is the @code{awk_scalar_t} type and @code{scalar_cookie} macro.
-Given a scalar cookie, @command{gawk} can directly retrieve or
-modify the value, as required, without having to first find it.
-
-The @code{awk_value_cookie_t} type and @code{value_cookie} macro are similar.
-If you know that you wish to
-use the same numeric or string @emph{value} for one or more variables,
-you can create the value once, retaining a @dfn{value cookie} for it,
-and then pass in that value cookie whenever you wish to set the value of a
-variable. This saves both storage space within the running @command{gawk}
-process as well as the time needed to create the value.
-
-@node Requesting Values
-@subsection Requesting Values
-
-All of the functions that return values from @command{gawk}
-work in the same way. You pass in an @code{awk_valtype_t} value
-to indicate what kind of value you expect. If the actual value
-matches what you requested, the function returns true and fills
-in the @code{awk_value_t} result.
-Otherwise, the function returns false, and the @code{val_type}
-member indicates the type of the actual value. You may then
-print an error message, or reissue the request for the actual
-value type, as appropriate. This behavior is summarized in
-@ref{table-value-types-returned}.
-
-@ifnotplaintext
-@float Table,table-value-types-returned
-@caption{Value Types Returned}
-@multitable @columnfractions .50 .50
-@headitem @tab Type of Actual Value:
-@end multitable
-@multitable @columnfractions .166 .166 .198 .15 .15 .166
-@headitem @tab @tab String @tab Number @tab Array @tab Undefined
-@item @tab @b{String} @tab String @tab String @tab false @tab false
-@item @tab @b{Number} @tab Number if can be converted, else false @tab Number @tab false @tab false
-@item @b{Type} @tab @b{Array} @tab false @tab false @tab Array @tab false
-@item @b{Requested:} @tab @b{Scalar} @tab Scalar @tab Scalar @tab false @tab false
-@item @tab @b{Undefined} @tab String @tab Number @tab Array @tab Undefined
-@item @tab @b{Value Cookie} @tab false @tab false @tab false @tab false
-@end multitable
-@end float
-@end ifnotplaintext
-@ifplaintext
-@float Table,table-value-types-returned
-@caption{Value Types Returned}
-@example
- +-------------------------------------------------+
- | Type of Actual Value: |
- +------------+------------+-----------+-----------+
- | String | Number | Array | Undefined |
-+-----------+-----------+------------+------------+-----------+-----------+
-| | String | String | String | false | false |
-| |-----------+------------+------------+-----------+-----------+
-| | Number | Number if | Number | false | false |
-| | | can be | | | |
-| | | converted, | | | |
-| | | else false | | | |
-| |-----------+------------+------------+-----------+-----------+
-| Type | Array | false | false | Array | false |
-| Requested |-----------+------------+------------+-----------+-----------+
-| | Scalar | Scalar | Scalar | false | false |
-| |-----------+------------+------------+-----------+-----------+
-| | Undefined | String | Number | Array | Undefined |
-| |-----------+------------+------------+-----------+-----------+
-| | Value | false | false | false | false |
-| | Cookie | | | | |
-+-----------+-----------+------------+------------+-----------+-----------+
-@end example
-@end float
-@end ifplaintext
-
-@node Constructor Functions
-@subsection Constructor Functions and Convenience Macros
-
-The API provides a number of @dfn{constructor} functions for creating
-string and numeric values, as well as a number of convenience macros.
-This @value{SUBSECTION} presents them all as function prototypes, in
-the way that extension code would use them.
-
-@table @code
-@item static inline awk_value_t *
-@itemx make_const_string(const char *string, size_t length, awk_value_t *result)
-This function creates a string value in the @code{awk_value_t} variable
-pointed to by @code{result}. It expects @code{string} to be a C string constant
-(or other string data), and automatically creates a @emph{copy} of the data
-for storage in @code{result}. It returns @code{result}.
-
-@item static inline awk_value_t *
-@itemx make_malloced_string(const char *string, size_t length, awk_value_t *result)
-This function creates a string value in the @code{awk_value_t} variable
-pointed to by @code{result}. It expects @code{string} to be a @samp{char *}
-value pointing to data previously obtained from @code{malloc()}. The idea here
-is that the data is passed directly to @command{gawk}, which assumes
-responsibility for it. It returns @code{result}.
-
-@item static inline awk_value_t *
-@itemx make_null_string(awk_value_t *result)
-This specialized function creates a null string (the ``undefined'' value)
-in the @code{awk_value_t} variable pointed to by @code{result}.
-It returns @code{result}.
-
-@item static inline awk_value_t *
-@itemx make_number(double num, awk_value_t *result)
-This function simply creates a numeric value in the @code{awk_value_t} variable
-pointed to by @code{result}.
-@end table
-
-Two convenience macros may be used for allocating storage from @code{malloc()}
-and @code{realloc()}. If the allocation fails, they cause @command{gawk} to
-exit with a fatal error message. They should be used as if they were
-procedure calls that do not return a value.
-
-@table @code
-@item emalloc(pointer, type, size, message)
-The arguments to this macro are as follows:
-@c nested table
-@table @code
-@item pointer
-The pointer variable to point at the allocated storage.
-
-@item type
-The type of the pointer variable, used to create a cast for the call to @code{malloc()}.
-
-@item size
-The total number of bytes to be allocated.
-
-@item message
-A message to be prefixed to the fatal error message. Typically this is the name
-of the function using the macro.
-@end table
-
-@noindent
-For example, you might allocate a string value like so:
-
-@example
-awk_value_t result;
-char *message;
-const char greet[] = "Don't Panic!";
-
-emalloc(message, char *, sizeof(greet), "myfunc");
-strcpy(message, greet);
-make_malloced_string(message, strlen(message), & result);
-@end example
-
-@item erealloc(pointer, type, size, message)
-This is like @code{emalloc()}, but it calls @code{realloc()},
-instead of @code{malloc()}.
-The arguments are the same as for the @code{emalloc()} macro.
-@end table
-
-@node Registration Functions
-@subsection Registration Functions
-
-This @value{SECTION} describes the API functions for
-registering parts of your extension with @command{gawk}.
-
-@menu
-* Extension Functions:: Registering extension functions.
-* Exit Callback Functions:: Registering an exit callback.
-* Extension Version String:: Registering a version string.
-* Input Parsers:: Registering an input parser.
-* Output Wrappers:: Registering an output wrapper.
-* Two-way processors:: Registering a two-way processor.
-@end menu
-
-@node Extension Functions
-@subsubsection Registering An Extension Function
-
-Extension functions are described by the following record:
-
-@example
-typedef struct @{
-@ @ @ @ const char *name;
-@ @ @ @ awk_value_t *(*function)(int num_actual_args, awk_value_t *result);
-@ @ @ @ size_t num_expected_args;
-@} awk_ext_func_t;
-@end example
-
-The fields are:
-
-@table @code
-@item const char *name;
-The name of the new function.
-@command{awk} level code calls the function by this name.
-This is a regular C string.
-
-@item awk_value_t *(*function)(int num_actual_args, awk_value_t *result);
-This is a pointer to the C function that provides the desired
-functionality.
-The function must fill in the result with either a number
-or a string. @command{awk} takes ownership of any string memory.
-As mentioned earlier, string memory @strong{must} come from @code{malloc()}.
-
-The function must return the value of @code{result}.
-This is for the convenience of the calling code inside @command{gawk}.
-
-@item size_t num_expected_args;
-This is the number of arguments the function expects to receive.
-Each extension function may decide what to do if the number of
-arguments isn't what it expected. Following @command{awk} functions, it
-is likely OK to ignore extra arguments.
-@end table
-
-Once you have a record representing your extension function, you register
-it with @command{gawk} using this API function:
-
-@table @code
-@item awk_bool_t add_ext_func(const char *namespace, const awk_ext_func_t *func);
-This function returns true upon success, false otherwise.
-The @code{namespace} parameter is currently not used; you should pass in an
-empty string (@code{""}). The @code{func} pointer is the address of a
-@code{struct} representing your function, as just described.
-@end table
-
-@node Exit Callback Functions
-@subsubsection Registering An Exit Callback Function
-
-An @dfn{exit callback} function is a function that
-@command{gawk} calls before it exits.
-Such functions are useful if you have general ``clean up'' tasks
-that should be performed in your extension (such as closing data
-base connections or other resource deallocations).
-You can register such
-a function with @command{gawk} using the following function.
-
-@table @code
-@item void awk_atexit(void (*funcp)(void *data, int exit_status),
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ void *arg0);
-The parameters are:
-@c nested table
-@table @code
-@item funcp
-A pointer to the function to be called before @command{gawk} exits. The @code{data}
-parameter will be the original value of @code{arg0}.
-The @code{exit_status} parameter is
-the exit status value that @command{gawk} will pass to the @code{exit()} system call.
-
-@item arg0
-A pointer to private data which @command{gawk} saves in order to pass to
-the function pointed to by @code{funcp}.
-@end table
-@end table
-
-Exit callback functions are called in Last-In-First-Out (LIFO) order---that is, in
-the reverse order in which they are registered with @command{gawk}.
-
-@node Extension Version String
-@subsubsection Registering An Extension Version String
-
-You can register a version string which indicates the name and
-version of your extension, with @command{gawk}, as follows:
-
-@table @code
-@item void register_ext_version(const char *version);
-Register the string pointed to by @code{version} with @command{gawk}.
-@command{gawk} does @emph{not} copy the @code{version} string, so
-it should not be changed.
-@end table
-
-@command{gawk} prints all registered extension version strings when it
-is invoked with the @option{--version} option.
-
-@node Input Parsers
-@subsubsection Customized Input Parsers
-
-By default, @command{gawk} reads text files as its input. It uses the value
-of @code{RS} to find the end of the record, and then uses @code{FS}
-(or @code{FIELDWIDTHS}) to split it into fields (@pxref{Reading Files}).
-Additionally, it sets the value of @code{RT} (@pxref{Built-in Variables}).
-
-If you want, you can provide your own, custom, input parser. An input
-parser's job is to return a record to the @command{gawk} record processing
-code, along with indicators for the value and length of the data to be
-used for @code{RT}, if any.
-
-To provide an input parser, you must first provide two functions
-(where @var{XXX} is a prefix name for your extension):
-
-@table @code
-@item awk_bool_t @var{XXX}_can_take_file(const awk_input_buf_t *iobuf)
-This function examines the information available in @code{iobuf}
-(which we discuss shortly). Based on the information there, it
-decides if the input parser should be used for this file.
-If so, it should return true. Otherwise, it should return false.
-It should not change any state (variable values, etc.) within @command{gawk}.
-
-@item awk_bool_t @var{XXX}_take_control_of(awk_input_buf_t *iobuf)
-When @command{gawk} decides to hand control of the file over to the
-input parser, it calls this function. This function in turn must fill
-in certain fields in the @code{awk_input_buf_t} structure, and ensure
-that certain conditions are true. It should then return true. If an
-error of some kind occurs, it should not fill in any fields, and should
-return false; then @command{gawk} will not use the input parser.
-The details are presented shortly.
-@end table
-
-Your extension should package these functions inside an
-@code{awk_input_parser_t}, which looks like this:
-
-@example
-typedef struct input_parser @{
- const char *name; /* name of parser */
- awk_bool_t (*can_take_file)(const awk_input_buf_t *iobuf);
- awk_bool_t (*take_control_of)(awk_input_buf_t *iobuf);
- awk_const struct input_parser *awk_const next; /* for use by gawk */
-@} awk_input_parser_t;
-@end example
-
-The fields are:
-
-@table @code
-@item const char *name;
-The name of the input parser. This is a regular C string.
-
-@item awk_bool_t (*can_take_file)(const awk_input_buf_t *iobuf);
-A pointer to your @code{@var{XXX}_can_take_file()} function.
-
-@item awk_bool_t (*take_control_of)(awk_input_buf_t *iobuf);
-A pointer to your @code{@var{XXX}_take_control_of()} function.
-
-@item awk_const struct input_parser *awk_const next;
-This pointer is used by @command{gawk}.
-The extension cannot modify it.
-@end table
-
-The steps are as follows:
-
-@enumerate
-@item
-Create a @code{static awk_input_parser_t} variable and initialize it
-appropriately.
-
-@item
-When your extension is loaded, register your input parser with
-@command{gawk} using the @code{register_input_parser()} API function
-(described below).
-@end enumerate
-
-An @code{awk_input_buf_t} looks like this:
-
-@example
-typedef struct awk_input @{
- const char *name; /* filename */
- int fd; /* file descriptor */
-#define INVALID_HANDLE (-1)
- void *opaque; /* private data for input parsers */
- int (*get_record)(char **out, struct awk_input *iobuf,
- int *errcode, char **rt_start, size_t *rt_len);
- void (*close_func)(struct awk_input *iobuf);
- struct stat sbuf; /* stat buf */
-@} awk_input_buf_t;
-@end example
-
-The fields can be divided into two categories: those for use (initially,
-at least) by @code{@var{XXX}_can_take_file()}, and those for use by
-@code{@var{XXX}_take_control_of()}. The first group of fields and their uses
-are as follows:
-
-@table @code
-@item const char *name;
-The name of the file.
-
-@item int fd;
-A file descriptor for the file. If @command{gawk} was able to
-open the file, then @code{fd} will @emph{not} be equal to
-@code{INVALID_HANDLE}. Otherwise, it will.
-
-@item struct stat sbuf;
-If file descriptor is valid, then @command{gawk} will have filled
-in this structure via a call to the @code{fstat()} system call.
-@end table
-
-The @code{@var{XXX}_can_take_file()} function should examine these
-fields and decide if the input parser should be used for the file.
-The decision can be made based upon @command{gawk} state (the value
-of a variable defined previously by the extension and set by
-@command{awk} code), the name of the
-file, whether or not the file descriptor is valid, the information
-in the @code{struct stat}, or any combination of the above.
-
-Once @code{@var{XXX}_can_take_file()} has returned true, and
-@command{gawk} has decided to use your input parser, it calls
-@code{@var{XXX}_take_control_of()}. That function then fills in at
-least the @code{get_record} field of the @code{awk_input_buf_t}. It must
-also ensure that @code{fd} is not set to @code{INVALID_HANDLE}. All of
-the fields that may be filled by @code{@var{XXX}_take_control_of()}
-are as follows:
-
-@table @code
-@item void *opaque;
-This is used to hold any state information needed by the input parser
-for this file. It is ``opaque'' to @command{gawk}. The input parser
-is not required to use this pointer.
-
-@item int@ (*get_record)(char@ **out,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ struct@ awk_input *iobuf,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ int *errcode,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ char **rt_start,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ size_t *rt_len);
-This function pointer should point to a function that creates the input
-records. Said function is the core of the input parser. Its behavior
-is described below.
-
-@item void (*close_func)(struct awk_input *iobuf);
-This function pointer should point to a function that does
-the ``tear down.'' It should release any resources allocated by
-@code{@var{XXX}_take_control_of()}. It may also close the file. If it
-does so, it should set the @code{fd} field to @code{INVALID_HANDLE}.
-
-If @code{fd} is still not @code{INVALID_HANDLE} after the call to this
-function, @command{gawk} calls the regular @code{close()} system call.
-
-Having a ``tear down'' function is optional. If your input parser does
-not need it, do not set this field. Then, @command{gawk} calls the
-regular @code{close()} system call on the file descriptor, so it should
-be valid.
-@end table
-
-The @code{@var{XXX}_get_record()} function does the work of creating
-input records. The parameters are as follows:
-
-@table @code
-@item char **out
-This is a pointer to a @code{char *} variable which is set to point
-to the record. @command{gawk} makes its own copy of the data, so
-the extension must manage this storage.
-
-@item struct awk_input *iobuf
-This is the @code{awk_input_buf_t} for the file. The fields should be
-used for reading data (@code{fd}) and for managing private state
-(@code{opaque}), if any.
-
-@item int *errcode
-If an error occurs, @code{*errcode} should be set to an appropriate
-code from @code{<errno.h>}.
-
-@item char **rt_start
-@itemx size_t *rt_len
-If the concept of a ``record terminator'' makes sense, then
-@code{*rt_start} should be set to point to the data to be used for
-@code{RT}, and @code{*rt_len} should be set to the length of the
-data. Otherwise, @code{*rt_len} should be set to zero.
-@code{gawk} makes its own copy of this data, so the
-extension must manage the storage.
-@end table
-
-The return value is the length of the buffer pointed to by
-@code{*out}, or @code{EOF} if end-of-file was reached or an
-error occurred.
-
-It is guaranteed that @code{errcode} is a valid pointer, so there is no
-need to test for a @code{NULL} value. @command{gawk} sets @code{*errcode}
-to zero, so there is no need to set it unless an error occurs.
-
-If an error does occur, the function should return @code{EOF} and set
-@code{*errcode} to a non-zero value. In that case, if @code{*errcode}
-does not equal @minus{}1, @command{gawk} automatically updates
-the @code{ERRNO} variable based on the value of @code{*errcode} (e.g.,
-setting @samp{*errcode = errno} should do the right thing).
-
-@command{gawk} ships with a sample extension that reads directories,
-returning records for each entry in the directory (@pxref{Extension
-Sample Readdir}). You may wish to use that code as a guide for writing
-your own input parser.
-
-When writing an input parser, you should think about (and document)
-how it is expected to interact with @command{awk} code. You may want
-it to always be called, and take effect as appropriate (as the
-@code{readdir} extension does). Or you may want it to take effect
-based upon the value of an @code{awk} variable, as the XML extension
-from the @code{gawkextlib} project does (@pxref{gawkextlib}).
-In the latter case, code in a @code{BEGINFILE} section
-can look at @code{FILENAME} and @code{ERRNO} to decide whether or
-not to activate an input parser (@pxref{BEGINFILE/ENDFILE}).
-
-You register your input parser with the following function:
-
-@table @code
-@item void register_input_parser(awk_input_parser_t *input_parser);
-Register the input parser pointed to by @code{input_parser} with
-@command{gawk}.
-@end table
-
-@node Output Wrappers
-@subsubsection Customized Output Wrappers
-
-An @dfn{output wrapper} is the mirror image of an input parser.
-It allows an extension to take over the output to a file opened
-with the @samp{>} or @samp{>>} operators (@pxref{Redirection}).
-
-The output wrapper is very similar to the input parser structure:
-
-@example
-typedef struct output_wrapper @{
- const char *name; /* name of the wrapper */
- awk_bool_t (*can_take_file)(const awk_output_buf_t *outbuf);
- awk_bool_t (*take_control_of)(awk_output_buf_t *outbuf);
- awk_const struct output_wrapper *awk_const next; /* for use by gawk */
-@} awk_output_wrapper_t;
-@end example
-
-The members are as follows:
-
-@table @code
-@item const char *name;
-This is the name of the output wrapper.
-
-@item awk_bool_t (*can_take_file)(const awk_output_buf_t *outbuf);
-This points to a function that examines the information in
-the @code{awk_output_buf_t} structure pointed to by @code{outbuf}.
-It should return true if the output wrapper wants to take over the
-file, and false otherwise. It should not change any state (variable
-values, etc.) within @command{gawk}.
-
-@item awk_bool_t (*take_control_of)(awk_output_buf_t *outbuf);
-The function pointed to by this field is called when @command{gawk}
-decides to let the output wrapper take control of the file. It should
-fill in appropriate members of the @code{awk_output_buf_t} structure,
-as described below, and return true if successful, false otherwise.
-
-@item awk_const struct output_wrapper *awk_const next;
-This is for use by @command{gawk}.
-@end table
-
-The @code{awk_output_buf_t} structure looks like this:
-
-@example
-typedef struct @{
- const char *name; /* name of output file */
- const char *mode; /* mode argument to fopen */
- FILE *fp; /* stdio file pointer */
- awk_bool_t redirected; /* true if a wrapper is active */
- void *opaque; /* for use by output wrapper */
- size_t (*gawk_fwrite)(const void *buf, size_t size, size_t count,
- FILE *fp, void *opaque);
- int (*gawk_fflush)(FILE *fp, void *opaque);
- int (*gawk_ferror)(FILE *fp, void *opaque);
- int (*gawk_fclose)(FILE *fp, void *opaque);
-@} awk_output_buf_t;
-@end example
-
-Here too, your extension will define @code{@var{XXX}_can_take_file()}
-and @code{@var{XXX}_take_control_of()} functions that examine and update
-data members in the @code{awk_output_buf_t}.
-The data members are as follows:
-
-@table @code
-@item const char *name;
-The name of the output file.
-
-@item const char *mode;
-The mode string (as would be used in the second argument to @code{fopen()})
-with which the file was opened.
-
-@item FILE *fp;
-The @code{FILE} pointer from @code{<stdio.h>}. @command{gawk} opens the file
-before attempting to find an output wrapper.
-
-@item awk_bool_t redirected;
-This field must be set to true by the @code{@var{XXX}_take_control_of()} function.
-
-@item void *opaque;
-This pointer is opaque to @command{gawk}. The extension should use it to store
-a pointer to any private data associated with the file.
-
-@item size_t (*gawk_fwrite)(const void *buf, size_t size, size_t count,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ FILE *fp, void *opaque);
-@itemx int (*gawk_fflush)(FILE *fp, void *opaque);
-@itemx int (*gawk_ferror)(FILE *fp, void *opaque);
-@itemx int (*gawk_fclose)(FILE *fp, void *opaque);
-These pointers should be set to point to functions that perform
-the equivalent function as the @code{<stdio.h>} functions do, if appropriate.
-@command{gawk} uses these function pointers for all output.
-@command{gawk} initializes the pointers to point to internal, ``pass through''
-functions that just call the regular @code{<stdio.h>} functions, so an
-extension only needs to redefine those functions that are appropriate for
-what it does.
-@end table
-
-The @code{@var{XXX}_can_take_file()} function should make a decision based
-upon the @code{name} and @code{mode} fields, and any additional state
-(such as @command{awk} variable values) that is appropriate.
-
-When @command{gawk} calls @code{@var{XXX}_take_control_of()}, it should fill
-in the other fields, as appropriate, except for @code{fp}, which it should just
-use normally.
-
-You register your output wrapper with the following function:
-
-@table @code
-@item void register_output_wrapper(awk_output_wrapper_t *output_wrapper);
-Register the output wrapper pointed to by @code{output_wrapper} with
-@command{gawk}.
-@end table
-
-@node Two-way processors
-@subsubsection Customized Two-way Processors
-
-A @dfn{two-way processor} combines an input parser and an output wrapper for
-two-way I/O with the @samp{|&} operator (@pxref{Redirection}). It makes identical
-use of the @code{awk_input_parser_t} and @code{awk_output_buf_t} structures
-as described earlier.
-
-A two-way processor is represented by the following structure:
-
-@example
-typedef struct two_way_processor @{
- const char *name; /* name of the two-way processor */
- awk_bool_t (*can_take_two_way)(const char *name);
- awk_bool_t (*take_control_of)(const char *name,
- awk_input_buf_t *inbuf,
- awk_output_buf_t *outbuf);
- awk_const struct two_way_processor *awk_const next; /* for use by gawk */
-@} awk_two_way_processor_t;
-@end example
-
-The fields are as follows:
-
-@table @code
-@item const char *name;
-The name of the two-way processor.
-
-@item awk_bool_t (*can_take_two_way)(const char *name);
-This function returns true if it wants to take over two-way I/O for this filename.
-It should not change any state (variable
-values, etc.) within @command{gawk}.
-
-@item awk_bool_t (*take_control_of)(const char *name,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_input_buf_t *inbuf,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_output_buf_t *outbuf);
-This function should fill in the @code{awk_input_buf_t} and
-@code{awk_outut_buf_t} structures pointed to by @code{inbuf} and
-@code{outbuf}, respectively. These structures were described earlier.
-
-@item awk_const struct two_way_processor *awk_const next;
-This is for use by @command{gawk}.
-@end table
-
-As with the input parser and output processor, you provide
-``yes I can take this'' and ``take over for this'' functions,
-@code{@var{XXX}_can_take_two_way()} and @code{@var{XXX}_take_control_of()}.
-
-You register your two-way processor with the following function:
-
-@table @code
-@item void register_two_way_processor(awk_two_way_processor_t *two_way_processor);
-Register the two-way processor pointed to by @code{two_way_processor} with
-@command{gawk}.
-@end table
-
-@node Printing Messages
-@subsection Printing Messages
-
-You can print different kinds of warning messages from your
-extension, as described below. Note that for these functions,
-you must pass in the extension id received from @command{gawk}
-when the extension was loaded.@footnote{Because the API uses only ISO C 90
-features, it cannot make use of the ISO C 99 variadic macro feature to hide
-that parameter. More's the pity.}
-
-@table @code
-@item void fatal(awk_ext_id_t id, const char *format, ...);
-Print a message and then cause @command{gawk} to exit immediately.
-
-@item void warning(awk_ext_id_t id, const char *format, ...);
-Print a warning message.
-
-@item void lintwarn(awk_ext_id_t id, const char *format, ...);
-Print a ``lint warning.'' Normally this is the same as printing a
-warning message, but if @command{gawk} was invoked with @samp{--lint=fatal},
-then lint warnings become fatal error messages.
-@end table
-
-All of these functions are otherwise like the C @code{printf()}
-family of functions, where the @code{format} parameter is a string
-with literal characters and formatting codes intermixed.
-
-@node Updating @code{ERRNO}
-@subsection Updating @code{ERRNO}
-
-The following functions allow you to update the @code{ERRNO}
-variable:
-
-@table @code
-@item void update_ERRNO_int(int errno_val);
-Set @code{ERRNO} to the string equivalent of the error code
-in @code{errno_val}. The value should be one of the defined
-error codes in @code{<errno.h>}, and @command{gawk} turns it
-into a (possibly translated) string using the C @code{strerror()} function.
-
-@item void update_ERRNO_string(const char *string);
-Set @code{ERRNO} directly to the string value of @code{ERRNO}.
-@command{gawk} makes a copy of the value of @code{string}.
-
-@item void unset_ERRNO();
-Unset @code{ERRNO}.
-@end table
-
-@node Accessing Parameters
-@subsection Accessing and Updating Parameters
-
-Two functions give you access to the arguments (parameters)
-passed to your extension function. They are:
-
-@table @code
-@item awk_bool_t get_argument(size_t count,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
-Fill in the @code{awk_value_t} structure pointed to by @code{result}
-with the @code{count}'th argument. Return true if the actual
-type matches @code{wanted}, false otherwise. In the latter
-case, @code{result@w{->}val_type} indicates the actual type
-(@pxref{table-value-types-returned}). Counts are zero based---the first
-argument is numbered zero, the second one, and so on. @code{wanted}
-indicates the type of value expected.
-
-@item awk_bool_t set_argument(size_t count, awk_array_t array);
-Convert a parameter that was undefined into an array; this provides
-call-by-reference for arrays. Return false if @code{count} is too big,
-or if the argument's type is not undefined. @xref{Array Manipulation},
-for more information on creating arrays.
-@end table
-
-@node Symbol Table Access
-@subsection Symbol Table Access
-
-Two sets of routines provide access to global variables, and one set
-allows you to create and release cached values.
-
-@menu
-* Symbol table by name:: Accessing variables by name.
-* Symbol table by cookie:: Accessing variables by ``cookie''.
-* Cached values:: Creating and using cached values.
-@end menu
-
-@node Symbol table by name
-@subsubsection Variable Access and Update by Name
-
-The following routines provide the ability to access and update
-global @command{awk}-level variables by name. In compiler terminology,
-identifiers of different kinds are termed @dfn{symbols}, thus the ``sym''
-in the routines' names. The data structure which stores information
-about symbols is termed a @dfn{symbol table}.
-
-@table @code
-@item awk_bool_t sym_lookup(const char *name,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
-Fill in the @code{awk_value_t} structure pointed to by @code{result}
-with the value of the variable named by the string @code{name}, which is
-a regular C string. @code{wanted} indicates the type of value expected.
-Return true if the actual type matches @code{wanted}, false otherwise
-In the latter case, @code{result->val_type} indicates the actual type
-(@pxref{table-value-types-returned}).
-
-@item awk_bool_t sym_update(const char *name, awk_value_t *value);
-Update the variable named by the string @code{name}, which is a regular
-C string. The variable is added to @command{gawk}'s symbol table
-if it is not there. Return true if everything worked, false otherwise.
-
-Changing types (scalar to array or vice versa) of an existing variable
-is @emph{not} allowed, nor may this routine be used to update an array.
-This routine cannot be be used to update any of the predefined
-variables (such as @code{ARGC} or @code{NF}).
-
-@item awk_bool_t sym_constant(const char *name, awk_value_t *value);
-Create a variable named by the string @code{name}, which is
-a regular C string, that has the constant value as given by
-@code{value}. @command{awk}-level code cannot change the value of this
-variable.@footnote{There (currently) is no @code{awk}-level feature that
-provides this ability.} The extension may change the value of @code{name}'s
-variable with subsequent calls to this routine, and may also convert
-a variable created by @code{sym_update()} into a constant. However,
-once a variable becomes a constant it cannot later be reverted into a
-mutable variable.
-@end table
-
-@node Symbol table by cookie
-@subsubsection Variable Access and Update by Cookie
-
-A @dfn{scalar cookie} is an opaque handle that provide access
-to a global variable or array. It is an optimization that
-avoids looking up variables in @command{gawk}'s symbol table every time
-access is needed. This was discussed earlier, in @ref{General Data Types}.
-
-The following functions let you work with scalar cookies.
-
-@table @code
-@item awk_bool_t sym_lookup_scalar(awk_scalar_t cookie,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
-Retrieve the current value of a scalar cookie.
-Once you have obtained a scalar_cookie using @code{sym_lookup()}, you can
-use this function to get its value more efficiently.
-Return false if the value cannot be retrieved.
-
-@item awk_bool_t sym_update_scalar(awk_scalar_t cookie, awk_value_t *value);
-Update the value associated with a scalar cookie. Return false if
-the new value is not one of @code{AWK_STRING} or @code{AWK_NUMBER}.
-Here too, the built-in variables may not be updated.
-@end table
-
-It is not obvious at first glance how to work with scalar cookies or
-what their @i{raison d'etre} really is. In theory, the @code{sym_lookup()}
-and @code{sym_update()} routines are all you really need to work with
-variables. For example, you might have code that looked up the value of
-a variable, evaluated a condition, and then possibly changed the value
-of the variable based on the result of that evaluation, like so:
-
-@example
-/* do_magic --- do something really great */
-
-static awk_value_t *
-do_magic(int nargs, awk_value_t *result)
-@{
- awk_value_t value;
-
- if ( sym_lookup("MAGIC_VAR", AWK_NUMBER, & value)
- && some_condition(value.num_value)) @{
- value.num_value += 42;
- sym_update("MAGIC_VAR", & value);
- @}
-
- return make_number(0.0, result);
-@}
-@end example
-
-@noindent
-This code looks (and is) simple and straightforward. So what's the problem?
-
-Consider what happens if @command{awk}-level code associated with your
-extension calls the @code{magic()} function (implemented in C by @code{do_magic()}),
-once per record, while processing hundreds of thousands or millions of records.
-The @code{MAGIC_VAR} variable is looked up in the symbol table once or twice per function call!
-
-The symbol table lookup is really pure overhead; it is considerably more efficient
-to get a cookie that represents the variable, and use that to get the variable's
-value and update it as needed.@footnote{The difference is measurable and quite real. Trust us.}
-
-Thus, the way to use cookies is as follows. First, install your extension's variable
-in @command{gawk}'s symbol table using @code{sym_update()}, as usual. Then get a
-scalar cookie for the variable using @code{sym_lookup()}:
-
-@example
-static awk_scalar_t magic_var_cookie; /* cookie for MAGIC_VAR */
-
-static void
-my_extension_init()
-@{
- awk_value_t value;
-
- /* install initial value */
- sym_update("MAGIC_VAR", make_number(42.0, & value));
-
- /* get cookie */
- sym_lookup("MAGIC_VAR", AWK_SCALAR, & value);
-
- /* save the cookie */
- magic_var_cookie = value.scalar_cookie;
- @dots{}
-@}
-@end example
-
-Next, use the routines in this section for retrieving and updating
-the value through the cookie. Thus, @code{do_magic()} now becomes
-something like this:
-
-@example
-/* do_magic --- do something really great */
-
-static awk_value_t *
-do_magic(int nargs, awk_value_t *result)
-@{
- awk_value_t value;
-
- if ( sym_lookup_scalar(magic_var_cookie, AWK_NUMBER, & value)
- && some_condition(value.num_value)) @{
- value.num_value += 42;
- sym_update_scalar(magic_var_cookie, & value);
- @}
- @dots{}
-
- return make_number(0.0, result);
-@}
-@end example
-
-@quotation NOTE
-The previous code omitted error checking for
-presentation purposes. Your extension code should be more robust
-and carefully check the return values from the API functions.
-@end quotation
-
-@node Cached values
-@subsubsection Creating and Using Cached Values
-
-The routines in this section allow you to create and release
-cached values. As with scalar cookies, in theory, cached values
-are not necessary. You can create numbers and strings using
-the functions in @ref{Constructor Functions}. You can then
-assign those values to variables using @code{sym_update()}
-or @code{sym_update_scalar()}, as you like.
-
-However, you can understand the point of cached values if you remember that
-@emph{every} string value's storage @emph{must} come from @code{malloc()}.
-If you have 20 variables, all of which have the same string value, you
-must create 20 identical copies of the string.@footnote{Numeric values
-are clearly less problematic, requiring only a C @code{double} to store.}
-
-It is clearly more efficient, if possible, to create a value once, and
-then tell @command{gawk} to reuse the value for multiple variables. That
-is what the routines in this section let you do. The functions are as follows:
-
-@table @code
-@item awk_bool_t create_value(awk_value_t *value, awk_value_cookie_t *result);
-Create a cached string or numeric value from @code{value} for efficient later
-assignment.
-Only @code{AWK_NUMBER} and @code{AWK_STRING} values are allowed. Any other type
-is rejected. While @code{AWK_UNDEFINED} could be allowed, doing so would
-result in inferior performance.
-
-@item awk_bool_t release_value(awk_value_cookie_t vc);
-Release the memory associated with a value cookie obtained
-from @code{create_value()}.
-@end table
-
-You use value cookies in a fashion similar to the way you use scalar cookies.
-In the extension initialization routine, you create the value cookie:
-
-@example
-static awk_value_cookie_t answer_cookie; /* static value cookie */
-
-static void
-my_extension_init()
-@{
- awk_value_t value;
- char *long_string;
- size_t long_string_len;
-
- /* code from earlier */
- @dots{}
- /* @dots{} fill in long_string and long_string_len @dots{} */
- make_malloced_string(long_string, long_string_len, & value);
- create_value(& value, & answer_cookie); /* create cookie */
- @dots{}
-@}
-@end example
-
-Once the value is created, you can use it as the value of any number
-of variables:
-
-@example
-static awk_value_t *
-do_magic(int nargs, awk_value_t *result)
-@{
- awk_value_t new_value;
-
- @dots{} /* as earlier */
-
- value.val_type = AWK_VALUE_COOKIE;
- value.value_cookie = answer_cookie;
- sym_update("VAR1", & value);
- sym_update("VAR2", & value);
- @dots{}
- sym_update("VAR100", & value);
- @dots{}
-@}
-@end example
-
-@noindent
-Using value cookies in this way saves considerable storage, since all of
-@code{VAR1} through @code{VAR100} share the same value.
-
-You might be wondering, ``Is this sharing problematic?
-What happens if @command{awk} code assigns a new value to @code{VAR1},
-are all the others be changed too?''
-
-That's a great question. The answer is that no, it's not a problem.
-@command{gawk} is smart enough to avoid such problems.
-
-Finally, as part of your clean up action (@pxref{Exit Callback Functions})
-you should release any cached values that you created, using
-@code{release_value()}.
-
-@node Array Manipulation
-@subsection Array Manipulation
-
-The primary data structure@footnote{Okay, the only data structure.} in @command{awk}
-is the associative array (@pxref{Arrays}).
-Extensions need to be able to manipulate @command{awk} arrays.
-The API provides a number of data structures for working with arrays,
-functions for working with individual elements, and functions for
-working with arrays as a whole. This includes the ability to
-``flatten'' an array so that it is easy for C code to traverse
-every element in an array. The array data structures integrate
-nicely with the data structures for values to make it easy to
-both work with and create true arrays of arrays (@pxref{General Data Types}).
-
-@menu
-* Array Data Types:: Data types for working with arrays.
-* Array Functions:: Functions for working with arrays.
-* Flattening Arrays:: How to flatten arrays.
-* Creating Arrays:: How to create and populate arrays.
-@end menu
-
-@node Array Data Types
-@subsubsection Array Data Types
-
-The data types associated with arrays are listed below.
-
-@table @code
-@item typedef void *awk_array_t;
-If you request the value of an array variable, you get back an
-@code{awk_array_t} value. This value is opaque@footnote{It is also
-a ``cookie,'' but the @command{gawk} developers did not wish to overuse this
-term.} to the extension; it uniquely identifies the array but can
-only be used by passing it into API functions or receiving it from API
-functions. This is very similar to way @samp{FILE *} values are used
-with the @code{<stdio.h>} library routines.
-
-
-@item
-@item typedef struct awk_element @{
-@itemx @ @ @ @ /* convenience linked list pointer, not used by gawk */
-@itemx @ @ @ @ struct awk_element *next;
-@itemx @ @ @ @ enum @{
-@itemx @ @ @ @ @ @ @ @ AWK_ELEMENT_DEFAULT = 0,@ @ /* set by gawk */
-@itemx @ @ @ @ @ @ @ @ AWK_ELEMENT_DELETE = 1@ @ @ @ /* set by extension if should be deleted */
-@itemx @ @ @ @ @} flags;
-@itemx @ @ @ @ awk_value_t index;
-@itemx @ @ @ @ awk_value_t value;
-@itemx @} awk_element_t;
-The @code{awk_element_t} is a ``flattened''
-array element. @command{awk} produces an array of these
-inside the @code{awk_flat_array_t} (see the next item).
-Individual elements may be marked for deletion. New elements must be added
-individually, one at a time, using the separate API for that purpose.
-The fields are as follows:
-
-@c nested table
-@table @code
-@item struct awk_element *next;
-This pointer is for the convenience of extension writers. It allows
-an extension to create a linked list of new elements which can then be
-added to an array in a loop that traverses the list.
-
-@item enum @{ @dots{} @} flags;
-A set of flag values that convey information between @command{gawk}
-and the extension. Currently there is only one: @code{AWK_ELEMENT_DELETE},
-which the extension can set to cause @command{gawk} to delete the
-element from the original array upon release of the flattened array.
-
-@item index
-@itemx value
-The index and value of the element, respectively.
-@emph{All} memory pointed to by @code{index} and @code{value} belongs to @command{gawk}.
-@end table
-
-@item typedef struct awk_flat_array @{
-@itemx @ @ @ @ awk_const void *awk_const opaque1;@ @ @ @ /* private data for use by gawk */
-@itemx @ @ @ @ awk_const void *awk_const opaque2;@ @ @ @ /* private data for use by gawk */
-@itemx @ @ @ @ awk_const size_t count;@ @ @ @ @ /* how many elements */
-@itemx @ @ @ @ awk_element_t elements[1];@ @ /* will be extended */
-@itemx @} awk_flat_array_t;
-This is a flattened array. When an extension gets one of these
-from @command{gawk}, the @code{elements} array is of actual
-size @code{count}.
-The @code{opaque1} and @code{opaque2} pointers are for use by @command{gawk};
-therefore they are marked @code{awk_const} so that the extension cannot
-modify them.
-@end table
-
-@node Array Functions
-@subsubsection Array Functions
-
-The following functions relate to individual array elements.
-
-@table @code
-@item awk_bool_t get_element_count(awk_array_t a_cookie, size_t *count);
-For the array represented by @code{a_cookie}, return in @code{*count}
-the number of elements it contains. A subarray counts as a single element.
-Return false if there is an error.
-
-@item awk_bool_t get_array_element(awk_array_t a_cookie,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const awk_value_t *const index,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_valtype_t wanted,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_value_t *result);
-For the array represented by @code{a_cookie}, return in @code{*result}
-the value of the element whose index is @code{index}.
-@code{wanted} specifies the type of value you wish to retrieve.
-Return false if @code{wanted} does not match the actual type or if
-@code{index} is not in the array (@pxref{table-value-types-returned}).
-
-The value for @code{index} can be numeric, in which case @command{gawk}
-converts it to a string. Using non-integral values is possible, but
-requires that you understand how such values are converted to strings
-(@pxref{Conversion}); thus using integral values is safest.
-
-As with @emph{all} strings passed into @code{gawk} from an extension,
-the string value of @code{index} must come from @code{malloc()}, and
-@command{gawk} releases the storage.
-
-@item awk_bool_t set_array_element(awk_array_t a_cookie,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const@ awk_value_t *const index,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const@ awk_value_t *const value);
-In the array represented by @code{a_cookie}, create or modify
-the element whose index is given by @code{index}.
-The @code{ARGV} and @code{ENVIRON} arrays may not be changed.
-
-@item awk_bool_t set_array_element_by_elem(awk_array_t a_cookie,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_element_t element);
-Like @code{set_array_element()}, but take the @code{index} and @code{value}
-from @code{element}. This is a convenience macro.
-
-@item awk_bool_t del_array_element(awk_array_t a_cookie,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ const awk_value_t* const index);
-Remove the element with the given index from the array
-represented by @code{a_cookie}.
-Return true if the element was removed, or false if the element did
-not exist in the array.
-@end table
-
-The following functions relate to arrays as a whole:
-
-@table @code
-@item awk_array_t create_array();
-Create a new array to which elements may be added.
-@xref{Creating Arrays}, for a discussion of how to
-create a new array and add elements to it.
-
-@item awk_bool_t clear_array(awk_array_t a_cookie);
-Clear the array represented by @code{a_cookie}.
-Return false if there was some kind of problem, true otherwise.
-The array remains an array, but after calling this function, it
-has no elements. This is equivalent to using the @code{delete}
-statement (@pxref{Delete}).
-
-@item awk_bool_t flatten_array(awk_array_t a_cookie, awk_flat_array_t **data);
-For the array represented by @code{a_cookie}, create an @code{awk_flat_array_t}
-structure and fill it in. Set the pointer whose address is passed as @code{data}
-to point to this structure.
-Return true upon success, or false otherwise.
-@xref{Flattening Arrays}, for a discussion of how to
-flatten an array and work with it.
-
-@item awk_bool_t release_flattened_array(awk_array_t a_cookie,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ awk_flat_array_t *data);
-When done with a flattened array, release the storage using this function.
-You must pass in both the original array cookie, and the address of
-the created @code{awk_flat_array_t} structure.
-The function returns true upon success, false otherwise.
-@end table
-
-@node Flattening Arrays
-@subsubsection Working With All The Elements of an Array
-
-To @dfn{flatten} an array is create a structure that
-represents the full array in a fashion that makes it easy
-for C code to traverse the entire array. Test code
-in @file{extension/testext.c} does this, and also serves
-as a nice example to show how to use the APIs.
-
-First, the @command{gawk} script that drives the test extension:
-
-@example
-@@load "testext"
-BEGIN @{
- n = split("blacky rusty sophie raincloud lucky", pets)
- printf "pets has %d elements\n", length(pets)
- ret = dump_array_and_delete("pets", "3")
- printf "dump_array_and_delete(pets) returned %d\n", ret
- if ("3" in pets)
- printf("dump_array_and_delete() did NOT remove index \"3\"!\n")
- else
- printf("dump_array_and_delete() did remove index \"3\"!\n")
- print ""
-@}
-@end example
-
-@noindent
-This code creates an array with @code{split()} (@pxref{String Functions})
-and then calls @code{dump_and_delete()}. That function looks up
-the array whose name is passed as the first argument, and
-deletes the element at the index passed in the second argument.
-It then prints the return value and checks if the element
-was indeed deleted. Here is the C code that implements
-@code{dump_array_and_delete()}. It has been edited slightly for
-presentation.
-
-The first part declares variables, sets up the default
-return value in @code{result}, and checks that the function
-was called with the correct number of arguments:
-
-@example
-static awk_value_t *
-dump_array_and_delete(int nargs, awk_value_t *result)
-@{
- awk_value_t value, value2, value3;
- awk_flat_array_t *flat_array;
- size_t count;
- char *name;
- int i;
-
- assert(result != NULL);
- make_number(0.0, result);
-
- if (nargs != 2) @{
- printf("dump_array_and_delete: nargs not right "
- "(%d should be 2)\n", nargs);
- goto out;
- @}
-@end example
-
-The function then proceeds in steps, as follows. First, retrieve
-the name of the array, passed as the first argument. Then
-retrieve the array itself. If either operation fails, print
-error messages and return:
-
-@example
- /* get argument named array as flat array and print it */
- if (get_argument(0, AWK_STRING, & value)) @{
- name = value.str_value.str;
- if (sym_lookup(name, AWK_ARRAY, & value2))
- printf("dump_array_and_delete: sym_lookup of %s passed\n",
- name);
- else @{
- printf("dump_array_and_delete: sym_lookup of %s failed\n",
- name);
- goto out;
- @}
- @} else @{
- printf("dump_array_and_delete: get_argument(0) failed\n");
- goto out;
- @}
-@end example
-
-For testing purposes and to make sure that the C code sees
-the same number of elements as the @command{awk} code,
-the second step is to get the count of elements in the array
-and print it:
-
-@example
- if (! get_element_count(value2.array_cookie, & count)) @{
- printf("dump_array_and_delete: get_element_count failed\n");
- goto out;
- @}
-
- printf("dump_array_and_delete: incoming size is %lu\n",
- (unsigned long) count);
-@end example
-
-The third step is to actually flatten the array, and then
-to double check that the count in the @code{awk_flat_array_t}
-is the same as the count just retrieved:
-
-@example
- if (! flatten_array(value2.array_cookie, & flat_array)) @{
- printf("dump_array_and_delete: could not flatten array\n");
- goto out;
- @}
-
- if (flat_array->count != count) @{
- printf("dump_array_and_delete: flat_array->count (%lu)"
- " != count (%lu)\n",
- (unsigned long) flat_array->count,
- (unsigned long) count);
- goto out;
- @}
-@end example
-
-The fourth step is to retrieve the index of the element
-to be deleted, which was passed as the second argument.
-Remember that argument counts passed to @code{get_argument()}
-are zero-based, thus the second argument is numbered one:
-
-@example
- if (! get_argument(1, AWK_STRING, & value3)) @{
- printf("dump_array_and_delete: get_argument(1) failed\n");
- goto out;
- @}
-@end example
-
-The fifth step is where the ``real work'' is done. The function
-loops over every element in the array, printing the index and
-element values. In addition, upon finding the element with the
-index that is supposed to be deleted, the function sets the
-@code{AWK_ELEMENT_DELETE} bit in the @code{flags} field
-of the element. When the array is released, @command{gawk}
-traverses the flattened array, and deletes any element which
-have this flag bit set:
-
-@example
- for (i = 0; i < flat_array->count; i++) @{
- printf("\t%s[\"%.*s\"] = %s\n",
- name,
- (int) flat_array->elements[i].index.str_value.len,
- flat_array->elements[i].index.str_value.str,
- valrep2str(& flat_array->elements[i].value));
-
- if (strcmp(value3.str_value.str,
- flat_array->elements[i].index.str_value.str)
- == 0) @{
- flat_array->elements[i].flags |= AWK_ELEMENT_DELETE;
- printf("dump_array_and_delete: marking element \"%s\" "
- "for deletion\n",
- flat_array->elements[i].index.str_value.str);
- @}
- @}
-@end example
-
-The sixth step is to release the flattened array. This tells
-@command{gawk} that the extension is no longer using the array,
-and that it should delete any elements marked for deletion.
-@command{gawk} also frees any storage that was allocated,
-so you should not use the pointer (@code{flat_array} in this
-code) once you have called @code{release_flattened_array()}:
-
-@example
- if (! release_flattened_array(value2.array_cookie, flat_array)) @{
- printf("dump_array_and_delete: could not release flattened array\n");
- goto out;
- @}
-@end example
-
-Finally, since everything was successful, the function sets the
-return value to success, and returns:
-
-@example
- make_number(1.0, result);
-out:
- return result;
-@}
-@end example
-
-Here is the output from running this part of the test:
-
-@example
-pets has 5 elements
-dump_array_and_delete: sym_lookup of pets passed
-dump_array_and_delete: incoming size is 5
- pets["1"] = "blacky"
- pets["2"] = "rusty"
- pets["3"] = "sophie"
-dump_array_and_delete: marking element "3" for deletion
- pets["4"] = "raincloud"
- pets["5"] = "lucky"
-dump_array_and_delete(pets) returned 1
-dump_array_and_delete() did remove index "3"!
-@end example
-
-@node Creating Arrays
-@subsubsection How To Create and Populate Arrays
-
-Besides working with arrays created by @command{awk} code, you can
-create arrays and populate them as you see fit, and then @command{awk}
-code can access them and manipulate them.
-
-There are two important points about creating arrays from extension code:
-
-@enumerate 1
-@item
-You must install a new array into @command{gawk}'s symbol
-table immediately upon creating it. Once you have done so,
-you can then populate the array.
-
-@ignore
-Strictly speaking, this is required only
-for arrays that will have subarrays as elements; however it is
-a good idea to always do this. This restriction may be relaxed
-in a subsequent revision of the API.
-@end ignore
-
-Similarly, if installing a new array as a subarray of an existing array,
-you must add the new array to its parent before adding any elements to it.
-
-Thus, the correct way to build an array is to work ``top down.'' Create
-the array, and immediately install it in @command{gawk}'s symbol table
-using @code{sym_update()}, or install it as an element in a previously
-existing array using @code{set_element()}. Example code is coming shortly.
-
-@item
-Due to gawk internals, after using @code{sym_update()} to install an array
-into @command{gawk}, you have to retrieve the array cookie from the value
-passed in to @command{sym_update()} before doing anything else with it, like so:
-
-@example
-awk_value_t index, value;
-awk_array_t new_array;
-
-make_const_string("an index", 8, & index);
-
-new_array = create_array();
-val.val_type = AWK_ARRAY;
-val.array_cookie = new_array;
-
-/* install array in the symbol table */
-sym_update("array", & index, & val);
-
-new_array = val.array_cookie; /* YOU MUST DO THIS */
-@end example
-
-If installing an array as a subarray, you must also retrieve the value
-of the array cookie after the call to @code{set_element()}.
-@end enumerate
-
-The following C code is a simple test extension to create an array
-with two regular elements and with a subarray. The leading @samp{#include}
-directives and boilerplate variable declarations are omitted for brevity.
-The first step is to create a new array and then install it
-in the symbol table:
-
-@example
-@ignore
-#ifdef HAVE_CONFIG_H
-#include <config.h>
-#endif
-
-#include <stdio.h>
-#include <assert.h>
-#include <errno.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-
-#include <sys/types.h>
-#include <sys/stat.h>
-
-#include "gawkapi.h"
-
-static const gawk_api_t *api; /* for convenience macros to work */
-static awk_ext_id_t *ext_id;
-static const char *ext_version = "testarray extension: version 1.0";
-
-int plugin_is_GPL_compatible;
-
-@end ignore
-/* create_new_array --- create a named array */
-
-static void
-create_new_array()
-@{
- awk_array_t a_cookie;
- awk_array_t subarray;
- awk_value_t index, value;
-
- a_cookie = create_array();
- value.val_type = AWK_ARRAY;
- value.array_cookie = a_cookie;
-
- if (! sym_update("new_array", & value))
- printf("create_new_array: sym_update(\"new_array\") failed!\n");
- a_cookie = value.array_cookie;
-@end example
-
-@noindent
-Note how @code{a_cookie} is reset from the @code{array_cookie} field in
-the @code{value} structure.
-
-The second step is to install two regular values into @code{new_array}:
-
-@example
- (void) make_const_string("hello", 5, & index);
- (void) make_const_string("world", 5, & value);
- if (! set_array_element(a_cookie, & index, & value)) @{
- printf("fill_in_array: set_array_element failed\n");
- return;
- @}
-
- (void) make_const_string("answer", 6, & index);
- (void) make_number(42.0, & value);
- if (! set_array_element(a_cookie, & index, & value)) @{
- printf("fill_in_array: set_array_element failed\n");
- return;
- @}
-@end example
-
-The third step is to create the subarray and install it:
-
-@example
- (void) make_const_string("subarray", 8, & index);
- subarray = create_array();
- value.val_type = AWK_ARRAY;
- value.array_cookie = subarray;
- if (! set_array_element(a_cookie, & index, & value)) @{
- printf("fill_in_array: set_array_element failed\n");
- return;
- @}
- subarray = value.array_cookie;
-@end example
-
-The final step is to populate the subarray with its own element:
-
-@example
- (void) make_const_string("foo", 3, & index);
- (void) make_const_string("bar", 3, & value);
- if (! set_array_element(subarray, & index, & value)) @{
- printf("fill_in_array: set_array_element failed\n");
- return;
- @}
-@}
-@ignore
-static awk_ext_func_t func_table[] = @{
- @{ NULL, NULL, 0 @}
-@};
-
-/* init_testarray --- additional initialization function */
-
-static awk_bool_t init_testarray(void)
-@{
- create_new_array();
-
- return 1;
-@}
-
-static awk_bool_t (*init_func)(void) = init_testarray;
-
-dl_load_func(func_table, testarray, "")
-@end ignore
-@end example
-
-Here is sample script that loads the extension
-and then dumps the array:
-
-@example
-@@load "subarray"
-
-function dumparray(name, array, i)
-@{
- for (i in array)
- if (isarray(array[i]))
- dumparray(name "[\"" i "\"]", array[i])
- else
- printf("%s[\"%s\"] = %s\n", name, i, array[i])
-@}
-
-BEGIN @{
- dumparray("new_array", new_array);
-@}
-@end example
-
-Here is the result of running the script:
-
-@example
-$ @kbd{AWKLIBPATH=$PWD ./gawk -f subarray.awk}
-@print{} new_array["subarray"]["foo"] = bar
-@print{} new_array["hello"] = world
-@print{} new_array["answer"] = 42
-@end example
-
-@noindent
-(@xref{Finding Extensions}, for more information on the
-@env{AWKLIBPATH} environment variable.)
-
-@node Extension API Variables
-@subsection API Variables
-
-The API provides two sets of variables. The first provides information
-about the version of the API (both with which the extension was compiled,
-and with which @command{gawk} was compiled). The second provides
-information about how @command{gawk} was invoked.
-
-@menu
-* Extension Versioning:: API Version information.
-* Extension API Informational Variables:: Variables providing information about
- @command{gawk}'s invocation.
-@end menu
-
-@node Extension Versioning
-@subsubsection API Version Constants and Variables
-
-The API provides both a ``major'' and a ``minor'' version number.
-The API versions are available at compile time as constants:
-
-@table @code
-@item GAWK_API_MAJOR_VERSION
-The major version of the API.
-
-@item GAWK_API_MINOR_VERSION
-The minor version of the API.
-@end table
-
-The minor version increases when new functions are added to the API. Such
-new functions are always added to the end of the API @code{struct}.
-
-The major version increases (and the minor version is reset to zero) if any
-of the data types change size or member order, or if any of the existing
-functions change signature.
-
-It could happen that an extension may be compiled against one version
-of the API but loaded by a version of @command{gawk} using a different
-version. For this reason, the major and minor API versions of the
-running @command{gawk} are included in the API @code{struct} as read-only
-constant integers:
-
-@table @code
-@item api->major_version
-The major version of the running @command{gawk}.
-
-@item api->minor_version
-The minor version of the running @command{gawk}.
-@end table
-
-It is up to the extension to decide if there are API incompatibilities.
-Typically a check like this is enough:
-
-@example
-if (api->major_version != GAWK_API_MAJOR_VERSION
- || api->minor_version < GAWK_API_MINOR_VERSION) @{
- fprintf(stderr, "foo_extension: version mismatch with gawk!\n");
- fprintf(stderr, "\tmy version (%d, %d), gawk version (%d, %d)\n",
- GAWK_API_MAJOR_VERSION, GAWK_API_MINOR_VERSION,
- api->major_version, api->minor_version);
- exit(1);
-@}
-@end example
-
-Such code is included in the boilerplate @code{dl_load_func()} macro
-provided in @file{gawkapi.h} (discussed later, in
-@ref{Extension API Boilerplate}).
-
-@node Extension API Informational Variables
-@subsubsection Informational Variables
-
-The API provides access to several variables that describe
-whether the corresponding command-line options were enabled when
-@command{gawk} was invoked. The variables are:
-
-@table @code
-@item do_lint
-This variable is true if @command{gawk} was invoked with @option{--lint} option
-(@pxref{Options}).
-
-@item do_traditional
-This variable is true if @command{gawk} was invoked with @option{--traditional} option.
-
-@item do_profile
-This variable is true if @command{gawk} was invoked with @option{--profile} option.
-
-@item do_sandbox
-This variable is true if @command{gawk} was invoked with @option{--sandbox} option.
-
-@item do_debug
-This variable is true if @command{gawk} was invoked with @option{--debug} option.
-
-@item do_mpfr
-This variable is true if @command{gawk} was invoked with @option{--bignum} option.
-@end table
-
-The value of @code{do_lint} can change if @command{awk} code
-modifies the @code{LINT} built-in variable (@pxref{Built-in Variables}).
-The others should not change during execution.
-
-@node Extension API Boilerplate
-@subsection Boilerplate Code
-
-As mentioned earlier (@pxref{Extension Mechanism Outline}), the function
-definitions as presented are really macros. To use these macros, your
-extension must provide a small amount of boilerplate code (variables and
-functions) towards the top of your source file, using pre-defined names
-as described below. The boilerplate needed is also provided in comments
-in the @file{gawkapi.h} header file:
-
-@example
-/* Boiler plate code: */
-int plugin_is_GPL_compatible;
-
-static gawk_api_t *const api;
-static awk_ext_id_t ext_id;
-static const char *ext_version = NULL; /* or @dots{} = "some string" */
-
-static awk_ext_func_t func_table[] = @{
- @{ "name", do_name, 1 @},
- /* @dots{} */
-@};
-
-/* EITHER: */
-
-static awk_bool_t (*init_func)(void) = NULL;
-
-/* OR: */
-
-static awk_bool_t
-init_my_module(void)
-@{
- @dots{}
-@}
-
-static awk_bool_t (*init_func)(void) = init_my_module;
-
-dl_load_func(func_table, some_name, "name_space_in_quotes")
-@end example
-
-These variables and functions are as follows:
-
-@table @code
-@item int plugin_is_GPL_compatible;
-This asserts that the extension is compatible with the GNU GPL
-(@pxref{Copying}). If your extension does not have this, @command{gawk}
-will not load it (@pxref{Plugin License}).
-
-@item static gawk_api_t *const api;
-This global @code{static} variable should be set to point to
-the @code{gawk_api_t} pointer that @command{gawk} passes to your
-@code{dl_load()} function. This variable is used by all of the macros.
-
-@item static awk_ext_id_t ext_id;
-This global static variable should be set to the @code{awk_ext_id_t}
-value that @command{gawk} passes to your @code{dl_load()} function.
-This variable is used by all of the macros.
-
-@item static const char *ext_version = NULL; /* or @dots{} = "some string" */
-This global @code{static} variable should be set either
-to @code{NULL}, or to point to a string giving the name and version of
-your extension.
-
-@item static awk_ext_func_t func_table[] = @{ @dots{} @};
-This is an array of one or more @code{awk_ext_func_t} structures
-as described earlier (@pxref{Extension Functions}).
-It can then be looped over for multiple calls to
-@code{add_ext_func()}.
-
-@item static awk_bool_t (*init_func)(void) = NULL;
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @r{OR}
-@itemx static awk_bool_t init_my_module(void) @{ @dots{} @}
-@itemx static awk_bool_t (*init_func)(void) = init_my_module;
-If you need to do some initialization work, you should define a
-function that does it (creates variables, opens files, etc.)
-and then define the @code{init_func} pointer to point to your
-function.
-The function should return zero (false) upon failure, non-zero
-(success) if everything goes well.
-
-If you don't need to do any initialization, define the pointer and
-initialize it to @code{NULL}.
-
-@item dl_load_func(func_table, some_name, "name_space_in_quotes")
-This macro expands to a @code{dl_load()} function that performs
-all the necessary initializations.
-@end table
-
-The point of the all the variables and arrays is to let the
-@code{dl_load()} function (from the @code{dl_load_func()}
-macro) do all the standard work. It does the following:
-
-@enumerate 1
-@item
-Check the API versions. If the extension major version does not match
-@command{gawk}'s, or if the extension minor version is greater than
-@command{gawk}'s, it prints a fatal error message and exits.
-
-@item
-Load the functions defined in @code{func_table}.
-If any of them fails to load, it prints a warning message but
-continues on.
-
-@item
-If the @code{init_func} pointer is not @code{NULL}, call the
-function it points to. If it returns non-zero, print a
-warning message.
-
-@item
-If @code{ext_version} is not @code{NULL}, register
-the version string with @command{gawk}.
-@end enumerate
-
-@node Finding Extensions
-@subsection How @command{gawk} Finds Extensions
-
-Compiled extensions have to be installed in a directory where
-@command{gawk} can find them. If @command{gawk} is configured and
-built in the default fashion, the directory in which to find
-extensions is @file{/usr/local/lib/gawk}. You can also specify a search
-path with a list of directories to search for compiled extensions.
-@xref{AWKLIBPATH Variable}, for more information.
-
-@node Extension Example
-@section Example: Some File Functions
-
-@quotation
-@i{No matter where you go, there you are.} @*
-Buckaroo Bonzai
-@end quotation
-
-@c It's enough to show chdir and stat, no need for fts
-
-Two useful functions that are not in @command{awk} are @code{chdir()} (so
-that an @command{awk} program can change its directory) and @code{stat()}
-(so that an @command{awk} program can gather information about a file).
-This @value{SECTION} implements these functions for @command{gawk}
-in an extension.
-
-@menu
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
-@end menu
-
-@node Internal File Description
-@subsection Using @code{chdir()} and @code{stat()}
-
-This @value{SECTION} shows how to use the new functions at
-the @command{awk} level once they've been integrated into the
-running @command{gawk} interpreter. Using @code{chdir()} is very
-straightforward. It takes one argument, the new directory to change to:
-
-@example
-@@load "filefuncs"
-@dots{}
-newdir = "/home/arnold/funstuff"
-ret = chdir(newdir)
-if (ret < 0) @{
- printf("could not change to %s: %s\n",
- newdir, ERRNO) > "/dev/stderr"
- exit 1
-@}
-@dots{}
-@end example
-
-The return value is negative if the @code{chdir()} failed, and
-@code{ERRNO} (@pxref{Built-in Variables}) is set to a string indicating
-the error.
-
-Using @code{stat()} is a bit more complicated. The C @code{stat()}
-function fills in a structure that has a fair amount of information.
-The right way to model this in @command{awk} is to fill in an associative
-array with the appropriate information:
-
-@c broke printf for page breaking
-@example
-file = "/home/arnold/.profile"
-ret = stat(file, fdata)
-if (ret < 0) @{
- printf("could not stat %s: %s\n",
- file, ERRNO) > "/dev/stderr"
- exit 1
-@}
-printf("size of %s is %d bytes\n", file, fdata["size"])
-@end example
-
-The @code{stat()} function always clears the data array, even if
-the @code{stat()} fails. It fills in the following elements:
-
-@table @code
-@item "name"
-The name of the file that was @code{stat()}'ed.
-
-@item "dev"
-@itemx "ino"
-The file's device and inode numbers, respectively.
-
-@item "mode"
-The file's mode, as a numeric value. This includes both the file's
-type and its permissions.
-
-@item "nlink"
-The number of hard links (directory entries) the file has.
-
-@item "uid"
-@itemx "gid"
-The numeric user and group ID numbers of the file's owner.
-
-@item "size"
-The size in bytes of the file.
-
-@item "blocks"
-The number of disk blocks the file actually occupies. This may not
-be a function of the file's size if the file has holes.
-
-@item "atime"
-@itemx "mtime"
-@itemx "ctime"
-The file's last access, modification, and inode update times,
-respectively. These are numeric timestamps, suitable for formatting
-with @code{strftime()}
-(@pxref{Time Functions}).
-
-@item "pmode"
-The file's ``printable mode.'' This is a string representation of
-the file's type and permissions, such as is produced by
-@samp{ls -l}---for example, @code{"drwxr-xr-x"}.
-
-@item "type"
-A printable string representation of the file's type. The value
-is one of the following:
-
-@table @code
-@item "blockdev"
-@itemx "chardev"
-The file is a block or character device (``special file'').
-
-@ignore
-@item "door"
-The file is a Solaris ``door'' (special file used for
-interprocess communications).
-@end ignore
-
-@item "directory"
-The file is a directory.
-
-@item "fifo"
-The file is a named-pipe (also known as a FIFO).
-
-@item "file"
-The file is just a regular file.
-
-@item "socket"
-The file is an @code{AF_UNIX} (``Unix domain'') socket in the
-filesystem.
-
-@item "symlink"
-The file is a symbolic link.
-@end table
-@end table
-
-Several additional elements may be present depending upon the operating
-system and the type of the file. You can test for them in your @command{awk}
-program by using the @code{in} operator
-(@pxref{Reference to Elements}):
-
-@table @code
-@item "blksize"
-The preferred block size for I/O to the file. This field is not
-present on all POSIX-like systems in the C @code{stat} structure.
-
-@item "linkval"
-If the file is a symbolic link, this element is the name of the
-file the link points to (i.e., the value of the link).
-
-@item "rdev"
-@itemx "major"
-@itemx "minor"
-If the file is a block or character device file, then these values
-represent the numeric device number and the major and minor components
-of that number, respectively.
-@end table
-
-@node Internal File Ops
-@subsection C Code for @code{chdir()} and @code{stat()}
-
-Here is the C code for these extensions.@footnote{This version is
-edited slightly for presentation. See @file{extension/filefuncs.c}
-in the @command{gawk} distribution for the complete version.}
-
-The file includes a number of standard header files, and then includes
-the @file{gawkapi.h} header file which provides the API definitions.
-Those are followed by the necessary variable declarations
-to make use of the API macros and boilerplate code
-(@pxref{Extension API Boilerplate}).
-
-@c break line for page breaking
-@example
-#ifdef HAVE_CONFIG_H
-#include <config.h>
-#endif
-
-#include <stdio.h>
-#include <assert.h>
-#include <errno.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-
-#include <sys/types.h>
-#include <sys/stat.h>
-
-#include "gawkapi.h"
-
-#include "gettext.h"
-#define _(msgid) gettext(msgid)
-#define N_(msgid) msgid
-
-#include "gawkfts.h"
-#include "stack.h"
-
-static const gawk_api_t *api; /* for convenience macros to work */
-static awk_ext_id_t *ext_id;
-static awk_bool_t init_filefuncs(void);
-static awk_bool_t (*init_func)(void) = init_filefuncs;
-static const char *ext_version = "filefuncs extension: version 1.0";
-
-int plugin_is_GPL_compatible;
-@end example
-
-@cindex programming conventions, @command{gawk} internals
-By convention, for an @command{awk} function @code{foo()}, the C function
-that implements it is called @code{do_foo()}. The function should have
-two arguments: the first is an @code{int} usually called @code{nargs},
-that represents the number of actual arguments for the function.
-The second is a pointer to an @code{awk_value_t}, usually named
-@code{result}.
-
-@example
-/* do_chdir --- provide dynamically loaded chdir() builtin for gawk */
-
-static awk_value_t *
-do_chdir(int nargs, awk_value_t *result)
-@{
- awk_value_t newdir;
- int ret = -1;
-
- assert(result != NULL);
-
- if (do_lint && nargs != 1)
- lintwarn(ext_id,
- _("chdir: called with incorrect number of arguments, "
- "expecting 1"));
-@end example
-
-The @code{newdir}
-variable represents the new directory to change to, retrieved
-with @code{get_argument()}. Note that the first argument is
-numbered zero.
-
-If the argument is retrieved successfully, the function calls the
-@code{chdir()} system call. If the @code{chdir()} fails, @code{ERRNO}
-is updated.
-
-@example
- if (get_argument(0, AWK_STRING, & newdir)) @{
- ret = chdir(newdir.str_value.str);
- if (ret < 0)
- update_ERRNO_int(errno);
- @}
-@end example
-
-Finally, the function returns the return value to the @command{awk} level:
-
-@example
- return make_number(ret, result);
-@}
-@end example
-
-The @code{stat()} built-in is more involved. First comes a function
-that turns a numeric mode into a printable representation
-(e.g., 644 becomes @samp{-rw-r--r--}). This is omitted here for brevity:
-
-@c break line for page breaking
-@example
-/* format_mode --- turn a stat mode field into something readable */
-
-static char *
-format_mode(unsigned long fmode)
-@{
- @dots{}
-@}
-@end example
-
-Next comes a function for reading symbolic links, which is also
-omitted here for brevity:
-
-@example
-/* read_symlink --- read a symbolic link into an allocated buffer.
- @dots{} */
-
-static char *
-read_symlink(const char *fname, size_t bufsize, ssize_t *linksize)
-@{
- @dots{}
-@}
-@end example
-
-Two helper functions simplify entering values in the
-array that will contain the result of the @code{stat()}:
-
-@example
-/* array_set --- set an array element */
-
-static void
-array_set(awk_array_t array, const char *sub, awk_value_t *value)
-@{
- awk_value_t index;
-
- set_array_element(array,
- make_const_string(sub, strlen(sub), & index),
- value);
-
-@}
-
-/* array_set_numeric --- set an array element with a number */
-
-static void
-array_set_numeric(awk_array_t array, const char *sub, double num)
-@{
- awk_value_t tmp;
-
- array_set(array, sub, make_number(num, & tmp));
-@}
-@end example
-
-The following function does most of the work to fill in
-the @code{awk_array_t} result array with values obtained
-from a valid @code{struct stat}. It is done in a separate function
-to support the @code{stat()} function for @command{gawk} and also
-to support the @code{fts()} extension which is included in
-the same file but whose code is not shown here
-(@pxref{Extension Sample File Functions}).
-
-The first part of the function is variable declarations,
-including a table to map file types to strings:
-
-@example
-/* fill_stat_array --- do the work to fill an array with stat info */
-
-static int
-fill_stat_array(const char *name, awk_array_t array, struct stat *sbuf)
-@{
- char *pmode; /* printable mode */
- const char *type = "unknown";
- awk_value_t tmp;
- static struct ftype_map @{
- unsigned int mask;
- const char *type;
- @} ftype_map[] = @{
- @{ S_IFREG, "file" @},
- @{ S_IFBLK, "blockdev" @},
- @{ S_IFCHR, "chardev" @},
- @{ S_IFDIR, "directory" @},
-#ifdef S_IFSOCK
- @{ S_IFSOCK, "socket" @},
-#endif
-#ifdef S_IFIFO
- @{ S_IFIFO, "fifo" @},
-#endif
-#ifdef S_IFLNK
- @{ S_IFLNK, "symlink" @},
-#endif
-#ifdef S_IFDOOR /* Solaris weirdness */
- @{ S_IFDOOR, "door" @},
-#endif /* S_IFDOOR */
- @};
- int j, k;
-@end example
-
-The destination array is cleared, and then code fills in
-various elements based on values in the @code{struct stat}:
-
-@example
- /* empty out the array */
- clear_array(array);
-
- /* fill in the array */
- array_set(array, "name", make_const_string(name, strlen(name),
- & tmp));
- array_set_numeric(array, "dev", sbuf->st_dev);
- array_set_numeric(array, "ino", sbuf->st_ino);
- array_set_numeric(array, "mode", sbuf->st_mode);
- array_set_numeric(array, "nlink", sbuf->st_nlink);
- array_set_numeric(array, "uid", sbuf->st_uid);
- array_set_numeric(array, "gid", sbuf->st_gid);
- array_set_numeric(array, "size", sbuf->st_size);
- array_set_numeric(array, "blocks", sbuf->st_blocks);
- array_set_numeric(array, "atime", sbuf->st_atime);
- array_set_numeric(array, "mtime", sbuf->st_mtime);
- array_set_numeric(array, "ctime", sbuf->st_ctime);
-
- /* for block and character devices, add rdev,
- major and minor numbers */
- if (S_ISBLK(sbuf->st_mode) || S_ISCHR(sbuf->st_mode)) @{
- array_set_numeric(array, "rdev", sbuf->st_rdev);
- array_set_numeric(array, "major", major(sbuf->st_rdev));
- array_set_numeric(array, "minor", minor(sbuf->st_rdev));
- @}
-@end example
-
-@noindent
-The latter part of the function makes selective additions
-to the destination array, depending upon the availability of
-certain members and/or the type of the file. It then returns zero,
-for success:
-
-@example
-#ifdef HAVE_ST_BLKSIZE
- array_set_numeric(array, "blksize", sbuf->st_blksize);
-#endif /* HAVE_ST_BLKSIZE */
-
- pmode = format_mode(sbuf->st_mode);
- array_set(array, "pmode", make_const_string(pmode, strlen(pmode),
- & tmp));
-
- /* for symbolic links, add a linkval field */
- if (S_ISLNK(sbuf->st_mode)) @{
- char *buf;
- ssize_t linksize;
-
- if ((buf = read_symlink(name, sbuf->st_size,
- & linksize)) != NULL)
- array_set(array, "linkval",
- make_malloced_string(buf, linksize, & tmp));
- else
- warning(ext_id, _("stat: unable to read symbolic link `%s'"),
- name);
- @}
-
- /* add a type field */
- type = "unknown"; /* shouldn't happen */
- for (j = 0, k = sizeof(ftype_map)/sizeof(ftype_map[0]); j < k; j++) @{
- if ((sbuf->st_mode & S_IFMT) == ftype_map[j].mask) @{
- type = ftype_map[j].type;
- break;
- @}
- @}
-
- array_set(array, "type", make_const_string(type, strlen(type), &tmp));
-
- return 0;
-@}
-@end example
-
-Finally, here is the @code{do_stat()} function. It starts with
-variable declarations and argument checking:
-
-@ignore
-Changed message for page breaking. Used to be:
- "stat: called with incorrect number of arguments (%d), should be 2",
-@end ignore
-@example
-/* do_stat --- provide a stat() function for gawk */
-
-static awk_value_t *
-do_stat(int nargs, awk_value_t *result)
-@{
- awk_value_t file_param, array_param;
- char *name;
- awk_array_t array;
- int ret;
- struct stat sbuf;
-
- assert(result != NULL);
-
- if (do_lint && nargs != 2) @{
- lintwarn(ext_id,
- _("stat: called with wrong number of arguments"));
- return make_number(-1, result);
- @}
-@end example
-
-Then comes the actual work. First, the function gets the arguments.
-Next, it gets the information for the file.
-The code use @code{lstat()} (instead of @code{stat()})
-to get the file information,
-in case the file is a symbolic link.
-If there's an error, it sets @code{ERRNO} and returns:
-
-@example
- /* file is first arg, array to hold results is second */
- if ( ! get_argument(0, AWK_STRING, & file_param)
- || ! get_argument(1, AWK_ARRAY, & array_param)) @{
- warning(ext_id, _("stat: bad parameters"));
- return make_number(-1, result);
- @}
-
- name = file_param.str_value.str;
- array = array_param.array_cookie;
-
- /* always empty out the array */
- clear_array(array);
-
- /* lstat the file, if error, set ERRNO and return */
- ret = lstat(name, & sbuf);
- if (ret < 0) @{
- update_ERRNO_int(errno);
- return make_number(ret, result);
- @}
-@end example
-
-The tedious work is done by @code{fill_stat_array()}, shown
-earlier. When done, return the result from @code{fill_stat_array()}:
-
-@example
- ret = fill_stat_array(name, array, & sbuf);
-
- return make_number(ret, result);
-@}
-@end example
-
-@cindex programming conventions, @command{gawk} internals
-Finally, it's necessary to provide the ``glue'' that loads the
-new function(s) into @command{gawk}.
-
-The @code{filefuncs} extension also provides an @code{fts()}
-function, which we omit here. For its sake there is an initialization
-function:
-
-@example
-/* init_filefuncs --- initialization routine */
-
-static awk_bool_t
-init_filefuncs(void)
-@{
- @dots{}
-@}
-@end example
-
-We are almost done. We need an array of @code{awk_ext_func_t}
-structures for loading each function into @command{gawk}:
-
-@example
-static awk_ext_func_t func_table[] = @{
- @{ "chdir", do_chdir, 1 @},
- @{ "stat", do_stat, 2 @},
- @{ "fts", do_fts, 3 @},
-@};
-@end example
-
-Each extension must have a routine named @code{dl_load()} to load
-everything that needs to be loaded. It is simplest to use the
-@code{dl_load_func()} macro in @code{gawkapi.h}:
-
-@example
-/* define the dl_load() function using the boilerplate macro */
-
-dl_load_func(func_table, filefuncs, "")
-@end example
-
-And that's it! As an exercise, consider adding functions to
-implement system calls such as @code{chown()}, @code{chmod()},
-and @code{umask()}.
-
-@node Using Internal File Ops
-@subsection Integrating The Extensions
-
-@cindex @command{gawk}, interpreter@comma{} adding code to
-Now that the code is written, it must be possible to add it at
-runtime to the running @command{gawk} interpreter. First, the
-code must be compiled. Assuming that the functions are in
-a file named @file{filefuncs.c}, and @var{idir} is the location
-of the @file{gawkapi.h} header file,
-the following steps@footnote{In practice, you would probably want to
-use the GNU Autotools---Automake, Autoconf, Libtool, and Gettext---to
-configure and build your libraries. Instructions for doing so are beyond
-the scope of this @value{DOCUMENT}. @xref{gawkextlib}, for WWW links to
-the tools.} create a GNU/Linux shared library:
-
-@example
-$ @kbd{gcc -fPIC -shared -DHAVE_CONFIG_H -c -O -g -I@var{idir} filefuncs.c}
-$ @kbd{ld -o filefuncs.so -shared filefuncs.o -lc}
-@end example
-
-Once the library exists, it is loaded by using the @code{@@load} keyword.
-
-@example
-# file testff.awk
-@@load "filefuncs"
-
-BEGIN @{
- "pwd" | getline curdir # save current directory
- close("pwd")
-
- chdir("/tmp")
- system("pwd") # test it
- chdir(curdir) # go back
-
- print "Info for testff.awk"
- ret = stat("testff.awk", data)
- print "ret =", ret
- for (i in data)
- printf "data[\"%s\"] = %s\n", i, data[i]
- print "testff.awk modified:",
- strftime("%m %d %y %H:%M:%S", data["mtime"])
-
- print "\nInfo for JUNK"
- ret = stat("JUNK", data)
- print "ret =", ret
- for (i in data)
- printf "data[\"%s\"] = %s\n", i, data[i]
- print "JUNK modified:", strftime("%m %d %y %H:%M:%S", data["mtime"])
-@}
-@end example
-
-The @env{AWKLIBPATH} environment variable tells
-@command{gawk} where to find shared libraries (@pxref{Finding Extensions}).
-We set it to the current directory and run the program:
-
-@example
-$ @kbd{AWKLIBPATH=$PWD gawk -f testff.awk}
-@print{} /tmp
-@print{} Info for testff.awk
-@print{} ret = 0
-@print{} data["blksize"] = 4096
-@print{} data["mtime"] = 1350838628
-@print{} data["mode"] = 33204
-@print{} data["type"] = file
-@print{} data["dev"] = 2053
-@print{} data["gid"] = 1000
-@print{} data["ino"] = 1719496
-@print{} data["ctime"] = 1350838628
-@print{} data["blocks"] = 8
-@print{} data["nlink"] = 1
-@print{} data["name"] = testff.awk
-@print{} data["atime"] = 1350838632
-@print{} data["pmode"] = -rw-rw-r--
-@print{} data["size"] = 662
-@print{} data["uid"] = 1000
-@print{} testff.awk modified: 10 21 12 18:57:08
-@print{}
-@print{} Info for JUNK
-@print{} ret = -1
-@print{} JUNK modified: 01 01 70 02:00:00
-@end example
-
-@node Extension Samples
-@section The Sample Extensions In The @command{gawk} Distribution
-
-This @value{SECTION} provides brief overviews of the sample extensions
-that come in the @command{gawk} distribution. Some of them are intended
-for production use, such the @code{filefuncs} and @code{readdir} extensions.
-Others mainly provide example code that shows how to use the extension API.
-
-@menu
-* Extension Sample File Functions:: The file functions sample.
-* Extension Sample Fnmatch:: An interface to @code{fnmatch()}.
-* Extension Sample Fork:: An interface to @code{fork()} and other
- process functions.
-* Extension Sample Ord:: Character to value to character
- conversions.
-* Extension Sample Readdir:: An interface to @code{readdir()}.
-* Extension Sample Revout:: Reversing output sample output wrapper.
-* Extension Sample Rev2way:: Reversing data sample two-way processor.
-* Extension Sample Read write array:: Serializing an array to a file.
-* Extension Sample Readfile:: Reading an entire file into a string.
-* Extension Sample API Tests:: Tests for the API.
-* Extension Sample Time:: An interface to @code{gettimeofday()}
- and @code{sleep()}.
-@end menu
-
-@node Extension Sample File Functions
-@subsection File Related Functions
-
-The @code{filefuncs} extension provides three different functions, as follows:
-The usage is:
-
-@table @code
-@item @@load "filefuncs"
-This is how you load the extension.
-
-@item result = chdir("/some/directory")
-The @code{chdir()} function is a direct hook to the @code{chdir()}
-system call to change the current directory. It returns zero
-upon success or less than zero upon error. In the latter case it updates
-@code{ERRNO}.
-
-@item result = stat("/some/path", statdata)
-The @code{stat()} function provides a hook into the
-@code{stat()} system call. In fact, it uses @code{lstat()}.
-It returns zero upon success or less than zero upon error.
-In the latter case it updates @code{ERRNO}.
-
-In all cases, it clears the @code{statdata} array.
-When the call is successful, @code{stat()} fills the @code{statdata}
-array with information retrieved from the filesystem, as follows:
-
-@c nested table
-@multitable @columnfractions .25 .60
-@item @code{statdata["name"]} @tab
-The name of the file.
-
-@item @code{statdata["dev"]} @tab
-Corresponds to the @code{st_dev} field in the @code{struct stat}.
-
-@item @code{statdata["ino"]} @tab
-Corresponds to the @code{st_ino} field in the @code{struct stat}.
-
-@item @code{statdata["mode"]} @tab
-Corresponds to the @code{st_mode} field in the @code{struct stat}.
-
-@item @code{statdata["nlink"]} @tab
-Corresponds to the @code{st_nlink} field in the @code{struct stat}.
-
-@item @code{statdata["uid"]} @tab
-Corresponds to the @code{st_uid} field in the @code{struct stat}.
-
-@item @code{statdata["gid"]} @tab
-Corresponds to the @code{st_gid} field in the @code{struct stat}.
-
-@item @code{statdata["size"]} @tab
-Corresponds to the @code{st_size} field in the @code{struct stat}.
-
-@item @code{statdata["atime"]} @tab
-Corresponds to the @code{st_atime} field in the @code{struct stat}.
-
-@item @code{statdata["mtime"]} @tab
-Corresponds to the @code{st_mtime} field in the @code{struct stat}.
-
-@item @code{statdata["ctime"]} @tab
-Corresponds to the @code{st_ctime} field in the @code{struct stat}.
-
-@item @code{statdata["rdev"]} @tab
-Corresponds to the @code{st_rdev} field in the @code{struct stat}.
-This element is only present for device files.
-
-@item @code{statdata["major"]} @tab
-Corresponds to the @code{st_major} field in the @code{struct stat}.
-This element is only present for device files.
-
-@item @code{statdata["minor"]} @tab
-Corresponds to the @code{st_minor} field in the @code{struct stat}.
-This element is only present for device files.
-
-@item @code{statdata["blksize"]} @tab
-Corresponds to the @code{st_blksize} field in the @code{struct stat}.
-if this field is present on your system.
-(It is present on all modern systems that we know of.)
-
-@item @code{statdata["pmode"]} @tab
-A human-readable version of the mode value, such as printed by
-@command{ls}. For example, @code{"-rwxr-xr-x"}.
-
-@item @code{statdata["linkval"]} @tab
-If the named file is a symbolic link, this element will exist
-and its value is the value of the symbolic link (where the
-symbolic link points to).
-
-@item @code{statdata["type"]} @tab
-The type of the file as a string. One of
-@code{"file"},
-@code{"blockdev"},
-@code{"chardev"},
-@code{"directory"},
-@code{"socket"},
-@code{"fifo"},
-@code{"symlink"},
-@code{"door"},
-or
-@code{"unknown"}.
-Not all systems support all file types.
-@end multitable
-
-@item flags = or(FTS_PHYSICAL, ...)
-@itemx result = fts(pathlist, flags, filedata)
-Walk the file trees provided in @code{pathlist} and fill in the
-@code{filedata} array as described below. @code{flags} is the bitwise
-OR of several predefined constant values, also as described below.
-Return zero if there were no errors, otherwise return @minus{}1.
-@end table
-
-The @code{fts()} function provides a hook to the C library @code{fts()}
-routines for traversing file hierarchies. Instead of returning data
-about one file at a time in a stream, it fills in a multi-dimensional
-array with data about each file and directory encountered in the requested
-hierarchies.
-
-The arguments are as follows:
-
-@table @code
-@item pathlist
-An array of filenames. The element values are used; the index values are ignored.
-
-@item flags
-This should be the bitwise OR of one or more of the following
-predefined constant flag values. At least one of
-@code{FTS_LOGICAL} or @code{FTS_PHYSICAL} must be provided; otherwise
-@code{fts()} returns an error value and sets @code{ERRNO}.
-The flags are:
-
-@c nested table
-@table @code
-@item FTS_LOGICAL
-Do a ``logical'' file traversal, where the information returned for
-a symbolic link refers to the linked-to file, and not to the symbolic
-link itself. This flag is mutually exclusive with @code{FTS_PHYSICAL}.
-
-@item FTS_PHYSICAL
-Do a ``physical'' file traversal, where the information returned for a
-symbolic link refers to the symbolic link itself. This flag is mutually
-exclusive with @code{FTS_LOGICAL}.
-
-@item FTS_NOCHDIR
-As a performance optimization, the C library @code{fts()} routines
-change directory as they traverse a file hierarchy. This flag disables
-that optimization.
-
-@item FTS_COMFOLLOW
-Immediately follow a symbolic link named in @code{pathlist},
-whether or not @code{FTS_LOGICAL} is set.
-
-@item FTS_SEEDOT
-By default, the @code{fts()} routines do not return entries for @file{.}
-and @file{..}. This option causes entries for @file{..} to also
-be included. (The extension always includes an entry for @file{.},
-see below.)
-
-@item FTS_XDEV
-During a traversal, do not cross onto a different mounted filesystem.
-@end table
-
-@item filedata
-The @code{filedata} array is first cleared. Then, @code{fts()} creates
-an element in @code{filedata} for every element in @code{pathlist}.
-The index is the name of the directory or file given in @code{pathlist}.
-The element for this index is itself an array. There are two cases.
-
-@c nested table
-@table @emph
-@item The path is a file.
-In this case, the array contains two or three elements:
-
-@c doubly nested table
-@table @code
-@item "path"
-The full path to this file, starting from the ``root'' that was given
-in the @code{pathlist} array.
-
-@item "stat"
-This element is itself an array, containing the same information as provided
-by the @code{stat()} function described earlier for its
-@code{statdata} argument. The element may not be present if
-the @code{stat()} system call for the file failed.
-
-@item "error"
-If some kind of error was encountered, the array will also
-contain an element named @code{"error"}, which is a string describing the error.
-@end table
-
-@item The path is a directory.
-In this case, the array contains one element for each entry in the
-directory. If an entry is a file, that element is as for files, just
-described. If the entry is a directory, that element is (recursively),
-an array describing the subdirectory. If @code{FTS_SEEDOT} was provided
-in the flags, then there will also be an element named @code{".."}. This
-element will be an array containing the data as provided by @code{stat()}.
-
-In addition, there will be an element whose index is @code{"."}.
-This element is an array containing the same two or three elements as
-for a file: @code{"path"}, @code{"stat"}, and @code{"error"}.
-@end table
-@end table
-
-The @code{fts()} function returns zero if there were no errors.
-Otherwise it returns @minus{}1.
-
-@quotation NOTE
-The @code{fts()} extension does not exactly mimic the
-interface of the C library @code{fts()} routines, choosing instead to
-provide an interface that is based on associative arrays, which should
-be more comfortable to use from an @command{awk} program. This includes the
-lack of a comparison function, since @command{gawk} already provides
-powerful array sorting facilities. While an @code{fts_read()}-like
-interface could have been provided, this felt less natural than simply
-creating a multi-dimensional array to represent the file hierarchy and
-its information.
-@end quotation
-
-See @file{test/fts.awk} in the @command{gawk} distribution for an example.
-
-@node Extension Sample Fnmatch
-@subsection Interface To @code{fnmatch()}
-
-This extension provides an interface to the C library
-@code{fnmatch()} function. The usage is:
-
-@example
-@@load "fnmatch"
-
-result = fnmatch(pattern, string, flags)
-@end example
-
-The @code{fnmatch} extension adds a single function named
-@code{fnmatch()}, one constant (@code{FNM_NOMATCH}), and an array of
-flag values named @code{FNM}.
-
-The arguments to @code{fnmatch()} are:
-
-@table @code
-@item pattern
-The filename wildcard to match.
-
-@item string
-The filename string,
-
-@item flag
-Either zero, or the bitwise OR of one or more of the
-flags in the @code{FNM} array.
-@end table
-
-The return value is zero on success, @code{FNM_NOMATCH}
-if the string did not match the pattern, or
-a different non-zero value if an error occurred.
-
-The flags are follows:
-
-@multitable @columnfractions .25 .75
-@item @code{FNM["CASEFOLD"]} @tab
-Corresponds to the @code{FNM_CASEFOLD} flag as defined in @code{fnmatch()}.
-
-@item @code{FNM["FILE_NAME"]} @tab
-Corresponds to the @code{FNM_FILE_NAME} flag as defined in @code{fnmatch()}.
-
-@item @code{FNM["LEADING_DIR"]} @tab
-Corresponds to the @code{FNM_LEADING_DIR} flag as defined in @code{fnmatch()}.
-
-@item @code{FNM["NOESCAPE"]} @tab
-Corresponds to the @code{FNM_NOESCAPE} flag as defined in @code{fnmatch()}.
-
-@item @code{FNM["PATHNAME"]} @tab
-Corresponds to the @code{FNM_PATHNAME} flag as defined in @code{fnmatch()}.
-
-@item @code{FNM["PERIOD"]} @tab
-Corresponds to the @code{FNM_PERIOD} flag as defined in @code{fnmatch()}.
-@end multitable
-
-Here is an example:
-
-@example
-@@load "fnmatch"
-@dots{}
-flags = or(FNM["PERIOD"], FNM["NOESCAPE"])
-if (fnmatch("*.a", "foo.c", flags) == FNM_NOMATCH)
- print "no match"
-@end example
-
-@node Extension Sample Fork
-@subsection Interface To @code{fork()}, @code{wait()} and @code{waitpid()}
-
-The @code{fork} extension adds three functions, as follows.
-
-@table @code
-@item @@load "fork"
-This is how you load the extension.
-
-@item pid = fork()
-This function creates a new process. The return value is the zero in the
-child and the process-id number of the child in the parent, or @minus{}1
-upon error. In the latter case, @code{ERRNO} indicates the problem.
-In the child, @code{PROCINFO["pid"]} and @code{PROCINFO["ppid"]} are
-updated to reflect the correct values.
-
-@item ret = waitpid(pid)
-This function takes a numeric argument, which is the process-id to
-wait for. The return value is that of the
-@code{waitpid()} system call.
-
-@item ret = wait()
-This function waits for the first child to die.
-The return value is that of the
-@code{wait()} system call.
-@end table
-
-There is no corresponding @code{exec()} function.
-
-Here is an example:
-
-@example
-@@load "fork"
-@dots{}
-if ((pid = fork()) == 0)
- print "hello from the child"
-else
- print "hello from the parent"
-@end example
-
-@node Extension Sample Ord
-@subsection Character and Numeric values: @code{ord()} and @code{chr()}
-
-The @code{ordchr} extension adds two functions, named
-@code{ord()} and @code{chr()}, as follows.
-
-@table @code
-@item number = ord(string)
-Return the numeric value of the first character in @code{string}.
-
-@item char = chr(number)
-Return the string whose first character is that represented by @code{number}.
-@end table
-
-These functions are inspired by the Pascal language functions
-of the same name. Here is an example:
-
-@example
-@@load "ordchr"
-@dots{}
-printf("The numeric value of 'A' is %d\n", ord("A"))
-printf("The string value of 65 is %s\n", chr(65))
-@end example
-
-@node Extension Sample Readdir
-@subsection Reading Directories
-
-The @code{readdir} extension adds an input parser for directories, and
-adds a single function named @code{readdir_do_ftype()}.
-The usage is as follows:
-
-@example
-@@load "readdir"
-
-readdir_do_ftype("stat") # or "dirent" or "never"
-@end example
-
-When this extension is in use, instead of skipping directories named
-on the command line (or with @code{getline}),
-they are read, with each entry returned as a record.
-
-The record consists of at least two fields: the inode number and the
-filename, separated by a forward slash character.
-On systems where the directory entry contains the file type, the record
-has a third field which is a single letter indicating the type of the
-file:
-
-@multitable @columnfractions .1 .9
-@headitem Letter @tab File Type
-@item @code{b} @tab Block device
-@item @code{c} @tab Character device
-@item @code{d} @tab Directory
-@item @code{f} @tab Regular file
-@item @code{l} @tab Symbolic link
-@item @code{p} @tab Named pipe (FIFO)
-@item @code{s} @tab Socket
-@item @code{u} @tab Anything else (unknown)
-@end multitable
-
-On systems without the file type information, calling
-@samp{readdir_do_ftype("stat")} causes the extension to use the
-@code{lstat()} system call to retrieve the appropriate information. This
-is not the default, since @code{lstat()} is a potentially expensive
-operation. By calling @samp{readdir_do_ftype("never")} one can ensure
-that the file type information is never displayed, even when readily
-available in the directory entry.
-
-The third option, @samp{readdir_do_ftype("dirent")}, takes file type
-information from the directory entry, if it is available. This is the
-default on systems that supply this information.
-
-The @code{readdir_do_ftype()} function sets @code{ERRNO} if called
-without arguments or with invalid arguments.
-
-@quotation NOTE
-On GNU/Linux systems, there are filesystems that don't support the
-@code{d_type} entry (see the @i{readdir}(3) manual page), and so the file
-type is always @samp{u}. Therefore, using @samp{readdir_do_ftype("stat")}
-is advisable even on GNU/Linux systems. In this case, the @code{readdir}
-extension falls back to using @code{lstat()} when it encounters an
-unknown file type.
-@end quotation
-
-Here is an example:
-
-@example
-@@load "readdir"
-@dots{}
-BEGIN @{ FS = "/" @}
-@{ print "file name is", $2 @}
-@end example
-
-@node Extension Sample Revout
-@subsection Reversing Output
-
-The @code{revoutput} extension adds a simple output wrapper that reverses
-the characters in each output line. It's main purpose is to show how to
-write an output wrapper, although it may be mildly amusing for the unwary.
-Here is an example:
-
-@example
-@@load "revoutput"
-
-BEGIN @{
- REVOUT = 1
- print "hello, world" > "/dev/stdout"
-@}
-@end example
-
-The output from this program is:
-@samp{dlrow ,olleh}.
-
-@node Extension Sample Rev2way
-@subsection Two-Way I/O Example
-
-The @code{revtwoway} extension adds a simple two-way processor that
-reverses the characters in each line sent to it for reading back by
-the @command{awk} program. It's main purpose is to show how to write
-a two-way processor, although it may also be mildly amusing.
-The following example shows how to use it:
-
-@example
-@@load "revtwoway"
-
-BEGIN @{
- cmd = "/magic/mirror"
- print "hello, world" |& cmd
- cmd |& getline result
- print result
- close(cmd)
-@}
-@end example
-
-@node Extension Sample Read write array
-@subsection Dumping and Restoring An Array
-
-The @code{rwarray} extension adds two functions,
-named @code{writea()} and @code{reada()}, as follows:
-
-@table @code
-@item ret = writea(file, array)
-This function takes a string argument, which is the name of the file
-to which dump the array, and the array itself as the second argument.
-@code{writea()} understands multidimensional arrays. It returns one on
-success, or zero upon failure.
-
-@item ret = reada(file, array)
-@code{reada()} is the inverse of @code{writea()};
-it reads the file named as its first argument, filling in
-the array named as the second argument. It clears the array first.
-Here too, the return value is one on success and zero upon failure.
-@end table
-
-The array created by @code{reada()} is identical to that written by
-@code{writea()} in the sense that the contents are the same. However,
-due to implementation issues, the array traversal order of the recreated
-array is likely to be different from that of the original array. As array
-traversal order in @command{awk} is by default undefined, this is not
-(technically) a problem. If you need to guarantee a particular traversal
-order, use the array sorting features in @command{gawk} to do so
-(@pxref{Array Sorting}).
-
-The file contains binary data. All integral values are written in network
-byte order. However, double precision floating-point values are written
-as native binary data. Thus, arrays containing only string data can
-theoretically be dumped on systems with one byte order and restored on
-systems with a different one, but this has not been tried.
-
-Here is an example:
-
-@example
-@@load "rwarray"
-@dots{}
-ret = writea("arraydump.bin", array)
-@dots{}
-ret = reada("arraydump.bin", array)
-@end example
-
-@node Extension Sample Readfile
-@subsection Reading An Entire File
-
-The @code{readfile} extension adds a single function
-named @code{readfile()}:
-
-@table @code
-@item result = readfile("/some/path")
-The argument is the name of the file to read. The return value is a
-string containing the entire contents of the requested file. Upon error,
-the function returns the empty string and sets @code{ERRNO}.
-@end table
-
-Here is an example:
-
-@example
-@@load "readfile"
-@dots{}
-contents = readfile("/path/to/file");
-if (contents == "" && ERRNO != "") @{
- print("problem reading file", ERRNO) > "/dev/stderr"
- ...
-@}
-@end example
-
-@node Extension Sample API Tests
-@subsection API Tests
-
-The @code{testext} extension exercises parts of the extension API that
-are not tested by the other samples. The @file{extension/testext.c}
-file contains both the C code for the extension and @command{awk}
-test code inside C comments that run the tests. The testing framework
-extracts the @command{awk} code and runs the tests. See the source file
-for more information.
-
-@node Extension Sample Time
-@subsection Extension Time Functions
-
-@cindex time
-@cindex sleep
-
-These functions can be used by either invoking @command{gawk}
-with a command-line argument of @samp{-l time} or by
-inserting @samp{@@load "time"} in your script.
-
-@table @code
-
-@cindex @code{gettimeofday} time extension function
-@item the_time = gettimeofday()
-Return the time in seconds that has elapsed since 1970-01-01 UTC as a
-floating point value. If the time is unavailable on this platform, return
-@minus{}1 and set @code{ERRNO}. The returned time should have sub-second
-precision, but the actual precision will vary based on the platform.
-If the standard C @code{gettimeofday()} system call is available on this
-platform, then it simply returns the value. Otherwise, if on Windows,
-it tries to use @code{GetSystemTimeAsFileTime()}.
-
-@cindex @code{sleep} time extension function
-@item result = sleep(@var{seconds})
-Attempt to sleep for @var{seconds} seconds. If @var{seconds} is negative,
-or the attempt to sleep fails, return @minus{}1 and set @code{ERRNO}.
-Otherwise, return zero after sleeping for the indicated amount of time.
-Note that @var{seconds} may be a floating-point (non-integral) value.
-Implementation details: depending on platform availability, this function
-tries to use @code{nanosleep()} or @code{select()} to implement the delay.
-@end table
-
-@node gawkextlib
-@section The @code{gawkextlib} Project
-
-The @uref{http://sourceforge.net/projects/gawkextlib/, @code{gawkextlib}}
-project provides a number of @command{gawk} extensions, including one for
-processing XML files. This is the evolution of the original @command{xgawk}
-(XML @command{gawk}) project.
-
-As of this writing, there are four extensions:
-
-@itemize @bullet
-@item
-XML parser extension, using the @uref{http://expat.sourceforge.net, Expat}
-XML parsing library.
-
-@item
-Postgres SQL extension.
-
-@item
-GD graphics library extension.
-
-@item
-MPFR library extension.
-This provides access to a number of MPFR functions which @command{gawk}'s
-native MPFR support does not.
-@end itemize
-
-The @code{time} extension described earlier (@pxref{Extension Sample
-Time}) was originally from this project but has been moved in to the
-main @command{gawk} distribution.
-
-You can check out the code for the @code{gawkextlib} project
-using the @uref{http://git-scm.com, GIT} distributed source
-code control system. The command is as follows:
-
-@example
-git clone git://git.code.sf.net/p/gawkextlib/code gawkextlib-code
-@end example
-
-You will need to have the @uref{http://expat.sourceforge.net, Expat}
-XML parser library installed in order to build and use the XML extension.
-
-In addition, you must have the GNU Autotools installed
-(@uref{http://www.gnu.org/software/autoconf, Autoconf},
-@uref{http://www.gnu.org/software/automake, Automake},
-@uref{http://www.gnu.org/software/libtool, Libtool},
-and
-@uref{http://www.gnu.org/software/gettext, Gettext}).
-
-The simple recipe for building and testing @code{gawkextlib} is as follows.
-First, build and install @command{gawk}:
-
-@example
-cd .../path/to/gawk/code
-./configure --prefix=/tmp/newgawk @ii{Install in /tmp/newgawk for now}
-make && make check @ii{Build and check that all is OK}
-make install @ii{Install gawk}
-@end example
-
-Next, build @code{gawkextlib} and test it:
-
-@example
-cd .../path/to/gawkextlib-code
-./update-autotools @ii{Generate configure, etc.}
- @ii{You may have to run this command twice}
-./configure --with-gawk=/tmp/newgawk @ii{Configure, point at ``installed'' gawk}
-make && make check @ii{Build and check that all is OK}
-@end example
-
-If you write an extension that you wish to share with other
-@command{gawk} users, please consider doing so through the
-@code{gawkextlib} project.
-
-@node Fake Chapter
-@chapter Fake Sections For Cross References
-
-@menu
-* Reference to Elements:: Referring to an Array Element.
-* Built-in:: Built-in Functions.
-* Built-in Variables:: Built-in Variables.
-* Options:: Command-Line Options.
-* AWKLIBPATH Variable:: The @env{AWKLIBPATH} Environment Variable.
-* BEGINFILE/ENDFILE:: The @code{BEGINFILE} and @code{ENDFILE} Special Patterns.
-* Redirection:: Redirecting Output of @code{print} and @code{printf}.
-* Arrays:: Arrays in @command{awk}.
-* Conversion:: Conversion of Strings and Numbers.
-* Delete:: The @code{delete} Statement.
-* String Functions:: String-Manipulation Functions.
-* Glossary:: Glossary.
-* Copying:: GNU General Public License.
-* Reading Files:: Reading Input Files.
-* Time Functions:: Time Functions.
-* Array Sorting:: Controlling Array Traversal and Array Sorting.
-@end menu
-
-@node Reference to Elements
-@section Referring to an Array Element
-
-@node Built-in
-@section Built-in Functions
-
-@node Built-in Variables
-@section Built-in Variables
-
-@node Options
-@section Command-Line Options
-
-@node AWKLIBPATH Variable
-@section The @env{AWKLIBPATH} Environment Variable
-
-@node BEGINFILE/ENDFILE
-@section The @code{BEGINFILE} and @code{ENDFILE} Special Patterns
-
-@node Redirection
-@section Redirecting Output of @code{print} and @code{printf}
-
-@node Arrays
-@section Arrays in @command{awk}
-
-@node Conversion
-@section Conversion of Strings and Numbers
-
-@node Delete
-@section The @code{delete} Statement
-
-@node String Functions
-@section String-Manipulation Functions
-
-@node Glossary
-@section Glossary
-
-@node Copying
-@section GNU General Public License
-
-@node Reading Files
-@section Reading Input Files
-
-@node Time Functions
-@section Time Functions
-
-@node Array Sorting
-@section Controlling Array Traversal and Array Sorting
-
-@bye
-shold