diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2012-12-16 17:10:48 +0200 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2012-12-16 17:10:48 +0200 |
commit | 52c4d6df2661e9ebdde8fcc0ea2e9308f37efd2e (patch) | |
tree | f06af9e0fa8173f260e900452a6e7bba3f877637 | |
parent | 287b218ce09459ba4d66dd0c4ad6f1c48f525c82 (diff) | |
parent | b68a7db5669521f4c56dc690a12588422548fa53 (diff) | |
download | egawk-52c4d6df2661e9ebdde8fcc0ea2e9308f37efd2e.tar.gz egawk-52c4d6df2661e9ebdde8fcc0ea2e9308f37efd2e.tar.bz2 egawk-52c4d6df2661e9ebdde8fcc0ea2e9308f37efd2e.zip |
Merge branch 'master' into array-iface
-rw-r--r-- | doc/ChangeLog | 5 | ||||
-rw-r--r-- | doc/gawk.info | 735 | ||||
-rw-r--r-- | doc/gawk.texi | 476 |
3 files changed, 617 insertions, 599 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog index a2c1eaa2..d66dbb87 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,7 +1,12 @@ +2012-12-16 Arnold D. Robbins <arnold@skeeve.com> + + * gawk.texi: Move design decisions on new API to appendix C. + 2012-12-15 Arnold D. Robbins <arnold@skeeve.com> * macros: Update to GPL Version 3 and add copyright year. * texinfo.tex: Updated, from automake 1.12.6. + * gawk.texi (Derived Files): A few minor fixes. 2012-12-09 Arnold D. Robbins <arnold@skeeve.com> diff --git a/doc/gawk.info b/doc/gawk.info index 24838360..693d2dcf 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -513,12 +513,7 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) with `gawk'. * Extension Intro:: What is an extension. * Plugin License:: A note about licensing. -* Extension Design:: Design notes about the extension API. -* Old Extension Problems:: Problems with the old mechanism. -* Extension New Mechanism Goals:: Goals for the new mechanism. -* Extension Other Design Decisions:: Some other design decisions. * Extension Mechanism Outline:: An outline of how it works. -* Extension Future Growth:: Some room for future growth. * Extension API Description:: A full description of the API. * Extension API Functions Introduction:: Introduction to the API functions. * General Data Types:: The data types. @@ -638,6 +633,11 @@ texts being (a) (see below), and with the Back-Cover Texts being (b) one day. * Implementation Limitations:: Some limitations of the implementation. * Old Extension Mechansim:: Some compatibility for old extensions. +* Extension Design:: Design notes about the extension API. +* Old Extension Problems:: Problems with the old mechanism. +* Extension New Mechanism Goals:: Goals for the new mechanism. +* Extension Other Design Decisions:: Some other design decisions. +* Extension Future Growth:: Some room for future growth. * Basic High Level:: The high level view. * Basic Data Typing:: A very quick intro to data types. @@ -21369,7 +21369,7 @@ describes how to create extensions using code written in C or C++. If you don't know anything about C programming, you can safely skip this major node, although you may wish to review the documentation on the extensions that come with `gawk' (*note Extension Samples::), and -the minor node on the `gawkextlib' project (*note gawkextlib::). The +the information on the `gawkextlib' project (*note gawkextlib::). The sample extensions are automatically built and installed when `gawk' is. NOTE: When `--sandbox' is specified, extensions are disabled @@ -21379,7 +21379,7 @@ sample extensions are automatically built and installed when `gawk' is. * Extension Intro:: What is an extension. * Plugin License:: A note about licensing. -* Extension Design:: Design notes about the extension API. +* Extension Mechanism Outline:: An outline of how it works. * Extension API Description:: A full description of the API. * Extension Example:: Example C code for an extension. * Extension Samples:: The sample extensions that ship with @@ -21413,7 +21413,7 @@ sample extensions included in the `gawk' distribution, and describes the `gawkextlib' project. -File: gawk.info, Node: Plugin License, Next: Extension Design, Prev: Extension Intro, Up: Dynamic Extensions +File: gawk.info, Node: Plugin License, Next: Extension Mechanism Outline, Prev: Extension Intro, Up: Dynamic Extensions 16.2 Extension Licensing ======================== @@ -21430,203 +21430,14 @@ the symbol exists in the global scope. Something like this is enough: int plugin_is_GPL_compatible; -File: gawk.info, Node: Extension Design, Next: Extension API Description, Prev: Plugin License, Up: Dynamic Extensions - -16.3 Extension API Design -========================= - -The first version of extensions for `gawk' was developed in the -mid-1990s and released with `gawk' 3.1 in the late 1990s. The basic -mechanisms and design remained unchanged for close to 15 years, until -2012. - - The old extension mechanism used data types and functions from -`gawk' itself, with a "clever hack" to install extension functions. - - `gawk' included some sample extensions, of which a few were really -useful. However, it was clear from the outset that the extension -mechanism was bolted onto the side and was not really thought out. - -* Menu: - -* Old Extension Problems:: Problems with the old mechanism. -* Extension New Mechanism Goals:: Goals for the new mechanism. -* Extension Other Design Decisions:: Some other design decisions. -* Extension Mechanism Outline:: An outline of how it works. -* Extension Future Growth:: Some room for future growth. - - -File: gawk.info, Node: Old Extension Problems, Next: Extension New Mechanism Goals, Up: Extension Design - -16.3.1 Problems With The Old Mechanism --------------------------------------- - -The old extension mechanism had several problems: - - * It depended heavily upon `gawk' internals. Any time the `NODE' - structure(1) changed, an extension would have to be recompiled. - Furthermore, to really write extensions required understanding - something about `gawk''s internal functions. There was some - documentation in this Info file, but it was quite minimal. - - * Being able to call into `gawk' from an extension required linker - facilities that are common on Unix-derived systems but that did - not work on Windows systems; users wanting extensions on Windows - had to statically link them into `gawk', even though Windows - supports dynamic loading of shared objects. - - * The API would change occasionally as `gawk' changed; no - compatibility between versions was ever offered or planned for. - - Despite the drawbacks, the `xgawk' project developers forked `gawk' -and developed several significant extensions. They also enhanced -`gawk''s facilities relating to file inclusion and shared object access. - - A new API was desired for a long time, but only in 2012 did the -`gawk' maintainer and the `xgawk' developers finally start working on -it together. More information about the `xgawk' project is provided in -*note gawkextlib::. - - ---------- Footnotes ---------- - - (1) A critical central data structure inside `gawk'. - - -File: gawk.info, Node: Extension New Mechanism Goals, Next: Extension Other Design Decisions, Prev: Old Extension Problems, Up: Extension Design - -16.3.2 Goals For A New Mechanism --------------------------------- - -Some goals for the new API were: - - * The API should be independent of `gawk' internals. Changes in - `gawk' internals should not be visible to the writer of an - extension function. - - * The API should provide _binary_ compatibility across `gawk' - releases as long as the API itself does not change. - - * The API should enable extensions written in C to have roughly the - same "appearance" to `awk'-level code as `awk' functions do. This - means that extensions should have: - - - The ability to access function parameters. - - - The ability to turn an undefined parameter into an array - (call by reference). - - - The ability to create, access and update global variables. - - - Easy access to all the elements of an array at once ("array - flattening") in order to loop over all the element in an easy - fashion for C code. - - - The ability to create arrays (including `gawk''s true - multi-dimensional arrays). - - Some additional important goals were: - - * The API should use only features in ISO C 90, so that extensions - can be written using the widest range of C and C++ compilers. The - header should include the appropriate `#ifdef __cplusplus' and - `extern "C"' magic so that a C++ compiler could be used. (If - using C++, the runtime system has to be smart enough to call any - constructors and destructors, as `gawk' is a C program. As of this - writing, this has not been tested.) - - * The API mechanism should not require access to `gawk''s symbols(1) - by the compile-time or dynamic linker, in order to enable creation - of extensions that also work on Windows. - - During development, it became clear that there were other features -that should be available to extensions, which were also subsequently -provided: - - * Extensions should have the ability to hook into `gawk''s I/O - redirection mechanism. In particular, the `xgawk' developers - provided a so-called "open hook" to take over reading records. - During development, this was generalized to allow extensions to - hook into input processing, output processing, and two-way I/O. - - * An extension should be able to provide a "call back" function to - perform clean up actions when `gawk' exits. - - * An extension should be able to provide a version string so that - `gawk''s `--version' option can provide information about - extensions as well. - - ---------- Footnotes ---------- - - (1) The "symbols" are the variables and functions defined inside -`gawk'. Access to these symbols by code external to `gawk' loaded -dynamically at runtime is problematic on Windows. - - -File: gawk.info, Node: Extension Other Design Decisions, Next: Extension Mechanism Outline, Prev: Extension New Mechanism Goals, Up: Extension Design - -16.3.3 Other Design Decisions ------------------------------ - -As an arbitrary design decision, extensions can read the values of -built-in variables and arrays (such as `ARGV' and `FS'), but cannot -change them, with the exception of `PROCINFO'. +File: gawk.info, Node: Extension Mechanism Outline, Next: Extension API Description, Prev: Plugin License, Up: Dynamic Extensions - The reason for this is to prevent an extension function from -affecting the flow of an `awk' program outside its control. While a -real `awk' function can do what it likes, that is at the discretion of -the programmer. An extension function should provide a service or make -a C API available for use within `awk', and not mess with `FS' or -`ARGC' and `ARGV'. - - In addition, it becomes easy to start down a slippery slope. How -much access to `gawk' facilities do extensions need? Do they need -`getline'? What about calling `gsub()' or compiling regular -expressions? What about calling into `awk' functions? (_That_ would be -messy.) - - In order to avoid these issues, the `gawk' developers chose to start -with the simplest, most basic features that are still truly useful. - - Another decision is that although `gawk' provides nice things like -MPFR, and arrays indexed internally by integers, these features are not -being brought out to the API in order to keep things simple and close to -traditional `awk' semantics. (In fact, arrays indexed internally by -integers are so transparent that they aren't even documented!) - - Additionally, all functions in the API check that their pointer -input parameters are not `NULL'. If they are, they return an error. -(It is a good idea for extension code to verify that pointers received -from `gawk' are not `NULL'. Such a thing should not happen, but the -`gawk' developers are only human, and they have been known to -occasionally make mistakes.) - - With time, the API will undoubtedly evolve; the `gawk' developers -expect this to be driven by user needs. For now, the current API seems -to provide a minimal yet powerful set of features for creating -extensions. - - -File: gawk.info, Node: Extension Mechanism Outline, Next: Extension Future Growth, Prev: Extension Other Design Decisions, Up: Extension Design - -16.3.4 At A High Level How It Works ------------------------------------ - -The requirement to avoid access to `gawk''s symbols is, at first -glance, a difficult one to meet. - - One design, apparently used by Perl and Ruby and maybe others, would -be to make the mainline `gawk' code into a library, with the `gawk' -utility a small C `main()' function linked against the library. - - This seemed like the tail wagging the dog, complicating build and -installation and making a simple copy of the `gawk' executable from one -system to another (or one place to another on the same system!) into a -chancy operation. +16.3 At A High Level How It Works +================================= - Pat Rankin suggested the solution that was adopted. Communication -between `gawk' and an extension is two-way. First, when an extension -is loaded, it is passed a pointer to a `struct' whose fields are -function pointers. This is shown in *note load-extension::. +Communication between `gawk' and an extension is two-way. First, when +an extension is loaded, it is passed a pointer to a `struct' whose +fields are function pointers. This is shown in *note load-extension::. API Struct @@ -21723,28 +21534,7 @@ Example::) and also the `testext.c' code for testing the APIs. Versioning::, for details. -File: gawk.info, Node: Extension Future Growth, Prev: Extension Mechanism Outline, Up: Extension Design - -16.3.5 Room For Future Growth ------------------------------ - -The API can later be expanded, in two ways: - - * `gawk' passes an "extension id" into the extension when it first - loads the extension. The extension then passes this id back to - `gawk' with each function call. This mechanism allows `gawk' to - identify the extension calling into it, should it need to know. - - * Similarly, the extension passes a "name space" into `gawk' when it - registers each extension function. This allows a future mechanism - for grouping extension functions and possibly avoiding name - conflicts. - - Of course, as of this writing, no decisions have been made with -respect to any of the above. - - -File: gawk.info, Node: Extension API Description, Next: Extension Example, Prev: Extension Design, Up: Dynamic Extensions +File: gawk.info, Node: Extension API Description, Next: Extension Example, Prev: Extension Mechanism Outline, Up: Dynamic Extensions 16.4 API Description ==================== @@ -23160,12 +22950,12 @@ as a nice example to show how to use the APIs. } This code creates an array with `split()' (*note String Functions::) -and then calls `dump_and_delete()'. That function looks up the array -whose name is passed as the first argument, and deletes the element at -the index passed in the second argument. It then prints the return -value and checks if the element was indeed deleted. Here is the C code -that implements `dump_array_and_delete()'. It has been edited slightly -for presentation. +and then calls `dump_array_and_delete()'. That function looks up the +array whose name is passed as the first argument, and deletes the +element at the index passed in the second argument. It then prints the +return value and checks if the element was indeed deleted. Here is the +C code that implements `dump_array_and_delete()'. It has been edited +slightly for presentation. The first part declares variables, sets up the default return value in `result', and checks that the function was called with the correct @@ -26498,6 +26288,7 @@ and maintainers of `gawk'. Everything in it applies specifically to * Future Extensions:: New features that may be implemented one day. * Implementation Limitations:: Some limitations of the implementation. * Old Extension Mechansim:: Some compatibility for old extensions. +* Extension Design:: Design notes about the extension API. File: gawk.info, Node: Compatibility Mode, Next: Additions, Up: Notes @@ -26887,7 +26678,7 @@ critical, that for any given branch, the above incantation _just works_. A. Installing from source is quite easy. It's how the maintainer worked for years under Fedora. He had `/usr/local/bin' at - the front of hs `PATH' and just did: + the front of his `PATH' and just did: wget http://ftp.gnu.org/gnu/PACKAGE/PACKAGE-X.Y.Z.tar.gz tar -xpzvf PACKAGE-X.Y.Z.tar.gz @@ -26895,9 +26686,9 @@ critical, that for any given branch, the above incantation _just works_. ./configure && make && make check make install # as root - B. These days the maintainer uses Ubuntu 10.11 which is medium - current, but he is already doing the above for `autoconf' and - `bison'. + B. These days the maintainer uses Ubuntu 12.04 which is medium + current, but he is already doing the above for `autoconf', + `automake' and `bison'. @@ -26992,7 +26783,7 @@ Size of a literal string `MAX_INT ' Size of a printf string `MAX_INT ' -File: gawk.info, Node: Old Extension Mechansim, Prev: Implementation Limitations, Up: Notes +File: gawk.info, Node: Old Extension Mechansim, Next: Extension Design, Prev: Implementation Limitations, Up: Notes C.5 Compatibility For Old Extensions ==================================== @@ -27031,6 +26822,220 @@ old extensions that you may have to use the new API described in *note Dynamic Extensions::. +File: gawk.info, Node: Extension Design, Prev: Old Extension Mechansim, Up: Notes + +C.6 Extension API Design +======================== + +This minor node documents the design of the extension API, including a +discussion of some of the history and problems that needed to be solved. + + The first version of extensions for `gawk' was developed in the +mid-1990s and released with `gawk' 3.1 in the late 1990s. The basic +mechanisms and design remained unchanged for close to 15 years, until +2012. + + The old extension mechanism used data types and functions from +`gawk' itself, with a "clever hack" to install extension functions. + + `gawk' included some sample extensions, of which a few were really +useful. However, it was clear from the outset that the extension +mechanism was bolted onto the side and was not really thought out. + +* Menu: + +* Old Extension Problems:: Problems with the old mechanism. +* Extension New Mechanism Goals:: Goals for the new mechanism. +* Extension Other Design Decisions:: Some other design decisions. +* Extension Future Growth:: Some room for future growth. + + +File: gawk.info, Node: Old Extension Problems, Next: Extension New Mechanism Goals, Up: Extension Design + +C.6.1 Problems With The Old Mechanism +------------------------------------- + +The old extension mechanism had several problems: + + * It depended heavily upon `gawk' internals. Any time the `NODE' + structure(1) changed, an extension would have to be recompiled. + Furthermore, to really write extensions required understanding + something about `gawk''s internal functions. There was some + documentation in this Info file, but it was quite minimal. + + * Being able to call into `gawk' from an extension required linker + facilities that are common on Unix-derived systems but that did + not work on Windows systems; users wanting extensions on Windows + had to statically link them into `gawk', even though Windows + supports dynamic loading of shared objects. + + * The API would change occasionally as `gawk' changed; no + compatibility between versions was ever offered or planned for. + + Despite the drawbacks, the `xgawk' project developers forked `gawk' +and developed several significant extensions. They also enhanced +`gawk''s facilities relating to file inclusion and shared object access. + + A new API was desired for a long time, but only in 2012 did the +`gawk' maintainer and the `xgawk' developers finally start working on +it together. More information about the `xgawk' project is provided in +*note gawkextlib::. + + ---------- Footnotes ---------- + + (1) A critical central data structure inside `gawk'. + + +File: gawk.info, Node: Extension New Mechanism Goals, Next: Extension Other Design Decisions, Prev: Old Extension Problems, Up: Extension Design + +C.6.2 Goals For A New Mechanism +------------------------------- + +Some goals for the new API were: + + * The API should be independent of `gawk' internals. Changes in + `gawk' internals should not be visible to the writer of an + extension function. + + * The API should provide _binary_ compatibility across `gawk' + releases as long as the API itself does not change. + + * The API should enable extensions written in C to have roughly the + same "appearance" to `awk'-level code as `awk' functions do. This + means that extensions should have: + + - The ability to access function parameters. + + - The ability to turn an undefined parameter into an array + (call by reference). + + - The ability to create, access and update global variables. + + - Easy access to all the elements of an array at once ("array + flattening") in order to loop over all the element in an easy + fashion for C code. + + - The ability to create arrays (including `gawk''s true + multi-dimensional arrays). + + Some additional important goals were: + + * The API should use only features in ISO C 90, so that extensions + can be written using the widest range of C and C++ compilers. The + header should include the appropriate `#ifdef __cplusplus' and + `extern "C"' magic so that a C++ compiler could be used. (If + using C++, the runtime system has to be smart enough to call any + constructors and destructors, as `gawk' is a C program. As of this + writing, this has not been tested.) + + * The API mechanism should not require access to `gawk''s symbols(1) + by the compile-time or dynamic linker, in order to enable creation + of extensions that also work on Windows. + + During development, it became clear that there were other features +that should be available to extensions, which were also subsequently +provided: + + * Extensions should have the ability to hook into `gawk''s I/O + redirection mechanism. In particular, the `xgawk' developers + provided a so-called "open hook" to take over reading records. + During development, this was generalized to allow extensions to + hook into input processing, output processing, and two-way I/O. + + * An extension should be able to provide a "call back" function to + perform clean up actions when `gawk' exits. + + * An extension should be able to provide a version string so that + `gawk''s `--version' option can provide information about + extensions as well. + + The requirement to avoid access to `gawk''s symbols is, at first +glance, a difficult one to meet. + + One design, apparently used by Perl and Ruby and maybe others, would +be to make the mainline `gawk' code into a library, with the `gawk' +utility a small C `main()' function linked against the library. + + This seemed like the tail wagging the dog, complicating build and +installation and making a simple copy of the `gawk' executable from one +system to another (or one place to another on the same system!) into a +chancy operation. + + Pat Rankin suggested the solution that was adopted. *Note Extension +Mechanism Outline::, for the details. + + ---------- Footnotes ---------- + + (1) The "symbols" are the variables and functions defined inside +`gawk'. Access to these symbols by code external to `gawk' loaded +dynamically at runtime is problematic on Windows. + + +File: gawk.info, Node: Extension Other Design Decisions, Next: Extension Future Growth, Prev: Extension New Mechanism Goals, Up: Extension Design + +C.6.3 Other Design Decisions +---------------------------- + +As an arbitrary design decision, extensions can read the values of +built-in variables and arrays (such as `ARGV' and `FS'), but cannot +change them, with the exception of `PROCINFO'. + + The reason for this is to prevent an extension function from +affecting the flow of an `awk' program outside its control. While a +real `awk' function can do what it likes, that is at the discretion of +the programmer. An extension function should provide a service or make +a C API available for use within `awk', and not mess with `FS' or +`ARGC' and `ARGV'. + + In addition, it becomes easy to start down a slippery slope. How +much access to `gawk' facilities do extensions need? Do they need +`getline'? What about calling `gsub()' or compiling regular +expressions? What about calling into `awk' functions? (_That_ would be +messy.) + + In order to avoid these issues, the `gawk' developers chose to start +with the simplest, most basic features that are still truly useful. + + Another decision is that although `gawk' provides nice things like +MPFR, and arrays indexed internally by integers, these features are not +being brought out to the API in order to keep things simple and close to +traditional `awk' semantics. (In fact, arrays indexed internally by +integers are so transparent that they aren't even documented!) + + Additionally, all functions in the API check that their pointer +input parameters are not `NULL'. If they are, they return an error. +(It is a good idea for extension code to verify that pointers received +from `gawk' are not `NULL'. Such a thing should not happen, but the +`gawk' developers are only human, and they have been known to +occasionally make mistakes.) + + With time, the API will undoubtedly evolve; the `gawk' developers +expect this to be driven by user needs. For now, the current API seems +to provide a minimal yet powerful set of features for creating +extensions. + + +File: gawk.info, Node: Extension Future Growth, Prev: Extension Other Design Decisions, Up: Extension Design + +C.6.4 Room For Future Growth +---------------------------- + +The API can later be expanded, in two ways: + + * `gawk' passes an "extension id" into the extension when it first + loads the extension. The extension then passes this id back to + `gawk' with each function call. This mechanism allows `gawk' to + identify the extension calling into it, should it need to know. + + * Similarly, the extension passes a "name space" into `gawk' when it + registers each extension function. This allows a future mechanism + for grouping extension functions and possibly avoiding name + conflicts. + + Of course, as of this writing, no decisions have been made with +respect to any of the above. + + File: gawk.info, Node: Basic Concepts, Next: Glossary, Prev: Notes, Up: Top Appendix D Basic Programming Concepts @@ -32203,134 +32208,134 @@ Node: Exact Arithmetic856101 Node: Arbitrary Precision Integers859209 Ref: Arbitrary Precision Integers-Footnote-1862209 Node: Dynamic Extensions862356 -Node: Extension Intro863742 -Node: Plugin License864950 -Node: Extension Design865624 -Node: Old Extension Problems866695 -Ref: Old Extension Problems-Footnote-1868205 -Node: Extension New Mechanism Goals868262 -Ref: Extension New Mechanism Goals-Footnote-1870974 -Node: Extension Other Design Decisions871160 -Node: Extension Mechanism Outline873272 -Ref: load-extension874297 -Ref: load-new-function875775 -Ref: call-new-function876756 -Node: Extension Future Growth878750 -Node: Extension API Description879568 -Node: Extension API Functions Introduction880896 -Node: General Data Types885674 -Ref: General Data Types-Footnote-1891276 -Node: Requesting Values891575 -Ref: table-value-types-returned892306 -Node: Constructor Functions893260 -Node: Registration Functions896256 -Node: Extension Functions896941 -Node: Exit Callback Functions899115 -Node: Extension Version String900358 -Node: Input Parsers901008 -Node: Output Wrappers909595 -Node: Two-way processors914011 -Node: Printing Messages916141 -Ref: Printing Messages-Footnote-1917218 -Node: Updating `ERRNO'917370 -Node: Accessing Parameters918109 -Node: Symbol Table Access919339 -Node: Symbol table by name919851 -Ref: Symbol table by name-Footnote-1922021 -Node: Symbol table by cookie922101 -Ref: Symbol table by cookie-Footnote-1926230 -Node: Cached values926293 -Ref: Cached values-Footnote-1929736 -Node: Array Manipulation929827 -Ref: Array Manipulation-Footnote-1930925 -Node: Array Data Types930964 -Ref: Array Data Types-Footnote-1933667 -Node: Array Functions933759 -Node: Flattening Arrays937525 -Node: Creating Arrays944358 -Node: Extension API Variables949153 -Node: Extension Versioning949789 -Node: Extension API Informational Variables951690 -Node: Extension API Boilerplate952776 -Node: Finding Extensions956607 -Node: Extension Example957154 -Node: Internal File Description957892 -Node: Internal File Ops961580 -Ref: Internal File Ops-Footnote-1973027 -Node: Using Internal File Ops973167 -Ref: Using Internal File Ops-Footnote-1975520 -Node: Extension Samples975786 -Node: Extension Sample File Functions977229 -Node: Extension Sample Fnmatch985702 -Node: Extension Sample Fork987428 -Node: Extension Sample Ord988642 -Node: Extension Sample Readdir989418 -Node: Extension Sample Revout990922 -Node: Extension Sample Rev2way991515 -Node: Extension Sample Read write array992205 -Node: Extension Sample Readfile994088 -Node: Extension Sample API Tests994843 -Node: Extension Sample Time995368 -Node: gawkextlib996675 -Node: Language History999056 -Node: V7/SVR3.11000578 -Node: SVR41002899 -Node: POSIX1004341 -Node: BTL1005349 -Node: POSIX/GNU1006154 -Node: Common Extensions1011689 -Node: Ranges and Locales1012748 -Ref: Ranges and Locales-Footnote-11017366 -Ref: Ranges and Locales-Footnote-21017393 -Ref: Ranges and Locales-Footnote-31017653 -Node: Contributors1017874 -Node: Installation1022170 -Node: Gawk Distribution1023064 -Node: Getting1023548 -Node: Extracting1024374 -Node: Distribution contents1026066 -Node: Unix Installation1031327 -Node: Quick Installation1031944 -Node: Additional Configuration Options1033906 -Node: Configuration Philosophy1035383 -Node: Non-Unix Installation1037725 -Node: PC Installation1038183 -Node: PC Binary Installation1039482 -Node: PC Compiling1041330 -Node: PC Testing1044274 -Node: PC Using1045450 -Node: Cygwin1049635 -Node: MSYS1050635 -Node: VMS Installation1051149 -Node: VMS Compilation1051752 -Ref: VMS Compilation-Footnote-11052759 -Node: VMS Installation Details1052817 -Node: VMS Running1054452 -Node: VMS Old Gawk1056059 -Node: Bugs1056533 -Node: Other Versions1060385 -Node: Notes1065700 -Node: Compatibility Mode1066430 -Node: Additions1067213 -Node: Accessing The Source1068140 -Node: Adding Code1069743 -Node: New Ports1075785 -Node: Derived Files1079920 -Ref: Derived Files-Footnote-11085228 -Ref: Derived Files-Footnote-21085262 -Ref: Derived Files-Footnote-31085862 -Node: Future Extensions1085960 -Node: Implementation Limitations1086541 -Node: Old Extension Mechansim1087800 -Node: Basic Concepts1089567 -Node: Basic High Level1090248 -Ref: figure-general-flow1090519 -Ref: figure-process-flow1091118 -Ref: Basic High Level-Footnote-11094347 -Node: Basic Data Typing1094532 -Node: Glossary1097887 -Node: Copying1123198 -Node: GNU Free Documentation License1160755 -Node: Index1185892 +Node: Extension Intro863733 +Node: Plugin License864941 +Node: Extension Mechanism Outline865626 +Ref: load-extension866043 +Ref: load-new-function867521 +Ref: call-new-function868502 +Node: Extension API Description870496 +Node: Extension API Functions Introduction871835 +Node: General Data Types876613 +Ref: General Data Types-Footnote-1882215 +Node: Requesting Values882514 +Ref: table-value-types-returned883245 +Node: Constructor Functions884199 +Node: Registration Functions887195 +Node: Extension Functions887880 +Node: Exit Callback Functions890054 +Node: Extension Version String891297 +Node: Input Parsers891947 +Node: Output Wrappers900534 +Node: Two-way processors904950 +Node: Printing Messages907080 +Ref: Printing Messages-Footnote-1908157 +Node: Updating `ERRNO'908309 +Node: Accessing Parameters909048 +Node: Symbol Table Access910278 +Node: Symbol table by name910790 +Ref: Symbol table by name-Footnote-1912960 +Node: Symbol table by cookie913040 +Ref: Symbol table by cookie-Footnote-1917169 +Node: Cached values917232 +Ref: Cached values-Footnote-1920675 +Node: Array Manipulation920766 +Ref: Array Manipulation-Footnote-1921864 +Node: Array Data Types921903 +Ref: Array Data Types-Footnote-1924606 +Node: Array Functions924698 +Node: Flattening Arrays928464 +Node: Creating Arrays935303 +Node: Extension API Variables940098 +Node: Extension Versioning940734 +Node: Extension API Informational Variables942635 +Node: Extension API Boilerplate943721 +Node: Finding Extensions947552 +Node: Extension Example948099 +Node: Internal File Description948837 +Node: Internal File Ops952525 +Ref: Internal File Ops-Footnote-1963972 +Node: Using Internal File Ops964112 +Ref: Using Internal File Ops-Footnote-1966465 +Node: Extension Samples966731 +Node: Extension Sample File Functions968174 +Node: Extension Sample Fnmatch976647 +Node: Extension Sample Fork978373 +Node: Extension Sample Ord979587 +Node: Extension Sample Readdir980363 +Node: Extension Sample Revout981867 +Node: Extension Sample Rev2way982460 +Node: Extension Sample Read write array983150 +Node: Extension Sample Readfile985033 +Node: Extension Sample API Tests985788 +Node: Extension Sample Time986313 +Node: gawkextlib987620 +Node: Language History990001 +Node: V7/SVR3.1991523 +Node: SVR4993844 +Node: POSIX995286 +Node: BTL996294 +Node: POSIX/GNU997099 +Node: Common Extensions1002634 +Node: Ranges and Locales1003693 +Ref: Ranges and Locales-Footnote-11008311 +Ref: Ranges and Locales-Footnote-21008338 +Ref: Ranges and Locales-Footnote-31008598 +Node: Contributors1008819 +Node: Installation1013115 +Node: Gawk Distribution1014009 +Node: Getting1014493 +Node: Extracting1015319 +Node: Distribution contents1017011 +Node: Unix Installation1022272 +Node: Quick Installation1022889 +Node: Additional Configuration Options1024851 +Node: Configuration Philosophy1026328 +Node: Non-Unix Installation1028670 +Node: PC Installation1029128 +Node: PC Binary Installation1030427 +Node: PC Compiling1032275 +Node: PC Testing1035219 +Node: PC Using1036395 +Node: Cygwin1040580 +Node: MSYS1041580 +Node: VMS Installation1042094 +Node: VMS Compilation1042697 +Ref: VMS Compilation-Footnote-11043704 +Node: VMS Installation Details1043762 +Node: VMS Running1045397 +Node: VMS Old Gawk1047004 +Node: Bugs1047478 +Node: Other Versions1051330 +Node: Notes1056645 +Node: Compatibility Mode1057445 +Node: Additions1058228 +Node: Accessing The Source1059155 +Node: Adding Code1060758 +Node: New Ports1066800 +Node: Derived Files1070935 +Ref: Derived Files-Footnote-11076256 +Ref: Derived Files-Footnote-21076290 +Ref: Derived Files-Footnote-31076890 +Node: Future Extensions1076988 +Node: Implementation Limitations1077569 +Node: Old Extension Mechansim1078828 +Node: Extension Design1080620 +Node: Old Extension Problems1081734 +Ref: Old Extension Problems-Footnote-11083242 +Node: Extension New Mechanism Goals1083299 +Ref: Extension New Mechanism Goals-Footnote-11086658 +Node: Extension Other Design Decisions1086844 +Node: Extension Future Growth1088950 +Node: Basic Concepts1089771 +Node: Basic High Level1090452 +Ref: figure-general-flow1090723 +Ref: figure-process-flow1091322 +Ref: Basic High Level-Footnote-11094551 +Node: Basic Data Typing1094736 +Node: Glossary1098091 +Node: Copying1123402 +Node: GNU Free Documentation License1160959 +Node: Index1186096 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index fca7cebb..15b43038 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -721,12 +721,7 @@ particular records in a file and perform operations upon them. with @command{gawk}. * Extension Intro:: What is an extension. * Plugin License:: A note about licensing. -* Extension Design:: Design notes about the extension API. -* Old Extension Problems:: Problems with the old mechanism. -* Extension New Mechanism Goals:: Goals for the new mechanism. -* Extension Other Design Decisions:: Some other design decisions. * Extension Mechanism Outline:: An outline of how it works. -* Extension Future Growth:: Some room for future growth. * Extension API Description:: A full description of the API. * Extension API Functions Introduction:: Introduction to the API functions. * General Data Types:: The data types. @@ -846,6 +841,11 @@ particular records in a file and perform operations upon them. one day. * Implementation Limitations:: Some limitations of the implementation. * Old Extension Mechansim:: Some compatibility for old extensions. +* Extension Design:: Design notes about the extension API. +* Old Extension Problems:: Problems with the old mechanism. +* Extension New Mechanism Goals:: Goals for the new mechanism. +* Extension Other Design Decisions:: Some other design decisions. +* Extension Future Growth:: Some room for future growth. * Basic High Level:: The high level view. * Basic Data Typing:: A very quick intro to data types. @end detailmenu @@ -28263,7 +28263,7 @@ using code written in C or C++. If you don't know anything about C programming, you can safely skip this @value{CHAPTER}, although you may wish to review the documentation on the extensions that come with @command{gawk} (@pxref{Extension Samples}), -and the @value{SECTION} on the @code{gawkextlib} project (@pxref{gawkextlib}). +and the information on the @code{gawkextlib} project (@pxref{gawkextlib}). The sample extensions are automatically built and installed when @command{gawk} is. @@ -28275,7 +28275,7 @@ When @option{--sandbox} is specified, extensions are disabled @menu * Extension Intro:: What is an extension. * Plugin License:: A note about licensing. -* Extension Design:: Design notes about the extension API. +* Extension Mechanism Outline:: An outline of how it works. * Extension API Description:: A full description of the API. * Extension Example:: Example C code for an extension. * Extension Samples:: The sample extensions that ship with @@ -28322,208 +28322,10 @@ the symbol exists in the global scope. Something like this is enough: int plugin_is_GPL_compatible; @end example -@node Extension Design -@section Extension API Design - -The first version of extensions for @command{gawk} was developed in -the mid-1990s and released with @command{gawk} 3.1 in the late 1990s. -The basic mechanisms and design remained unchanged for close to 15 years, -until 2012. - -The old extension mechanism used data types and functions from -@command{gawk} itself, with a ``clever hack'' to install extension -functions. - -@command{gawk} included some sample extensions, of which a few were -really useful. However, it was clear from the outset that the extension -mechanism was bolted onto the side and was not really thought out. - -@menu -* Old Extension Problems:: Problems with the old mechanism. -* Extension New Mechanism Goals:: Goals for the new mechanism. -* Extension Other Design Decisions:: Some other design decisions. -* Extension Mechanism Outline:: An outline of how it works. -* Extension Future Growth:: Some room for future growth. -@end menu - -@node Old Extension Problems -@subsection Problems With The Old Mechanism - -The old extension mechanism had several problems: - -@itemize @bullet -@item -It depended heavily upon @command{gawk} internals. Any time the -@code{NODE} structure@footnote{A critical central data structure -inside @command{gawk}.} changed, an extension would have to be -recompiled. Furthermore, to really write extensions required understanding -something about @command{gawk}'s internal functions. There was some -documentation in this @value{DOCUMENT}, but it was quite minimal. - -@item -Being able to call into @command{gawk} from an extension required linker -facilities that are common on Unix-derived systems but that did -not work on Windows systems; users wanting extensions on Windows -had to statically link them into @command{gawk}, even though Windows supports -dynamic loading of shared objects. - -@item -The API would change occasionally as @command{gawk} changed; no compatibility -between versions was ever offered or planned for. -@end itemize - -Despite the drawbacks, the @command{xgawk} project developers forked -@command{gawk} and developed several significant extensions. They also -enhanced @command{gawk}'s facilities relating to file inclusion and -shared object access. - -A new API was desired for a long time, but only in 2012 did the -@command{gawk} maintainer and the @command{xgawk} developers finally -start working on it together. More information about the @command{xgawk} -project is provided in @ref{gawkextlib}. - -@node Extension New Mechanism Goals -@subsection Goals For A New Mechanism - -Some goals for the new API were: - -@itemize @bullet -@item -The API should be independent of @command{gawk} internals. Changes in -@command{gawk} internals should not be visible to the writer of an -extension function. - -@item -The API should provide @emph{binary} compatibility across @command{gawk} -releases as long as the API itself does not change. - -@item -The API should enable extensions written in C to have roughly the -same ``appearance'' to @command{awk}-level code as @command{awk} -functions do. This means that extensions should have: - -@itemize @minus -@item -The ability to access function parameters. - -@item -The ability to turn an undefined parameter into an array (call by reference). - -@item -The ability to create, access and update global variables. - -@item -Easy access to all the elements of an array at once (``array flattening'') -in order to loop over all the element in an easy fashion for C code. - -@item -The ability to create arrays (including @command{gawk}'s true -multi-dimensional arrays). -@end itemize -@end itemize - -Some additional important goals were: - -@itemize @bullet -@item -The API should use only features in ISO C 90, so that extensions -can be written using the widest range of C and C++ compilers. The header -should include the appropriate @samp{#ifdef __cplusplus} and @samp{extern "C"} -magic so that a C++ compiler could be used. (If using C++, the runtime -system has to be smart enough to call any constructors and destructors, -as @command{gawk} is a C program. As of this writing, this has not been -tested.) - -@item -The API mechanism should not require access to @command{gawk}'s -symbols@footnote{The @dfn{symbols} are the variables and functions -defined inside @command{gawk}. Access to these symbols by code -external to @command{gawk} loaded dynamically at runtime is -problematic on Windows.} by the compile-time or dynamic linker, -in order to enable creation of extensions that also work on Windows. -@end itemize - -During development, it became clear that there were other features -that should be available to extensions, which were also subsequently -provided: - -@itemize @bullet -@item -Extensions should have the ability to hook into @command{gawk}'s -I/O redirection mechanism. In particular, the @command{xgawk} -developers provided a so-called ``open hook'' to take over reading -records. During development, this was generalized to allow -extensions to hook into input processing, output processing, and -two-way I/O. - -@item -An extension should be able to provide a ``call back'' function -to perform clean up actions when @command{gawk} exits. - -@item -An extension should be able to provide a version string so that -@command{gawk}'s @option{--version} option can provide information -about extensions as well. -@end itemize - -@node Extension Other Design Decisions -@subsection Other Design Decisions - -As an arbitrary design decision, extensions can read the values of -built-in variables and arrays (such as @code{ARGV} and @code{FS}), but cannot -change them, with the exception of @code{PROCINFO}. - -The reason for this is to prevent an extension function from affecting -the flow of an @command{awk} program outside its control. While a real -@command{awk} function can do what it likes, that is at the discretion -of the programmer. An extension function should provide a service or -make a C API available for use within @command{awk}, and not mess with -@code{FS} or @code{ARGC} and @code{ARGV}. - -In addition, it becomes easy to start down a slippery slope. How -much access to @command{gawk} facilities do extensions need? -Do they need @code{getline}? What about calling @code{gsub()} or -compiling regular expressions? What about calling into @command{awk} -functions? (@emph{That} would be messy.) - -In order to avoid these issues, the @command{gawk} developers chose -to start with the simplest, most basic features that are still truly useful. - -Another decision is that although @command{gawk} provides nice things like -MPFR, and arrays indexed internally by integers, these features are not -being brought out to the API in order to keep things simple and close to -traditional @command{awk} semantics. (In fact, arrays indexed internally -by integers are so transparent that they aren't even documented!) - -Additionally, all functions in the API check that their pointer -input parameters are not @code{NULL}. If they are, they return an error. -(It is a good idea for extension code to verify that -pointers received from @command{gawk} are not @code{NULL}. -Such a thing should not happen, but the @command{gawk} developers -are only human, and they have been known to occasionally make -mistakes.) - -With time, the API will undoubtedly evolve; the @command{gawk} developers -expect this to be driven by user needs. For now, the current API seems -to provide a minimal yet powerful set of features for creating extensions. - @node Extension Mechanism Outline -@subsection At A High Level How It Works +@section At A High Level How It Works -The requirement to avoid access to @command{gawk}'s symbols is, at first -glance, a difficult one to meet. - -One design, apparently used by Perl and Ruby and maybe others, would -be to make the mainline @command{gawk} code into a library, with the -@command{gawk} utility a small C @code{main()} function linked against -the library. - -This seemed like the tail wagging the dog, complicating build and -installation and making a simple copy of the @command{gawk} executable -from one system to another (or one place to another on the same -system!) into a chancy operation. - -Pat Rankin suggested the solution that was adopted. Communication between +Communication between @command{gawk} and an extension is two-way. First, when an extension is loaded, it is passed a pointer to a @code{struct} whose fields are function pointers. @@ -28604,28 +28406,6 @@ happen, but we all know how @emph{that} goes.) @xref{Extension Versioning}, for details. @end itemize -@node Extension Future Growth -@subsection Room For Future Growth - -The API can later be expanded, in two ways: - -@itemize @bullet -@item -@command{gawk} passes an ``extension id'' into the extension when it -first loads the extension. The extension then passes this id back -to @command{gawk} with each function call. This mechanism allows -@command{gawk} to identify the extension calling into it, should it need -to know. - -@item -Similarly, the extension passes a ``name space'' into @command{gawk} -when it registers each extension function. This allows a future -mechanism for grouping extension functions and possibly avoiding name -conflicts. -@end itemize - -Of course, as of this writing, no decisions have been made with respect -to any of the above. @node Extension API Description @section API Description @@ -30114,7 +29894,7 @@ BEGIN @{ @noindent This code creates an array with @code{split()} (@pxref{String Functions}) -and then calls @code{dump_and_delete()}. That function looks up +and then calls @code{dump_array_and_delete()}. That function looks up the array whose name is passed as the first argument, and deletes the element at the index passed in the second argument. It then prints the return value and checks if the element @@ -34305,6 +34085,7 @@ maintainers of @command{gawk}. Everything in it applies specifically to * Future Extensions:: New features that may be implemented one day. * Implementation Limitations:: Some limitations of the implementation. * Old Extension Mechansim:: Some compatibility for old extensions. +* Extension Design:: Design notes about the extension API. @end menu @node Compatibility Mode @@ -34797,7 +34578,7 @@ dorking with the configuration machinery. @item Installing from source is quite easy. It's how the maintainer worked for years under Fedora. -He had @file{/usr/local/bin} at the front of hs @env{PATH} and just did: +He had @file{/usr/local/bin} at the front of his @env{PATH} and just did: @example wget http://ftp.gnu.org/gnu/@var{package}/@var{package}-@var{x}.@var{y}.@var{z}.tar.gz @@ -34808,8 +34589,9 @@ make install # as root @end example @item -These days the maintainer uses Ubuntu 10.11 which is medium current, but -he is already doing the above for @command{autoconf} and @command{bison}. +These days the maintainer uses Ubuntu 12.04 which is medium current, but +he is already doing the above for @command{autoconf}, @command{automake} +and @command{bison}. @ignore (C. Rant: Recent Linux versions with GNOME 3 really suck. What @@ -34917,7 +34699,6 @@ This following table describes limits of @command{gawk} on a Unix-like system (although it is variable even then). Other systems may have different limits. -@c @multitable {Number of file redirections} {min(number of processes per user, number of open files)} @multitable @columnfractions .40 .60 @headitem Item @tab Limit @item Characters in a character class @tab 2^(number of bits per byte) @@ -34971,6 +34752,233 @@ The @command{gawk} development team strongly recommends that you convert any old extensions that you may have to use the new API described in @ref{Dynamic Extensions}. +@node Extension Design +@appendixsec Extension API Design + +This @value{SECTION} documents the design of the extension API, +including a discussion of some of the history and problems that needed +to be solved. + +The first version of extensions for @command{gawk} was developed in +the mid-1990s and released with @command{gawk} 3.1 in the late 1990s. +The basic mechanisms and design remained unchanged for close to 15 years, +until 2012. + +The old extension mechanism used data types and functions from +@command{gawk} itself, with a ``clever hack'' to install extension +functions. + +@command{gawk} included some sample extensions, of which a few were +really useful. However, it was clear from the outset that the extension +mechanism was bolted onto the side and was not really thought out. + +@menu +* Old Extension Problems:: Problems with the old mechanism. +* Extension New Mechanism Goals:: Goals for the new mechanism. +* Extension Other Design Decisions:: Some other design decisions. +* Extension Future Growth:: Some room for future growth. +@end menu + +@node Old Extension Problems +@appendixsubsec Problems With The Old Mechanism + +The old extension mechanism had several problems: + +@itemize @bullet +@item +It depended heavily upon @command{gawk} internals. Any time the +@code{NODE} structure@footnote{A critical central data structure +inside @command{gawk}.} changed, an extension would have to be +recompiled. Furthermore, to really write extensions required understanding +something about @command{gawk}'s internal functions. There was some +documentation in this @value{DOCUMENT}, but it was quite minimal. + +@item +Being able to call into @command{gawk} from an extension required linker +facilities that are common on Unix-derived systems but that did +not work on Windows systems; users wanting extensions on Windows +had to statically link them into @command{gawk}, even though Windows supports +dynamic loading of shared objects. + +@item +The API would change occasionally as @command{gawk} changed; no compatibility +between versions was ever offered or planned for. +@end itemize + +Despite the drawbacks, the @command{xgawk} project developers forked +@command{gawk} and developed several significant extensions. They also +enhanced @command{gawk}'s facilities relating to file inclusion and +shared object access. + +A new API was desired for a long time, but only in 2012 did the +@command{gawk} maintainer and the @command{xgawk} developers finally +start working on it together. More information about the @command{xgawk} +project is provided in @ref{gawkextlib}. + +@node Extension New Mechanism Goals +@appendixsubsec Goals For A New Mechanism + +Some goals for the new API were: + +@itemize @bullet +@item +The API should be independent of @command{gawk} internals. Changes in +@command{gawk} internals should not be visible to the writer of an +extension function. + +@item +The API should provide @emph{binary} compatibility across @command{gawk} +releases as long as the API itself does not change. + +@item +The API should enable extensions written in C to have roughly the +same ``appearance'' to @command{awk}-level code as @command{awk} +functions do. This means that extensions should have: + +@itemize @minus +@item +The ability to access function parameters. + +@item +The ability to turn an undefined parameter into an array (call by reference). + +@item +The ability to create, access and update global variables. + +@item +Easy access to all the elements of an array at once (``array flattening'') +in order to loop over all the element in an easy fashion for C code. + +@item +The ability to create arrays (including @command{gawk}'s true +multi-dimensional arrays). +@end itemize +@end itemize + +Some additional important goals were: + +@itemize @bullet +@item +The API should use only features in ISO C 90, so that extensions +can be written using the widest range of C and C++ compilers. The header +should include the appropriate @samp{#ifdef __cplusplus} and @samp{extern "C"} +magic so that a C++ compiler could be used. (If using C++, the runtime +system has to be smart enough to call any constructors and destructors, +as @command{gawk} is a C program. As of this writing, this has not been +tested.) + +@item +The API mechanism should not require access to @command{gawk}'s +symbols@footnote{The @dfn{symbols} are the variables and functions +defined inside @command{gawk}. Access to these symbols by code +external to @command{gawk} loaded dynamically at runtime is +problematic on Windows.} by the compile-time or dynamic linker, +in order to enable creation of extensions that also work on Windows. +@end itemize + +During development, it became clear that there were other features +that should be available to extensions, which were also subsequently +provided: + +@itemize @bullet +@item +Extensions should have the ability to hook into @command{gawk}'s +I/O redirection mechanism. In particular, the @command{xgawk} +developers provided a so-called ``open hook'' to take over reading +records. During development, this was generalized to allow +extensions to hook into input processing, output processing, and +two-way I/O. + +@item +An extension should be able to provide a ``call back'' function +to perform clean up actions when @command{gawk} exits. + +@item +An extension should be able to provide a version string so that +@command{gawk}'s @option{--version} option can provide information +about extensions as well. +@end itemize + +The requirement to avoid access to @command{gawk}'s symbols is, at first +glance, a difficult one to meet. + +One design, apparently used by Perl and Ruby and maybe others, would +be to make the mainline @command{gawk} code into a library, with the +@command{gawk} utility a small C @code{main()} function linked against +the library. + +This seemed like the tail wagging the dog, complicating build and +installation and making a simple copy of the @command{gawk} executable +from one system to another (or one place to another on the same +system!) into a chancy operation. + +Pat Rankin suggested the solution that was adopted. +@xref{Extension Mechanism Outline}, for the details. + +@node Extension Other Design Decisions +@appendixsubsec Other Design Decisions + +As an arbitrary design decision, extensions can read the values of +built-in variables and arrays (such as @code{ARGV} and @code{FS}), but cannot +change them, with the exception of @code{PROCINFO}. + +The reason for this is to prevent an extension function from affecting +the flow of an @command{awk} program outside its control. While a real +@command{awk} function can do what it likes, that is at the discretion +of the programmer. An extension function should provide a service or +make a C API available for use within @command{awk}, and not mess with +@code{FS} or @code{ARGC} and @code{ARGV}. + +In addition, it becomes easy to start down a slippery slope. How +much access to @command{gawk} facilities do extensions need? +Do they need @code{getline}? What about calling @code{gsub()} or +compiling regular expressions? What about calling into @command{awk} +functions? (@emph{That} would be messy.) + +In order to avoid these issues, the @command{gawk} developers chose +to start with the simplest, most basic features that are still truly useful. + +Another decision is that although @command{gawk} provides nice things like +MPFR, and arrays indexed internally by integers, these features are not +being brought out to the API in order to keep things simple and close to +traditional @command{awk} semantics. (In fact, arrays indexed internally +by integers are so transparent that they aren't even documented!) + +Additionally, all functions in the API check that their pointer +input parameters are not @code{NULL}. If they are, they return an error. +(It is a good idea for extension code to verify that +pointers received from @command{gawk} are not @code{NULL}. +Such a thing should not happen, but the @command{gawk} developers +are only human, and they have been known to occasionally make +mistakes.) + +With time, the API will undoubtedly evolve; the @command{gawk} developers +expect this to be driven by user needs. For now, the current API seems +to provide a minimal yet powerful set of features for creating extensions. + +@node Extension Future Growth +@appendixsubsec Room For Future Growth + +The API can later be expanded, in two ways: + +@itemize @bullet +@item +@command{gawk} passes an ``extension id'' into the extension when it +first loads the extension. The extension then passes this id back +to @command{gawk} with each function call. This mechanism allows +@command{gawk} to identify the extension calling into it, should it need +to know. + +@item +Similarly, the extension passes a ``name space'' into @command{gawk} +when it registers each extension function. This allows a future +mechanism for grouping extension functions and possibly avoiding name +conflicts. +@end itemize + +Of course, as of this writing, no decisions have been made with respect +to any of the above. + @c ENDOFRANGE impis @c ENDOFRANGE gawii |