diff options
Diffstat (limited to 'gawk.info-7')
-rw-r--r-- | gawk.info-7 | 1265 |
1 files changed, 1265 insertions, 0 deletions
diff --git a/gawk.info-7 b/gawk.info-7 new file mode 100644 index 00000000..b3ac7254 --- /dev/null +++ b/gawk.info-7 @@ -0,0 +1,1265 @@ +This is Info file gawk.info, produced by Makeinfo-1.54 from the input +file gawk.texi. + + This file documents `awk', a program that you can use to select +particular records in a file and perform operations upon them. + + This is Edition 0.15 of `The GAWK Manual', +for the 2.15 version of the GNU implementation +of AWK. + + Copyright (C) 1989, 1991, 1992, 1993 Free Software Foundation, Inc. + + Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + + Permission is granted to copy and distribute modified versions of +this manual under the conditions for verbatim copying, provided that +the entire resulting derived work is distributed under the terms of a +permission notice identical to this one. + + Permission is granted to copy and distribute translations of this +manual into another language, under the above conditions for modified +versions, except that this permission notice may be stated in a +translation approved by the Foundation. + + +File: gawk.info, Node: V7/S5R3.1, Next: S5R4, Prev: Language History, Up: Language History + +Major Changes between V7 and S5R3.1 +=================================== + + The `awk' language evolved considerably between the release of +Version 7 Unix (1978) and the new version first made widely available in +System V Release 3.1 (1987). This section summarizes the changes, with +cross-references to further details. + + * The requirement for `;' to separate rules on a line (*note `awk' + Statements versus Lines: Statements/Lines.). + + * User-defined functions, and the `return' statement (*note + User-defined Functions: User-defined.). + + * The `delete' statement (*note The `delete' Statement: Delete.). + + * The `do'-`while' statement (*note The `do'-`while' Statement: Do + Statement.). + + * The built-in functions `atan2', `cos', `sin', `rand' and `srand' + (*note Numeric Built-in Functions: Numeric Functions.). + + * The built-in functions `gsub', `sub', and `match' (*note Built-in + Functions for String Manipulation: String Functions.). + + * The built-in functions `close', which closes an open file, and + `system', which allows the user to execute operating system + commands (*note Built-in Functions for Input/Output: I/O + Functions.). + + * The `ARGC', `ARGV', `FNR', `RLENGTH', `RSTART', and `SUBSEP' + built-in variables (*note Built-in Variables::.). + + * The conditional expression using the operators `?' and `:' (*note + Conditional Expressions: Conditional Exp.). + + * The exponentiation operator `^' (*note Arithmetic Operators: + Arithmetic Ops.) and its assignment operator form `^=' (*note + Assignment Expressions: Assignment Ops.). + + * C-compatible operator precedence, which breaks some old `awk' + programs (*note Operator Precedence (How Operators Nest): + Precedence.). + + * Regexps as the value of `FS' (*note Specifying how Fields are + Separated: Field Separators.), and as the third argument to the + `split' function (*note Built-in Functions for String + Manipulation: String Functions.). + + * Dynamic regexps as operands of the `~' and `!~' operators (*note + How to Use Regular Expressions: Regexp Usage.). + + * Escape sequences (*note Constant Expressions: Constants.) in + regexps. + + * The escape sequences `\b', `\f', and `\r' (*note Constant + Expressions: Constants.). + + * Redirection of input for the `getline' function (*note Explicit + Input with `getline': Getline.). + + * Multiple `BEGIN' and `END' rules (*note `BEGIN' and `END' Special + Patterns: BEGIN/END.). + + * Simulated multi-dimensional arrays (*note Multi-dimensional + Arrays: Multi-dimensional.). + + +File: gawk.info, Node: S5R4, Next: POSIX, Prev: V7/S5R3.1, Up: Language History + +Changes between S5R3.1 and S5R4 +=============================== + + The System V Release 4 version of Unix `awk' added these features +(some of which originated in `gawk'): + + * The `ENVIRON' variable (*note Built-in Variables::.). + + * Multiple `-f' options on the command line (*note Invoking `awk': + Command Line.). + + * The `-v' option for assigning variables before program execution + begins (*note Invoking `awk': Command Line.). + + * The `--' option for terminating command line options. + + * The `\a', `\v', and `\x' escape sequences (*note Constant + Expressions: Constants.). + + * A defined return value for the `srand' built-in function (*note + Numeric Built-in Functions: Numeric Functions.). + + * The `toupper' and `tolower' built-in string functions for case + translation (*note Built-in Functions for String Manipulation: + String Functions.). + + * A cleaner specification for the `%c' format-control letter in the + `printf' function (*note Using `printf' Statements for Fancier + Printing: Printf.). + + * The ability to dynamically pass the field width and precision + (`"%*.*d"') in the argument list of the `printf' function (*note + Using `printf' Statements for Fancier Printing: Printf.). + + * The use of constant regexps such as `/foo/' as expressions, where + they are equivalent to use of the matching operator, as in `$0 ~ + /foo/' (*note Constant Expressions: Constants.). + + +File: gawk.info, Node: POSIX, Next: POSIX/GNU, Prev: S5R4, Up: Language History + +Changes between S5R4 and POSIX `awk' +==================================== + + The POSIX Command Language and Utilities standard for `awk' +introduced the following changes into the language: + + * The use of `-W' for implementation-specific options. + + * The use of `CONVFMT' for controlling the conversion of numbers to + strings (*note Conversion of Strings and Numbers: Conversion.). + + * The concept of a numeric string, and tighter comparison rules to go + with it (*note Comparison Expressions: Comparison Ops.). + + * More complete documentation of many of the previously undocumented + features of the language. + + +File: gawk.info, Node: POSIX/GNU, Prev: POSIX, Up: Language History + +Extensions in `gawk' not in POSIX `awk' +======================================= + + The GNU implementation, `gawk', adds these features: + + * The `AWKPATH' environment variable for specifying a path search for + the `-f' command line option (*note Invoking `awk': Command Line.). + + * The various `gawk' specific features available via the `-W' + command line option (*note Invoking `awk': Command Line.). + + * The `ARGIND' variable, that tracks the movement of `FILENAME' + through `ARGV'. (*note Built-in Variables::.). + + * The `ERRNO' variable, that contains the system error message when + `getline' returns -1, or when `close' fails. (*note Built-in + Variables::.). + + * The `IGNORECASE' variable and its effects (*note Case-sensitivity + in Matching: Case-sensitivity.). + + * The `FIELDWIDTHS' variable and its effects (*note Reading + Fixed-width Data: Constant Size.). + + * The `next file' statement for skipping to the next data file + (*note The `next file' Statement: Next File Statement.). + + * The `systime' and `strftime' built-in functions for obtaining and + printing time stamps (*note Functions for Dealing with Time + Stamps: Time Functions.). + + * The `/dev/stdin', `/dev/stdout', `/dev/stderr', and `/dev/fd/N' + file name interpretation (*note Standard I/O Streams: Special + Files.). + + * The `-W compat' option to turn off these extensions (*note + Invoking `awk': Command Line.). + + * The `-W posix' option for full POSIX compliance (*note Invoking + `awk': Command Line.). + + +File: gawk.info, Node: Installation, Next: Gawk Summary, Prev: Language History, Up: Top + +Installing `gawk' +***************** + + This chapter provides instructions for installing `gawk' on the +various platforms that are supported by the developers. The primary +developers support Unix (and one day, GNU), while the other ports were +contributed. The file `ACKNOWLEDGMENT' in the `gawk' distribution +lists the electronic mail addresses of the people who did the +respective ports. + +* Menu: + +* Gawk Distribution:: What is in the `gawk' distribution. +* Unix Installation:: Installing `gawk' under various versions + of Unix. +* VMS Installation:: Installing `gawk' on VMS. +* MS-DOS Installation:: Installing `gawk' on MS-DOS. +* Atari Installation:: Installing `gawk' on the Atari ST. + + +File: gawk.info, Node: Gawk Distribution, Next: Unix Installation, Prev: Installation, Up: Installation + +The `gawk' Distribution +======================= + + This section first describes how to get and extract the `gawk' +distribution, and then discusses what is in the various files and +subdirectories. + +* Menu: + +* Extracting:: How to get and extract the distribution. +* Distribution contents:: What is in the distribution. + + +File: gawk.info, Node: Extracting, Next: Distribution contents, Prev: Gawk Distribution, Up: Gawk Distribution + +Getting the `gawk' Distribution +------------------------------- + + `gawk' is distributed as a `tar' file compressed with the GNU Zip +program, `gzip'. You can get it via anonymous `ftp' to the Internet +host `prep.ai.mit.edu'. Like all GNU software, it will be archived at +other well known systems, from which it will be possible to use some +sort of anonymous `uucp' to obtain the distribution as well. You can +also order `gawk' on tape or CD-ROM directly from the Free Software +Foundation. (The address is on the copyright page.) Doing so directly +contributes to the support of the foundation and to the production of +more free software. + + Once you have the distribution (for example, `gawk-2.15.0.tar.z'), +first use `gzip' to expand the file, and then use `tar' to extract it. +You can use the following pipeline to produce the `gawk' distribution: + + # Under System V, add 'o' to the tar flags + gzip -d -c gawk-2.15.0.tar.z | tar -xvpf - + +This will create a directory named `gawk-2.15' in the current directory. + + The distribution file name is of the form `gawk-2.15.N.tar.Z'. The +N represents a "patchlevel", meaning that minor bugs have been fixed in +the major release. The current patchlevel is 0, but when retrieving +distributions, you should get the version with the highest patchlevel. + + If you are not on a Unix system, you will need to make other +arrangements for getting and extracting the `gawk' distribution. You +should consult a local expert. + + +File: gawk.info, Node: Distribution contents, Prev: Extracting, Up: Gawk Distribution + +Contents of the `gawk' Distribution +----------------------------------- + + `gawk' has a number of C source files, documentation files, +subdirectories and files related to the configuration process (*note +Compiling and Installing `gawk' on Unix: Unix Installation.), and +several subdirectories related to different, non-Unix, operating +systems. + +various `.c', `.y', and `.h' files + The C and YACC source files are the actual `gawk' source code. + +`README' +`README.VMS' +`README.dos' +`README.rs6000' +`README.ultrix' + Descriptive files: `README' for `gawk' under Unix, and the rest + for the various hardware and software combinations. + +`PORTS' + A list of systems to which `gawk' has been ported, and which have + successfully run the test suite. + +`ACKNOWLEDGMENT' + A list of the people who contributed major parts of the code or + documentation. + +`NEWS' + A list of changes to `gawk' since the last release or patch. + +`COPYING' + The GNU General Public License. + +`FUTURES' + A brief list of features and/or changes being contemplated for + future releases, with some indication of the time frame for the + feature, based on its difficulty. + +`LIMITATIONS' + A list of those factors that limit `gawk''s performance. Most of + these depend on the hardware or operating system software, and are + not limits in `gawk' itself. + +`PROBLEMS' + A file describing known problems with the current release. + +`gawk.1' + The `troff' source for a manual page describing `gawk'. + +`gawk.texinfo' + The `texinfo' source file for this Info file. It should be + processed with TeX to produce a printed manual, and with + `makeinfo' to produce the Info file. + +`Makefile.in' +`config' +`config.in' +`configure' +`missing' +`mungeconf' + These files and subdirectories are used when configuring `gawk' + for various Unix systems. They are explained in detail in *Note + Compiling and Installing `gawk' on Unix: Unix Installation. + +`atari' + Files needed for building `gawk' on an Atari ST. *Note Installing + `gawk' on the Atari ST: Atari Installation, for details. + +`pc' + Files needed for building `gawk' under MS-DOS. *Note Installing + `gawk' on MS-DOS: MS-DOS Installation, for details. + +`vms' + Files needed for building `gawk' under VMS. *Note Compiling + Installing and Running `gawk' on VMS: VMS Installation, for + details. + +`test' + Many interesting `awk' programs, provided as a test suite for + `gawk'. You can use `make test' from the top level `gawk' + directory to run your version of `gawk' against the test suite. + If `gawk' successfully passes `make test' then you can be + confident of a successful port. + + +File: gawk.info, Node: Unix Installation, Next: VMS Installation, Prev: Gawk Distribution, Up: Installation + +Compiling and Installing `gawk' on Unix +======================================= + + Often, you can compile and install `gawk' by typing only two +commands. However, if you do not use a supported system, you may need +to configure `gawk' for your system yourself. + +* Menu: + +* Quick Installation:: Compiling `gawk' on a + supported Unix version. +* Configuration Philosophy:: How it's all supposed to work. +* New Configurations:: What to do if there is no supplied + configuration for your system. + + +File: gawk.info, Node: Quick Installation, Next: Configuration Philosophy, Prev: Unix Installation, Up: Unix Installation + +Compiling `gawk' for a Supported Unix Version +--------------------------------------------- + + After you have extracted the `gawk' distribution, `cd' to +`gawk-2.15'. Look in the `config' subdirectory for a file that matches +your hardware/software combination. In general, only the software is +relevant; for example `sunos41' is used for SunOS 4.1, on both Sun 3 +and Sun 4 hardware. + + If you find such a file, run the command: + + # assume you have SunOS 4.1 + ./configure sunos41 + + This produces a `Makefile' and `config.h' tailored to your system. +You may wish to edit the `Makefile' to use a different C compiler, such +as `gcc', the GNU C compiler, if you have it. You may also wish to +change the `CFLAGS' variable, which controls the command line options +that are passed to the C compiler (such as optimization levels, or +compiling for debugging). + + After you have configured `Makefile' and `config.h', type: + + make + +and shortly thereafter, you should have an executable version of `gawk'. +That's all there is to it! + + +File: gawk.info, Node: Configuration Philosophy, Next: New Configurations, Prev: Quick Installation, Up: Unix Installation + +The Configuration Process +------------------------- + + (This section is of interest only if you know something about using +the C language and the Unix operating system.) + + The source code for `gawk' generally attempts to adhere to industry +standards wherever possible. This means that `gawk' uses library +routines that are specified by the ANSI C standard and by the POSIX +operating system interface standard. When using an ANSI C compiler, +function prototypes are provided to help improve the compile-time +checking. + + Many older Unix systems do not support all of either the ANSI or the +POSIX standards. The `missing' subdirectory in the `gawk' distribution +contains replacement versions of those subroutines that are most likely +to be missing. + + The `config.h' file that is created by the `configure' program +contains definitions that describe features of the particular operating +system where you are attempting to compile `gawk'. For the most part, +it lists which standard subroutines are *not* available. For example, +if your system lacks the `getopt' routine, then `GETOPT_MISSING' would +be defined. + + `config.h' also defines constants that describe facts about your +variant of Unix. For example, there may not be an `st_blksize' element +in the `stat' structure. In this case `BLKSIZE_MISSING' would be +defined. + + Based on the list in `config.h' of standard subroutines that are +missing, `missing.c' will do a `#include' of the appropriate file(s) +from the `missing' subdirectory. + + Conditionally compiled code in the other source files relies on the +other definitions in the `config.h' file. + + Besides creating `config.h', `configure' produces a `Makefile' from +`Makefile.in'. There are a number of lines in `Makefile.in' that are +system or feature specific. For example, there is line that begins +with `##MAKE_ALLOCA_C##'. This is normally a comment line, since it +starts with `#'. If a configuration file has `MAKE_ALLOCA_C' in it, +then `configure' will delete the `##MAKE_ALLOCA_C##' from the beginning +of the line. This will enable the rules in the `Makefile' that use a C +version of `alloca'. There are several similar features that work in +this fashion. + + +File: gawk.info, Node: New Configurations, Prev: Configuration Philosophy, Up: Unix Installation + +Configuring `gawk' for a New System +----------------------------------- + + (This section is of interest only if you know something about using +the C language and the Unix operating system, and if you have to install +`gawk' on a system that is not supported by the `gawk' distribution. +If you are a C or Unix novice, get help from a local expert.) + + If you need to configure `gawk' for a Unix system that is not +supported in the distribution, first see *Note The Configuration +Process: Configuration Philosophy. Then, copy `config.in' to +`config.h', and copy `Makefile.in' to `Makefile'. + + Next, edit both files. Both files are liberally commented, and the +necessary changes should be straightforward. + + While editing `config.h', you need to determine what library +routines you do or do not have by consulting your system documentation, +or by perusing your actual libraries using the `ar' or `nm' utilities. +In the worst case, simply do not define *any* of the macros for missing +subroutines. When you compile `gawk', the final link-editing step will +fail. The link editor will provide you with a list of unresolved +external references--these are the missing subroutines. Edit +`config.h' again and recompile, and you should be set. + + Editing the `Makefile' should also be straightforward. Enable or +disable the lines that begin with `##MAKE_WHATEVER##', as appropriate. +Select the correct C compiler and `CFLAGS' for it. Then run `make'. + + Getting a correct configuration is likely to be an iterative process. +Do not be discouraged if it takes you several tries. If you have no +luck whatsoever, please report your system type, and the steps you took. +Once you do have a working configuration, please send it to the +maintainers so that support for your system can be added to the +official release. + + *Note Reporting Problems and Bugs: Bugs, for information on how to +report problems in configuring `gawk'. You may also use the same +mechanisms for sending in new configurations. + + +File: gawk.info, Node: VMS Installation, Next: MS-DOS Installation, Prev: Unix Installation, Up: Installation + +Compiling, Installing, and Running `gawk' on VMS +================================================ + + This section describes how to compile and install `gawk' under VMS. + +* Menu: + +* VMS Compilation:: How to compile `gawk' under VMS. +* VMS Installation Details:: How to install `gawk' under VMS. +* VMS Running:: How to run `gawk' under VMS. +* VMS POSIX:: Alternate instructions for VMS POSIX. + + +File: gawk.info, Node: VMS Compilation, Next: VMS Installation Details, Prev: VMS Installation, Up: VMS Installation + +Compiling `gawk' under VMS +-------------------------- + + To compile `gawk' under VMS, there is a `DCL' command procedure that +will issue all the necessary `CC' and `LINK' commands, and there is +also a `Makefile' for use with the `MMS' utility. From the source +directory, use either + + $ @[.VMS]VMSBUILD.COM + +or + + $ MMS/DESCRIPTION=[.VMS]DECSRIP.MMS GAWK + + Depending upon which C compiler you are using, follow one of the sets +of instructions in this table: + +VAX C V3.x + Use either `vmsbuild.com' or `descrip.mms' as is. These use + `CC/OPTIMIZE=NOLINE', which is essential for Version 3.0. + +VAX C V2.x + You must have Version 2.3 or 2.4; older ones won't work. Edit + either `vmsbuild.com' or `descrip.mms' according to the comments + in them. For `vmsbuild.com', this just entails removing two `!' + delimiters. Also edit `config.h' (which is a copy of file + `[.config]vms-conf.h') and comment out or delete the two lines + `#define __STDC__ 0' and `#define VAXC_BUILTINS' near the end. + +GNU C + Edit `vmsbuild.com' or `descrip.mms'; the changes are different + from those for VAX C V2.x, but equally straightforward. No + changes to `config.h' should be needed. + +DEC C + Edit `vmsbuild.com' or `descrip.mms' according to their comments. + No changes to `config.h' should be needed. + + `gawk' 2.15 has been tested under VAX/VMS 5.5-1 using VAX C V3.2, +GNU C 1.40 and 2.3. It should work without modifications for VMS V4.6 +and up. + + +File: gawk.info, Node: VMS Installation Details, Next: VMS Running, Prev: VMS Compilation, Up: VMS Installation + +Installing `gawk' on VMS +------------------------ + + To install `gawk', all you need is a "foreign" command, which is a +`DCL' symbol whose value begins with a dollar sign. + + $ GAWK :== $device:[directory]GAWK + +(Substitute the actual location of `gawk.exe' for +`device:[directory]'.) The symbol should be placed in the `login.com' +of any user who wishes to run `gawk', so that it will be defined every +time the user logs on. Alternatively, the symbol may be placed in the +system-wide `sylogin.com' procedure, which will allow all users to run +`gawk'. + + Optionally, the help entry can be loaded into a VMS help library: + + $ LIBRARY/HELP SYS$HELP:HELPLIB [.VMS]GAWK.HLP + +(You may want to substitute a site-specific help library rather than +the standard VMS library `HELPLIB'.) After loading the help text, + + $ HELP GAWK + +will provide information about both the `gawk' implementation and the +`awk' programming language. + + The logical name `AWK_LIBRARY' can designate a default location for +`awk' program files. For the `-f' option, if the specified filename +has no device or directory path information in it, `gawk' will look in +the current directory first, then in the directory specified by the +translation of `AWK_LIBRARY' if the file was not found. If after +searching in both directories, the file still is not found, then `gawk' +appends the suffix `.awk' to the filename and the file search will be +re-tried. If `AWK_LIBRARY' is not defined, that portion of the file +search will fail benignly. + + +File: gawk.info, Node: VMS Running, Next: VMS POSIX, Prev: VMS Installation Details, Up: VMS Installation + +Running `gawk' on VMS +--------------------- + + Command line parsing and quoting conventions are significantly +different on VMS, so examples in this manual or from other sources +often need minor changes. They *are* minor though, and all `awk' +programs should run correctly. + + Here are a couple of trivial tests: + + $ gawk -- "BEGIN {print ""Hello, World!""}" + $ gawk -"W" version ! could also be -"W version" or "-W version" + +Note that upper-case and mixed-case text must be quoted. + + The VMS port of `gawk' includes a `DCL'-style interface in addition +to the original shell-style interface (see the help entry for details). +One side-effect of dual command line parsing is that if there is only a +single parameter (as in the quoted string program above), the command +becomes ambiguous. To work around this, the normally optional `--' +flag is required to force Unix style rather than `DCL' parsing. If any +other dash-type options (or multiple parameters such as data files to be +processed) are present, there is no ambiguity and `--' can be omitted. + + The default search path when looking for `awk' program files +specified by the `-f' option is `"SYS$DISK:[],AWK_LIBRARY:"'. The +logical name `AWKPATH' can be used to override this default. The format +of `AWKPATH' is a comma-separated list of directory specifications. +When defining it, the value should be quoted so that it retains a single +translation, and not a multi-translation `RMS' searchlist. + + +File: gawk.info, Node: VMS POSIX, Prev: VMS Running, Up: VMS Installation + +Building and using `gawk' under VMS POSIX +----------------------------------------- + + Ignore the instructions above, although `vms/gawk.hlp' should still +be made available in a help library. Make sure that the two scripts, +`configure' and `mungeconf', are executable; use `chmod +x' on them if +necessary. Then execute the following commands: + + $ POSIX + psx> configure vms-posix + psx> make awktab.c gawk + +The first command will construct files `config.h' and `Makefile' out of +templates. The second command will compile and link `gawk'. Due to a +`make' bug in VMS POSIX V1.0 and V1.1, the file `awktab.c' must be +given as an explicit target or it will not be built and the final link +step will fail. Ignore the warning `"Could not find lib m in lib +list"'; it is harmless, caused by the explicit use of `-lm' as a linker +option which is not needed under VMS POSIX. Under V1.1 (but not V1.0) +a problem with the `yacc' skeleton `/etc/yyparse.c' will cause a +compiler warning for `awktab.c', followed by a linker warning about +compilation warnings in the resulting object module. These warnings +can be ignored. + + Once built, `gawk' will work like any other shell utility. Unlike +the normal VMS port of `gawk', no special command line manipulation is +needed in the VMS POSIX environment. + + +File: gawk.info, Node: MS-DOS Installation, Next: Atari Installation, Prev: VMS Installation, Up: Installation + +Installing `gawk' on MS-DOS +=========================== + + The first step is to get all the files in the `gawk' distribution +onto your PC. Move all the files from the `pc' directory into the main +directory where the other files are. Edit the file `make.bat' so that +it will be an acceptable MS-DOS batch file. This means making sure +that all lines are terminated with the ASCII carriage return and line +feed characters. restrictions. + + `gawk' has only been compiled with version 5.1 of the Microsoft C +compiler. The file `make.bat' from the `pc' directory assumes that you +have this compiler. + + Copy the file `setargv.obj' from the library directory where it +resides to the `gawk' source code directory. + + Run `make.bat'. This will compile `gawk' for you, and link it. +That's all there is to it! + + +File: gawk.info, Node: Atari Installation, Prev: MS-DOS Installation, Up: Installation + +Installing `gawk' on the Atari ST +================================= + + This section assumes that you are running TOS. It applies to other +Atari models (STe, TT) as well. + + In order to use `gawk', you need to have a shell, either text or +graphics, that does not map all the characters of a command line to +upper case. Maintaining case distinction in option flags is very +important (*note Invoking `awk': Command Line.). Popular shells like +`gulam' or `gemini' will work, as will newer versions of `desktop'. +Support for I/O redirection is necessary to make it easy to import +`awk' programs from other environments. Pipes are nice to have, but +not vital. + + If you have received an executable version of `gawk', place it, as +usual, anywhere in your `PATH' where your shell will find it. + + While executing, `gawk' creates a number of temporary files. `gawk' +looks for either of the environment variables `TEMP' or `TMPDIR', in +that order. If either one is found, its value is assumed to be a +directory for temporary files. This directory must exist, and if you +can spare the memory, it is a good idea to put it on a RAM drive. If +neither `TEMP' nor `TMPDIR' are found, then `gawk' uses the current +directory for its temporary files. + + The ST version of `gawk' searches for its program files as described +in *Note The `AWKPATH' Environment Variable: AWKPATH Variable. On the +ST, the default value for the `AWKPATH' variable is +`".,c:\lib\awk,c:\gnu\lib\awk"'. The search path can be modified by +explicitly setting `AWKPATH' to whatever you wish. Note that colons +cannot be used on the ST to separate elements in the `AWKPATH' +variable, since they have another, reserved, meaning. Instead, you +must use a comma to separate elements in the path. If you are +recompiling `gawk' on the ST, then you can choose a new default search +path, by setting the value of `DEFPATH' in the file `...\config\atari'. +You may choose a different separator character by setting the value of +`ENVSEP' in the same file. The new values will be used when creating +the header file `config.h'. + + Although `awk' allows great flexibility in doing I/O redirections +from within a program, this facility should be used with care on the ST. +In some circumstances the OS routines for file handle pool processing +lose track of certain events, causing the computer to crash, and +requiring a reboot. Often a warm reboot is sufficient. Fortunately, +this happens infrequently, and in rather esoteric situations. In +particular, avoid having one part of an `awk' program using `print' +statements explicitly redirected to `"/dev/stdout"', while other +`print' statements use the default standard output, and a calling shell +has redirected standard output to a file. + + When `gawk' is compiled with the ST version of `gcc' and its usual +libraries, it will accept both `/' and `\' as path separators. While +this is convenient, it should be remembered that this removes one, +technically legal, character (`/') from your file names, and that it +may create problems for external programs, called via the `system()' +function, which may not support this convention. Whenever it is +possible that a file created by `gawk' will be used by some other +program, use only backslashes. Also remember that in `awk', +backslashes in strings have to be doubled in order to get literal +backslashes. + + The initial port of `gawk' to the ST was done with `gcc'. If you +wish to recompile `gawk' from scratch, you will need to use a compiler +that accepts ANSI standard C (such as `gcc', Turbo C, or Prospero C). +If `sizeof(int) != sizeof(int *)', the correctness of the generated +code depends heavily on the fact that all function calls have function +prototypes in the current scope. If your compiler does not accept +function prototypes, you will probably have to add a number of casts to +the code. + + If you are using `gcc', make sure that you have up-to-date libraries. +Older versions have problems with some library functions (`atan2()', +`strftime()', the `%g' conversion in `sprintf()') which may affect the +operation of `gawk'. + + In the `atari' subdirectory of the `gawk' distribution is a version +of the `system()' function that has been tested with `gulam' and `msh'; +it should work with other shells as well. With `gulam', it passes the +string to be executed without spawning an extra copy of a shell. It is +possible to replace this version of `system()' with a similar function +from a library or from some other source if that version would be a +better choice for the shell you prefer. + + The files needed to recompile `gawk' on the ST can be found in the +`atari' directory. The provided files and instructions below assume +that you have the GNU C compiler (`gcc'), the `gulam' shell, and an ST +version of `sed'. The `Makefile' is set up to use `byacc' as a `yacc' +replacement. With a different set of tools some adjustments and/or +editing will be needed. + + `cd' to the `atari' directory. Copy `Makefile.st' to `makefile' in +the source (parent) directory. Possibly adjust `../config/atari' to +suit your system. Execute the script `mkconf.g' which will create the +header file `../config.h'. Go back to the source directory. If you +are not using `gcc', check the file `missing.c'. It may be necessary +to change forward slashes in the references to files from the `atari' +subdirectory into backslashes. Type `make' and enjoy. + + Compilation with `gcc' of some of the bigger modules, like +`awk_tab.c', may require a full four megabytes of memory. On smaller +machines you would need to cut down on optimizations, or you would have +to switch to another, less memory hungry, compiler. + + +File: gawk.info, Node: Gawk Summary, Next: Sample Program, Prev: Installation, Up: Top + +`gawk' Summary +************** + + This appendix provides a brief summary of the `gawk' command line +and the `awk' language. It is designed to serve as "quick reference." +It is therefore terse, but complete. + +* Menu: + +* Command Line Summary:: Recapitulation of the command line. +* Language Summary:: A terse review of the language. +* Variables/Fields:: Variables, fields, and arrays. +* Rules Summary:: Patterns and Actions, and their + component parts. +* Functions Summary:: Defining and calling functions. +* Historical Features:: Some undocumented but supported "features". + + +File: gawk.info, Node: Command Line Summary, Next: Language Summary, Prev: Gawk Summary, Up: Gawk Summary + +Command Line Options Summary +============================ + + The command line consists of options to `gawk' itself, the `awk' +program text (if not supplied via the `-f' option), and values to be +made available in the `ARGC' and `ARGV' predefined `awk' variables: + + awk [POSIX OR GNU STYLE OPTIONS] -f source-file [`--'] FILE ... + awk [POSIX OR GNU STYLE OPTIONS] [`--'] 'PROGRAM' FILE ... + + The options that `gawk' accepts are: + +`-F FS' +`--field-separator=FS' + Use FS for the input field separator (the value of the `FS' + predefined variable). + +`-f PROGRAM-FILE' +`--file=PROGRAM-FILE' + Read the `awk' program source from the file PROGRAM-FILE, instead + of from the first command line argument. + +`-v VAR=VAL' +`--assign=VAR=VAL' + Assign the variable VAR the value VAL before program execution + begins. + +`-W compat' +`--compat' + Specifies compatibility mode, in which `gawk' extensions are turned + off. + +`-W copyleft' +`-W copyright' +`--copyleft' +`--copyright' + Print the short version of the General Public License on the error + output. This option may disappear in a future version of `gawk'. + +`-W help' +`-W usage' +`--help' +`--usage' + Print a relatively short summary of the available options on the + error output. + +`-W lint' +`--lint' + Give warnings about dubious or non-portable `awk' constructs. + +`-W posix' +`--posix' + Specifies POSIX compatibility mode, in which `gawk' extensions are + turned off and additional restrictions apply. + +`-W source=PROGRAM-TEXT' +`--source=PROGRAM-TEXT' + Use PROGRAM-TEXT as `awk' program source code. This option allows + mixing command line source code with source code from files, and is + particularly useful for mixing command line programs with library + functions. + +`-W version' +`--version' + Print version information for this particular copy of `gawk' on + the error output. This option may disappear in a future version + of `gawk'. + +`--' + Signal the end of options. This is useful to allow further + arguments to the `awk' program itself to start with a `-'. This + is mainly for consistency with the argument parsing conventions of + POSIX. + + Any other options are flagged as invalid, but are otherwise ignored. +*Note Invoking `awk': Command Line, for more details. + + +File: gawk.info, Node: Language Summary, Next: Variables/Fields, Prev: Command Line Summary, Up: Gawk Summary + +Language Summary +================ + + An `awk' program consists of a sequence of pattern-action statements +and optional function definitions. + + PATTERN { ACTION STATEMENTS } + + function NAME(PARAMETER LIST) { ACTION STATEMENTS } + + `gawk' first reads the program source from the PROGRAM-FILE(s) if +specified, or from the first non-option argument on the command line. +The `-f' option may be used multiple times on the command line. `gawk' +reads the program text from all the PROGRAM-FILE files, effectively +concatenating them in the order they are specified. This is useful for +building libraries of `awk' functions, without having to include them +in each new `awk' program that uses them. To use a library function in +a file from a program typed in on the command line, specify `-f +/dev/tty'; then type your program, and end it with a `Control-d'. +*Note Invoking `awk': Command Line. + + The environment variable `AWKPATH' specifies a search path to use +when finding source files named with the `-f' option. The default +path, which is `.:/usr/lib/awk:/usr/local/lib/awk' is used if `AWKPATH' +is not set. If a file name given to the `-f' option contains a `/' +character, no path search is performed. *Note The `AWKPATH' +Environment Variable: AWKPATH Variable, for a full description of the +`AWKPATH' environment variable. + + `gawk' compiles the program into an internal form, and then proceeds +to read each file named in the `ARGV' array. If there are no files +named on the command line, `gawk' reads the standard input. + + If a "file" named on the command line has the form `VAR=VAL', it is +treated as a variable assignment: the variable VAR is assigned the +value VAL. If any of the files have a value that is the null string, +that element in the list is skipped. + + For each line in the input, `gawk' tests to see if it matches any +PATTERN in the `awk' program. For each pattern that the line matches, +the associated ACTION is executed. + + +File: gawk.info, Node: Variables/Fields, Next: Rules Summary, Prev: Language Summary, Up: Gawk Summary + +Variables and Fields +==================== + + `awk' variables are dynamic; they come into existence when they are +first used. Their values are either floating-point numbers or strings. +`awk' also has one-dimension arrays; multiple-dimensional arrays may be +simulated. There are several predefined variables that `awk' sets as a +program runs; these are summarized below. + +* Menu: + +* Fields Summary:: Input field splitting. +* Built-in Summary:: `awk''s built-in variables. +* Arrays Summary:: Using arrays. +* Data Type Summary:: Values in `awk' are numbers or strings. + + +File: gawk.info, Node: Fields Summary, Next: Built-in Summary, Prev: Variables/Fields, Up: Variables/Fields + +Fields +------ + + As each input line is read, `gawk' splits the line into FIELDS, +using the value of the `FS' variable as the field separator. If `FS' +is a single character, fields are separated by that character. +Otherwise, `FS' is expected to be a full regular expression. In the +special case that `FS' is a single blank, fields are separated by runs +of blanks and/or tabs. Note that the value of `IGNORECASE' (*note +Case-sensitivity in Matching: Case-sensitivity.) also affects how +fields are split when `FS' is a regular expression. + + Each field in the input line may be referenced by its position, `$1', +`$2', and so on. `$0' is the whole line. The value of a field may be +assigned to as well. Field numbers need not be constants: + + n = 5 + print $n + +prints the fifth field in the input line. The variable `NF' is set to +the total number of fields in the input line. + + References to nonexistent fields (i.e., fields after `$NF') return +the null-string. However, assigning to a nonexistent field (e.g., +`$(NF+2) = 5') increases the value of `NF', creates any intervening +fields with the null string as their value, and causes the value of +`$0' to be recomputed, with the fields being separated by the value of +`OFS'. + + *Note Reading Input Files: Reading Files, for a full description of +the way `awk' defines and uses fields. + + +File: gawk.info, Node: Built-in Summary, Next: Arrays Summary, Prev: Fields Summary, Up: Variables/Fields + +Built-in Variables +------------------ + + `awk''s built-in variables are: + +`ARGC' + The number of command line arguments (not including options or the + `awk' program itself). + +`ARGIND' + The index in `ARGV' of the current file being processed. It is + always true that `FILENAME == ARGV[ARGIND]'. + +`ARGV' + The array of command line arguments. The array is indexed from 0 + to `ARGC' - 1. Dynamically changing the contents of `ARGV' can + control the files used for data. + +`CONVFMT' + The conversion format to use when converting numbers to strings. + +`FIELDWIDTHS' + A space separated list of numbers describing the fixed-width input + data. + +`ENVIRON' + An array containing the values of the environment variables. The + array is indexed by variable name, each element being the value of + that variable. Thus, the environment variable `HOME' would be in + `ENVIRON["HOME"]'. Its value might be `/u/close'. + + Changing this array does not affect the environment seen by + programs which `gawk' spawns via redirection or the `system' + function. (This may change in a future version of `gawk'.) + + Some operating systems do not have environment variables. The + array `ENVIRON' is empty when running on these systems. + +`ERRNO' + The system error message when an error occurs using `getline' or + `close'. + +`FILENAME' + The name of the current input file. If no files are specified on + the command line, the value of `FILENAME' is `-'. + +`FNR' + The input record number in the current input file. + +`FS' + The input field separator, a blank by default. + +`IGNORECASE' + The case-sensitivity flag for regular expression operations. If + `IGNORECASE' has a nonzero value, then pattern matching in rules, + field splitting with `FS', regular expression matching with `~' + and `!~', and the `gsub', `index', `match', `split' and `sub' + predefined functions all ignore case when doing regular expression + operations. + +`NF' + The number of fields in the current input record. + +`NR' + The total number of input records seen so far. + +`OFMT' + The output format for numbers for the `print' statement, `"%.6g"' + by default. + +`OFS' + The output field separator, a blank by default. + +`ORS' + The output record separator, by default a newline. + +`RS' + The input record separator, by default a newline. `RS' is + exceptional in that only the first character of its string value + is used for separating records. If `RS' is set to the null + string, then records are separated by blank lines. When `RS' is + set to the null string, then the newline character always acts as + a field separator, in addition to whatever value `FS' may have. + +`RSTART' + The index of the first character matched by `match'; 0 if no match. + +`RLENGTH' + The length of the string matched by `match'; -1 if no match. + +`SUBSEP' + The string used to separate multiple subscripts in array elements, + by default `"\034"'. + + *Note Built-in Variables::, for more information. + + +File: gawk.info, Node: Arrays Summary, Next: Data Type Summary, Prev: Built-in Summary, Up: Variables/Fields + +Arrays +------ + + Arrays are subscripted with an expression between square brackets +(`[' and `]'). Array subscripts are *always* strings; numbers are +converted to strings as necessary, following the standard conversion +rules (*note Conversion of Strings and Numbers: Conversion.). + + If you use multiple expressions separated by commas inside the square +brackets, then the array subscript is a string consisting of the +concatenation of the individual subscript values, converted to strings, +separated by the subscript separator (the value of `SUBSEP'). + + The special operator `in' may be used in an `if' or `while' +statement to see if an array has an index consisting of a particular +value. + + if (val in array) + print array[val] + + If the array has multiple subscripts, use `(i, j, ...) in array' to +test for existence of an element. + + The `in' construct may also be used in a `for' loop to iterate over +all the elements of an array. *Note Scanning all Elements of an Array: +Scanning an Array. + + An element may be deleted from an array using the `delete' statement. + + *Note Arrays in `awk': Arrays, for more detailed information. + + +File: gawk.info, Node: Data Type Summary, Prev: Arrays Summary, Up: Variables/Fields + +Data Types +---------- + + The value of an `awk' expression is always either a number or a +string. + + Certain contexts (such as arithmetic operators) require numeric +values. They convert strings to numbers by interpreting the text of +the string as a numeral. If the string does not look like a numeral, +it converts to 0. + + Certain contexts (such as concatenation) require string values. +They convert numbers to strings by effectively printing them with +`sprintf'. *Note Conversion of Strings and Numbers: Conversion, for +the details. + + To force conversion of a string value to a number, simply add 0 to +it. If the value you start with is already a number, this does not +change it. + + To force conversion of a numeric value to a string, concatenate it +with the null string. + + The `awk' language defines comparisons as being done numerically if +both operands are numeric, or if one is numeric and the other is a +numeric string. Otherwise one or both operands are converted to +strings and a string comparison is performed. + + Uninitialized variables have the string value `""' (the null, or +empty, string). In contexts where a number is required, this is +equivalent to 0. + + *Note Variables::, for more information on variable naming and +initialization; *note Conversion of Strings and Numbers: Conversion., +for more information on how variable values are interpreted. + + +File: gawk.info, Node: Rules Summary, Next: Functions Summary, Prev: Variables/Fields, Up: Gawk Summary + +Patterns and Actions +==================== + +* Menu: + +* Pattern Summary:: Quick overview of patterns. +* Regexp Summary:: Quick overview of regular expressions. +* Actions Summary:: Quick overview of actions. + + An `awk' program is mostly composed of rules, each consisting of a +pattern followed by an action. The action is enclosed in `{' and `}'. +Either the pattern may be missing, or the action may be missing, but, +of course, not both. If the pattern is missing, the action is executed +for every single line of input. A missing action is equivalent to this +action, + + { print } + +which prints the entire line. + + Comments begin with the `#' character, and continue until the end of +the line. Blank lines may be used to separate statements. Normally, a +statement ends with a newline, however, this is not the case for lines +ending in a `,', `{', `?', `:', `&&', or `||'. Lines ending in `do' or +`else' also have their statements automatically continued on the +following line. In other cases, a line can be continued by ending it +with a `\', in which case the newline is ignored. + + Multiple statements may be put on one line by separating them with a +`;'. This applies to both the statements within the action part of a +rule (the usual case), and to the rule statements. + + *Note Comments in `awk' Programs: Comments, for information on +`awk''s commenting convention; *note `awk' Statements versus Lines: +Statements/Lines., for a description of the line continuation mechanism +in `awk'. + + +File: gawk.info, Node: Pattern Summary, Next: Regexp Summary, Prev: Rules Summary, Up: Rules Summary + +Patterns +-------- + + `awk' patterns may be one of the following: + + /REGULAR EXPRESSION/ + RELATIONAL EXPRESSION + PATTERN && PATTERN + PATTERN || PATTERN + PATTERN ? PATTERN : PATTERN + (PATTERN) + ! PATTERN + PATTERN1, PATTERN2 + BEGIN + END + + `BEGIN' and `END' are two special kinds of patterns that are not +tested against the input. The action parts of all `BEGIN' rules are +merged as if all the statements had been written in a single `BEGIN' +rule. They are executed before any of the input is read. Similarly, +all the `END' rules are merged, and executed when all the input is +exhausted (or when an `exit' statement is executed). `BEGIN' and `END' +patterns cannot be combined with other patterns in pattern expressions. +`BEGIN' and `END' rules cannot have missing action parts. + + For `/REGULAR-EXPRESSION/' patterns, the associated statement is +executed for each input line that matches the regular expression. +Regular expressions are extensions of those in `egrep', and are +summarized below. + + A RELATIONAL EXPRESSION may use any of the operators defined below in +the section on actions. These generally test whether certain fields +match certain regular expressions. + + The `&&', `||', and `!' operators are logical "and," logical "or," +and logical "not," respectively, as in C. They do short-circuit +evaluation, also as in C, and are used for combining more primitive +pattern expressions. As in most languages, parentheses may be used to +change the order of evaluation. + + The `?:' operator is like the same operator in C. If the first +pattern matches, then the second pattern is matched against the input +record; otherwise, the third is matched. Only one of the second and +third patterns is matched. + + The `PATTERN1, PATTERN2' form of a pattern is called a range +pattern. It matches all input lines starting with a line that matches +PATTERN1, and continuing until a line that matches PATTERN2, inclusive. +A range pattern cannot be used as an operand to any of the pattern +operators. + + *Note Patterns::, for a full description of the pattern part of `awk' +rules. + |