diff options
Diffstat (limited to 'doc/gawk.1')
-rw-r--r-- | doc/gawk.1 | 502 |
1 files changed, 282 insertions, 220 deletions
@@ -1,6 +1,5 @@ .ds PX \s-1POSIX\s+1 .ds UX \s-1UNIX\s+1 -.ds AN \s-1ANSI\s+1 .ds GN \s-1GNU\s+1 .ds AK \s-1AWK\s+1 .ds EP \fIGAWK: Effective AWK Programming\fP @@ -14,7 +13,7 @@ . if \w'\(rq' .ds rq "\(rq . \} .\} -.TH GAWK 1 "Dec 07 2012" "Free Software Foundation" "Utility Commands" +.TH GAWK 1 "Apr 24 2013" "Free Software Foundation" "Utility Commands" .SH NAME gawk \- pattern scanning and processing language .SH SYNOPSIS @@ -43,7 +42,7 @@ This version in turn is based on the description in by Aho, Kernighan, and Weinberger. .I Gawk provides the additional features found in the current version -of \*(UX +of Brian Kernighan's .I awk and a number of \*(GN-specific extensions. .PP @@ -60,7 +59,7 @@ and .B ARGV pre-defined \*(AK variables. .PP -When +When .I gawk is invoked with the .B \-\^\-profile @@ -107,7 +106,7 @@ next command line argument. Long options may be abbreviated, as long as the abbreviation remains unique. .PP -Additionally, each long option has a corresponding short +Additionally, every long option has a corresponding short option, so that the option's functionality may be used from within .B #! @@ -158,7 +157,7 @@ to the variable before execution of the program begins. Such variable values are available to the .B BEGIN -block of an \*(AK program. +rule of an \*(AK program. .TP .PD 0 .B \-b @@ -171,6 +170,7 @@ process strings as multibyte characters. The .B "\-\^\-posix" option overrides this one. +.bp .TP .PD 0 .B \-c @@ -181,7 +181,7 @@ Run in .I compatibility mode. In compatibility mode, .I gawk -behaves identically to \*(UX +behaves identically to Brian Kernighan's .IR awk ; none of the \*(GN-specific extensions are recognized. .\" The use of @@ -234,7 +234,7 @@ Enable debugging of \*(AK programs. By default, the debugger reads commands interactively from the terminal. The optional .IR file -argument can be used to specify a file with a list +argument specifies a file with a list of commands for the debugger to execute non-interactively. .TP .PD 0 @@ -304,8 +304,10 @@ Load an awk source library. This searches for the library using the .B AWKPATH environment variable. If the initial search fails, another attempt will -be made after appending the ".awk" suffix. The file will be loaded only -once (i.e. duplicates are eliminated), and the code does not constitute +be made after appending the +.B \&.awk +suffix. The file will be loaded only +once (i.e., duplicates are eliminated), and the code does not constitute the main program source. .TP .PD 0 @@ -347,7 +349,7 @@ actually invalid are issued. (This is not fully implemented yet.) Force arbitrary precision arithmetic on numbers. This option has no effect if .I gawk -is not compiled to use the GNU MPFR and MP libraries. +is not compiled to use the GNU MPFR and MP libraries. .TP .PD 0 .B \-n @@ -415,12 +417,12 @@ elimination for recursive functions. The maintainer hopes to add additional optimizations over time. .TP .PD 0 -\fB\-p\fR[\fIprof_file\fR] +\fB\-p\fR[\fIprof-file\fR] .TP .PD -\fB\-\^\-profile\fR[\fB=\fIprof_file\fR] +\fB\-\^\-profile\fR[\fB=\fIprof-file\fR] Start a profiling session, and send the profiling data to -.IR prof_file . +.IR prof-file . The default is .BR awkprof.out . The profile contains execution counts of each statement in the program @@ -487,7 +489,7 @@ and .I egrep consistent with each other. They are enabled by default, but this option remains for use with -.BR \-\^-traditional . +.BR \-\^\-traditional . .TP .PD 0 .BI \-S @@ -500,7 +502,7 @@ in sandbox mode, disabling the .B system() function, input redirection with .BR getline , -output redirection with +output redirection with .BR print " and " printf , and loading dynamic extensions. Command execution (through pipelines) is also disabled. @@ -513,7 +515,7 @@ This effectively blocks a script from accessing local resources .PD .B \-\^\-lint\-old Provide warnings about constructs that are -not portable to the original version of Unix +not portable to the original version of \*(UX .IR awk . .TP .PD 0 @@ -547,6 +549,10 @@ options are passed on to the \*(AK program in the .B ARGV array for processing. This is particularly useful for running \*(AK programs via the \*(lq#!\*(rq executable interpreter mechanism. +.PP +For \*(PX compatibility, the +.B \-W +option may be used, followed by the name of a long option. .SH AWK PROGRAM EXECUTION .PP An \*(AK program consists of a sequence of pattern-action statements @@ -586,13 +592,16 @@ functions with command line programs. In addition, lines beginning with .B @include may be used to include other source files into your program, -making library use even easier. +making library use even easier. This is equivalent +to using the +.B \-i +option. .PP Lines beginning with .B @load may be used to load shared libraries into your program. This is equivalent to using the -.B \-l +.B \-l option. .PP The environment variable @@ -611,6 +620,17 @@ If a file name given to the .B \-f option contains a \*(lq/\*(rq character, no path search is performed. .PP +The environment variable +.B AWKLIBPATH +specifies a search path to use when finding source files named with +the +.B \-l +option. If this variable does not exist, the default path is +\fB".:/usr/local/lib/gawk"\fR. +(The actual directory may vary, depending upon how +.I gawk +was built and installed.) +.PP .I Gawk executes \*(AK programs in the following order. First, @@ -624,7 +644,7 @@ Then, .I gawk executes the code in the .B BEGIN -block(s) (if any), +rule(s) (if any), and then proceeds to read each file named in the .B ARGV @@ -642,7 +662,7 @@ will be assigned the value .IR val . (This happens after any .B BEGIN -block(s) have been run.) +rule(s) have been run.) Command line variable assignment is most useful for dynamically assigning values to the variables \*(AK uses to control how input is broken into fields and records. @@ -673,16 +693,17 @@ For each record in the input, tests to see if it matches any .I pattern in the \*(AK program. -For each pattern that the record matches, the associated -.I action -is executed. +For each pattern that the record matches, +.I gawk +executes the associated +.IR action . The patterns are tested in the order they occur in the program. .PP Finally, after all the input is exhausted, .I gawk executes the code in the .B END -block(s) (if any). +rule(s) (if any). .SS Command Line Directories .PP According to POSIX, files named on the @@ -710,6 +731,10 @@ first used. Their values are either floating-point numbers or strings, or both, depending upon how they are used. \*(AK also has one dimensional arrays; arrays with multiple dimensions may be simulated. +.I Gawk +provides true arrays of arrays; see +.BR Arrays , +below. Several pre-defined variables are set as a program runs; these are described as needed and summarized below. .SS Records @@ -799,7 +824,7 @@ or overrides the use of .BR FPAT . .PP -Each field in the input record may be referenced by its position, +Each field in the input record may be referenced by its position: .BR $1 , .BR $2 , and so on. @@ -821,14 +846,14 @@ The variable .B NF is set to the total number of fields in the input record. .PP -References to non-existent fields (i.e. fields after +References to non-existent fields (i.e., fields after .BR $NF ) produce the null-string. However, assigning to a non-existent field (e.g., .BR "$(NF+2) = 5" ) increases the value of .BR NF , -creates any intervening fields with the null string as their value, and +creates any intervening fields with the null string as their values, and causes the value of .B $0 to be recomputed, with the fields being separated by the value of @@ -891,7 +916,7 @@ The conversion format for numbers, \fB"%.6g"\fR, by default. An array containing the values of the current environment. The array is indexed by the environment variables, each element being the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be -.BR /home/arnold ). +\fB"/home/arnold"\fR). Changing this array does not affect the environment seen by programs which .I gawk spawns via redirection or the @@ -931,7 +956,7 @@ However, .B FILENAME is undefined inside the .B BEGIN -block +rule (unless set by .BR getline ). .TP @@ -958,13 +983,13 @@ The input field separator, a space by default. See above. .TP .B FUNCTAB -An array whose indices are the names of all the user-defined +An array whose indices and corresponding values +are the names of all the user-defined or extension functions in the program. .BR NOTE : -The array values cannot currently be used. -Also, you may not use the +You may not use the .B delete -statment with the +statement with the .B FUNCTAB array. .TP @@ -1063,7 +1088,7 @@ The following elements are guaranteed to be available: .RS .TP \w'\fBPROCINFO["version"]\fR'u+1n \fBPROCINFO["egid"]\fP -the value of the +The value of the .IR getegid (2) system call. .TP @@ -1072,7 +1097,7 @@ The default time format string for .BR strftime() . .TP \fBPROCINFO["euid"]\fP -the value of the +The value of the .IR geteuid (2) system call. .TP @@ -1089,7 +1114,13 @@ is in effect. .TP \fBPROCINFO["identifiers"]\fP A subarray, indexed by the names of all identifiers used in the -text of the AWK program. For each identifier, the value of the element is one of the following: +text of the AWK program. +The values indicate what +.I gawk +knows about the identifiers after it has finished parsing the program; they are +.I not +updated while the program runs. +For each identifier, the value of the element is one of the following: .RS .TP \fB"array"\fR @@ -1110,28 +1141,23 @@ doesn't know yet). \fB"user"\fR The identifier is a user-defined function. .RE -The values indicate what -.I gawk -knows about the identifiers after it has finished parsing the program; they are -.I not -updated while the program runs. .TP \fBPROCINFO["gid"]\fP -the value of the +The value of the .IR getgid (2) system call. .TP \fBPROCINFO["pgrpid"]\fP -the process group ID of the current process. +The process group ID of the current process. .TP \fBPROCINFO["pid"]\fP -the process ID of the current process. +The process ID of the current process. .TP \fBPROCINFO["ppid"]\fP -the parent process ID of the current process. +The parent process ID of the current process. .TP \fBPROCINFO["uid"]\fP -the value of the +The value of the .IR getuid (2) system call. .TP @@ -1157,11 +1183,11 @@ and \fB"@unsorted"\fR. The value can also be the name of any comparison function defined as follows: -.PP -.RS +.sp +.in +5m \fBfunction cmp_func(i1, v1, i2, v2)\fR -.RE -.PP +.in -5m +.sp where .I i1 and @@ -1176,7 +1202,7 @@ It should return a number less than, equal to, or greater than 0, depending on how the elements of the array are to be ordered. .TP \fBPROCINFO["input", "READ_TIMEOUT"]\fP -specifies the timeout in milliseconds for reading data from +The timeout in milliseconds for reading data from .IR input , where .I input @@ -1184,22 +1210,38 @@ is a redirection string or a filename. A value of zero or less than zero means no timeout. .TP \fBPROCINFO["mpfr_version"]\fP -the version of the GNU MPFR library used for arbitrary precision +The version of the GNU MPFR library used for arbitrary precision number support in .IR gawk . +This entry is not present if MPFR support is not compiled into +.IR gawk . .TP \fBPROCINFO["gmp_version"]\fP -the version of the GNU MP library used for arbitrary precision +The version of the GNU MP library used for arbitrary precision number support in .IR gawk . +This entry is not present if MPFR support is not compiled into +.IR gawk . .TP \fBPROCINFO["prec_max"]\fP -the maximum precision supported by the GNU MPFR library for +The maximum precision supported by the GNU MPFR library for arbitrary precision floating-point numbers. +This entry is not present if MPFR support is not compiled into +.IR gawk . .TP \fBPROCINFO["prec_min"]\fP -the minimum precision allowed by the GNU MPFR library for +The minimum precision allowed by the GNU MPFR library for arbitrary precision floating-point numbers. +This entry is not present if MPFR support is not compiled into +.IR gawk . +.TP +\fBPROCINFO["api_major"]\fP +The major version of the extension API. +This entry is not present if loading dynamic extensions is not available. +.TP +\fBPROCINFO["api_minor"]\fP +The minor version of the extension API. +This entry is not present if loading dynamic extensions is not available. .TP \fBPROCINFO["version"]\fP the version of @@ -1248,15 +1290,17 @@ elements, by default \fB"\e034"\fR. An array whose indices are the names of all currently defined global variables and arrays in the program. The array may be used for indirect access to read or write the value of a variable: -.PP -.RS +.sp .ft B +.nf +.in +5m foo = 5 SYMTAB["foo"] = 4 print foo # prints 4 +.fi .ft R -.RE -.PP +.in -5m +.sp The .B isarray() function may be used to test if an element in @@ -1264,7 +1308,7 @@ function may be used to test if an element in is an array. You may not use the .B delete -statment with the +statement with the .B SYMTAB array. .TP @@ -1296,7 +1340,7 @@ x[i, j, k] = "hello, world\en" assigns the string \fB"hello, world\en"\fR to the element of the array .B x which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in \*(AK -are associative, i.e. indexed by string values. +are associative, i.e., indexed by string values. .PP The special operator .B in @@ -1333,6 +1377,7 @@ just by specifying the array name without a subscript. supports true multidimensional arrays. It does not require that such arrays be ``rectangular'' as in C or C++. For example: +.sp .RS .ft B .nf @@ -1342,6 +1387,18 @@ a[2][2] = 7 .fi .ft .RE +.PP +.BR NOTE : +You may need to tell +.I gawk +that an array element is really a subarray in order to use it where +.I gawk +expects an array (such as in the second argument to +.BR split() ). +You can do this by creating an element in the subarray and then +deleting it with the +.B delete +statement. .SS Variable Typing And Conversion .PP Variables and fields @@ -1353,6 +1410,9 @@ it will be treated as a string. To force a variable to be treated as a number, add 0 to it; to force it to be treated as a string, concatenate it with the null string. .PP +Uninitialized variables have the numeric value 0 and the string value "" +(the null, or empty, string). +.PP When a string must be converted to a number, the conversion is accomplished using .IR strtod (3). @@ -1383,7 +1443,7 @@ has a string value of \fB"12"\fR and not \fB"12.00"\fR. .BR NOTE : When operating in POSIX mode (such as with the .B \-\^\-posix -command line option), +option), beware that locale settings may interfere with the way decimal numbers are treated: the decimal separator of the numbers you are feeding to @@ -1420,9 +1480,6 @@ The basic idea is that .IR "user input" , and only user input, that looks numeric, should be treated that way. -.PP -Uninitialized variables have the numeric value 0 and the string value "" -(the null, or empty, string). .SS Octal and Hexadecimal Constants You may use C-style octal and hexadecimal constants in your AWK program source code. @@ -1448,28 +1505,28 @@ A literal backslash. The \*(lqalert\*(rq character; usually the \s-1ASCII\s+1 \s-1BEL\s+1 character. .TP .B \eb -backspace. +Backspace. .TP .B \ef -form-feed. +Form-feed. .TP .B \en -newline. +Newline. .TP .B \er -carriage return. +Carriage return. .TP .B \et -horizontal tab. +Horizontal tab. .TP .B \ev -vertical tab. +Vertical tab. .TP .BI \ex "\^hex digits" The character represented by the string of hexadecimal digits following the .BR \ex . -As in \*(AN C, all following hexadecimal digits are considered part of +As in ISO C, all following hexadecimal digits are considered part of the escape sequence. (This feature should tell us something about language design by committee.) E.g., \fB"\ex1B"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character. @@ -1568,10 +1625,10 @@ The action parts of all patterns are merged as if all the statements had been written in a single .B BEGIN -block. They are executed before any +rule. They are executed before any of the input is read. Similarly, all the .B END -blocks are merged, +rules are merged, and executed when all the input is exhausted (or when an .B exit statement is executed). @@ -1594,7 +1651,7 @@ Inside the .B BEGINFILE rule, the value of .B ERRNO -will be the empty string if the file could be opened successfully. +will be the empty string if the file was opened successfully. Otherwise, there is some problem with the file and the code should use .B nextfile @@ -1646,58 +1703,59 @@ Regular expressions are the extended kind found in They are composed of characters as follows: .TP "\w'\fB[^\fIabc.\|.\|.\fB]\fR'u+2n" .I c -matches the non-metacharacter +Matches the non-metacharacter .IR c . .TP .I \ec -matches the literal character +Matches the literal character .IR c . .TP .B . -matches any character +Matches any character .I including newline. .TP .B ^ -matches the beginning of a string. +Matches the beginning of a string. .TP .B $ -matches the end of a string. +Matches the end of a string. .TP .BI [ abc.\|.\|. ] -character list, matches any of the characters +A character list: matches any of the characters .IR abc.\|.\|. . +You may include a range of characters by separating them with a dash. .TP \fB[^\fIabc.\|.\|.\fB]\fR -negated character list, matches any character except +A negated character list: matches any character except .IR abc.\|.\|. . .TP .IB r1 | r2 -alternation: matches either +Alternation: matches either .I r1 or .IR r2 . .TP .I r1r2 -concatenation: matches +Concatenation: matches .IR r1 , and then .IR r2 . .TP .IB r\^ + -matches one or more +Matches one or more .IR r\^ "'s." .TP .IB r * -matches zero or more +Matches zero or more .IR r\^ "'s." .TP .IB r\^ ? -matches zero or one +Matches zero or one .IR r\^ "'s." .TP .BI ( r ) -grouping: matches +Grouping: matches .IR r . .TP .PD 0 @@ -1728,37 +1786,38 @@ is repeated at least times. .TP .B \ey -matches the empty string at either the beginning or the +Matches the empty string at either the beginning or the end of a word. .TP .B \eB -matches the empty string within a word. +Matches the empty string within a word. .TP .B \e< -matches the empty string at the beginning of a word. +Matches the empty string at the beginning of a word. .TP .B \e> -matches the empty string at the end of a word. +Matches the empty string at the end of a word. .TP .B \es -matches any whitespace character. +Matches any whitespace character. .TP .B \eS -matches any nonwhitespace character. +Matches any nonwhitespace character. .TP .B \ew -matches any word-constituent character (letter, digit, or underscore). +Matches any word-constituent character (letter, digit, or underscore). .TP .B \eW -matches any character that is not word-constituent. +Matches any character that is not word-constituent. .TP .B \e` -matches the empty string at the beginning of a buffer (string). +Matches the empty string at the beginning of a buffer (string). .TP .B \e' -matches the empty string at the end of a buffer. +Matches the empty string at the end of a buffer. .PP -The escape sequences that are valid in string constants (see below) +The escape sequences that are valid in string constants (see +.BR "String Constants" ) are also valid in regular expressions. .PP .I "Character classes" @@ -1907,7 +1966,7 @@ interprets characters in regular expressions. No options In the default case, .I gawk -provide all the facilities of +provides all the facilities of \*(PX regular expressions and the \*(GN regular expression operators described above. .TP .B \-\^\-posix @@ -1918,7 +1977,7 @@ matches a literal .BR w ). .TP .B \-\^\-traditional -Traditional Unix +Traditional \*(UX .I awk regular expressions are matched. The \*(GN operators are not special, and interval expressions are not available. @@ -1940,7 +1999,7 @@ and input/output statements available are patterned after those in C. .SS Operators .PP -The operators in \*(AK, in order of decreasing precedence, are +The operators in \*(AK, in order of decreasing precedence, are: .PP .TP "\w'\fB*= /= %= ^=\fR'u+1n" .BR ( \&.\|.\|. ) @@ -1992,7 +2051,7 @@ Only use one on the right-hand side. The expression has the same meaning as \fB(($0 ~ /foo/) ~ \fIexp\fB)\fR. This is usually .I not -what was intended. +what you want. .TP .B in Array membership. @@ -2068,7 +2127,8 @@ Set from next input record; set .BR NF , .BR NR , -.BR FNR . +.BR FNR , +.BR RT . .TP .BI "getline <" file Set @@ -2076,20 +2136,23 @@ Set from next record of .IR file ; set -.BR NF . +.BR NF , +.BR RT . .TP .BI getline " var" Set .I var from next input record; set .BR NR , -.BR FNR . +.BR FNR , +.BR RT . .TP .BI getline " var" " <" file Set .I var from next record of -.IR file . +.IR file , +.BR RT . .TP \fIcommand\fB | getline \fR[\fIvar\fR] Run @@ -2098,7 +2161,8 @@ piping the output either into .B $0 or .IR var , -as above. +as above, and +.BR RT . .TP \fIcommand\fB |& getline \fR[\fIvar\fR] Run @@ -2108,7 +2172,8 @@ piping the output either into .B $0 or .IR var , -as above. +as above, and +.BR RT . Co-processes are a .I gawk extension. @@ -2120,9 +2185,12 @@ below.) .B next Stop processing the current input record. The next input record is read and processing starts over with the first pattern in the -\*(AK program. If the end of the input data is reached, the +\*(AK program. +Upon reaching the end of the input data, +.I gawk +executes any .B END -block(s), if any, are executed. +rule(s). .TP .B "nextfile" Stop processing the current input file. The next input record read @@ -2133,33 +2201,32 @@ and are updated, .B FNR is reset to 1, and processing starts over with the first pattern in the -\*(AK program. If the end of the input data is reached, the +\*(AK program. +Upon reaching the end of the input data, +.I gawk +executes any .B END -block(s), if any, are executed. +rule(s). .TP .B print Print the current record. -The output record is terminated with the value of the -.B ORS -variable. +The output record is terminated with the value of +.BR ORS . .TP .BI print " expr-list" Print expressions. -Each expression is separated by the value of the -.B OFS -variable. -The output record is terminated with the value of the -.B ORS -variable. +Each expression is separated by the value of +.BR OFS . +The output record is terminated with the value of +.BR ORS . .TP .BI print " expr-list" " >" file Print expressions on .IR file . -Each expression is separated by the value of the -.B OFS -variable. The output record is terminated with the value of the -.B ORS -variable. +Each expression is separated by the value of +.BR OFS . +The output record is terminated with the value of +.BR ORS . .TP .BI printf " fmt, expr-list" Format and print. @@ -2207,10 +2274,10 @@ The command returns 1 on success, 0 on end of file, and \-1 on an error. Upon an error, .B ERRNO -contains a string describing the problem. +is set to a string describing the problem. .PP .BR NOTE : -Failure in opening a two-way socket will result in a non-fatal error being +Failure in opening a two-way socket results in a non-fatal error being returned to the calling function. If using a pipe, co-process, or socket to .BR getline , or from @@ -2247,7 +2314,7 @@ A decimal number (the integer part). .TP .BR %e , " %E" A floating point number of the form -.BR [\-]d.dddddde[+\^\-]dd . +[\fB\-\fP]\fId\fB.\fIdddddd\^\fBe\fR[\fB+\-\fR]\fIdd\fR. The .B %E format uses @@ -2257,7 +2324,7 @@ instead of .TP .BR %f , " %F" A floating point number of the form -.BR [\-]ddd.dddddd . +[\fB\-\fP]\fIddd\fB.\fIdddddd\fR. If the system library supports it, .B %F is available as well. This is like @@ -2378,9 +2445,9 @@ value to be printed. .TP .I width The field should be padded to this width. The field is normally padded -with spaces. If the +with spaces. With the .B 0 -flag has been used, it is padded with zeroes. +flag, it is padded with zeroes. .TP .BI \&. prec A number that specifies the precision to use when printing. @@ -2415,15 +2482,15 @@ The dynamic .I width and .I prec -capabilities of the \*(AN C +capabilities of the ISO C .B printf() routines are supported. A .B * in place of either the -.B width +.I width or -.B prec +.I prec specifications causes their values to be taken from the argument list to .B printf @@ -2454,6 +2521,9 @@ parent process (usually the shell). These file names may also be used on the command line to name data files. The filenames are: .TP "\w'\fB/dev/stdout\fR'u+1n" +.B \- +The standard input. +.TP .B /dev/stdin The standard input. .TP @@ -2560,7 +2630,8 @@ Return the sine of which is in radians. .TP .BI sqrt( expr ) -The square root function. +Return the square root of +.IR expr . .TP \&\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR Use @@ -2568,7 +2639,7 @@ Use as the new seed for the random number generator. If no .I expr is provided, use the time of day. -The return value is the previous seed for the random +Return the previous seed for the random number generator. .SS String Functions .PP @@ -2593,7 +2664,7 @@ with sequential integers starting with 1. If the optional destination array .I d -is specified, then +is specified, first duplicate .I s into @@ -2789,11 +2860,11 @@ Element values are the portions of that matched .IR r . The value of -.I seps[i] +.BI seps[ i ] is the separator that appeared in front of -.IR a[i+1] . -If +.BI a[ i +1]\fR. +\&\fRIf .I r is omitted, .B FPAT @@ -2826,33 +2897,33 @@ The arrays and .I seps are cleared first. -.I seps[i] +.BI seps[ i ] is the field separator matched by .I r between -.I a[i] +.BI a[ i ] and -.IR a[i+1] . -If +.BI a[ i +1]\fR. +\&\fRIf .I r is a single space, then leading whitespace in .I s goes into the extra array element -.I seps[0] +.B seps[0] and trailing whitespace goes into the extra array element -.IR seps[n] , +.BI seps[ n ]\fR, where .I n -is the return value of -.IR "split(s, a, r, seps)" . +is the return value of +.BI split( s ", " a ", " r ", " seps )\fR. Splitting behaves identically to field splitting, described above. .TP .BI sprintf( fmt , " expr-list" ) -Prints +Print .I expr-list according to .IR fmt , -and returns the resulting string. +and return the resulting string. .TP .BI strtonum( str ) Examine @@ -2863,10 +2934,8 @@ If begins with a leading .BR 0 , -.B strtonum() -assumes that -.I str -is an octal number. +treat it +as an octal number. If .I str begins @@ -2874,11 +2943,9 @@ with a leading .B 0x or .BR 0X , -.B strtonum() -assumes that -.I str -is a hexadecimal number. -Otherwise, decimal is assumed. +treat it +as a hexadecimal number. +Otherwise, assume it is a decimal number. .TP \fBsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR Just like @@ -2991,7 +3058,7 @@ The default format is available in .BR PROCINFO["strftime"] . See the specification for the .B strftime() -function in \*(AN C for the format conversions that are +function in ISO C for the format conversions that are guaranteed to be available. .TP .B systime() @@ -3053,7 +3120,7 @@ For full details, see \*(EP. Specify the directory where .I gawk looks for the -.B \&.mo +.B \&.gmo files, in case they will not or cannot be placed in the ``standard'' locations (e.g., during testing). @@ -3097,7 +3164,7 @@ You must also supply a text domain. Use .B TEXTDOMAIN if you want to use the current domain. .TP -\fBdcngettext(\fIstring1 \fR, \fIstring2 \fR, \fInumber \fR[\fB, \fIdomain \fR[\fB, \fIcategory\fR]]\fB)\fR +\fBdcngettext(\fIstring1\fB, \fIstring2\fB, \fInumber \fR[\fB, \fIdomain \fR[\fB, \fIcategory\fR]]\fB)\fR Return the plural form used for .I number of the translation of @@ -3207,7 +3274,8 @@ Calling an undefined function at run time is a fatal error. The word .B func may be used in place of -.BR function . +.BR function , +although this is deprecated. .SH DYNAMICALLY LOADING NEW FUNCTIONS You can dynamically add new built-in functions to the running .I gawk @@ -3269,16 +3337,16 @@ action to assign a value to the .B TEXTDOMAIN variable to set the text domain to a name associated with your program: .sp -.RS +.in +5m .ft B BEGIN { TEXTDOMAIN = "myprog" } .ft R -.RE +.in -5m .sp This allows .I gawk to find the -.B \&.mo +.B \&.gmo file associated with your program. Without this step, .I gawk @@ -3301,12 +3369,12 @@ functions in your program, as appropriate. Run .B "gawk \-\^\-gen\-pot \-f myprog.awk > myprog.pot" to generate a -.B \&.po +.B \&.pot file for your program. .TP 5. Provide appropriate translations, and build and install the corresponding -.B \&.mo +.B \&.gmo files. .PP The internationalization features are described in full detail in \*(EP. @@ -3314,13 +3382,13 @@ The internationalization features are described in full detail in \*(EP. A primary goal for .I gawk is compatibility with the \*(PX standard, as well as with the -latest version of \*(UX +latest version of Brian Kernighan's .IR awk . To this end, .I gawk incorporates the following user visible features which are not described in the \*(AK book, -but are part of the Bell Laboratories version of +but are part of the Brian Kernighan's version of .IR awk , and are in the \*(PX standard. .PP @@ -3328,19 +3396,20 @@ The book indicates that command line variable assignment happens when .I awk would otherwise open the argument as a file, which is after the .B BEGIN -block is executed. However, in earlier implementations, when such an +rule is executed. However, in earlier implementations, when such an assignment appeared before any file names, the assignment would happen .I before the .B BEGIN -block was run. Applications came to depend on this \*(lqfeature.\*(rq +rule was run. Applications came to depend on this \*(lqfeature.\*(rq When .I awk was changed to match its documentation, the .B \-v option for assigning variables before program execution was added to accommodate applications that depended upon the old behavior. -(This feature was agreed upon by both the Bell Laboratories and the \*(GN developers.) +(This feature was agreed upon by both the Bell Laboratories +and the \*(GN developers.) .PP When processing arguments, .I gawk @@ -3378,7 +3447,7 @@ and fed back into the Bell Laboratories version); the .B tolower() and .B toupper() -built-in functions (from the Bell Laboratories version); and the \*(AN C conversion specifications in +built-in functions (from the Bell Laboratories version); and the ISO C conversion specifications in .B printf (done first in the Bell Laboratories version). .SH HISTORICAL FEATURES @@ -3413,7 +3482,7 @@ issues a warning about its use if is specified on the command line. .SH GNU EXTENSIONS .I Gawk -has a number of extensions to \*(PX +has a too-large number of extensions to \*(PX .IR awk . They are described in this section. All the extensions described here can be disabled by @@ -3441,12 +3510,19 @@ environment variable is not special. .\" POSIX and language recognition issues .TP \(bu -There is no facility for doing file inclusion +There is no facility for doing file inclusion .RI ( gawk 's .B @include mechanism). .TP \(bu +There is no facility for dynamically adding new functions +written in C +.RI ( gawk 's +.B @load +mechanism). +.TP +\(bu The .B \ex escape sequence. @@ -3550,16 +3626,17 @@ and The ability to pass an array to .BR length() . .\" New keywords or changes to keywords -.TP -\(bu -The use of -.BI delete " array" -to delete the entire contents of an array. -.TP -\(bu -The use of -.B "nextfile" -to abandon processing of the current input file. +.\" (As of 2012, these are in POSIX) +.\" .TP +.\" \(bu +.\" The use of +.\" .BI delete " array" +.\" to delete the entire contents of an array. +.\" .TP +.\" \(bu +.\" The use of +.\" .B "nextfile" +.\" to abandon processing of the current input file. .\" New functions .TP \(bu @@ -3587,12 +3664,6 @@ functions. .TP \(bu Localizable strings. -.\" Extending gawk -.TP -\(bu -Adding new built-in functions dynamically with the -.B extension() -function. .PP The \*(AK book does not define the return value of the .B close() @@ -3661,15 +3732,15 @@ The environment variable can be used to provide a list of directories that .I gawk searches when looking for files named via the -.B \-f -, -.B \-\^\-file -, +.BR \-f , +.RB \-\^\-file , .B \-i and .B \-\^\-include options. If the initial search fails, the path is searched again after -appending ".awk" to the filename. +appending +.B \&.awk +to the filename. .PP The .B AWKLIBPATH @@ -3687,10 +3758,11 @@ environment variable can be used to specify a timeout in milliseconds for reading input from a terminal, pipe or two-way communication including sockets. .PP -For socket communication, two special environment variables can be used to control the number of retries -.RB ( GAWK_SOCK_RETRIES ), -and the interval between retries -.RB ( GAWK_MSEC_SLEEP ). +For connection to a remote host via socket, +.B GAWK_SOCK_RETRIES +controls the number of retries, and +.B GAWK_MSEC_SLEEP +and the interval between retries. The interval is in milliseconds. On systems that do not support .IR usleep (3), the value is rounded up to an integral number of seconds. @@ -3759,22 +3831,12 @@ compatible with the new version of \*(UX .IR awk . Arnold Robbins is the current maintainer. .PP -The initial DOS port was done by Conrad Kwok and Scott Garfinkle. -Scott Deifik maintains the port to MS-DOS using DJGPP. -Eli Zaretskii maintains the port to MS-Windows using MinGW. -Pat Rankin did the -port to VMS, and Michal Jaegermann did the port to the Atari ST. -The port to OS/2 was done by Kai Uwe Rommel, with contributions and -help from Darrel Hankerson. -Andreas Buening now maintains the OS/2 port. -The late Fred Fish supplied support for the Amiga, -and Martin Brown provided the BeOS port. -Stephen Davies provided the original Tandem port, and -Matthew Woehlke provided changes for Tandem's POSIX-compliant systems. -Dave Pitts provided the port to z/OS. +See \*(EP for a full list of the contributors to +.I gawk +and its documentation. .PP See the -.I README +.B README file in the .I gawk distribution for up-to-date information about maintainers @@ -3892,13 +3954,13 @@ Run an external command for particular lines of data: .ft R .fi .SH ACKNOWLEDGEMENTS -Brian Kernighan of Bell Laboratories +Brian Kernighan provided valuable assistance during testing and debugging. We thank him. .SH COPYING PERMISSIONS Copyright \(co 1989, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002, 2003, 2004, 2005, 2007, 2009, -2010, 2011, 2012 +2010, 2011, 2012, 2013 Free Software Foundation, Inc. .PP Permission is granted to make and distribute verbatim copies of |