diff options
Diffstat (limited to 'doc/gawktexi.in')
-rw-r--r-- | doc/gawktexi.in | 310 |
1 files changed, 281 insertions, 29 deletions
diff --git a/doc/gawktexi.in b/doc/gawktexi.in index 08a63960..7596bd95 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -1494,9 +1494,11 @@ default @command{awk} utility. A more modern @command{awk} lives in if you try the test program: @example +@group $ @kbd{awk 1 /dev/null} @error{} awk: syntax error near line 1 @error{} awk: bailing out near line 1 +@end group @end example @noindent @@ -2901,10 +2903,12 @@ for the single- and double-quote characters, like so: @example +@group $ @kbd{awk 'BEGIN @{ print "Here is a single quote <\47>" @}'} @print{} Here is a single quote <'> $ @kbd{awk 'BEGIN @{ print "Here is a double quote <\42>" @}'} @print{} Here is a double quote <"> +@end group @end example @noindent @@ -3176,8 +3180,10 @@ action---so it uses the default action, printing the record. Print the length of the longest input line: @example +@group awk '@{ if (length($0) > max) max = length($0) @} END @{ print max @}' data +@end group @end example The code associated with @code{END} executes after all @@ -3494,11 +3500,13 @@ starts a comment, it ignores @emph{everything} on the rest of the line. For example: @example +@group $ @kbd{gawk 'BEGIN @{ print "dont panic" # a friendly \} > @kbd{ BEGIN rule} > @kbd{@}'} @error{} gawk: cmd. line:2: BEGIN rule @error{} gawk: cmd. line:2: ^ syntax error +@end group @end example @noindent @@ -4695,10 +4703,12 @@ The files to be included may be nested; e.g., given a third script, namely @file{test3}: @example +@group @@include "test2" BEGIN @{ print "This is script test3." @} +@end group @end example @noindent @@ -4785,8 +4795,10 @@ $ @kbd{gawk '@@load "ordchr"; BEGIN @{print chr(65)@}'} This is equivalent to the following example: @example +@group $ @kbd{gawk -lordchr 'BEGIN @{print chr(65)@}'} @print{} A +@end group @end example @noindent @@ -6282,8 +6294,10 @@ with each @samp{u} changed to a newline. Here are the results of running the program on @file{mail-list}: @example +@group $ @kbd{awk 'BEGIN @{ RS = "u" @}} > @kbd{@{ print $0 @}' mail-list} +@end group @print{} Amelia 555-5553 amelia.zodiac @print{} sq @print{} e@@gmail.com F @@ -6440,9 +6454,11 @@ matches either a newline or a series of one or more uppercase letters with optional leading and/or trailing whitespace: @example +@group $ @kbd{echo record 1 AAAA record 2 BBBB record 3 |} > @kbd{gawk 'BEGIN @{ RS = "\n|( *[[:upper:]]+ *)" @}} > @kbd{@{ print "Record =", $0,"and RT = [" RT "]" @}'} +@end group @print{} Record = record 1 and RT = [ AAAA ] @print{} Record = record 2 and RT = [ BBBB ] @print{} Record = record 3 and RT = [ @@ -6826,8 +6842,10 @@ values of the fields and @code{OFS}. To do this, use the seemingly innocuous assignment: @example +@group $1 = $1 # force record to be reconstituted print $0 # or whatever else with $0 +@end group @end example @noindent @@ -7596,16 +7614,20 @@ Putting this to use, here is a simple program to parse the data: @example @c file eg/misc/simple-csv.awk +@group BEGIN @{ FPAT = "([^,]+)|(\"[^\"]+\")" @} +@end group +@group @{ print "NF = ", NF for (i = 1; i <= NF; i++) @{ printf("$%d = <%s>\n", i, $i) @} @} +@end group @c endfile @end example @@ -8046,6 +8068,7 @@ read-a-line-and-check-each-rule loop of @command{awk} never sees it. The following example swaps every two lines of input: @example +@group @{ if ((getline tmp) > 0) @{ print tmp @@ -8053,6 +8076,7 @@ The following example swaps every two lines of input: @} else print $0 @} +@end group @end example @noindent @@ -8195,6 +8219,7 @@ lines that begin with @samp{@@execute}, which are replaced by the output produced by running the rest of the line as a shell command: @example +@group @{ if ($1 == "@@execute") @{ tmp = substr($0, 10) # Remove "@@execute" @@ -8204,6 +8229,7 @@ produced by running the rest of the line as a shell command: @} else print @} +@end group @end example @noindent @@ -8507,12 +8533,14 @@ For example, a TCP client can decide to give up on receiving any response from the server after a certain amount of time: @example +@group Service = "/inet/tcp/0/localhost/daytime" PROCINFO[Service, "READ_TIMEOUT"] = 100 if ((Service |& getline) > 0) print $0 else if (ERRNO != "") print ERRNO +@end group @end example Here is how to read interactively from the user@footnote{This assumes @@ -8854,10 +8882,12 @@ newlines: @end ifnotinfo @example +@group $ @kbd{awk 'BEGIN @{ print "line one\nline two\nline three" @}'} @print{} line one @print{} line two @print{} line three +@end group @end example @cindex fields, printing @@ -9109,12 +9139,14 @@ The output separator variables @code{OFS} and @code{ORS} have no effect on @code{printf} statements. For example: @example +@group $ @kbd{awk 'BEGIN @{} > @kbd{ORS = "\nOUCH!\n"; OFS = "+"} > @kbd{msg = "Don\47t Panic!"} > @kbd{printf "%s\n", msg} > @kbd{@}'} @print{} Don't Panic! +@end group @end example @noindent @@ -9637,9 +9669,11 @@ alone for now and let's hope no-one notices. @end ignore @example +@group awk '@{ print $1 > "names.unsorted" command = "sort -r > names.sorted" print $1 | command @}' mail-list +@end group @end example The unsorted list is written with an ordinary redirection, while @@ -9933,7 +9967,7 @@ The @var{protocol} is one of @samp{tcp} or @samp{udp}, and the other fields represent the other essential pieces of information for making a networking connection. These @value{FN}s are used with the @samp{|&} operator for communicating -with a coprocess +with @w{a coprocess} (@pxref{Two-way I/O}). This is an advanced feature, mentioned here only for completeness. Full discussion is delayed until @@ -10032,10 +10066,14 @@ it is good practice to use a variable to store the @value{FN} or command. The previous example becomes the following: @example +@group sortcom = "sort -r names" sortcom | getline foo +@end group +@group @dots{} close(sortcom) +@end group @end example @noindent @@ -10178,7 +10216,7 @@ if it fails. @float Table,table-close-pipe-return-values @caption{Return values from @code{close()} of a pipe} -@multitable @columnfractions .40 .60 +@multitable @columnfractions .50 .50 @headitem Situation @tab Return value from @code{close()} @item Normal exit of command @tab Command's exit status @item Death by signal of command @tab 256 + number of murderous signal @@ -10207,7 +10245,8 @@ disk) is a fatal error. @example $ @kbd{gawk 'BEGIN @{ print "hi" > "/no/such/file" @}'} -@error{} gawk: cmd. line:1: fatal: can't redirect to `/no/such/file' (No such file or directory) +@error{} gawk: cmd. line:1: fatal: can't redirect to `/no/such/file' (No +@error{} such file or directory) @end example @command{gawk} makes it possible to detect that an error has @@ -10638,6 +10677,7 @@ confusion can arise when attempting to use regexp constants as arguments to user-defined functions (@pxref{User-defined}). For example: @example +@group function mysub(pat, repl, str, global) @{ if (global) @@ -10646,13 +10686,16 @@ function mysub(pat, repl, str, global) sub(pat, repl, str) return str @} +@end group +@group @{ @dots{} text = "hi! hi yourself!" mysub(/hi/, "howdy", text, 1) @dots{} @} +@end group @end example @c @cindex automatic warnings @@ -10900,8 +10943,10 @@ is performed. If numeric values appear in string concatenation, they are converted to strings. Consider the following: @example +@group two = 2; three = 3 print (two three) + 4 +@end group @end example @noindent @@ -11374,10 +11419,14 @@ to it. In the following program fragment, the variable @code{foo} has a numeric value at first, and a string value later on: @example +@group foo = 1 print foo +@end group +@group foo = "bar" print foo +@end group @end example @noindent @@ -11449,16 +11498,20 @@ righthand expression. For example: @cindex Rankin, Pat @example +@group # Thanks to Pat Rankin for this example BEGIN @{ foo[rand()] += 5 for (x in foo) print x, foo[x] +@end group +@group bar[rand()] = bar[rand()] + 5 for (x in bar) print x, bar[x] @} +@end group @end example @cindex operators, assignment, evaluation order @@ -12134,10 +12187,12 @@ leave off one of the @samp{=} characters. The result is still valid @command{awk} code, but the program does not do what is intended: @example +@group if (a = b) # oops! should be a == b @dots{} else @dots{} +@end group @end example @noindent @@ -13089,8 +13144,10 @@ $ @kbd{awk '! /li/' mail-list} @print{} Bill 555-1675 bill.drowning@@hotmail.com A @print{} Camilla 555-2912 camilla.infusarum@@skynet.be R @print{} Fabius 555-1234 fabius.undevicesimus@@ucb.edu F +@group @print{} Martin 555-6480 martin.codicibus@@hotmail.com A @print{} Jean-Paul 555-2127 jeanpaul.campanorum@@nyu.edu R +@end group @end example @cindex @code{BEGIN} pattern, Boolean patterns and @@ -13481,10 +13538,12 @@ the variable's value into the program inside the script. For example, consider the following program: @example +@group printf "Enter search pattern: " read pattern awk "/$pattern/ "'@{ nmatches++ @} END @{ print nmatches, "found" @}' /path/to/data +@end group @end example @noindent @@ -13673,10 +13732,12 @@ the null string; otherwise, the condition is true. Refer to the following: @example +@group if (x % 2 == 0) print "x is even" else print "x is odd" +@end group @end example In this example, if the expression @samp{x % 2 == 0} is true (i.e., @@ -13998,6 +14059,7 @@ finds the smallest divisor of any integer, and also identifies prime numbers: @example +@group # find smallest divisor of num @{ num = $1 @@ -14005,11 +14067,14 @@ numbers: if (num % divisor == 0) break @} +@end group +@group if (num % divisor == 0) printf "Smallest divisor of %d is %d\n", num, divisor else printf "%d is prime\n", num @} +@end group @end example When the remainder is zero in the first @code{if} statement, @command{awk} @@ -14317,14 +14382,18 @@ using an @code{exit} statement with a nonzero argument, as shown in the following example: @example +@group BEGIN @{ if (("date" | getline date_now) <= 0) @{ print "Can't get system date" > "/dev/stderr" exit 1 @} +@end group +@group print "current date is", date_now close("date") @} +@end group @end example @quotation NOTE @@ -14622,6 +14691,7 @@ Unlike most @command{awk} arrays, In the following example: @example +@group $ @kbd{awk 'BEGIN @{} > @kbd{for (i = 0; i < ARGC; i++)} > @kbd{print ARGV[i]} @@ -14629,6 +14699,7 @@ $ @kbd{awk 'BEGIN @{} @print{} awk @print{} inventory-shipped @print{} mail-list +@end group @end example @noindent @@ -15053,12 +15124,14 @@ points out that it effectively gives @command{awk} data pointers. Consider his example: @example +@group # Indirect multiply of any variable by amount, return result function multiply(variable, amount) @{ return SYMTAB[variable] *= amount @} +@end group @end example @noindent @@ -15130,6 +15203,7 @@ presented the following program describing the information contained in @code{AR and @code{ARGV}: @example +@group $ @kbd{awk 'BEGIN @{} > @kbd{for (i = 0; i < ARGC; i++)} > @kbd{print ARGV[i]} @@ -15137,6 +15211,7 @@ $ @kbd{awk 'BEGIN @{} @print{} awk @print{} inventory-shipped @print{} mail-list +@end group @end example @noindent @@ -15733,8 +15808,10 @@ For example, this statement tests whether the array @code{frequencies} contains the index @samp{2}: @example +@group if (2 in frequencies) print "Subscript 2 is present." +@end group @end example Note that this is @emph{not} a test of whether the array @@ -15744,8 +15821,10 @@ There is no way to do that except to scan all the elements. Also, this (incorrect) alternative does: @example +@group if (frequencies[2] != "") print "Subscript 2 is present." +@end group @end example @node Assigning Elements @@ -15802,6 +15881,7 @@ all the lines. When this program is run with the following input: @example +@group @c file eg/misc/arraymax.data 5 I am the Five man 2 Who are you? The new number two! @@ -15809,17 +15889,20 @@ When this program is run with the following input: 1 Who is number one? 3 I three you. @c endfile +@end group @end example @noindent Its output is: @example +@group 1 Who is number one? 2 Who are you? The new number two! 3 I three you. 4 . . . And four on the floor 5 I am the Five man +@end group @end example If a line number is repeated, the last line with a given number overrides @@ -15828,11 +15911,13 @@ Gaps in the line numbers can be handled with an easy improvement to the program's @code{END} rule, as follows: @example +@group END @{ for (x = 1; x <= max; x++) if (x in arr) print arr[x] @} +@end group @end example @node Scanning an Array @@ -15852,8 +15937,10 @@ So @command{awk} has a special kind of @code{for} statement for scanning an array: @example +@group for (@var{var} in @var{array}) @var{body} +@end group @end example @noindent @@ -15874,12 +15961,15 @@ such words. for more information on the built-in function @code{length()}. @example +@group # Record a 1 for each word that is used at least once @{ for (i = 1; i <= NF; i++) used[$i] = 1 @} +@end group +@group # Find number of distinct words more than 10 characters long END @{ for (x in used) @{ @@ -15890,6 +15980,7 @@ END @{ @} print num_long_words, "words longer than 10 characters" @} +@end group @end example @noindent @@ -16277,9 +16368,11 @@ same as assigning it a null value (the empty string, @code{""}). For example: @example +@group foo[4] = "" if (4 in foo) print "This is printed, even though foo[4] is empty" +@end group @end example @cindex lint checking, array elements @@ -16434,22 +16527,26 @@ END @{ When given the input: @example +@group 1 2 3 4 5 6 2 3 4 5 6 1 3 4 5 6 1 2 4 5 6 1 2 3 +@end group @end example @noindent the program produces the following output: @example +@group 4 3 2 1 5 4 3 2 6 5 4 3 1 6 5 4 2 1 6 5 3 2 1 6 +@end group @end example @node Multiscanning @@ -16629,15 +16726,19 @@ you can often devise workarounds using control statements. For example, the following code prints the elements of our main array @code{a}: @example +@group for (i in a) @{ for (j in a[i]) @{ if (j == 3) @{ for (k in a[i][j]) print a[i][j][k] +@end group +@group @} else print a[i][j] @} @} +@end group @end example @noindent @@ -17087,9 +17188,11 @@ asort(a) results in the following contents of @code{a}: @example +@group a[1] = "cul" a[2] = "de" a[3] = "sac" +@end group @end example The @code{asorti()} function works similarly to @code{asort()}; however, @@ -18145,6 +18248,9 @@ a file or pipe that was opened for reading (such as with @code{getline}), or if @var{filename} is not an open file, pipe, or coprocess. In such a case, @code{fflush()} returns @minus{}1, as well. +@c end the table to let the sidebar take up the full width of the page. +@end table + @sidebar Interactive Versus Noninteractive Buffering @cindex buffering, interactive vs.@: noninteractive @@ -18188,6 +18294,7 @@ Here, no output is printed until after the @kbd{Ctrl-d} is typed, because it is all buffered and sent down the pipe to @command{cat} in one shot. @end sidebar +@table @asis @item @code{system(@var{command})} @cindexawkfunc{system} @cindex invoke shell command @@ -18909,7 +19016,7 @@ that illustrates the use of these functions: @example @group @c file eg/lib/bits2str.awk -# bits2str --- turn a byte into readable ones and zeros +# bits2str --- turn an integer into readable ones and zeros function bits2str(bits, data, mask) @{ @@ -18931,7 +19038,7 @@ function bits2str(bits, data, mask) @c this is a hack to make testbits.awk self-contained @ignore @c file eg/prog/testbits.awk -# bits2str --- turn a byte into readable 1's and 0's +# bits2str --- turn an integer into readable ones and zeros function bits2str(bits, data, mask) @{ @@ -18972,7 +19079,8 @@ $ @kbd{gawk -f testbits.awk} @print{} 123 = 01111011 @print{} 0123 = 01010011 @print{} 0x99 = 10011001 -@print{} compl(0x99) = 0x3fffffffffff66 = 00111111111111111111111111111111111111111111111101100110 +@print{} compl(0x99) = 0x3fffffffffff66 = +@print{} 00111111111111111111111111111111111111111111111101100110 @print{} lshift(0x99, 2) = 0x264 = 0000001001100100 @print{} rshift(0x99, 2) = 0x26 = 00100110 @end example @@ -19243,10 +19351,12 @@ entire program before starting to execute any of it. The definition of a function named @var{name} looks like this: @display +@group @code{function} @var{name}@code{(}[@var{parameter-list}]@code{)} @code{@{} @var{body-of-function} @code{@}} +@end group @end display @cindex names, functions @@ -19414,11 +19524,13 @@ This function deletes all the elements in an array (recall that the extra whitespace signifies the start of the local variable list): @example +@group function delarray(a, i) @{ for (i in a) delete a[i] @} +@end group @end example When working with arrays, it is often necessary to delete all the elements @@ -19625,10 +19737,12 @@ In addition, recursive calls create new arrays. Consider this example: @example +@group function some_func(p1, a) @{ if (p1++ > 3) return +@end group a[p1] = p1 @@ -19692,12 +19806,14 @@ this has no effect on any other variables. Thus, if @code{myfunc()} does this: @example +@group function myfunc(str) @{ print str str = "zzz" print str @} +@end group @end example @noindent @@ -19853,11 +19969,13 @@ function maxelt(vec, i, ret) return ret @} +@group # Load all fields of each record into nums. @{ for(i = 1; i <= NF; i++) nums[NR, i] = $i @} +@end group END @{ print maxelt(nums) @@ -20151,12 +20269,14 @@ first thing to do is write some comparison functions: @example @c file eg/prog/indirectcall.awk +@group # num_lt --- do a numeric less than comparison function num_lt(left, right) @{ return ((left + 0) < (right + 0)) @} +@end group # num_ge --- do a numeric greater than or equal to comparison @@ -20205,19 +20325,23 @@ names of the two comparison functions: @example @c file eg/prog/indirectcall.awk +@group # sort --- sort the data in ascending order and return it as a string function sort(first, last) @{ return do_sort(first, last, "num_lt") @} +@end group +@group # rsort --- sort the data in descending order and return it as a string function rsort(first, last) @{ return do_sort(first, last, "num_ge") @} +@end group @c endfile @end example @@ -20717,6 +20841,7 @@ been true but was not, and then it kills the program. In C, using @code{assert()} looks this: @example +@group #include <assert.h> int myfunc(int a, double b) @@ -20724,6 +20849,7 @@ int myfunc(int a, double b) assert(a <= 5 && b >= 17.1); @dots{} @} +@end group @end example If the assertion fails, the program prints a message similar to this: @@ -20881,9 +21007,10 @@ function round(x, ival, aval, fraction) @} @c endfile @c don't include test harness in the file that gets installed - +@group # test harness # @{ print $0, round($0) @} +@end group @end example @node Cliff Random Function @@ -21289,7 +21416,7 @@ if (length(contents) == 0) @end example This tests the result to see if it is empty or not. An equivalent -test would be @samp{contents == ""}. +test would be @samp{@w{contents == ""}}. @xref{Extension Sample Readfile} for an extension function that also reads an entire file into memory. @@ -21590,8 +21717,10 @@ $ @kbd{gawk -f rewind.awk -f test.awk data } @print{} data 1 a @print{} data 2 b @print{} data 3 c +@group @print{} data 4 d @print{} data 5 e +@end group @end example @node File Checking @@ -22806,8 +22935,10 @@ function getgrent() _gr_init() if (++_gr_count in _gr_bycount) return _gr_bycount[_gr_count] +@group return "" @} +@end group @c endfile @end example @@ -23337,10 +23468,12 @@ list of fields or characters: if (by_fields == 0 && by_chars == 0) by_fields = 1 # default +@group if (fieldlist == "") @{ print "cut: needs list for -c or -f" > "/dev/stderr" exit 1 @} +@end group if (by_fields) set_fieldlist() @@ -23681,8 +23814,10 @@ function endfile(file) print fcount @} +@group total += fcount @} +@end group @c endfile @end example @@ -23839,11 +23974,15 @@ BEGIN @{ pw = getpwuid(uid) pr_first_field(pw) +@group if (euid != uid) @{ printf(" euid=%d", euid) pw = getpwuid(euid) +@end group +@group pr_first_field(pw) @} +@end group printf(" gid=%d", gid) pw = getgrgid(gid) @@ -23971,14 +24110,17 @@ BEGIN @{ # test argv in case reading from stdin instead of file if (i in ARGV) i++ # skip datafile name +@group if (i in ARGV) @{ outfile = ARGV[i] ARGV[i] = "" @} - +@end group +@group s1 = s2 = "a" out = (outfile s1 s2) @} +@end group @c endfile @end example @@ -24134,11 +24276,15 @@ line into each file on the command line, and then to the standard output: It is also possible to write the loop this way: @example +@group for (i in copy) if (append) print >> copy[i] +@end group +@group else print > copy[i] +@end group @end example @noindent @@ -24289,10 +24435,12 @@ BEGIN @{ usage() @} +@group if (ARGV[Optind] ~ /^\+[[:digit:]]+$/) @{ charcount = substr(ARGV[Optind], 2) + 0 Optind++ @} +@end group for (i = 1; i < Optind; i++) ARGV[i] = "" @@ -24326,10 +24474,12 @@ strings are then compared and @code{are_equal()} returns the result: @example @c file eg/prog/uniq.awk +@group function are_equal( n, m, clast, cline, alast, aline) @{ if (fcount == 0 && charcount == 0) return (last == $0) +@end group if (fcount > 0) @{ n = split(last, alast) @@ -24344,9 +24494,11 @@ function are_equal( n, m, clast, cline, alast, aline) clast = substr(clast, charcount + 1) cline = substr(cline, charcount + 1) @} +@group return (clast == cline) @} +@end group @c endfile @end example @@ -24405,11 +24557,13 @@ NR == 1 @{ END @{ if (do_count) printf("%4d %s\n", count, last) > outputfile +@group else if ((repeated_only && count > 1) || (non_repeated_only && count == 1)) print last > outputfile close(outputfile) @} +@end group @c endfile @end example @@ -25204,10 +25358,12 @@ At first glance, a program like this would seem to do the job: freq[$i]++ @} +@group END @{ for (word in freq) printf "%s\t%d\n", word, freq[word] @} +@end group @end example The program relies on @command{awk}'s default field-splitting @@ -25597,9 +25753,11 @@ line. That line is then printed to the output file: i++ @} @} +@group print join(a, 1, n, SUBSEP) > curfile @} @} +@end group @c endfile @end example @@ -25685,10 +25843,12 @@ function usage() exit 1 @} +@group BEGIN @{ # validate arguments if (ARGC < 3) usage() +@end group RS = ARGV[1] ORS = ARGV[2] @@ -26082,13 +26242,11 @@ the program is done: continue @} fpath = pathto($2) -@group if (fpath == "") @{ printf("igawk: %s:%d: cannot find %s\n", input[stackptr], FNR, $2) > "/dev/stderr" continue @} -@end group if (! (fpath in processed)) @{ processed[fpath] = input[stackptr] input[++stackptr] = fpath # push onto stack @@ -26345,10 +26503,12 @@ notice and this notice are preserved. Here is the program: @example +@group awk 'BEGIN@{O="~"~"~";o="=="=="==";o+=+o;x=O""O;while(X++<=x+o+o)c=c"%c"; printf c,(x-O)*(x-O),x*(x-o)-o,x*(x-O)+x-O-o,+x*(x-O)-x+o,X*(o*o+O)+x-O, X*(X-x)-o*o,(x+X)*o*o+o,x*(X-x)-O-O,x-O+(O+o+X+x)*(o+O),X*X-X*(x-O)-x+O, O+X*(o*(o+O)+O),+x+O+X*o,x*(x-o),(o+X+x)*o*o-(x-O-O),O+(X-x)*(X+O),x-O@}' +@end group @end example @cindex Johansen, Chris @@ -26836,11 +26996,13 @@ Our first comparison function can be used to scan an array in numerical order of the indices: @example +@group function cmp_num_idx(i1, v1, i2, v2) @{ # numerical index comparison, ascending order return (i1 - i2) @} +@end group @end example Our second function traverses an array based on the string order of @@ -26945,10 +27107,13 @@ function cmp_field(i1, v1, i2, v2) a[NR][i] = $i @} +@group END @{ PROCINFO["sorted_in"] = "cmp_field" +@end group if (POS < 1 || POS > NF) POS = 1 + for (i in a) @{ for (j = 1; j <= NF; j++) printf("%s%c", a[i][j], j < NF ? ":" : "") @@ -27005,6 +27170,7 @@ function cmp_numeric(i1, v1, i2, v2) return (v1 != v2) ? (v2 - v1) : (i2 - i1) @} +@group function cmp_string(i1, v1, i2, v2) @{ # string value (and index) comparison, descending order @@ -27012,6 +27178,7 @@ function cmp_string(i1, v1, i2, v2) v2 = v2 i2 return (v1 > v2) ? -1 : (v1 != v2) @} +@end group @end example @c Avoid using the term ``stable'' when describing the unpredictable behavior @@ -27165,11 +27332,13 @@ The following example demonstrates the use of a comparison function with both values to lowercase in order to compare them ignoring case. @example +@group # case_fold_compare --- compare as strings, ignoring case function case_fold_compare(i1, v1, i2, v2, l, r) @{ l = tolower(v1) +@end group r = tolower(v2) if (l < r) @@ -28526,8 +28695,10 @@ This is somewhat counterintuitive. and those with positional specifiers in the same string: @example +@group $ @kbd{gawk 'BEGIN @{ printf "%d %3$s\n", 1, 2, "hi" @}'} @error{} gawk: cmd. line:1: fatal: must use `count$' on all formats or none +@end group @end example @quotation NOTE @@ -29152,8 +29323,10 @@ be inside this function. To investigate further, we must begin @samp{n} (for ``next''): @example +@group gawk> @kbd{n} @print{} 66 if (fcount > 0) @{ +@end group @end example This tells us that @command{gawk} is now ready to execute line 66, which @@ -29922,10 +30095,12 @@ partial dump of Davide Brini's obfuscated code @c FIXME: This will need updating if num-handler branch is ever merged in. @smallexample +@group gawk> @kbd{dump} @print{} # BEGIN @print{} @print{} [ 1:0xfcd340] Op_rule : [in_rule = BEGIN] [source_file = brini.awk] +@end group @print{} [ 1:0xfcc240] Op_push_i : "~" [MALLOC|STRING|STRCUR] @print{} [ 1:0xfcc2a0] Op_push_i : "~" [MALLOC|STRING|STRCUR] @print{} [ 1:0xfcc280] Op_match : @@ -29958,18 +30133,18 @@ gawk> @kbd{dump} @print{} [ :0xfcc660] Op_no_op : @print{} [ 1:0xfcc520] Op_assign_concat : c @print{} [ :0xfcc620] Op_jmp : [target_jmp = 0xfcc440] -@print{} @dots{} -@print{} @print{} [ 2:0xfcc5a0] Op_K_printf : [expr_count = 17] [redir_type = ""] @print{} [ :0xfcc140] Op_no_op : @print{} [ :0xfcc1c0] Op_atexit : @print{} [ :0xfcc640] Op_stop : @print{} [ :0xfcc180] Op_no_op : @print{} [ :0xfcd150] Op_after_beginfile : +@group @print{} [ :0xfcc160] Op_no_op : @print{} [ :0xfcc1a0] Op_after_endfile : gawk> +@end group @end smallexample @cindex @code{exit} debugger command @@ -30324,6 +30499,7 @@ In computer systems, integer arithmetic is exact, but the possible range of values is limited. Integer arithmetic is generally faster than floating-point arithmetic. +@cindex floating-point, numbers @item Floating-point arithmetic Floating-point numbers represent what were called in school ``real'' numbers (i.e., those that have a fractional part, such as 3.1415927). @@ -30335,6 +30511,12 @@ Modern systems support floating-point arithmetic in hardware, with a limited range of values. There are software libraries that allow the use of arbitrary-precision floating-point calculations. +@cindex floating-point, numbers@comma{} single-precision +@cindex floating-point, numbers@comma{} double-precision +@cindex floating-point, numbers@comma{} arbitrary-precision +@cindex single-precision +@cindex double-precision +@cindex arbitrary-precision POSIX @command{awk} uses @dfn{double-precision} floating-point numbers, which can hold more digits than @dfn{single-precision} floating-point numbers. @command{gawk} has facilities for performing arbitrary-precision @@ -30344,29 +30526,48 @@ floating-point arithmetic, which we describe in more detail shortly. Computers work with integer and floating-point values of different ranges. Integer values are usually either 32 or 64 bits in size. Single-precision floating-point values occupy 32 bits, whereas double-precision -floating-point values occupy 64 bits. Floating-point values are always -signed. The possible ranges of values are shown in @ref{table-numeric-ranges}. +floating-point values occupy 64 bits. +(Quadruple-precision floating point values also exist. They occupy 128 bits, +but such numbers are not available in @command{awk}.) +Floating-point values are always +signed. The possible ranges of values are shown in @ref{table-numeric-ranges} +and @ref{table-floating-point-ranges}. @float Table,table-numeric-ranges -@caption{Value ranges for different numeric representations} +@caption{Value ranges for integer representations} @multitable @columnfractions .34 .33 .33 -@headitem Numeric representation @tab Minimum value @tab Maximum value +@headitem Representation @tab Minimum value @tab Maximum value @item 32-bit signed integer @tab @minus{}2,147,483,648 @tab 2,147,483,647 @item 32-bit unsigned integer @tab 0 @tab 4,294,967,295 @item 64-bit signed integer @tab @minus{}9,223,372,036,854,775,808 @tab 9,223,372,036,854,775,807 @item 64-bit unsigned integer @tab 0 @tab 18,446,744,073,709,551,615 +@end multitable +@end float + +@float Table,table-floating-point-ranges +@caption{Approximate value ranges for floating-point number representations} +@multitable @columnfractions .38 .22 .22 .23 @iftex -@item Single-precision floating point (approximate) @tab @math{1.175494^{-38}} @tab @math{3.402823^{38}} -@item Double-precision floating point (approximate) @tab @math{2.225074^{-308}} @tab @math{1.797693^{308}} +@headitem Representation @tab @w{Minimum positive} @w{nonzero value} @tab Minimum @w{finite value} @tab Maximum @w{finite value} +@end iftex +@ifnottex +@headitem Representation @tab Minimum positive nonzero value @tab Minimum finite value @tab Maximum finite value +@end ifnottex +@iftex +@item @w{Single-precision floating-point} @tab @math{1.175494 @cdot 10^{-38}} @tab @math{-3.402823 @cdot 10^{38}} @tab @math{3.402823 @cdot 10^{38}} +@item @w{Double-precision floating-point} @tab @math{2.225074 @cdot 10^{-308}} @tab @math{-1.797693 @cdot 10^{308}} @tab @math{1.797693 @cdot 10^{308}} +@item @w{Quadruple-precision floating-point} @tab @math{3.362103 @cdot 10^{-4932}} @tab @math{-1.189731 @cdot 10^{4932}} @tab @math{1.189731 @cdot 10^{4932}} @end iftex @ifinfo -@item Single-precision floating point (approximate) @tab 1.175494e-38 @tab 3.402823e38 -@item Double-precision floating point (approximate) @tab 2.225074e-308 @tab 1.797693e308 +@item Single-precision floating-point @tab 1.175494e-38 @tab -3.402823e+38 @tab 3.402823e+38 +@item Double-precision floating-point @tab 2.225074e-308 @tab -1.797693e+308 @tab 1.797693e+308 +@item Quadruple-precision floating-point @tab 3.362103e-4932 @tab -1.189731e+4932 @tab 1.189731e+4932 @end ifinfo @ifnottex @ifnotinfo -@item Single-precision floating point (approximate) @tab 1.175494@sup{-38} @tab 3.402823@sup{38} -@item Double-precision floating point (approximate) @tab 2.225074@sup{-308} @tab 1.797693@sup{308} +@item Single-precision floating-point @tab 1.175494*10@sup{-38} @tab -3.402823*10@sup{38} @tab 3.402823*10@sup{38} +@item Double-precision floating-point @tab 2.225074*10@sup{-308} @tab -1.797693*10@sup{308} @tab 1.797693*10@sup{308} +@item Quadruple-precision floating-point @tab 3.362103*10@sup{-4932} @tab -1.189731*10@sup{4932} @tab 1.189731*10@sup{4932} @end ifnotinfo @end ifnottex @end multitable @@ -30635,12 +30836,14 @@ You have to decide how small a delta is important to you. Code to do this looks something like the following: @example +@group delta = 0.00001 # for example difference = abs(a) - abs(b) # subtract the two values if (difference < delta) # all ok else # not ok +@end group @end example @noindent @@ -31110,6 +31313,7 @@ choose to set: @example @c file eg/prog/pi.awk +@group # pi.awk --- compute the digits of pi @c endfile @c endfile @@ -31125,6 +31329,7 @@ choose to set: BEGIN @{ digits = 100000 two = 2 * 10 ^ digits +@end group pi = two for (m = digits * 4; m > 0; --m) @{ d = m * 2 + 1 @@ -32091,6 +32296,7 @@ of the function using the macro. For example, you might allocate a string value like so: @example +@group awk_value_t result; char *message; const char greet[] = "Don't Panic!"; @@ -32098,8 +32304,10 @@ const char greet[] = "Don't Panic!"; emalloc(message, char *, sizeof(greet), "myfunc"); strcpy(message, greet); make_malloced_string(message, strlen(message), & result); +@end group @end example +@sp 2 @item #define ezalloc(pointer, type, size, message) @dots{} This is like @code{emalloc()}, but it calls @code{gawk_calloc()} instead of @code{gawk_malloc()}. @@ -32235,6 +32443,7 @@ registering parts of your extension with @command{gawk}. Extension functions are described by the following record: @example +@group typedef struct awk_ext_func @{ @ @ @ @ const char *name; @ @ @ @ awk_value_t *(*const function)(int num_actual_args, @@ -32245,6 +32454,7 @@ typedef struct awk_ext_func @{ @ @ @ @ awk_bool_t suppress_lint; @ @ @ @ void *data; /* opaque pointer to any extra state */ @} awk_ext_func_t; +@end group @end example The fields are: @@ -32440,12 +32650,14 @@ Your extension should package these functions inside an @code{awk_input_parser_t}, which looks like this: @example +@group typedef struct awk_input_parser @{ const char *name; /* name of parser */ awk_bool_t (*can_take_file)(const awk_input_buf_t *iobuf); awk_bool_t (*take_control_of)(awk_input_buf_t *iobuf); awk_const struct awk_input_parser *awk_const next; /* for gawk */ @} awk_input_parser_t; +@end group @end example The fields are: @@ -33198,6 +33410,7 @@ to a global variable or array. It is an optimization that avoids looking up variables in @command{gawk}'s symbol table every time access is needed. This was discussed earlier, in @ref{General Data Types}. +@need 1500 The following functions let you work with scalar cookies: @table @code @@ -33260,12 +33473,14 @@ your extension's variable in @command{gawk}'s symbol table using using @code{sym_lookup()}: @example +@group static awk_scalar_t magic_var_cookie; /* cookie for MAGIC_VAR */ static void my_extension_init() @{ awk_value_t value; +@end group /* install initial value */ sym_update("MAGIC_VAR", make_number(42.0, & value)); @@ -33769,10 +33984,12 @@ Finally, because everything was successful, the function sets the return value to success, and returns: @example +@group make_number(1.0, result); out: return result; @} +@end group @end example Here is the output from running this part of the test: @@ -33984,7 +34201,7 @@ BEGIN @{ Here is the result of running the script: @example -$ @kbd{AWKLIBPATH=$PWD ./gawk -f subarray.awk} +$ @kbd{AWKLIBPATH=$PWD gawk -f subarray.awk} @print{} new_array["subarray"]["foo"] = bar @print{} new_array["hello"] = world @print{} new_array["answer"] = 42 @@ -34123,7 +34340,7 @@ It is up to the extension to decide if there are API incompatibilities. Typically, a check like this is enough: @example -if (api->major_version != GAWK_API_MAJOR_VERSION +if ( api->major_version != GAWK_API_MAJOR_VERSION || api->minor_version < GAWK_API_MINOR_VERSION) @{ fprintf(stderr, "foo_extension: version mismatch with gawk!\n"); fprintf(stderr, "\tmy version (%d, %d), gawk version (%d, %d)\n", @@ -34224,10 +34441,12 @@ as described here. The boilerplate needed is also provided in comments in the @file{gawkapi.h} header file: @example +@group /* Boilerplate code: */ int plugin_is_GPL_compatible; static gawk_api_t *const api; +@end group static awk_ext_id_t ext_id; static const char *ext_version = NULL; /* or @dots{} = "some string" */ @@ -34628,10 +34847,12 @@ The second is a pointer to an @code{awk_value_t} structure, usually named @code{result}: @example +@group /* do_chdir --- provide dynamically loaded chdir() function for gawk */ static awk_value_t * do_chdir(int nargs, awk_value_t *result, struct awk_ext_func *unused) +@end group @{ awk_value_t newdir; int ret = -1; @@ -34758,7 +34979,7 @@ fill_stat_array(const char *name, awk_array_t array, struct stat *sbuf) #endif #ifdef S_IFDOOR /* Solaris weirdness */ @{ S_IFDOOR, "door" @}, -#endif /* S_IFDOOR */ +#endif @}; int j, k; @end example @@ -34801,9 +35022,11 @@ certain members and/or the type of the file. It then returns zero, for success: @example +@group #ifdef HAVE_STRUCT_STAT_ST_BLKSIZE array_set_numeric(array, "blksize", sbuf->st_blksize); -#endif /* HAVE_STRUCT_STAT_ST_BLKSIZE */ +#endif +@end group pmode = format_mode(sbuf->st_mode); array_set(array, "pmode", make_const_string(pmode, strlen(pmode), @@ -34892,20 +35115,24 @@ Next, it gets the information for the file. If the called function /* stat the file; if error, set ERRNO and return */ ret = statfunc(name, & sbuf); +@group if (ret < 0) @{ update_ERRNO_int(errno); return make_number(ret, result); @} +@end group @end example The tedious work is done by @code{fill_stat_array()}, shown earlier. When done, the function returns the result from @code{fill_stat_array()}: @example +@group ret = fill_stat_array(name, array, & sbuf); return make_number(ret, result); @} +@end group @end example Finally, it's necessary to provide the ``glue'' that loads the @@ -40599,14 +40826,24 @@ like this: @code{""}. Humans are used to working in decimal; i.e., base 10. In base 10, numbers go from 0 to 9, and then ``roll over'' into the next +@iftex +column. (Remember grade school? @math{42 = 4\times 10 + 2}.) +@end iftex +@ifnottex column. (Remember grade school? 42 = 4 x 10 + 2.) +@end ifnottex There are other number bases though. Computers commonly use base 2 or @dfn{binary}, base 8 or @dfn{octal}, and base 16 or @dfn{hexadecimal}. In binary, each column represents two times the value in the column to its right. Each column may contain either a 0 or a 1. +@iftex +Thus, binary 1010 represents @math{(1\times 8) + (0\times 4) + (1\times 2) + (0\times 1)}, or decimal 10. +@end iftex +@ifnottex Thus, binary 1010 represents (1 x 8) + (0 x 4) + (1 x 2) + (0 x 1), or decimal 10. +@end ifnottex Octal and hexadecimal are discussed more in @ref{Nondecimal-numbers}. @@ -40746,7 +40983,12 @@ electronic circuitry works ``naturally'' in base 2 (just think of Off/On), everything inside a computer is calculated using base 2. Each digit represents the presence (or absence) of a power of 2 and is called a @dfn{bit}. So, for example, the base-two number @code{10101} is +@iftex +the same as decimal 21, (@math{(1\times 16) + (1\times 4) + (1\times 1)}). +@end iftex +@ifnottex the same as decimal 21, ((1 x 16) + (1 x 4) + (1 x 1)). +@end ifnottex Since base-two numbers quickly become very long to read and write, they are usually grouped by 3 (i.e., they are @@ -40917,7 +41159,7 @@ See also ``Interpreter.'' @item Complemented Bracket Expression The negation of a @dfn{bracket expression}. All that is @emph{not} described by a given bracket expression. The symbol @samp{^} precedes -the negated bracket expression. E.g.: @samp{[[^:digit:]} +the negated bracket expression. E.g.: @samp{[^[:digit:]]} designates whatever character is not a digit. @samp{[^bad]} designates whatever character is not one of the letters @samp{b}, @samp{a}, or @samp{d}. @@ -41186,7 +41428,12 @@ Base 16 notation, where the digits are @code{0}--@code{9} and @code{A}--@code{F}, with @samp{A} representing 10, @samp{B} representing 11, and so on, up to @samp{F} for 15. Hexadecimal numbers are written in C using a leading @samp{0x}, +@iftex +to indicate their base. Thus, @code{0x12} is 18 (@math{(1\times 16) + 2}). +@end iftex +@ifnottex to indicate their base. Thus, @code{0x12} is 18 ((1 x 16) + 2). +@end ifnottex @xref{Nondecimal-numbers}. @item I/O @@ -41250,7 +41497,7 @@ meaning. Keywords are reserved and may not be used as variable names. @code{break}, @code{case}, @code{continue}, -@code{default} +@code{default}, @code{delete}, @code{do@dots{}while}, @code{else}, @@ -41336,7 +41583,12 @@ Ancient @command{awk} implementations used single precision floating-point. @item Octal Base-eight notation, where the digits are @code{0}--@code{7}. Octal numbers are written in C using a leading @samp{0}, +@iftex +to indicate their base. Thus, @code{013} is 11 (@math{(1\times 8) + 3}). +@end iftex +@ifnottex to indicate their base. Thus, @code{013} is 11 ((1 x 8) + 3). +@end ifnottex @xref{Nondecimal-numbers}. @item Output Record |