aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawktexi.in
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawktexi.in')
-rw-r--r--doc/gawktexi.in116
1 files changed, 59 insertions, 57 deletions
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index 8fd84288..7fd947a5 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -467,7 +467,7 @@ particular records in a file and perform operations upon them.
@command{gawk}.
* Internationalization:: Getting @command{gawk} to speak your
language.
-* Debugger:: The @code{gawk} debugger.
+* Debugger:: The @command{gawk} debugger.
* Arbitrary Precision Arithmetic:: Arbitrary precision arithmetic with
@command{gawk}.
* Dynamic Extensions:: Adding new built-in functions to
@@ -950,7 +950,7 @@ particular records in a file and perform operations upon them.
* Internal File Ops:: The code for internal file operations.
* Using Internal File Ops:: How to use an external extension.
* Extension Samples:: The sample extensions that ship with
- @code{gawk}.
+ @command{gawk}.
* Extension Sample File Functions:: The file functions sample.
* Extension Sample Fnmatch:: An interface to @code{fnmatch()}.
* Extension Sample Fork:: An interface to @code{fork()} and
@@ -4591,7 +4591,7 @@ $ @kbd{gawk -f test2}
@print{} This is script test2.
@end example
-@code{gawk} runs the @file{test2} script, which includes @file{test1}
+@command{gawk} runs the @file{test2} script, which includes @file{test1}
using the @code{@@include}
keyword. So, to include external @command{awk} source files, you just
use @code{@@include} followed by the name of the file to be included,
@@ -4800,7 +4800,7 @@ This seems to have been a long-undocumented feature in Unix @command{awk}.
Similarly, you may use @code{print} or @code{printf} statements in the
@var{init} and @var{increment} parts of a @code{for} loop. This is another
-long-undocumented ``feature'' of Unix @code{awk}.
+long-undocumented ``feature'' of Unix @command{awk}.
@end ignore
@@ -16100,6 +16100,9 @@ Besides the built-in functions, @command{awk} has provisions for
writing new functions that the rest of a program can use.
The second half of this @value{CHAPTER} describes these
@dfn{user-defined} functions.
+Finally, we explore indirect function calls, a @command{gawk}-specific
+extension that lets you determine at runtime what function is to
+be called.
@menu
* Built-in:: Summarizes the built-in functions.
@@ -16109,7 +16112,7 @@ The second half of this @value{CHAPTER} describes these
@end menu
@node Built-in
-@section Built-In Functions
+@section Built-in Functions
@dfn{Built-in} functions are always available for
your @command{awk} program to call. This @value{SECTION} defines all
@@ -16132,7 +16135,7 @@ but are summarized here for your convenience.
@end menu
@node Calling Built-in
-@subsection Calling Built-In Functions
+@subsection Calling Built-in Functions
To call one of @command{awk}'s built-in functions, write the name of
the function followed
@@ -16183,7 +16186,7 @@ j = atan2(++i, i *= 2)
@end example
If the order of evaluation is left to right, then @code{i} first becomes
-6, and then 12, and @code{atan2()} is called with the two arguments 6
+six, and then 12, and @code{atan2()} is called with the two arguments six
and 12. But if the order of evaluation is right to left, @code{i}
first becomes 10, then 11, and @code{atan2()} is called with the
two arguments 11 and 10.
@@ -16247,7 +16250,7 @@ In fact, @command{gawk} uses the BSD @code{random()} function, which is
considerably better than @code{rand()}, to produce random numbers.}
Often random integers are needed instead. Following is a user-defined function
-that can be used to obtain a random non-negative integer less than @var{n}:
+that can be used to obtain a random nonnegative integer less than @var{n}:
@example
function randint(n)
@@ -16342,7 +16345,7 @@ implementations.
The functions in this @value{SECTION} look at or change the text of one
or more strings.
-@code{gawk} understands locales (@pxref{Locales}), and does all
+@command{gawk} understands locales (@pxref{Locales}) and does all
string processing in terms of @emph{characters}, not @emph{bytes}.
This distinction is particularly important to understand for locales
where one character may be represented by multiple bytes. Thus, for
@@ -16431,7 +16434,7 @@ a[2] = "de"
a[3] = "sac"
@end example
-The @code{asorti()} function works similarly to @code{asort()}, however,
+The @code{asorti()} function works similarly to @code{asort()}; however,
the @emph{indices} are sorted, instead of the values. Thus, in the
previous example, starting with the same initial set of indices and
values in @code{a}, calling @samp{asorti(a)} would yield:
@@ -16546,7 +16549,7 @@ If @var{find} is not found, @code{index()} returns zero.
With BWK @command{awk} and @command{gawk},
it is a fatal error to use a regexp constant for @var{find}.
Other implementations allow it, simply treating the regexp
-constant as an expression meaning @samp{$0 ~ /regexp/}. @value{DARKCORNER}.
+constant as an expression meaning @samp{$0 ~ /regexp/}. @value{DARKCORNER}
@item @code{length(}[@var{string}]@code{)}
@cindexawkfunc{length}
@@ -16629,7 +16632,7 @@ If @option{--posix} is supplied, using an array argument is a fatal error
@cindex string, regular expression match
@cindex match regexp in string
Search @var{string} for the
-longest, leftmost substring matched by the regular expression,
+longest, leftmost substring matched by the regular expression
@var{regexp} and return the character position (index)
at which that substring begins (one, if it starts at the beginning of
@var{string}). If no match is found, return zero.
@@ -16641,7 +16644,7 @@ In the latter case, the string is treated as a regexp to be matched.
discussion of the difference between the two forms, and the
implications for writing your program correctly.
-The order of the first two arguments is backwards from most other string
+The order of the first two arguments is the opposite of most other string
functions that work with regular expressions, such as
@code{sub()} and @code{gsub()}. It might help to remember that
for @code{match()}, the order is the same as for the @samp{~} operator:
@@ -16730,7 +16733,7 @@ $ @kbd{echo foooobazbarrrrr |}
@end example
There may not be subscripts for the start and index for every parenthesized
-subexpression, because they may not all have matched text; thus they
+subexpression, because they may not all have matched text; thus, they
should be tested for with the @code{in} operator
(@pxref{Reference to Elements}).
@@ -16777,13 +16780,13 @@ a regexp describing where to split @var{string} (much as @code{FS} can
be a regexp describing where to split input records).
If @var{fieldsep} is omitted, the value of @code{FS} is used.
@code{split()} returns the number of elements created.
-@var{seps} is a @command{gawk} extension with @code{@var{seps}[@var{i}]}
+@var{seps} is a @command{gawk} extension, with @code{@var{seps}[@var{i}]}
being the separator string
between @code{@var{array}[@var{i}]} and @code{@var{array}[@var{i}+1]}.
If @var{fieldsep} is a single
-space then any leading whitespace goes into @code{@var{seps}[0]} and
+space, then any leading whitespace goes into @code{@var{seps}[0]} and
any trailing
-whitespace goes into @code{@var{seps}[@var{n}]} where @var{n} is the
+whitespace goes into @code{@var{seps}[@var{n}]}, where @var{n} is the
return value of
@code{split()} (i.e., the number of elements in @var{array}).
@@ -16821,19 +16824,18 @@ As with input field-splitting, when the value of @var{fieldsep} is
the elements of
@var{array} but not in @var{seps}, and the elements
are separated by runs of whitespace.
-Also, as with input field-splitting, if @var{fieldsep} is the null string, each
+Also, as with input field splitting, if @var{fieldsep} is the null string, each
individual character in the string is split into its own array element.
@value{COMMONEXT}
Note, however, that @code{RS} has no effect on the way @code{split()}
-works. Even though @samp{RS = ""} causes newline to also be an input
+works. Even though @samp{RS = ""} causes the newline character to also be an input
field separator, this does not affect how @code{split()} splits strings.
@cindex dark corner, @code{split()} function
Modern implementations of @command{awk}, including @command{gawk}, allow
-the third argument to be a regexp constant (@code{/abc/}) as well as a
-string.
-@value{DARKCORNER}
+the third argument to be a regexp constant (@w{@code{/}@dots{}@code{/}})
+as well as a string. @value{DARKCORNER}
The POSIX standard allows this as well.
@DBXREF{Computed Regexps} for a
discussion of the difference between using a string constant or a regexp constant,
@@ -16970,7 +16972,7 @@ an @samp{&}:
@cindex @code{sub()} function, arguments of
@cindex @code{gsub()} function, arguments of
As mentioned, the third argument to @code{sub()} must
-be a variable, field or array element.
+be a variable, field, or array element.
Some versions of @command{awk} allow the third argument to
be an expression that is not an lvalue. In such a case, @code{sub()}
still searches for the pattern and returns zero or one, but the result of
@@ -17129,8 +17131,8 @@ example, @code{"a\qb"} is treated as @code{"aqb"}.
At the runtime level, the various functions handle sequences of
@samp{\} and @samp{&} differently. The situation is (sadly) somewhat complex.
-Historically, the @code{sub()} and @code{gsub()} functions treated the two
-character sequence @samp{\&} specially; this sequence was replaced in
+Historically, the @code{sub()} and @code{gsub()} functions treated the
+two-character sequence @samp{\&} specially; this sequence was replaced in
the generated text with a single @samp{&}. Any other @samp{\} within
the @var{replacement} string that did not precede an @samp{&} was passed
through unchanged. This is illustrated in @ref{table-sub-escapes}.
@@ -17188,7 +17190,7 @@ _bigskip}
@end float
@noindent
-This table shows both the lexical-level processing, where
+This table shows the lexical-level processing, where
an odd number of backslashes becomes an even number at the runtime level,
as well as the runtime processing done by @code{sub()}.
(For the sake of simplicity, the rest of the following tables only show the
@@ -17209,7 +17211,7 @@ This is shown in
@ref{table-sub-proposed}.
@float Table,table-sub-proposed
-@caption{GNU @command{awk} rules for @code{sub()} and backslash}
+@caption{@command{gawk} rules for @code{sub()} and backslash}
@tex
\vbox{\bigskip
% We need more characters for escape and tab ...
@@ -17254,7 +17256,7 @@ _bigskip}
@end float
In a nutshell, at the runtime level, there are now three special sequences
-of characters (@samp{\\\&}, @samp{\\&} and @samp{\&}) whereas historically
+of characters (@samp{\\\&}, @samp{\\&}, and @samp{\&}) whereas historically
there was only one. However, as in the historical case, any @samp{\} that
is not part of one of these three sequences is not special and appears
in the output literally.
@@ -17320,7 +17322,7 @@ The only case where the difference is noticeable is the last one: @samp{\\\\}
is seen as @samp{\\} and produces @samp{\} instead of @samp{\\}.
Starting with @value{PVERSION} 3.1.4, @command{gawk} followed the POSIX rules
-when @option{--posix} is specified (@pxref{Options}). Otherwise,
+when @option{--posix} was specified (@pxref{Options}). Otherwise,
it continued to follow the proposed rules, as
that had been its behavior for many years.
@@ -17388,7 +17390,7 @@ _bigskip}
@end ifnottex
@end float
-Because of the complexity of the lexical and runtime level processing
+Because of the complexity of the lexical- and runtime-level processing
and the special cases for @code{sub()} and @code{gsub()},
we recommend the use of @command{gawk} and @code{gensub()} when you have
to do substitutions.
@@ -17414,6 +17416,7 @@ for more information.
When closing a coprocess, it is occasionally useful to first close
one end of the two-way pipe and then to close the other. This is done
by providing a second argument to @code{close()}. This second argument
+(@var{how})
should be one of the two string values @code{"to"} or @code{"from"},
indicating which end of the pipe to close. Case in the string does
not matter.
@@ -17440,7 +17443,7 @@ every little bit of information as soon as it is ready. However, sometimes
it is necessary to force a program to @dfn{flush} its buffers (i.e.,
write the information to its destination, even if a buffer is not full).
This is the purpose of the @code{fflush()} function---@command{gawk} also
-buffers its output and the @code{fflush()} function forces
+buffers its output, and the @code{fflush()} function forces
@command{gawk} to flush its buffers.
@cindex extensions, common@comma{} @code{fflush()} function
@@ -17461,7 +17464,7 @@ would flush only the standard output if there was no argument,
and flush all output files and pipes if the argument was the null
string. This was changed in order to be compatible with Brian
Kernighan's @command{awk}, in the hope that standardizing this
-feature in POSIX would then be easier (which indeed helped).
+feature in POSIX would then be easier (which indeed proved to be the case).
With @command{gawk},
you can use @samp{fflush("/dev/stdout")} if you wish to flush
@@ -17472,7 +17475,7 @@ only the standard output.
@c @cindex warnings, automatic
@cindex troubleshooting, @code{fflush()} function
@code{fflush()} returns zero if the buffer is successfully flushed;
-otherwise, it returns non-zero. (@command{gawk} returns @minus{}1.)
+otherwise, it returns a nonzero value. (@command{gawk} returns @minus{}1.)
In the case where all buffers are flushed, the return value is zero
only if all buffers were flushed successfully. Otherwise, it is
@minus{}1, and @command{gawk} warns about the problem @var{filename}.
@@ -17485,8 +17488,8 @@ In such a case, @code{fflush()} returns @minus{}1, as well.
@sidebar Interactive Versus Noninteractive Buffering
@cindex buffering, interactive vs.@: noninteractive
-As a side point, buffering issues can be even more confusing, depending
-upon whether your program is @dfn{interactive} (i.e., communicating
+As a side point, buffering issues can be even more confusing if
+your program is @dfn{interactive} (i.e., communicating
with a user sitting at a keyboard).@footnote{A program is interactive
if the standard output is connected to a terminal device. On modern
systems, this means your keyboard and screen.}
@@ -17529,7 +17532,7 @@ it is all buffered and sent down the pipe to @command{cat} in one shot.
@cindexawkfunc{system}
@cindex invoke shell command
@cindex interacting with other programs
-Execute the operating-system
+Execute the operating system
command @var{command} and then return to the @command{awk} program.
Return @var{command}'s exit status.
@@ -17638,9 +17641,9 @@ you would see the latter (undesirable) output.
@cindex files, log@comma{} timestamps in
@cindex @command{gawk}, timestamps
@cindex POSIX @command{awk}, timestamps and
-@code{awk} programs are commonly used to process log files
+@command{awk} programs are commonly used to process log files
containing timestamp information, indicating when a
-particular log record was written. Many programs log their timestamp
+particular log record was written. Many programs log their timestamps
in the form returned by the @code{time()} system call, which is the
number of seconds since a particular epoch. On POSIX-compliant systems,
it is the number of seconds since
@@ -17701,7 +17704,7 @@ The values of these numbers need not be within the ranges specified;
for example, an hour of @minus{}1 means 1 hour before midnight.
The origin-zero Gregorian calendar is assumed, with year 0 preceding
year 1 and year @minus{}1 preceding year 0.
-The time is assumed to be in the local timezone.
+The time is assumed to be in the local time zone.
If the daylight-savings flag is positive, the time is assumed to be
daylight savings time; if zero, the time is assumed to be standard
time; and if negative (the default), @code{mktime()} attempts to determine
@@ -17861,12 +17864,12 @@ Equivalent to specifying @samp{%H:%M:%S}.
The weekday as a decimal number (1--7). Monday is day one.
@item %U
-The week number of the year (the first Sunday as the first day of week one)
+The week number of the year (with the first Sunday as the first day of week one)
as a decimal number (00--53).
@c @cindex ISO 8601
@item %V
-The week number of the year (the first Monday as the first
+The week number of the year (with the first Monday as the first
day of week one) as a decimal number (01--53).
The method for determining the week number is as specified by ISO 8601.
(To wit: if the week containing January 1 has four or more days in the
@@ -17877,7 +17880,7 @@ and the next week is week one.)
The weekday as a decimal number (0--6). Sunday is day zero.
@item %W
-The week number of the year (the first Monday as the first day of week one)
+The week number of the year (with the first Monday as the first day of week one)
as a decimal number (00--53).
@item %x
@@ -17897,8 +17900,8 @@ The full year as a decimal number (e.g., 2015).
@c @cindex RFC 822
@c @cindex RFC 1036
@item %z
-The timezone offset in a +HHMM format (e.g., the format necessary to
-produce RFC 822/RFC 1036 date headers).
+The time zone offset in a @samp{+@var{HHMM}} format (e.g., the format
+necessary to produce RFC 822/RFC 1036 date headers).
@item %Z
The time zone name or abbreviation; no characters if
@@ -18038,7 +18041,7 @@ The operations are described in @ref{table-bitwise-ops}.
@ifnottex
@ifnotdocbook
@display
- Bit Operator
+ Bit operator
| AND | OR | XOR
|---+---+---+---+---+---
Operands | 0 | 1 | 0 | 1 | 0 | 1
@@ -18096,7 +18099,7 @@ Operands | 0 | 1 | 0 | 1 | 0 | 1
<tbody>
<row>
<entry colsep="0"></entry>
-<entry spanname="optitle"><emphasis role="bold">Bit Operator</emphasis></entry>
+<entry spanname="optitle"><emphasis role="bold">Bit operator</emphasis></entry>
</row>
<row rowsep="1">
@@ -18160,10 +18163,9 @@ of a given value.
Finally, two other common operations are to shift the bits left or right.
For example, if you have a bit string @samp{10111001} and you shift it
right by three bits, you end up with @samp{00010111}.@footnote{This example
-shows that 0's come in on the left side. For @command{gawk}, this is
+shows that zeros come in on the left side. For @command{gawk}, this is
always true, but in some languages, it's possible to have the left side
-fill with 1's.}
-@c Purposely decided to use 0's and 1's here. 2/2001.
+fill with ones.}
If you start over again with @samp{10111001} and shift it left by three
bits, you end up with @samp{11001000}. The following list describes
@command{gawk}'s built-in functions that implement the bitwise operations.
@@ -18217,7 +18219,7 @@ that illustrates the use of these functions:
@example
@group
@c file eg/lib/bits2str.awk
-# bits2str --- turn a byte into readable 1's and 0's
+# bits2str --- turn a byte into readable ones and zeros
function bits2str(bits, data, mask)
@{
@@ -19511,7 +19513,7 @@ for (i = 1; i <= n; i++)
@end example
@noindent
-@code{gawk} looks up the actual function to call only once.
+@command{gawk} looks up the actual function to call only once.
@node Functions Summary
@section Summary
@@ -30009,7 +30011,7 @@ Allowing completely alphabetic strings to have valid numeric
values is also a very severe departure from historical practice.
@end itemize
-The second problem is that the @code{gawk} maintainer feels that this
+The second problem is that the @command{gawk} maintainer feels that this
interpretation of the standard, which requires a certain amount of
``language lawyering'' to arrive at in the first place, was not even
intended by the standard developers. In other words, ``we see how you
@@ -30168,7 +30170,7 @@ When @option{--sandbox} is specified, extensions are disabled
* Finding Extensions:: How @command{gawk} finds compiled extensions.
* Extension Example:: Example C code for an extension.
* Extension Samples:: The sample extensions that ship with
- @code{gawk}.
+ @command{gawk}.
* gawkextlib:: The @code{gawkextlib} project.
* Extension summary:: Extension summary.
* Extension Exercises:: Exercises.
@@ -31132,7 +31134,7 @@ If the concept of a ``record terminator'' makes sense, then
@code{*rt_start} should be set to point to the data to be used for
@code{RT}, and @code{*rt_len} should be set to the length of the
data. Otherwise, @code{*rt_len} should be set to zero.
-@code{gawk} makes its own copy of this data, so the
+@command{gawk} makes its own copy of this data, so the
extension must manage this storage.
@end table
@@ -31178,7 +31180,7 @@ When writing an input parser, you should think about (and document)
how it is expected to interact with @command{awk} code. You may want
it to always be called, and take effect as appropriate (as the
@code{readdir} extension does). Or you may want it to take effect
-based upon the value of an @code{awk} variable, as the XML extension
+based upon the value of an @command{awk} variable, as the XML extension
from the @code{gawkextlib} project does (@pxref{gawkextlib}).
In the latter case, code in a @code{BEGINFILE} section
can look at @code{FILENAME} and @code{ERRNO} to decide whether or
@@ -31961,7 +31963,7 @@ converts it to a string. Using non-integral values is possible, but
requires that you understand how such values are converted to strings
(@pxref{Conversion}); thus using integral values is safest.
-As with @emph{all} strings passed into @code{gawk} from an extension,
+As with @emph{all} strings passed into @command{gawk} from an extension,
the string value of @code{index} must come from @code{gawk_malloc()},
@code{gawk_calloc()} or @code{gawk_realloc()}, and
@command{gawk} releases the storage.
@@ -36265,7 +36267,7 @@ can be configured and compiled.
@cindex @option{--disable-lint} configuration option
@cindex configuration option, @code{--disable-lint}
@item --disable-lint
-Disable all lint checking within @code{gawk}. The
+Disable all lint checking within @command{gawk}. The
@option{--lint} and @option{--lint-old} options
(@pxref{Options})
are accepted, but silently do nothing.