diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2015-01-21 08:49:49 +0200 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2015-01-21 08:49:49 +0200 |
commit | 902fcb22d611b7f9e99369ecab223c00c877b82c (patch) | |
tree | 5846d6638beb88a7f5d0ee6f69fd62c3ccafb953 /doc/gawk.texi | |
parent | 4c01db1833a02173c910d463eaed77ad6ed66195 (diff) | |
parent | 8e0e08c84626633e1d4b7b431576d4ec7d8f10c4 (diff) | |
download | egawk-902fcb22d611b7f9e99369ecab223c00c877b82c.tar.gz egawk-902fcb22d611b7f9e99369ecab223c00c877b82c.tar.bz2 egawk-902fcb22d611b7f9e99369ecab223c00c877b82c.zip |
Merge branch 'gawk-4.1-stable'
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r-- | doc/gawk.texi | 465 |
1 files changed, 53 insertions, 412 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi index 035d1476..a4567760 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -2597,9 +2597,7 @@ for programs that are provided on the @command{awk} command line. (Also, placing the program in a file allows us to use a literal single quote in the program text, instead of the magic @samp{\47}.) -@c STARTOFRANGE sq1x @cindex single quote (@code{'}) in @command{gawk} command lines -@c STARTOFRANGE qs2x @cindex @code{'} (single quote) in @command{gawk} command lines If you want to clearly identify an @command{awk} program file as such, you can add the extension @file{.awk} to the @value{FN}. This doesn't @@ -2973,8 +2971,6 @@ $ @kbd{awk "BEGIN @{ print \"Here is a single quote <'>\" @}"} @end example @noindent -@c ENDOFRANGE sq1x -@c ENDOFRANGE qs2x This option is also painful, because double quotes, backslashes, and dollar signs are very common in more advanced @command{awk} programs. @@ -3310,8 +3306,13 @@ no actions run. After processing all the rules that match the line (and perhaps there are none), @command{awk} reads the next line. (However, -@pxref{Next Statement}, +@DBPXREF{Next Statement} +@ifdocbook +and @DBREF{Nextfile Statement}.) +@end ifdocbook +@ifnotdocbook and also @pxref{Nextfile Statement}.) +@end ifnotdocbook This continues until the program reaches the end of the file. For example, the following @command{awk} program contains two rules: @@ -3576,7 +3577,7 @@ performing bit manipulation, for runtime string translation (internationalizatio determining the type of a variable, and array sorting. -As we develop our presentation of the @command{awk} language, we introduce +As we develop our presentation of the @command{awk} language, we will introduce most of the variables and many of the functions. They are described systematically in @DBREF{Built-in Variables} and in @ref{Built-in}. @@ -3630,7 +3631,7 @@ and Perl.} @c FIXME: Review this chapter for summary of builtin functions called. @itemize @value{BULLET} @item -Programs in @command{awk} consist of @var{pattern}-@var{action} pairs. +Programs in @command{awk} consist of @var{pattern}--@var{action} pairs. @item An @var{action} without a @var{pattern} always runs. The default @@ -3659,7 +3660,7 @@ part of a larger shell script (or MS-Windows batch file). You may use backslash continuation to continue a source line. Lines are automatically continued after a comma, open brace, question mark, colon, -@samp{||}, @samp{&&}, @code{do} and @code{else}. +@samp{||}, @samp{&&}, @code{do}, and @code{else}. @end itemize @node Invoking Gawk @@ -3734,20 +3735,16 @@ warning that the program is empty. @node Options @section Command-Line Options -@c STARTOFRANGE ocl @cindex options, command-line -@c STARTOFRANGE clo @cindex command line, options -@c STARTOFRANGE gnulo @cindex GNU long options -@c STARTOFRANGE longo @cindex options, long Options begin with a dash and consist of a single character. GNU-style long options consist of two dashes and a keyword. The keyword can be abbreviated, as long as the abbreviation allows the option -to be uniquely identified. If the option takes an argument, then the -keyword is either immediately followed by an equals sign (@samp{=}) and the +to be uniquely identified. If the option takes an argument, either the +keyword is immediately followed by an equals sign (@samp{=}) and the argument's value, or the keyword and the argument's value are separated by whitespace. If a particular option with a value is given more than once, it is the @@ -3774,7 +3771,7 @@ Set the @code{FS} variable to @var{fs} @cindex @option{-f} option @cindex @option{--file} option @cindex @command{awk} programs, location of -Read @command{awk} program source from @var{source-file} +Read the @command{awk} program source from @var{source-file} instead of in the first nonoption argument. This option may be given multiple times; the @command{awk} program consists of the concatenation of the contents of @@ -3829,8 +3826,6 @@ by the user that could start with @samp{-}. It is also useful for passing options on to the @command{awk} program; see @ref{Getopt Function}. @end table -@c ENDOFRANGE gnulo -@c ENDOFRANGE longo The following list describes @command{gawk}-specific options: @@ -3842,14 +3837,14 @@ The following list describes @command{gawk}-specific options: @cindex @option{--characters-as-bytes} option Cause @command{gawk} to treat all input data as single-byte characters. In addition, all output written with @code{print} or @code{printf} -are treated as single-byte characters. +is treated as single-byte characters. Normally, @command{gawk} follows the POSIX standard and attempts to process its input data according to the current locale (@pxref{Locales}). This can often involve converting multibyte characters into wide characters (internally), and can lead to problems or confusion if the input data does not contain valid -multibyte characters. This option is an easy way to tell @command{gawk}: -``hands off my data!''. +multibyte characters. This option is an easy way to tell @command{gawk}, +``Hands off my data!'' @item @option{-c} @itemx @option{--traditional} @@ -3906,7 +3901,7 @@ Enable debugging of @command{awk} programs By default, the debugger reads commands interactively from the keyboard (standard input). The optional @var{file} argument allows you to specify a file with a list -of commands for the debugger to execute non-interactively. +of commands for the debugger to execute noninteractively. No space is allowed between the @option{-D} and @var{file}, if @var{file} is supplied. @@ -3966,7 +3961,7 @@ with @samp{#!} scripts (@pxref{Executable Scripts}), like so: @cindex portable object files, generating @cindex files, portable object, generating Analyze the source program and -generate a GNU @command{gettext} Portable Object Template file on standard +generate a GNU @command{gettext} portable object template file on standard output for all string constants that have been marked for translation. @xref{Internationalization}, for information about this option. @@ -3978,7 +3973,7 @@ for information about this option. @cindex GNU long options, printing list of @cindex options, printing list of @cindex printing, list of options -Print a ``usage'' message summarizing the short and long style options +Print a ``usage'' message summarizing the short- and long-style options that @command{gawk} accepts and then exit. @item @option{-i} @var{source-file} @@ -3988,7 +3983,7 @@ that @command{gawk} accepts and then exit. @cindex @command{awk} programs, location of Read an @command{awk} source library from @var{source-file}. This option is completely equivalent to using the @code{@@include} directive inside -your program. This option is very similar to the @option{-f} option, +your program. It is very similar to the @option{-f} option, but there are two important differences. First, when @option{-i} is used, the program source is not loaded if it has been previously loaded, whereas with @option{-f}, @command{gawk} always loads the file. @@ -4073,7 +4068,7 @@ when parsing numeric input data (@pxref{Locales}). @cindex @option{-o} option @cindex @option{--pretty-print} option Enable pretty-printing of @command{awk} programs. -By default, output program is created in a file named @file{awkprof.out} +By default, the output program is created in a file named @file{awkprof.out} (@pxref{Profiling}). The optional @var{file} argument allows you to specify a different @value{FN} for the output. @@ -4117,7 +4112,7 @@ in the left margin, and function call counts for each function. Operate in strict POSIX mode. This disables all @command{gawk} extensions (just like @option{--traditional}) and disables all extensions not allowed by POSIX. -@xref{Common Extensions}, for a summary of the extensions +@DBXREF{Common Extensions} for a summary of the extensions in @command{gawk} that are disabled by this option. Also, the following additional @@ -4238,7 +4233,7 @@ source of data.) Because it is clumsy using the standard @command{awk} mechanisms to mix source file and command-line @command{awk} programs, @command{gawk} provides the @option{-e} option. This does not require you to -pre-empt the standard input for your source code; it allows you to easily +preempt the standard input for your source code; it allows you to easily mix command-line and library source code (@pxref{AWKPATH Variable}). As with @option{-f}, the @option{-e} and @option{-i} options may also be used multiple times on the command line. @@ -4284,8 +4279,6 @@ setenv POSIXLY_CORRECT true Having @env{POSIXLY_CORRECT} set is not recommended for daily use, but it is good for testing the portability of your programs to other environments. -@c ENDOFRANGE ocl -@c ENDOFRANGE clo @node Other Arguments @section Other Command-Line Arguments @@ -4428,7 +4421,7 @@ file, unless the file is in the current directory. But with @command{gawk}, if the @value{FN} supplied to the @option{-f} or @option{-i} options does not contain a directory separator @samp{/}, then @command{gawk} searches a list of -directories (called the @dfn{search path}), one by one, looking for a +directories (called the @dfn{search path}) one by one, looking for a file with the specified name. The search path is a string consisting of directory names @@ -4469,9 +4462,9 @@ as an entry in the path or write a null entry in the path. Different past versions of @command{gawk} would also look explicitly in the current directory, either before or after the path search. As of -@value{PVERSION} 4.1.2, this no longer happens, and if you wish to look +@value{PVERSION} 4.1.2, this no longer happens; if you wish to look in the current directory, you must include @file{.} either as a separate -entry, or as a null entry in the search path. +entry or as a null entry in the search path. @end quotation The default value for @env{AWKPATH} is @@ -4587,7 +4580,7 @@ If this variable exists, @command{gawk} includes the @value{FN} and line number within the @command{gawk} source code from which warning and/or fatal messages are generated. Its purpose is to help isolate the source of a -message, as there are multiple places which produce the +message, as there are multiple places that produce the same warning or error message. @item GAWK_NO_DFA @@ -4603,16 +4596,16 @@ This specifies the amount by which @command{gawk} should grow its internal evaluation stack, when needed. @item INT_CHAIN_MAX -The intended maximum number of items @command{gawk} will maintain on a +This specifies intended maximum number of items @command{gawk} will maintain on a hash chain for managing arrays indexed by integers. @item STR_CHAIN_MAX -The intended maximum number of items @command{gawk} will maintain on a +This specifies intended maximum number of items @command{gawk} will maintain on a hash chain for managing arrays indexed by strings. @item TIDYMEM If this variable exists, @command{gawk} uses the @code{mtrace()} library -calls from GNU LIBC to help track down possible memory leaks. +calls from the GNU C library to help track down possible memory leaks. @end table @node Exit Status @@ -4649,7 +4642,7 @@ The @code{@@include} keyword can be used to read external @command{awk} source files. This gives you the ability to split large @command{awk} source files into smaller, more manageable pieces, and also lets you reuse common @command{awk} code from various @command{awk} scripts. In other words, you can group -together @command{awk} functions, used to carry out specific tasks, +together @command{awk} functions used to carry out specific tasks into external files. These files can be used just like function libraries, using the @code{@@include} keyword in conjunction with the @env{AWKPATH} environment variable. Note that source files may also be included @@ -4739,11 +4732,12 @@ of the @env{AWKPATH} variable in command-line file searches This is very helpful in constructing @command{gawk} function libraries. If you have a large script with useful, general-purpose @command{awk} functions, you can break it down into library files and put those files -in a special directory. You can then include those ``libraries,'' using -either the full pathnames of the files, or by setting the @env{AWKPATH} +in a special directory. You can then include those ``libraries,'' +either by using the full pathnames of the files, or by setting the @env{AWKPATH} environment variable accordingly and then using @code{@@include} with -just the file part of the full pathname. Of course, you can have more -than one directory to keep library files; the more complex the working +just the file part of the full pathname. Of course, +you can keep library files in more than one directory; +the more complex the working environment is, the more directories you may need to organize the files to be included. @@ -4756,8 +4750,8 @@ In particular, @code{@@include} is very useful for writing CGI scripts to be run from web pages. As mentioned in @ref{AWKPATH Variable}, the current directory is always -searched first for source files, before searching in @env{AWKPATH}, -and this also applies to files named with @code{@@include}. +searched first for source files, before searching in @env{AWKPATH}; +this also applies to files named with @code{@@include}. @node Loading Shared Libraries @section Loading Dynamic Extensions into Your Program @@ -4811,8 +4805,8 @@ It also describes the @code{ordchr} extension. @cindex features, deprecated @cindex obsolete features This @value{SECTION} describes features and/or command-line options from -previous releases of @command{gawk} that are either not available in the -current version or that are still supported but deprecated (meaning that +previous releases of @command{gawk} that either are not available in the +current version or are still supported but deprecated (meaning that they will @emph{not} be in the next release). The process-related special files @file{/dev/pid}, @file{/dev/ppid}, @@ -4909,7 +4903,7 @@ to run @command{awk}. @item The three standard options for all versions of @command{awk} are -@option{-f}, @option{-F} and @option{-v}. @command{gawk} supplies these +@option{-f}, @option{-F}, and @option{-v}. @command{gawk} supplies these and many others, as well as corresponding GNU-style long options. @item @@ -4946,13 +4940,12 @@ and @option{-f} command-line options. @item @command{gawk} allows you to load additional functions written in C or C++ using the @code{@@load} statement and/or the @option{-l} option. -(This advanced feature is described later on in @ref{Dynamic Extensions}.) +(This advanced feature is described later, in @ref{Dynamic Extensions}.) @end itemize @node Regexp @chapter Regular Expressions @cindex regexp -@c STARTOFRANGE regexp @cindex regular expressions A @dfn{regular expression}, or @dfn{regexp}, is a way of describing a @@ -5159,7 +5152,7 @@ Horizontal TAB, @kbd{Ctrl-i}, ASCII code 9 (HT). @cindex @code{\} (backslash), @code{\v} escape sequence @cindex backslash (@code{\}), @code{\v} escape sequence @item \v -Vertical tab, @kbd{Ctrl-k}, ASCII code 11 (VT). +Vertical TAB, @kbd{Ctrl-k}, ASCII code 11 (VT). @cindex @code{\} (backslash), @code{\}@var{nnn} escape sequence @cindex backslash (@code{\}), @code{\}@var{nnn} escape sequence @@ -5234,7 +5227,7 @@ characters @samp{a+b}. @cindex @code{\} (backslash), in escape sequences @cindex portability For complete portability, do not use a backslash before any character not -shown in the previous list and that is not an operator. +shown in the previous list or that is not an operator. @c 11/2014: Moved so as to not stack sidebars @cindex sidebar, Backslash Before Regular Characters @@ -5396,7 +5389,6 @@ escape sequences literally when used in regexp constants. Thus, @node Regexp Operators @section Regular Expression Operators -@c STARTOFRANGE regexpo @cindex regular expressions, operators @cindex metacharacters in regular expressions @@ -5414,7 +5406,7 @@ are recognized and converted into corresponding real characters as the very first step in processing regexps. Here is a list of metacharacters. All characters that are not escape -sequences and that are not listed in the following stand for themselves: +sequences and that are not listed here stand for themselves: @c Use @asis so the docbook comes out ok. Sigh. @table @asis @@ -5537,7 +5529,7 @@ just @samp{p} if no @samp{h}s are present. There are two subtle points to understand about how @samp{*} works. First, the @samp{*} applies only to the single preceding regular expression component (e.g., in @samp{ph*}, it applies just to the @samp{h}). -To cause @samp{*} to apply to a larger sub-expression, use parentheses: +To cause @samp{*} to apply to a larger subexpression, use parentheses: @samp{(ph)*} matches @samp{ph}, @samp{phph}, @samp{phphph}, and so on. Second, @samp{*} finds as many repetitions as possible. If the text @@ -5576,10 +5568,10 @@ is repeated at least @var{n} times: Matches @samp{whhhy}, but not @samp{why} or @samp{whhhhy}. @item wh@{3,5@}y -Matches @samp{whhhy}, @samp{whhhhy}, or @samp{whhhhhy}, only. +Matches @samp{whhhy}, @samp{whhhhy}, or @samp{whhhhhy} only. @item wh@{2,@}y -Matches @samp{whhy} or @samp{whhhy}, and so on. +Matches @samp{whhy}, @samp{whhhy}, and so on. @end table @cindex POSIX @command{awk}, interval expressions in @@ -5628,11 +5620,9 @@ usage as a syntax error. If @command{gawk} is in compatibility mode (@pxref{Options}), interval expressions are not available in regular expressions. -@c ENDOFRANGE regexpo @node Bracket Expressions @section Using Bracket Expressions -@c STARTOFRANGE charlist @cindex bracket expressions @cindex bracket expressions, range expressions @cindex range expressions (regexps) @@ -5708,7 +5698,7 @@ POSIX standard. (a space is printable but not visible, whereas an @samp{a} is both) @item @code{[:lower:]} @tab Lowercase alphabetic characters @item @code{[:print:]} @tab Printable characters (characters that are not control characters) -@item @code{[:punct:]} @tab Punctuation characters (characters that are not letters, digits +@item @code{[:punct:]} @tab Punctuation characters (characters that are not letters, digits, control characters, or space characters) @item @code{[:space:]} @tab Space characters (such as space, TAB, and formfeed, to name a few) @item @code{[:upper:]} @tab Uppercase alphabetic characters @@ -5776,7 +5766,6 @@ expression matching currently recognize only POSIX character classes; they do not recognize collating symbols or equivalence classes. @end quotation @c maybe one day ... -@c ENDOFRANGE charlist @node Leftmost Longest @section How Much Text Matches? @@ -5820,9 +5809,7 @@ and also @pxref{Field Separators}). @node Computed Regexps @section Using Dynamic Regexps -@c STARTOFRANGE dregexp @cindex regular expressions, computed -@c STARTOFRANGE regexpd @cindex regular expressions, dynamic @cindex @code{~} (tilde), @code{~} operator @cindex tilde (@code{~}), @code{~} operator @@ -5973,17 +5960,13 @@ $ @kbd{awk '$0 ~ /[ \t\n]/'} occur often in practice, but it's worth noting for future reference. @end cartouche @end ifnotdocbook -@c ENDOFRANGE dregexp -@c ENDOFRANGE regexpd @node GNU Regexp Operators @section @command{gawk}-Specific Regexp Operators @c This section adapted (long ago) from the regex-0.12 manual -@c STARTOFRANGE regexpg @cindex regular expressions, operators, @command{gawk} -@c STARTOFRANGE gregexp @cindex @command{gawk}, regular expressions, operators @cindex operators, GNU-specific @cindex regular expressions, operators, for words @@ -6148,15 +6131,11 @@ Allow interval expressions in regexps, if @option{--traditional} has been provided. Otherwise, interval expressions are available by default. @end table -@c ENDOFRANGE gregexp -@c ENDOFRANGE regexpg @node Case-sensitivity @section Case Sensitivity in Matching -@c STARTOFRANGE regexpcs @cindex regular expressions, case sensitivity -@c STARTOFRANGE csregexp @cindex case sensitivity, regexps and Case is normally significant in regular expressions, both when matching ordinary characters (i.e., not metacharacters) and inside bracket @@ -6248,8 +6227,6 @@ the right thing.} The value of @code{IGNORECASE} has no effect if @command{gawk} is in compatibility mode (@pxref{Options}). Case is always significant in compatibility mode. -@c ENDOFRANGE csregexp -@c ENDOFRANGE regexpcs @node Regexp Summary @section Summary @@ -6296,12 +6273,10 @@ versions, use @code{tolower()} or @code{toupper()}. @end itemize -@c ENDOFRANGE regexp @node Reading Files @chapter Reading Input Files -@c STARTOFRANGE infir @cindex reading input files @cindex input files, reading @cindex input files @@ -6352,9 +6327,7 @@ used with it do not have to be named on the @command{awk} command line @node Records @section How Input Is Split into Records -@c STARTOFRANGE inspl @cindex input, splitting into records -@c STARTOFRANGE recspl @cindex records, splitting input into @cindex @code{NR} variable @cindex @code{FNR} variable @@ -6711,8 +6684,6 @@ whole files. If you are using @command{gawk}, see @DBREF{Extension Sample Readfile} for another option. @end cartouche @end ifnotdocbook -@c ENDOFRANGE inspl -@c ENDOFRANGE recspl @node Fields @section Examining Fields @@ -6720,7 +6691,6 @@ Readfile} for another option. @cindex examining fields @cindex fields @cindex accessing fields -@c STARTOFRANGE fiex @cindex fields, examining @cindex POSIX @command{awk}, field separators and @cindex field separators, POSIX and @@ -6801,7 +6771,6 @@ $ @kbd{awk '/li/ @{ print $1, $NF @}' mail-list} @print{} Julie F @print{} Samuel A @end example -@c ENDOFRANGE fiex @node Nonconstant Fields @section Nonconstant Field Numbers @@ -6862,7 +6831,6 @@ evaluating @code{NF} and using its value as a field number. @node Changing Fields @section Changing the Contents of a Field -@c STARTOFRANGE ficon @cindex fields, changing contents of The contents of a field, as seen by @command{awk}, can be changed within an @command{awk} program; this changes what @command{awk} perceives as the @@ -7085,7 +7053,6 @@ with a statement such as @samp{$1 = $1}, as described earlier. @end cartouche @end ifnotdocbook -@c ENDOFRANGE ficon @node Field Separators @section Specifying How Fields Are Separated @@ -7101,9 +7068,7 @@ with a statement such as @samp{$1 = $1}, as described earlier. @cindex @code{FS} variable @cindex fields, separating -@c STARTOFRANGE fisepr @cindex field separators -@c STARTOFRANGE fisepg @cindex fields, separating The @dfn{field separator}, which is either a single character or a regular expression, controls the way @command{awk} splits an input record into fields. @@ -7203,9 +7168,7 @@ rules. @node Regexp Field Splitting @subsection Using Regular Expressions to Separate Fields -@c STARTOFRANGE regexpfs @cindex regular expressions, as field separators -@c STARTOFRANGE fsregexp @cindex field separators, regular expressions as The previous @value{SUBSECTION} discussed the use of single characters or simple strings as the @@ -7309,8 +7272,6 @@ $ @kbd{echo 'xxAA xxBxx C' |} @print{} -->xxBxx<-- @print{} -->C<-- @end example -@c ENDOFRANGE regexpfs -@c ENDOFRANGE fsregexp @node Single Character Fields @subsection Making Each Character a Separate Field @@ -7666,8 +7627,6 @@ will take effect. @end cartouche @end ifnotdocbook -@c ENDOFRANGE fisepr -@c ENDOFRANGE fisepg @node Constant Size @section Reading Fixed-Width Data @@ -7931,11 +7890,8 @@ last assigned to. @section Multiple-Line Records @cindex multiple-line records -@c STARTOFRANGE recm @cindex records, multiline -@c STARTOFRANGE imr @cindex input, multiline records -@c STARTOFRANGE frm @cindex files, reading, multiline records @cindex input, files, See input files In some databases, a single line cannot conveniently hold all the @@ -8102,16 +8058,11 @@ If not in compatibility mode (@pxref{Options}), @command{gawk} sets @code{RT} to the input text that matched the value specified by @code{RS}. But if the input file ended without any text that matches @code{RS}, then @command{gawk} sets @code{RT} to the null string. -@c ENDOFRANGE recm -@c ENDOFRANGE imr -@c ENDOFRANGE frm @node Getline @section Explicit Input with @code{getline} -@c STARTOFRANGE getl @cindex @code{getline} command, explicit input with -@c STARTOFRANGE inex @cindex input, explicit So far we have been getting our input data from @command{awk}'s main input stream---either the standard input (usually your keyboard, sometimes @@ -8701,9 +8652,6 @@ Note: for each variant, @command{gawk} sets the @code{RT} predefined variable. @item @var{command} @code{|& getline} @var{var} @tab Sets @var{var} and @code{RT} @tab @command{gawk} @end multitable @end float -@c ENDOFRANGE getl -@c ENDOFRANGE inex -@c ENDOFRANGE infir @node Read Timeout @section Reading Input with a Timeout @@ -8938,7 +8886,6 @@ That can be fixed by making one simple change. What is it? @node Printing @chapter Printing Output -@c STARTOFRANGE prnt @cindex printing @cindex output, printing, See printing One of the most common programming actions is to @dfn{print}, or output, @@ -8954,7 +8901,6 @@ columns, whether to use exponential notation or not, and so on. For printing with specifications, you need the @code{printf} statement (@pxref{Printf}). -@c STARTOFRANGE prnts @cindex @code{print} statement @cindex @code{printf} statement Besides basic and formatted printing, this @value{CHAPTER} @@ -9134,7 +9080,6 @@ You can continue either a @code{print} or @code{printf} statement simply by putting a newline after any comma (@pxref{Statements/Lines}). @end quotation -@c ENDOFRANGE prnts @node Output Separators @section Output Separators @@ -9247,7 +9192,6 @@ if @code{OFMT} contains anything but a floating-point conversion specification. @node Printf @section Using @code{printf} Statements for Fancier Printing -@c STARTOFRANGE printfs @cindex @code{printf} statement @cindex output, formatted @cindex formatting output @@ -9445,7 +9389,6 @@ values or do something else entirely. @node Format Modifiers @subsection Modifiers for @code{printf} Formats -@c STARTOFRANGE pfm @cindex @code{printf} statement, modifiers @cindex modifiers@comma{} in format specifiers A format specification can also include @dfn{modifiers} that can control @@ -9651,7 +9594,6 @@ format strings. These are not valid in @command{awk}. Most @command{awk} implementations silently ignore them. If @option{--lint} is provided on the command line (@pxref{Options}), @command{gawk} warns about their use. If @option{--posix} is supplied, their use is a fatal error. -@c ENDOFRANGE pfm @node Printf Examples @subsection Examples Using @code{printf} @@ -9732,14 +9674,11 @@ awk 'BEGIN @{ format = "%-10s %s\n" @{ printf format, $1, $2 @}' mail-list @end example -@c ENDOFRANGE printfs @node Redirection @section Redirecting Output of @code{print} and @code{printf} -@c STARTOFRANGE outre @cindex output redirection -@c STARTOFRANGE reout @cindex redirection of output @cindex @option{--sandbox} option, output redirection with @code{print}, @code{printf} So far, the output from @code{print} and @code{printf} has gone @@ -9997,8 +9936,6 @@ It then sends the list to the shell for execution. command lines to be fed to the shell. @end cartouche @end ifnotdocbook -@c ENDOFRANGE outre -@c ENDOFRANGE reout @node Special FD @section Special Files for Standard Pre-Opened Data Streams @@ -10108,7 +10045,6 @@ invoked with the @option{--traditional} option (@pxref{Options}). @node Special Files @section Special @value{FFN}s in @command{gawk} -@c STARTOFRANGE gfn @cindex @command{gawk}, file names in Besides access to standard input, standard output, and standard error, @@ -10199,18 +10135,13 @@ the time this does not matter; however, it is important to @emph{not} close any of the files related to file descriptors 0, 1, and 2. Doing so results in unpredictable behavior. @end itemize -@c ENDOFRANGE gfn @node Close Files And Pipes @section Closing Input and Output Redirections @cindex files, output, See output files -@c STARTOFRANGE ifc @cindex input files, closing -@c STARTOFRANGE ofc @cindex output, files@comma{} closing -@c STARTOFRANGE pc @cindex pipe, closing -@c STARTOFRANGE cc @cindex coprocesses, closing @cindex @code{getline} command, coprocesses@comma{} using from @@ -10487,10 +10418,6 @@ when closing a pipe. @end cartouche @end ifnotdocbook -@c ENDOFRANGE ifc -@c ENDOFRANGE ofc -@c ENDOFRANGE pc -@c ENDOFRANGE cc @node Output Summary @section Summary @@ -10554,11 +10481,9 @@ BEGIN @{ print "Serious error detected!" > /dev/stderr @} @end enumerate @c EXCLUDE END -@c ENDOFRANGE prnt @node Expressions @chapter Expressions -@c STARTOFRANGE exps @cindex expressions Expressions are the basic building blocks of @command{awk} patterns @@ -10601,7 +10526,6 @@ which provide the values used in expressions. @node Constants @subsection Constant Expressions -@c STARTOFRANGE cnst @cindex constants, types of The simplest type of expression is the @dfn{constant}, which always has @@ -10787,7 +10711,6 @@ $ @kbd{gawk 'BEGIN @{ printf "0x11 is <%s>\n", 0x11 @}'} @node Regexp Constants @subsubsection Regular Expression Constants -@c STARTOFRANGE rec @cindex regexp constants @cindex @code{~} (tilde), @code{~} operator @cindex tilde (@code{~}), @code{~} operator @@ -10799,7 +10722,6 @@ slashes, such as @code{@w{/^beginning and end$/}}. Most regexps used in matching operators can also match computed or dynamic regexps (which are typically just ordinary strings or variables that contain a regexp, but could be a more complex expression). -@c ENDOFRANGE cnst @node Using Constant Regexps @subsection Using Regular Expression Constants @@ -10910,7 +10832,6 @@ or not @code{$0} matches @code{/hi/}. @command{gawk} issues a warning when it sees a regexp constant used as a parameter to a user-defined function, because passing a truth value in this way is probably not what was intended. -@c ENDOFRANGE rec @node Variables @subsection Variables @@ -11505,11 +11426,8 @@ you're never quite sure what you'll get. @node Assignment Ops @subsection Assignment Expressions -@c STARTOFRANGE asop @cindex assignment operators -@c STARTOFRANGE opas @cindex operators, assignment -@c STARTOFRANGE exas @cindex expressions, assignment @cindex @code{=} (equals sign), @code{=} operator @cindex equals sign (@code{=}), @code{=} operator @@ -11815,16 +11733,11 @@ awk '/[=]=/' /dev/null and @command{mawk} also do not. @end cartouche @end ifnotdocbook -@c ENDOFRANGE exas -@c ENDOFRANGE opas -@c ENDOFRANGE asop @node Increment Ops @subsection Increment and Decrement Operators -@c STARTOFRANGE inop @cindex increment operators -@c STARTOFRANGE opde @cindex operators, decrement/increment @dfn{Increment} and @dfn{decrement operators} increase or decrease the value of a variable by one. An assignment operator can do the same thing, so @@ -11872,7 +11785,6 @@ just like variables. (Use @samp{$(i++)} when you want to do a field reference and a variable increment at the same time. The parentheses are necessary because of the precedence of the field reference operator @samp{$}.) -@c STARTOFRANGE deop @cindex decrement operators The decrement operator @samp{--} works just like @samp{++}, except that it subtracts one instead of adding it. As with @samp{++}, it can be used before @@ -12006,9 +11918,6 @@ You should avoid such things in your own programs. @c in the mirror in the morning. @end cartouche @end ifnotdocbook -@c ENDOFRANGE inop -@c ENDOFRANGE opde -@c ENDOFRANGE deop @node Truth Values and Conditions @section Truth Values and Conditions @@ -12073,17 +11982,13 @@ the string constant @code{"0"} is actually true, because it is non-null. @author Douglas Adams, @cite{The Hitchhiker's Guide to the Galaxy} @end quotation -@c STARTOFRANGE comex @cindex comparison expressions -@c STARTOFRANGE excom @cindex expressions, comparison @cindex expressions, matching, See comparison expressions @cindex matching, expressions, See comparison expressions @cindex relational operators, See comparison operators @cindex operators, relational, See operators@comma{} comparison -@c STARTOFRANGE varting @cindex variable typing -@c STARTOFRANGE vartypc @cindex variables, types of, comparison expressions and Unlike other programming languages, @command{awk} variables do not have a fixed type. Instead, they can be either a number or a string, depending @@ -12483,19 +12388,13 @@ $ @kbd{gawk --posix 'BEGIN @{ printf("ABC < abc = %s\n",} @print{} ABC < abc = FALSE @end example -@c ENDOFRANGE comex -@c ENDOFRANGE excom -@c ENDOFRANGE vartypc -@c ENDOFRANGE varting @node Boolean Ops @subsection Boolean Expressions @cindex and Boolean-logic operator @cindex or Boolean-logic operator @cindex not Boolean-logic operator -@c STARTOFRANGE exbo @cindex expressions, Boolean -@c STARTOFRANGE boex @cindex Boolean expressions @cindex operators, Boolean, See Boolean expressions @cindex Boolean operators, See Boolean expressions @@ -12641,8 +12540,6 @@ next record, and start processing the rules over again at the top. The reason it's there is to avoid printing the bracketing @samp{START} and @samp{END} lines. @end quotation -@c ENDOFRANGE exbo -@c ENDOFRANGE boex @node Conditional Exp @subsection Conditional Expressions @@ -12821,9 +12718,7 @@ $ @kbd{awk -f matchit.awk} @node Precedence @section Operator Precedence (How Operators Nest) -@c STARTOFRANGE prec @cindex precedence -@c STARTOFRANGE oppr @cindex operators, precedence @dfn{Operator precedence} determines how operators are grouped when @@ -13008,8 +12903,6 @@ Assignment. These operators group right-to-left. The @samp{|&}, @samp{**}, and @samp{**=} operators are not specified by POSIX. For maximum portability, do not use them. @end quotation -@c ENDOFRANGE prec -@c ENDOFRANGE oppr @node Locales @section Where You Are Makes a Difference @@ -13113,11 +13006,9 @@ program, and occasionally the format for data read as input. @end itemize -@c ENDOFRANGE exps @node Patterns and Actions @chapter Patterns, Actions, and Variables -@c STARTOFRANGE pat @cindex patterns As you have already seen, each @command{awk} statement consists of @@ -13413,9 +13304,7 @@ a range pattern. @value{DARKCORNER} @node BEGIN/END @subsection The @code{BEGIN} and @code{END} Special Patterns -@c STARTOFRANGE beg @cindex @code{BEGIN} pattern -@c STARTOFRANGE end @cindex @code{END} pattern All the patterns described so far are for matching input records. The @code{BEGIN} and @code{END} special patterns are different. @@ -13553,8 +13442,6 @@ are not valid in an @code{END} rule, because all the input has been read. @ifdocbook @DBREF{Nextfile Statement}.) @end ifdocbook -@c ENDOFRANGE beg -@c ENDOFRANGE end @node BEGINFILE/ENDFILE @subsection The @code{BEGINFILE} and @code{ENDFILE} Special Patterns @@ -13675,7 +13562,6 @@ awk '@{ print $1 @}' mail-list @noindent prints the first field of every record. -@c ENDOFRANGE pat @node Using Shell Variables @section Using Shell Variables in Programs @@ -13824,11 +13710,8 @@ For deleting array elements. @node Statements @section Control Statements in Actions -@c STARTOFRANGE csta @cindex control statements -@c STARTOFRANGE acs @cindex statements, control, in actions -@c STARTOFRANGE accs @cindex actions, control statements in @dfn{Control statements}, such as @code{if}, @code{while}, and so on, @@ -14546,15 +14429,10 @@ Negative values, and values of 127 or greater, may not produce consistent results across different operating systems. @end quotation -@c ENDOFRANGE csta -@c ENDOFRANGE acs -@c ENDOFRANGE accs @node Built-in Variables @section Predefined Variables -@c STARTOFRANGE bvar @cindex predefined variables -@c STARTOFRANGE varb @cindex variables, predefined Most @command{awk} variables are available to use for your own @@ -14581,9 +14459,7 @@ their areas of activity. @node User-modified @subsection Built-In Variables That Control @command{awk} -@c STARTOFRANGE bvaru @cindex predefined variables, user-modifiable -@c STARTOFRANGE nmbv @cindex user-modifiable variables The following is an alphabetical list of variables that you can change to @@ -14810,17 +14686,11 @@ marked string constants in the source text, as well as for the (@pxref{Internationalization}). The default value of @code{TEXTDOMAIN} is @code{"messages"}. @end table -@c ENDOFRANGE bvar -@c ENDOFRANGE varb -@c ENDOFRANGE bvaru -@c ENDOFRANGE nmbv @node Auto-set @subsection Built-In Variables That Convey Information -@c STARTOFRANGE bvconi @cindex predefined variables, conveying information -@c STARTOFRANGE vbconi @cindex variables, predefined conveying information The following is an alphabetical list of variables that @command{awk} sets automatically on certain occasions in order to provide @@ -15242,8 +15112,6 @@ implementation issues.} neither @code{FUNCTAB} nor @code{SYMTAB} are available as elements within the @code{SYMTAB} array. @end quotation @end table -@c ENDOFRANGE bvconi -@c ENDOFRANGE vbconi @cindex sidebar, Changing @code{NR} and @code{FNR} @ifdocbook @@ -15536,7 +15404,6 @@ control how @command{awk} will process the provided @value{DF}s. @node Arrays @chapter Arrays in @command{awk} -@c STARTOFRANGE arrs @cindex arrays An @dfn{array} is a table of values called @dfn{elements}. The @@ -15658,9 +15525,7 @@ Only the values are stored; the indices are implicit from the order of the values. Here, 8 is the value at index zero, because 8 appears in the position with zero elements before it. -@c STARTOFRANGE arrin @cindex arrays, indexing -@c STARTOFRANGE inarr @cindex indexing arrays @cindex associative arrays @cindex arrays, associative @@ -15863,8 +15728,6 @@ that array's indices are consecutive integers starting at one. @command{awk}'s arrays are efficient---the time to access an element is independent of the number of elements in the array. -@c ENDOFRANGE arrin -@c ENDOFRANGE inarr @node Reference to Elements @subsection Referring to an Array Element @@ -16917,14 +16780,11 @@ element is itself a subarray. @end itemize -@c ENDOFRANGE arrs @node Functions @chapter Functions -@c STARTOFRANGE funcbi @cindex functions, built-in -@c STARTOFRANGE bifunc @cindex built-in functions This @value{CHAPTER} describes @command{awk}'s built-in functions, which fall into three categories: numeric, string, and I/O. @@ -18648,13 +18508,9 @@ you would see the latter (undesirable) output. @subsection Time Functions @cindex time functions -@c STARTOFRANGE tst @cindex timestamps -@c STARTOFRANGE logftst @cindex log files, timestamps in -@c STARTOFRANGE filogtst @cindex files, log@comma{} timestamps in -@c STARTOFRANGE gawtst @cindex @command{gawk}, timestamps @cindex POSIX @command{awk}, timestamps and @code{awk} programs are commonly used to process log files @@ -18732,7 +18588,6 @@ is out of range, @code{mktime()} returns @minus{}1. @cindex @command{gawk}, @code{PROCINFO} array in @cindex @code{PROCINFO} array @item @code{strftime(}[@var{format} [@code{,} @var{timestamp} [@code{,} @var{utc-flag}] ] ]@code{)} -@c STARTOFRANGE strf @cindexgawkfunc{strftime} @cindex format time string Format the time specified by @var{timestamp} @@ -18981,7 +18836,6 @@ The time as a decimal timestamp in seconds since the epoch. The date in VMS format (e.g., @samp{20-JUN-1991}). @end ignore @end table -@c ENDOFRANGE strf Additionally, the alternative representations are recognized but their normal representations are used. @@ -19032,23 +18886,14 @@ gawk 'BEGIN @{ exit exitval @}' "$@@" @end example -@c ENDOFRANGE tst -@c ENDOFRANGE logftst -@c ENDOFRANGE filogtst -@c ENDOFRANGE gawtst @node Bitwise Functions @subsection Bit-Manipulation Functions @cindex bit-manipulation functions -@c STARTOFRANGE bit @cindex bitwise, operations -@c STARTOFRANGE and @cindex AND bitwise operation -@c STARTOFRANGE oro @cindex OR bitwise operation -@c STARTOFRANGE xor @cindex XOR bitwise operation -@c STARTOFRANGE opbit @cindex operations, bitwise @quotation @i{I can explain it for you, but I can't understand it for you.} @@ -19340,11 +19185,6 @@ decimal and octal values for the same numbers (@pxref{Nondecimal-numbers}), and then demonstrates the results of the @code{compl()}, @code{lshift()}, and @code{rshift()} functions. -@c ENDOFRANGE bit -@c ENDOFRANGE and -@c ENDOFRANGE oro -@c ENDOFRANGE xor -@c ENDOFRANGE opbit @node Type Functions @subsection Getting Type Information @@ -19424,15 +19264,11 @@ variant of the same message. The default value for @var{domain} is the current value of @code{TEXTDOMAIN}. The default value for @var{category} is @code{"LC_MESSAGES"}. @end table -@c ENDOFRANGE funcbi -@c ENDOFRANGE bifunc @node User-defined @section User-Defined Functions -@c STARTOFRANGE udfunc @cindex user-defined functions -@c STARTOFRANGE funcud @cindex functions, user-defined Complicated @command{awk} programs can often be simplified by defining your own functions. User-defined functions can be called just like @@ -19457,7 +19293,6 @@ variable definitions is appallingly awful.} @author Brian Kernighan @end quotation -@c STARTOFRANGE fdef @cindex functions, defining Definitions of functions can appear anywhere between the rules of an @command{awk} program. Thus, the general form of an @command{awk} program is @@ -19704,12 +19539,10 @@ You might think that @code{ctime()} could use @code{PROCINFO["strftime"]} for its format string. That would be a mistake, because @code{ctime()} is supposed to return the time formatted in a standard fashion, and user-level code could have changed @code{PROCINFO["strftime"]}. -@c ENDOFRANGE fdef @node Function Caveats @subsection Calling User-Defined Functions -@c STARTOFRANGE fudc @cindex functions, user-defined, calling @dfn{Calling a function} means causing the function to run and do its job. A function call is an expression and its value is the value returned by @@ -20001,7 +19834,6 @@ or the @code{nextfile} statement @end ifnotdocbook inside a user-defined function. @command{gawk} does not have this limitation. -@c ENDOFRANGE fudc @node Return Statement @subsection The @code{return} Statement @@ -20129,7 +19961,6 @@ does report the second error. Usually, such things aren't a big issue, but it's worth being aware of them. -@c ENDOFRANGE udfunc @node Indirect Calls @section Indirect Function Calls @@ -20622,7 +20453,6 @@ program. This is equivalent to function pointers in C and C++. @end itemize -@c ENDOFRANGE funcud @ifnotinfo @part @value{PART2}Problem Solving with @command{awk} @@ -20644,11 +20474,8 @@ It contains the following chapters: @node Library Functions @chapter A Library of @command{awk} Functions -@c STARTOFRANGE libf @cindex libraries of @command{awk} functions -@c STARTOFRANGE flib @cindex functions, library -@c STARTOFRANGE fudlib @cindex functions, user-defined, library of @DBREF{User-defined} describes how to write @@ -20971,13 +20798,9 @@ be tested with @command{gawk} and the results compared to the built-in @node Assert Function @subsection Assertions -@c STARTOFRANGE asse @cindex assertions -@c STARTOFRANGE assef @cindex @code{assert()} function (C library) -@c STARTOFRANGE libfass @cindex libraries of @command{awk} functions, assertions -@c STARTOFRANGE flibass @cindex functions, library, assertions @cindex @command{awk} programs, lengthy, assertions When writing large programs, it is often useful to know @@ -21093,10 +20916,6 @@ most likely causing the program to hang as it waits for input. There is a simple workaround to this: make sure that such a @code{BEGIN} rule always ends with an @code{exit} statement. -@c ENDOFRANGE asse -@c ENDOFRANGE assef -@c ENDOFRANGE flibass -@c ENDOFRANGE libfass @node Round Function @subsection Rounding Numbers @@ -21654,11 +21473,8 @@ function shell_quote(s, # parameter @node Data File Management @section @value{DDF} Management -@c STARTOFRANGE dataf @cindex files, managing -@c STARTOFRANGE libfdataf @cindex libraries of @command{awk} functions, managing, data files -@c STARTOFRANGE flibdataf @cindex functions, library, managing data files This @value{SECTION} presents functions that are useful for managing command-line @value{DF}s. @@ -22050,22 +21866,14 @@ The use of @code{No_command_assign} allows you to disable command-line assignments at invocation time, by giving the variable a true value. When not set, it is initially zero (i.e., false), so the command-line arguments are left alone. -@c ENDOFRANGE dataf -@c ENDOFRANGE flibdataf -@c ENDOFRANGE libfdataf @node Getopt Function @section Processing Command-Line Options -@c STARTOFRANGE libfclo @cindex libraries of @command{awk} functions, command-line options -@c STARTOFRANGE flibclo @cindex functions, library, command-line options -@c STARTOFRANGE clop @cindex command-line options, processing -@c STARTOFRANGE oclp @cindex options, command-line, processing -@c STARTOFRANGE clibf @cindex functions, library, C library @cindex arguments, processing Most utilities on POSIX-compatible systems take options on @@ -22417,21 +22225,13 @@ further options Several of the sample programs presented in @ref{Sample Programs}, use @code{getopt()} to process their arguments. -@c ENDOFRANGE libfclo -@c ENDOFRANGE flibclo -@c ENDOFRANGE clop -@c ENDOFRANGE oclp @node Passwd Functions @section Reading the User Database -@c STARTOFRANGE libfudata @cindex libraries of @command{awk} functions, user database, reading -@c STARTOFRANGE flibudata @cindex functions, library, user database@comma{} reading -@c STARTOFRANGE udatar @cindex user database@comma{} reading -@c STARTOFRANGE dataur @cindex database, users@comma{} reading @cindex @code{PROCINFO} array The @code{PROCINFO} array @@ -22778,21 +22578,13 @@ and such a change would clutter up the code. The @command{id} program in @DBREF{Id Program} uses these functions. -@c ENDOFRANGE libfudata -@c ENDOFRANGE flibudata -@c ENDOFRANGE udatar -@c ENDOFRANGE dataur @node Group Functions @section Reading the Group Database -@c STARTOFRANGE libfgdata @cindex libraries of @command{awk} functions, group database, reading -@c STARTOFRANGE flibgdata @cindex functions, library, group database@comma{} reading -@c STARTOFRANGE gdatar @cindex group database, reading -@c STARTOFRANGE datagr @cindex database, group, reading @cindex @code{PROCINFO} array, and group membership @cindex @code{getgrent()} function (C library) @@ -23115,7 +22907,6 @@ function getgrent() @} @c endfile @end example -@c ENDOFRANGE clibf @cindex @code{endgrent()} function (C library) The @code{endgrent()} function resets @code{_gr_count} to zero so that @code{getgrent()} can @@ -23204,10 +22995,6 @@ $ @kbd{gawk -f walk_array.awk} @print{} a[4][2] = 42 @end example -@c ENDOFRANGE libfgdata -@c ENDOFRANGE flibgdata -@c ENDOFRANGE gdatar -@c ENDOFRANGE libf @node Library Functions Summary @section Summary @@ -23321,13 +23108,9 @@ output identical to that of the original version. @end enumerate @c EXCLUDE END -@c ENDOFRANGE flib -@c ENDOFRANGE fudlib -@c ENDOFRANGE datagr @node Sample Programs @chapter Practical @command{awk} Programs -@c STARTOFRANGE awkpex @cindex @command{awk} programs, examples of @c FULLXREF ON @@ -23397,7 +23180,6 @@ cut.awk -- -c1-8 myfiles > results @node Clones @section Reinventing Wheels for Fun and Profit -@c STARTOFRANGE posimawk @cindex POSIX, programs@comma{} implementing in @command{awk} This @value{SECTION} presents a number of POSIX utilities implemented in @@ -23428,11 +23210,8 @@ The programs are presented in alphabetical order. @subsection Cutting Out Fields and Columns @cindex @command{cut} utility -@c STARTOFRANGE cut @cindex @command{cut} utility -@c STARTOFRANGE ficut @cindex fields, cutting -@c STARTOFRANGE colcut @cindex columns, cutting The @command{cut} utility selects, or ``cuts,'' characters or fields from its standard input and sends them to its standard output. @@ -23740,21 +23519,14 @@ other @command{awk} implementations to use @code{substr()} it is also extremely painful. The @code{FIELDWIDTHS} variable supplies an elegant solution to the problem of picking the input line apart by characters. -@c ENDOFRANGE cut -@c ENDOFRANGE ficut -@c ENDOFRANGE colcut @node Egrep Program @subsection Searching for Regular Expressions in Files -@c STARTOFRANGE regexps @cindex regular expressions, searching for -@c STARTOFRANGE sfregexp @cindex searching, files for regular expressions -@c STARTOFRANGE fsregexp @cindex files, searching for regular expressions -@c STARTOFRANGE egrep @cindex @command{egrep} utility The @command{egrep} utility searches files for patterns. It uses regular expressions that are almost identical to those available in @command{awk} @@ -24022,17 +23794,12 @@ function usage() @c endfile @end example -@c ENDOFRANGE regexps -@c ENDOFRANGE sfregexp -@c ENDOFRANGE fsregexp -@c ENDOFRANGE egrep @node Id Program @subsection Printing Out User Information @cindex printing, user information @cindex users, information about, printing -@c STARTOFRANGE id @cindex @command{id} utility The @command{id} utility lists a user's real and effective user ID numbers, real and effective group ID numbers, and the user's group set, if any. @@ -24161,16 +23928,13 @@ code that is used repeatedly, making the whole program shorter and cleaner. In particular, moving the check for the empty string into this function saves several lines of code. -@c ENDOFRANGE id @node Split Program @subsection Splitting a Large File into Pieces @c FIXME: One day, update to current POSIX version of split -@c STARTOFRANGE filspl @cindex files, splitting -@c STARTOFRANGE split @cindex @code{split} utility The @command{split} program splits large text files into smaller pieces. Usage is as follows:@footnote{This is the traditional usage. The @@ -24305,15 +24069,12 @@ You might want to consider how to eliminate the use of way as to solve the EBCDIC issue as well. @end ifset -@c ENDOFRANGE filspl -@c ENDOFRANGE split @node Tee Program @subsection Duplicating Output into Multiple Files @cindex files, multiple@comma{} duplicating output into @cindex output, duplicating into files -@c STARTOFRANGE tee @cindex @code{tee} utility The @code{tee} program is known as a ``pipe fitting.'' @code{tee} copies its standard input to its standard output and also duplicates it to the @@ -24426,18 +24187,14 @@ END @{ @} @c endfile @end example -@c ENDOFRANGE tee @node Uniq Program @subsection Printing Nonduplicated Lines of Text @c FIXME: One day, update to current POSIX version of uniq -@c STARTOFRANGE prunt @cindex printing, unduplicated lines of text -@c STARTOFRANGE tpul @cindex text@comma{} printing, unduplicated lines of -@c STARTOFRANGE uniq @cindex @command{uniq} utility The @command{uniq} utility reads sorted lines of data on its standard input, and by default removes duplicate lines. In other words, it only @@ -24706,26 +24463,17 @@ suggestion. @end ifset -@c ENDOFRANGE prunt -@c ENDOFRANGE tpul -@c ENDOFRANGE uniq @node Wc Program @subsection Counting Things @c FIXME: One day, update to current POSIX version of wc -@c STARTOFRANGE count @cindex counting -@c STARTOFRANGE infco @cindex input files, counting elements in -@c STARTOFRANGE woco @cindex words, counting -@c STARTOFRANGE chco @cindex characters, counting -@c STARTOFRANGE lico @cindex lines, counting -@c STARTOFRANGE wc @cindex @command{wc} utility The @command{wc} (word count) utility counts lines, words, and characters in one or more input files. Its usage is as follows: @@ -24895,13 +24643,6 @@ END @{ @} @c endfile @end example -@c ENDOFRANGE count -@c ENDOFRANGE infco -@c ENDOFRANGE lico -@c ENDOFRANGE woco -@c ENDOFRANGE chco -@c ENDOFRANGE wc -@c ENDOFRANGE posimawk @node Miscellaneous Programs @section A Grab Bag of @command{awk} Programs @@ -25032,9 +24773,7 @@ Aharon Robbins <arnold@skeeve.com> wrote: @author Erik Quanstrom @end quotation -@c STARTOFRANGE tialarm @cindex time, alarm clock example program -@c STARTOFRANGE alaex @cindex alarm clock example program The following program is a simple ``alarm clock'' program. You give it a time of day and an optional message. At the specified time, @@ -25186,15 +24925,11 @@ seconds are necessary: @} @c endfile @end example -@c ENDOFRANGE tialarm -@c ENDOFRANGE alaex @node Translate Program @subsection Transliterating Characters -@c STARTOFRANGE chtra @cindex characters, transliterating -@c STARTOFRANGE tr @cindex @command{tr} utility The system @command{tr} utility transliterates characters. For example, it is often used to map uppercase letters into lowercase for further processing: @@ -25342,15 +25077,11 @@ such as @samp{a-z}, as allowed by the @command{tr} utility. Look at the code for @file{cut.awk} (@pxref{Cut Program}) for inspiration. -@c ENDOFRANGE chtra -@c ENDOFRANGE tr @node Labels Program @subsection Printing Mailing Labels -@c STARTOFRANGE prml @cindex printing, mailing labels -@c STARTOFRANGE mlprint @cindex mailing labels@comma{} printing Here is a ``real world''@footnote{``Real world'' is defined as ``a program actually used to get something done.''} @@ -25414,7 +25145,6 @@ that there are two blank lines at the top and two blank lines at the bottom. The @code{END} rule arranges to flush the final page of labels; there may not have been an even multiple of 20 labels in the data: -@c STARTOFRANGE labels @cindex @code{labels.awk} program @example @c file eg/prog/labels.awk @@ -25479,14 +25209,10 @@ END @{ @} @c endfile @end example -@c ENDOFRANGE prml -@c ENDOFRANGE mlprint -@c ENDOFRANGE labels @node Word Sorting @subsection Generating Word-Usage Counts -@c STARTOFRANGE worus @cindex words, usage counts@comma{} generating When working with large amounts of text, it can be interesting to know @@ -25548,7 +25274,6 @@ to remove punctuation characters. Finally, we solve the third problem by using the system @command{sort} utility to process the output of the @command{awk} script. Here is the new version of the program: -@c STARTOFRANGE wordfreq @cindex @code{wordfreq.awk} program @example @c file eg/prog/wordfreq.awk @@ -25613,13 +25338,10 @@ This way of sorting must be used on systems that do not have true pipes at the command-line (or batch-file) level. See the general operating system documentation for more information on how to use the @command{sort} program. -@c ENDOFRANGE worus -@c ENDOFRANGE wordfreq @node History Sorting @subsection Removing Duplicates from Unsorted Text -@c STARTOFRANGE lidu @cindex lines, duplicate@comma{} removing The @command{uniq} program (@pxref{Uniq Program}), @@ -25644,7 +25366,6 @@ Each element of @code{lines} is a unique command, and the indices of The @code{END} rule simply prints out the lines, in order: @cindex Rakitzis, Byron -@c STARTOFRANGE histsort @cindex @code{histsort.awk} program @example @c file eg/prog/histsort.awk @@ -25687,15 +25408,11 @@ print data[lines[i]], lines[i] @noindent This works because @code{data[$0]} is incremented each time a line is seen. -@c ENDOFRANGE lidu -@c ENDOFRANGE histsort @node Extract Program @subsection Extracting Programs from Texinfo Source Files -@c STARTOFRANGE texse @cindex Texinfo, extracting programs from source files -@c STARTOFRANGE fitex @cindex files, Texinfo@comma{} extracting programs from @ifnotinfo Both this chapter and the previous chapter @@ -25799,7 +25516,6 @@ The first rule handles calling @code{system()}, checking that a command is given (@code{NF} is at least three) and also checking that the command exits with a zero exit status, signifying OK: -@c STARTOFRANGE extract @cindex @code{extract.awk} program @example @c file eg/prog/extract.awk @@ -25945,9 +25661,6 @@ END @{ @} @c endfile @end example -@c ENDOFRANGE texse -@c ENDOFRANGE fitex -@c ENDOFRANGE extract @node Simple Sed @subsection A Simple Stream Editor @@ -25977,7 +25690,6 @@ additional arguments are treated as @value{DF} names to process. If none are provided, the standard input is used: @cindex Brennan, Michael -@c STARTOFRANGE awksed @cindex @command{awksed.awk} program @c @cindex simple stream editor @c @cindex stream editor, simple @@ -26054,14 +25766,11 @@ The @code{usage()} function prints an error message and exits. Finally, the single rule handles the printing scheme outlined earlier, using @code{print} or @code{printf} as appropriate, depending upon the value of @code{RT}. -@c ENDOFRANGE awksed @node Igawk Program @subsection An Easy Way to Use Library Functions -@c STARTOFRANGE libfex @cindex libraries of @command{awk} functions, example program for using -@c STARTOFRANGE flibex @cindex functions, library, example program for using In @ref{Include Files}, we saw how @command{gawk} provides a built-in file-inclusion capability. However, this is a @command{gawk} extension. @@ -26200,7 +25909,6 @@ program. The program is as follows: -@c STARTOFRANGE igawk @cindex @code{igawk.sh} program @example @c file eg/prog/igawk.sh @@ -26525,10 +26233,6 @@ features to a program; they can often be layered on top.@footnote{@command{gawk} does @code{@@include} processing itself in order to support the use of @command{awk} programs as Web CGI scripts.} -@c ENDOFRANGE libfex -@c ENDOFRANGE flibex -@c ENDOFRANGE awkpex -@c ENDOFRANGE igawk @node Anagram Program @subsection Finding Anagrams from a Dictionary @@ -26552,7 +26256,6 @@ The following program uses arrays of arrays to bring together words with the same signature and array sorting to print the words in sorted order: -@c STARTOFRANGE anagram @cindex @code{anagram.awk} program @example @c file eg/prog/anagram.awk @@ -26661,7 +26364,6 @@ babery yabber @dots{} @end example -@c ENDOFRANGE anagram @node Signature Program @subsection And Now for Something Completely Different @@ -26981,9 +26683,7 @@ It contains the following chapters: @node Advanced Features @chapter Advanced Features of @command{gawk} -@c STARTOFRANGE gawadv @cindex @command{gawk}, features, advanced -@c STARTOFRANGE advgaw @cindex advanced features, @command{gawk} @ignore Contributed by: Peter Langston <pud!psl@bellcore.bellcore.com> @@ -27693,7 +27393,6 @@ using regular pipes. @section Using @command{gawk} for Network Programming @cindex advanced features, network programming @cindex networks, programming -@c STARTOFRANGE tcpip @cindex TCP/IP @cindex @code{/inet/@dots{}} special files (@command{gawk}) @cindex files, @code{/inet/@dots{}} (@command{gawk}) @@ -27810,13 +27509,10 @@ which comes as part of the @command{gawk} distribution, for a much more complete introduction and discussion, as well as extensive examples. -@c ENDOFRANGE tcpip @node Profiling @section Profiling Your @command{awk} Programs -@c STARTOFRANGE awkp @cindex @command{awk} programs, profiling -@c STARTOFRANGE proawk @cindex profiling @command{awk} programs @cindex @code{awkprof.out} file @cindex files, @code{awkprof.out} @@ -28143,9 +27839,6 @@ that the profiling output does. This makes it easy to pretty-print your code once development is completed, and then use the result as the final version of your program. -@c ENDOFRANGE awkp -@c ENDOFRANGE proawk - @node Advanced Features Summary @section Summary @@ -28191,8 +27884,6 @@ the program, but that will change in the next major release. @end itemize -@c ENDOFRANGE advgaw -@c ENDOFRANGE gawadv @node Internationalization @chapter Internationalization with @command{gawk} @@ -28205,7 +27896,6 @@ countries, they were able to sell more systems. As a result, internationalization and localization of programs and software systems became a common practice. -@c STARTOFRANGE inloc @cindex internationalization, localization @cindex @command{gawk}, internationalization and, See internationalization @cindex internationalization, localization, @command{gawk} and @@ -28250,7 +27940,6 @@ monetary values are printed and read. @section GNU @command{gettext} @cindex internationalizing a program -@c STARTOFRANGE gettex @cindex @command{gettext} library @command{gawk} uses GNU @command{gettext} to provide its internationalization features. @@ -28302,7 +27991,6 @@ lookup of the translations. @cindex @code{.po} files @cindex files, @code{.po} -@c STARTOFRANGE portobfi @cindex portable object files @cindex files, portable object @item @@ -28314,7 +28002,6 @@ For example, there might be a @file{fr.po} for a French translation. @cindex @code{.gmo} files @cindex files, @code{.gmo} @cindex message object files -@c STARTOFRANGE portmsgfi @cindex files, message object @item Each language's @file{.po} file is converted into a binary @@ -28442,11 +28129,9 @@ before or after the day in a date, local month abbreviations, and so on. @item LC_ALL All of the above. (Not too useful in the context of @command{gettext}.) @end table -@c ENDOFRANGE gettex @node Programmer i18n @section Internationalizing @command{awk} Programs -@c STARTOFRANGE inap @cindex @command{awk} programs, internationalizing @command{gawk} provides the following variables and functions for @@ -28679,8 +28364,6 @@ to provide you translations that you can also then distribute. @DBXREF{I18N Example} for the full list of steps to go through to create and test translations for @command{guide}. -@c ENDOFRANGE portobfi -@c ENDOFRANGE portmsgfi @node Printf Ordering @subsection Rearranging @code{printf} Arguments @@ -28856,7 +28539,6 @@ However, because the positional specifications are primarily for use in @emph{translated} format strings, and because non-GNU @command{awk}s never retrieve the translated string, this should not be a problem in practice. @end itemize -@c ENDOFRANGE inap @node I18N Example @section A Simple Internationalization Example @@ -29007,8 +28689,8 @@ complete detail in @cite{GNU gettext tools}}.) @end ifnotinfo As of this writing, the latest version of GNU @command{gettext} is -@uref{ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.3.tar.gz, -@value{PVERSION} 0.19.3}. +@uref{ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.4.tar.gz, +@value{PVERSION} 0.19.4}. If a translation of @command{gawk}'s messages exists, then @command{gawk} produces usage messages, warnings, @@ -29052,7 +28734,6 @@ a number of translations for its messages. @end itemize -@c ENDOFRANGE inloc @node Debugger @chapter Debugging @command{awk} Programs @@ -30656,7 +30337,7 @@ is available like so: @example $ @kbd{gawk --version} @print{} GNU Awk 4.1.2, API: 1.1 (GNU MPFR 3.1.0-p3, GNU MP 5.0.2) -@print{} Copyright (C) 1989, 1991-2014 Free Software Foundation. +@print{} Copyright (C) 1989, 1991-2015 Free Software Foundation. @dots{} @end example @@ -35598,9 +35279,7 @@ online documentation}. @node V7/SVR3.1 @appendixsec Major Changes Between V7 and SVR3.1 -@c STARTOFRANGE gawkv @cindex @command{awk}, versions of -@c STARTOFRANGE gawkv1 @cindex @command{awk}, versions of, changes between V7 and SVR3.1 The @command{awk} language evolved considerably between the release of @@ -35687,7 +35366,6 @@ Multiple @code{BEGIN} and @code{END} rules Multidimensional arrays (@pxref{Multidimensional}). @end itemize -@c ENDOFRANGE gawkv1 @node SVR4 @appendixsec Changes Between SVR3.1 and SVR4 @@ -35802,7 +35480,6 @@ not permitted by the POSIX standard. The 2008 POSIX standard can be found online at @url{http://www.opengroup.org/onlinepubs/9699919799/}. -@c ENDOFRANGE gawkv @node BTL @appendixsec Extensions in Brian Kernighan's @command{awk} @@ -35848,11 +35525,8 @@ available in his @command{awk}. @node POSIX/GNU @appendixsec Extensions in @command{gawk} Not in POSIX @command{awk} -@c STARTOFRANGE fripls @cindex compatibility mode (@command{gawk}), extensions -@c STARTOFRANGE exgnot @cindex extensions, in @command{gawk}, not in POSIX @command{awk} -@c STARTOFRANGE posnot @cindex POSIX, @command{gawk} extensions not included in The GNU implementation, @command{gawk}, adds a large number of features. They can all be disabled with either the @option{--traditional} or @@ -36166,9 +35840,6 @@ Support for MirBSD was removed at @command{gawk} @value{PVERSION} 4.2. @c XXX ADD MORE STUFF HERE -@c ENDOFRANGE fripls -@c ENDOFRANGE exgnot -@c ENDOFRANGE posnot @c This does not need to be in the formal book. @ifclear FOR_PRINT @@ -37217,9 +36888,7 @@ the appropriate credit where credit is due. @c last two commas are part of see also @cindex operating systems, See Also GNU/Linux@comma{} PC operating systems@comma{} Unix -@c STARTOFRANGE gligawk @cindex @command{gawk}, installing -@c STARTOFRANGE ingawk @cindex installing @command{gawk} This appendix provides instructions for installing @command{gawk} on the various platforms that are supported by the developers. The primary @@ -37329,7 +36998,6 @@ a local expert. @node Distribution contents @appendixsubsec Contents of the @command{gawk} Distribution -@c STARTOFRANGE gawdis @cindex @command{gawk}, distribution The @command{gawk} distribution has a number of C source files, @@ -37528,7 +37196,6 @@ directory to run your version of @command{gawk} against the test suite. If @command{gawk} successfully passes @samp{make check}, then you can be confident of a successful port. @end table -@c ENDOFRANGE gawdis @node Unix Installation @appendixsec Compiling and Installing @command{gawk} on Unix-Like Systems @@ -37993,9 +37660,7 @@ multibyte functionality is not available. @node PC Using @appendixsubsubsec Using @command{gawk} on PC Operating Systems -@c STARTOFRANGE opgawx @cindex operating systems, PC, @command{gawk} on -@c STARTOFRANGE pcgawon @cindex PC operating systems, @command{gawk} on Under MS-DOS and MS-Windows, the Cygwin and MinGW environments support @@ -38503,8 +38168,6 @@ $ @kbd{gawk :== $sys$common:[syshlp.examples.tcpip.snmp]gawk.exe} This is apparently @value{PVERSION} 2.15.6, which is extremely old. We recommend compiling and using the current version. -@c ENDOFRANGE opgawx -@c ENDOFRANGE pcgawon @node Bugs @appendixsec Reporting Problems and Bugs @@ -38515,9 +38178,7 @@ recommend compiling and using the current version. @end quotation @c the radio show, not the book. :-) -@c STARTOFRANGE dbugg @cindex debugging @command{gawk}, bug reports -@c STARTOFRANGE tblgawb @cindex troubleshooting, @command{gawk}, bug reports If you have problems with @command{gawk} or think that you have found a bug, report it to the developers; we cannot promise to do anything @@ -38614,12 +38275,9 @@ The people maintaining the various @command{gawk} ports are: If your bug is also reproducible under Unix, send a copy of your report to the @EMAIL{bug-gawk@@gnu.org,bug-gawk at gnu dot org} email list as well. -@c ENDOFRANGE dbugg -@c ENDOFRANGE tblgawb @node Other Versions @appendixsec Other Freely Available @command{awk} Implementations -@c STARTOFRANGE awkim @cindex @command{awk}, implementations @ignore From: emory!amc.com!brennan (Michael Brennan) @@ -38679,7 +38337,7 @@ git clone git://github.com/onetrueawk/awk bwkawk @end example @noindent -This command creates a copy of the @uref{http://www.git-scm.com, Git} +This command creates a copy of the @uref{http://git-scm.com, Git} repository in a directory named @file{bwkawk}. If you leave that argument off the @command{git} command line, the repository copy is created in a directory named @file{awk}. @@ -38840,7 +38498,6 @@ See also the ``Versions and implementations'' section of the Wikipedia article} for information on additional versions. @end table -@c ENDOFRANGE awkim @node Installation summary @appendixsec Summary @@ -38878,15 +38535,11 @@ implementations. Many are POSIX compliant; others are less so. @end itemize -@c ENDOFRANGE gligawk -@c ENDOFRANGE ingawk @ifclear FOR_PRINT @node Notes @appendix Implementation Notes -@c STARTOFRANGE gawii @cindex @command{gawk}, implementation issues -@c STARTOFRANGE impis @cindex implementation issues, @command{gawk} This appendix contains information mainly of interest to implementers and @@ -38962,7 +38615,7 @@ However, if you want to modify @command{gawk} and contribute back your changes, you will probably wish to work with the development version. To do so, you will need to access the @command{gawk} source code repository. The code is maintained using the -@uref{http://git-scm.com/, Git distributed version control system}. +@uref{http://git-scm.com, Git distributed version control system}. You will need to install it if your system doesn't have it. Once you have done so, use the command: @@ -38991,11 +38644,8 @@ that has a Git plug-in for working with Git repositories. @node Adding Code @appendixsubsec Adding New Features -@c STARTOFRANGE adfgaw @cindex adding, features to @command{gawk} -@c STARTOFRANGE fadgaw @cindex features, adding to @command{gawk} -@c STARTOFRANGE gawadf @cindex @command{gawk}, features, adding You are free to add any new features you like to @command{gawk}. However, if you want your changes to be incorporated into the @command{gawk} @@ -39162,9 +38812,6 @@ Although this sounds like a lot of work, please remember that while you may write the new code, I have to maintain it and support it. If it isn't possible for me to do that with a minimum of extra work, then I probably will not. -@c ENDOFRANGE adfgaw -@c ENDOFRANGE gawadf -@c ENDOFRANGE fadgaw @node New Ports @appendixsubsec Porting @command{gawk} to a New Operating System @@ -39298,7 +38945,6 @@ coding style and brace layout that suits your taste. @node Derived Files @appendixsubsec Why Generated Files Are Kept In Git -@c STARTOFRANGE gawkgit @cindex Git, use of for @command{gawk} source code @c From emails written March 22, 2012, to the gawk developers list. @@ -39487,7 +39133,6 @@ wget http://git.savannah.gnu.org/cgit/gawk.git/snapshot/gawk-@var{branchname}.ta @noindent to retrieve a snapshot of the given branch. -@c ENDOFRANGE gawkgit @node Future Extensions @appendixsec Probable Future Extensions @@ -39868,13 +39513,10 @@ of @command{gawk}, but it @emph{will} be removed in the next major release. @end itemize -@c ENDOFRANGE impis -@c ENDOFRANGE gawii @node Basic Concepts @appendix Basic Programming Concepts @cindex programming, concepts -@c STARTOFRANGE procon @cindex programming, concepts This @value{APPENDIX} attempts to define some of the basic concepts @@ -40112,7 +39754,6 @@ standard for C. This standard became an ISO standard in 1990. In 1999, a revised ISO C standard was approved and released. Where it makes sense, POSIX @command{awk} is compatible with 1999 ISO C. -@c ENDOFRANGE procon @node Glossary @unnumbered Glossary |