aboutsummaryrefslogtreecommitdiffstats
path: root/gawk.info-4
diff options
context:
space:
mode:
Diffstat (limited to 'gawk.info-4')
-rw-r--r--gawk.info-41305
1 files changed, 1305 insertions, 0 deletions
diff --git a/gawk.info-4 b/gawk.info-4
new file mode 100644
index 00000000..100e1d25
--- /dev/null
+++ b/gawk.info-4
@@ -0,0 +1,1305 @@
+This is Info file gawk.info, produced by Makeinfo-1.54 from the input
+file gawk.texi.
+
+ This file documents `awk', a program that you can use to select
+particular records in a file and perform operations upon them.
+
+ This is Edition 0.15 of `The GAWK Manual',
+for the 2.15 version of the GNU implementation
+of AWK.
+
+ Copyright (C) 1989, 1991, 1992, 1993 Free Software Foundation, Inc.
+
+ Permission is granted to make and distribute verbatim copies of this
+manual provided the copyright notice and this permission notice are
+preserved on all copies.
+
+ Permission is granted to copy and distribute modified versions of
+this manual under the conditions for verbatim copying, provided that
+the entire resulting derived work is distributed under the terms of a
+permission notice identical to this one.
+
+ Permission is granted to copy and distribute translations of this
+manual into another language, under the above conditions for modified
+versions, except that this permission notice may be stated in a
+translation approved by the Foundation.
+
+
+File: gawk.info, Node: Actions, Next: Expressions, Prev: Patterns, Up: Top
+
+Overview of Actions
+*******************
+
+ An `awk' program or script consists of a series of rules and
+function definitions, interspersed. (Functions are described later.
+*Note User-defined Functions: User-defined.)
+
+ A rule contains a pattern and an action, either of which may be
+omitted. The purpose of the "action" is to tell `awk' what to do once
+a match for the pattern is found. Thus, the entire program looks
+somewhat like this:
+
+ [PATTERN] [{ ACTION }]
+ [PATTERN] [{ ACTION }]
+ ...
+ function NAME (ARGS) { ... }
+ ...
+
+ An action consists of one or more `awk' "statements", enclosed in
+curly braces (`{' and `}'). Each statement specifies one thing to be
+done. The statements are separated by newlines or semicolons.
+
+ The curly braces around an action must be used even if the action
+contains only one statement, or even if it contains no statements at
+all. However, if you omit the action entirely, omit the curly braces as
+well. (An omitted action is equivalent to `{ print $0 }'.)
+
+ Here are the kinds of statements supported in `awk':
+
+ * Expressions, which can call functions or assign values to variables
+ (*note Expressions as Action Statements: Expressions.). Executing
+ this kind of statement simply computes the value of the expression
+ and then ignores it. This is useful when the expression has side
+ effects (*note Assignment Expressions: Assignment Ops.).
+
+ * Control statements, which specify the control flow of `awk'
+ programs. The `awk' language gives you C-like constructs (`if',
+ `for', `while', and so on) as well as a few special ones (*note
+ Control Statements in Actions: Statements.).
+
+ * Compound statements, which consist of one or more statements
+ enclosed in curly braces. A compound statement is used in order
+ to put several statements together in the body of an `if',
+ `while', `do' or `for' statement.
+
+ * Input control, using the `getline' command (*note Explicit Input
+ with `getline': Getline.), and the `next' statement (*note The
+ `next' Statement: Next Statement.).
+
+ * Output statements, `print' and `printf'. *Note Printing Output:
+ Printing.
+
+ * Deletion statements, for deleting array elements. *Note The
+ `delete' Statement: Delete.
+
+
+File: gawk.info, Node: Expressions, Next: Statements, Prev: Actions, Up: Top
+
+Expressions as Action Statements
+********************************
+
+ Expressions are the basic building block of `awk' actions. An
+expression evaluates to a value, which you can print, test, store in a
+variable or pass to a function. But beyond that, an expression can
+assign a new value to a variable or a field, with an assignment
+operator.
+
+ An expression can serve as a statement on its own. Most other kinds
+of statements contain one or more expressions which specify data to be
+operated on. As in other languages, expressions in `awk' include
+variables, array references, constants, and function calls, as well as
+combinations of these with various operators.
+
+* Menu:
+
+* Constants:: String, numeric, and regexp constants.
+* Variables:: Variables give names to values for later use.
+* Arithmetic Ops:: Arithmetic operations (`+', `-', etc.)
+* Concatenation:: Concatenating strings.
+* Comparison Ops:: Comparison of numbers and strings
+ with `<', etc.
+* Boolean Ops:: Combining comparison expressions
+ using boolean operators
+ `||' ("or"), `&&' ("and") and `!' ("not").
+
+* Assignment Ops:: Changing the value of a variable or a field.
+* Increment Ops:: Incrementing the numeric value of a variable.
+
+* Conversion:: The conversion of strings to numbers
+ and vice versa.
+* Values:: The whole truth about numbers and strings.
+* Conditional Exp:: Conditional expressions select
+ between two subexpressions under control
+ of a third subexpression.
+* Function Calls:: A function call is an expression.
+* Precedence:: How various operators nest.
+
+
+File: gawk.info, Node: Constants, Next: Variables, Prev: Expressions, Up: Expressions
+
+Constant Expressions
+====================
+
+ The simplest type of expression is the "constant", which always has
+the same value. There are three types of constants: numeric constants,
+string constants, and regular expression constants.
+
+ A "numeric constant" stands for a number. This number can be an
+integer, a decimal fraction, or a number in scientific (exponential)
+notation. Note that all numeric values are represented within `awk' in
+double-precision floating point. Here are some examples of numeric
+constants, which all have the same value:
+
+ 105
+ 1.05e+2
+ 1050e-1
+
+ A string constant consists of a sequence of characters enclosed in
+double-quote marks. For example:
+
+ "parrot"
+
+represents the string whose contents are `parrot'. Strings in `gawk'
+can be of any length and they can contain all the possible 8-bit ASCII
+characters including ASCII NUL. Other `awk' implementations may have
+difficulty with some character codes.
+
+ Some characters cannot be included literally in a string constant.
+You represent them instead with "escape sequences", which are character
+sequences beginning with a backslash (`\').
+
+ One use of an escape sequence is to include a double-quote character
+in a string constant. Since a plain double-quote would end the string,
+you must use `\"' to represent a single double-quote character as a
+part of the string. The backslash character itself is another
+character that cannot be included normally; you write `\\' to put one
+backslash in the string. Thus, the string whose contents are the two
+characters `"\' must be written `"\"\\"'.
+
+ Another use of backslash is to represent unprintable characters such
+as newline. While there is nothing to stop you from writing most of
+these characters directly in a string constant, they may look ugly.
+
+ Here is a table of all the escape sequences used in `awk':
+
+`\\'
+ Represents a literal backslash, `\'.
+
+`\a'
+ Represents the "alert" character, control-g, ASCII code 7.
+
+`\b'
+ Represents a backspace, control-h, ASCII code 8.
+
+`\f'
+ Represents a formfeed, control-l, ASCII code 12.
+
+`\n'
+ Represents a newline, control-j, ASCII code 10.
+
+`\r'
+ Represents a carriage return, control-m, ASCII code 13.
+
+`\t'
+ Represents a horizontal tab, control-i, ASCII code 9.
+
+`\v'
+ Represents a vertical tab, control-k, ASCII code 11.
+
+`\NNN'
+ Represents the octal value NNN, where NNN are one to three digits
+ between 0 and 7. For example, the code for the ASCII ESC (escape)
+ character is `\033'.
+
+`\xHH...'
+ Represents the hexadecimal value HH, where HH are hexadecimal
+ digits (`0' through `9' and either `A' through `F' or `a' through
+ `f'). Like the same construct in ANSI C, the escape sequence
+ continues until the first non-hexadecimal digit is seen. However,
+ using more than two hexadecimal digits produces undefined results.
+ (The `\x' escape sequence is not allowed in POSIX `awk'.)
+
+ A "constant regexp" is a regular expression description enclosed in
+slashes, such as `/^beginning and end$/'. Most regexps used in `awk'
+programs are constant, but the `~' and `!~' operators can also match
+computed or "dynamic" regexps (*note How to Use Regular Expressions:
+Regexp Usage.).
+
+ Constant regexps may be used like simple expressions. When a
+constant regexp is not on the right hand side of the `~' or `!~'
+operators, it has the same meaning as if it appeared in a pattern, i.e.
+`($0 ~ /foo/)' (*note Expressions as Patterns: Expression Patterns.).
+This means that the two code segments,
+
+ if ($0 ~ /barfly/ || $0 ~ /camelot/)
+ print "found"
+
+and
+
+ if (/barfly/ || /camelot/)
+ print "found"
+
+are exactly equivalent. One rather bizarre consequence of this rule is
+that the following boolean expression is legal, but does not do what
+the user intended:
+
+ if (/foo/ ~ $1) print "found foo"
+
+ This code is "obviously" testing `$1' for a match against the regexp
+`/foo/'. But in fact, the expression `(/foo/ ~ $1)' actually means
+`(($0 ~ /foo/) ~ $1)'. In other words, first match the input record
+against the regexp `/foo/'. The result will be either a 0 or a 1,
+depending upon the success or failure of the match. Then match that
+result against the first field in the record.
+
+ Since it is unlikely that you would ever really wish to make this
+kind of test, `gawk' will issue a warning when it sees this construct in
+a program.
+
+ Another consequence of this rule is that the assignment statement
+
+ matches = /foo/
+
+will assign either 0 or 1 to the variable `matches', depending upon the
+contents of the current input record.
+
+ Constant regular expressions are also used as the first argument for
+the `sub' and `gsub' functions (*note Built-in Functions for String
+Manipulation: String Functions.).
+
+ This feature of the language was never well documented until the
+POSIX specification.
+
+ You may be wondering, when is
+
+ $1 ~ /foo/ { ... }
+
+preferable to
+
+ $1 ~ "foo" { ... }
+
+ Since the right-hand sides of both `~' operators are constants, it
+is more efficient to use the `/foo/' form: `awk' can note that you have
+supplied a regexp and store it internally in a form that makes pattern
+matching more efficient. In the second form, `awk' must first convert
+the string into this internal form, and then perform the pattern
+matching. The first form is also better style; it shows clearly that
+you intend a regexp match.
+
+
+File: gawk.info, Node: Variables, Next: Arithmetic Ops, Prev: Constants, Up: Expressions
+
+Variables
+=========
+
+ Variables let you give names to values and refer to them later. You
+have already seen variables in many of the examples. The name of a
+variable must be a sequence of letters, digits and underscores, but it
+may not begin with a digit. Case is significant in variable names; `a'
+and `A' are distinct variables.
+
+ A variable name is a valid expression by itself; it represents the
+variable's current value. Variables are given new values with
+"assignment operators" and "increment operators". *Note Assignment
+Expressions: Assignment Ops.
+
+ A few variables have special built-in meanings, such as `FS', the
+field separator, and `NF', the number of fields in the current input
+record. *Note Built-in Variables::, for a list of them. These
+built-in variables can be used and assigned just like all other
+variables, but their values are also used or changed automatically by
+`awk'. Each built-in variable's name is made entirely of upper case
+letters.
+
+ Variables in `awk' can be assigned either numeric or string values.
+By default, variables are initialized to the null string, which is
+effectively zero if converted to a number. There is no need to
+"initialize" each variable explicitly in `awk', the way you would in C
+or most other traditional languages.
+
+* Menu:
+
+* Assignment Options:: Setting variables on the command line
+ and a summary of command line syntax.
+ This is an advanced method of input.
+
+
+File: gawk.info, Node: Assignment Options, Prev: Variables, Up: Variables
+
+Assigning Variables on the Command Line
+---------------------------------------
+
+ You can set any `awk' variable by including a "variable assignment"
+among the arguments on the command line when you invoke `awk' (*note
+Invoking `awk': Command Line.). Such an assignment has this form:
+
+ VARIABLE=TEXT
+
+With it, you can set a variable either at the beginning of the `awk'
+run or in between input files.
+
+ If you precede the assignment with the `-v' option, like this:
+
+ -v VARIABLE=TEXT
+
+then the variable is set at the very beginning, before even the `BEGIN'
+rules are run. The `-v' option and its assignment must precede all the
+file name arguments, as well as the program text.
+
+ Otherwise, the variable assignment is performed at a time determined
+by its position among the input file arguments: after the processing of
+the preceding input file argument. For example:
+
+ awk '{ print $n }' n=4 inventory-shipped n=2 BBS-list
+
+prints the value of field number `n' for all input records. Before the
+first file is read, the command line sets the variable `n' equal to 4.
+This causes the fourth field to be printed in lines from the file
+`inventory-shipped'. After the first file has finished, but before the
+second file is started, `n' is set to 2, so that the second field is
+printed in lines from `BBS-list'.
+
+ Command line arguments are made available for explicit examination by
+the `awk' program in an array named `ARGV' (*note Built-in
+Variables::.).
+
+ `awk' processes the values of command line assignments for escape
+sequences (*note Constant Expressions: Constants.).
+
+
+File: gawk.info, Node: Arithmetic Ops, Next: Concatenation, Prev: Variables, Up: Expressions
+
+Arithmetic Operators
+====================
+
+ The `awk' language uses the common arithmetic operators when
+evaluating expressions. All of these arithmetic operators follow normal
+precedence rules, and work as you would expect them to. This example
+divides field three by field four, adds field two, stores the result
+into field one, and prints the resulting altered input record:
+
+ awk '{ $1 = $2 + $3 / $4; print }' inventory-shipped
+
+ The arithmetic operators in `awk' are:
+
+`X + Y'
+ Addition.
+
+`X - Y'
+ Subtraction.
+
+`- X'
+ Negation.
+
+`+ X'
+ Unary plus. No real effect on the expression.
+
+`X * Y'
+ Multiplication.
+
+`X / Y'
+ Division. Since all numbers in `awk' are double-precision
+ floating point, the result is not rounded to an integer: `3 / 4'
+ has the value 0.75.
+
+`X % Y'
+ Remainder. The quotient is rounded toward zero to an integer,
+ multiplied by Y and this result is subtracted from X. This
+ operation is sometimes known as "trunc-mod." The following
+ relation always holds:
+
+ b * int(a / b) + (a % b) == a
+
+ One possibly undesirable effect of this definition of remainder is
+ that `X % Y' is negative if X is negative. Thus,
+
+ -17 % 8 = -1
+
+ In other `awk' implementations, the signedness of the remainder
+ may be machine dependent.
+
+`X ^ Y'
+`X ** Y'
+ Exponentiation: X raised to the Y power. `2 ^ 3' has the value 8.
+ The character sequence `**' is equivalent to `^'. (The POSIX
+ standard only specifies the use of `^' for exponentiation.)
+
+
+File: gawk.info, Node: Concatenation, Next: Comparison Ops, Prev: Arithmetic Ops, Up: Expressions
+
+String Concatenation
+====================
+
+ There is only one string operation: concatenation. It does not have
+a specific operator to represent it. Instead, concatenation is
+performed by writing expressions next to one another, with no operator.
+For example:
+
+ awk '{ print "Field number one: " $1 }' BBS-list
+
+produces, for the first record in `BBS-list':
+
+ Field number one: aardvark
+
+ Without the space in the string constant after the `:', the line
+would run together. For example:
+
+ awk '{ print "Field number one:" $1 }' BBS-list
+
+produces, for the first record in `BBS-list':
+
+ Field number one:aardvark
+
+ Since string concatenation does not have an explicit operator, it is
+often necessary to insure that it happens where you want it to by
+enclosing the items to be concatenated in parentheses. For example, the
+following code fragment does not concatenate `file' and `name' as you
+might expect:
+
+ file = "file"
+ name = "name"
+ print "something meaningful" > file name
+
+It is necessary to use the following:
+
+ print "something meaningful" > (file name)
+
+ We recommend you use parentheses around concatenation in all but the
+most common contexts (such as in the right-hand operand of `=').
+
+
+File: gawk.info, Node: Comparison Ops, Next: Boolean Ops, Prev: Concatenation, Up: Expressions
+
+Comparison Expressions
+======================
+
+ "Comparison expressions" compare strings or numbers for
+relationships such as equality. They are written using "relational
+operators", which are a superset of those in C. Here is a table of
+them:
+
+`X < Y'
+ True if X is less than Y.
+
+`X <= Y'
+ True if X is less than or equal to Y.
+
+`X > Y'
+ True if X is greater than Y.
+
+`X >= Y'
+ True if X is greater than or equal to Y.
+
+`X == Y'
+ True if X is equal to Y.
+
+`X != Y'
+ True if X is not equal to Y.
+
+`X ~ Y'
+ True if the string X matches the regexp denoted by Y.
+
+`X !~ Y'
+ True if the string X does not match the regexp denoted by Y.
+
+`SUBSCRIPT in ARRAY'
+ True if array ARRAY has an element with the subscript SUBSCRIPT.
+
+ Comparison expressions have the value 1 if true and 0 if false.
+
+ The rules `gawk' uses for performing comparisons are based on those
+in draft 11.2 of the POSIX standard. The POSIX standard introduced the
+concept of a "numeric string", which is simply a string that looks like
+a number, for example, `" +2"'.
+
+ When performing a relational operation, `gawk' considers the type of
+an operand to be the type it received on its last *assignment*, rather
+than the type of its last *use* (*note Numeric and String Values:
+Values.). This type is *unknown* when the operand is from an
+"external" source: field variables, command line arguments, array
+elements resulting from a `split' operation, and the value of an
+`ENVIRON' element. In this case only, if the operand is a numeric
+string, then it is considered to be of both string type and numeric
+type. If at least one operand of a comparison is of string type only,
+then a string comparison is performed. Any numeric operand will be
+converted to a string using the value of `CONVFMT' (*note Conversion of
+Strings and Numbers: Conversion.). If one operand of a comparison is
+numeric, and the other operand is either numeric or both numeric and
+string, then `gawk' does a numeric comparison. If both operands have
+both types, then the comparison is numeric. Strings are compared by
+comparing the first character of each, then the second character of
+each, and so on. Thus `"10"' is less than `"9"'. If there are two
+strings where one is a prefix of the other, the shorter string is less
+than the longer one. Thus `"abc"' is less than `"abcd"'.
+
+ Here are some sample expressions, how `gawk' compares them, and what
+the result of the comparison is.
+
+`1.5 <= 2.0'
+ numeric comparison (true)
+
+`"abc" >= "xyz"'
+ string comparison (false)
+
+`1.5 != " +2"'
+ string comparison (true)
+
+`"1e2" < "3"'
+ string comparison (true)
+
+`a = 2; b = "2"'
+`a == b'
+ string comparison (true)
+
+ echo 1e2 3 | awk '{ print ($1 < $2) ? "true" : "false" }'
+
+prints `false' since both `$1' and `$2' are numeric strings and thus
+have both string and numeric types, thus dictating a numeric comparison.
+
+ The purpose of the comparison rules and the use of numeric strings is
+to attempt to produce the behavior that is "least surprising," while
+still "doing the right thing."
+
+ String comparisons and regular expression comparisons are very
+different. For example,
+
+ $1 == "foo"
+
+has the value of 1, or is true, if the first field of the current input
+record is precisely `foo'. By contrast,
+
+ $1 ~ /foo/
+
+has the value 1 if the first field contains `foo', such as `foobar'.
+
+ The right hand operand of the `~' and `!~' operators may be either a
+constant regexp (`/.../'), or it may be an ordinary expression, in
+which case the value of the expression as a string is a dynamic regexp
+(*note How to Use Regular Expressions: Regexp Usage.).
+
+ In very recent implementations of `awk', a constant regular
+expression in slashes by itself is also an expression. The regexp
+`/REGEXP/' is an abbreviation for this comparison expression:
+
+ $0 ~ /REGEXP/
+
+ In some contexts it may be necessary to write parentheses around the
+regexp to avoid confusing the `gawk' parser. For example, `(/x/ - /y/)
+> threshold' is not allowed, but `((/x/) - (/y/)) > threshold' parses
+properly.
+
+ One special place where `/foo/' is *not* an abbreviation for `$0 ~
+/foo/' is when it is the right-hand operand of `~' or `!~'! *Note
+Constant Expressions: Constants, where this is discussed in more detail.
+
+
+File: gawk.info, Node: Boolean Ops, Next: Assignment Ops, Prev: Comparison Ops, Up: Expressions
+
+Boolean Expressions
+===================
+
+ A "boolean expression" is a combination of comparison expressions or
+matching expressions, using the boolean operators "or" (`||'), "and"
+(`&&'), and "not" (`!'), along with parentheses to control nesting.
+The truth of the boolean expression is computed by combining the truth
+values of the component expressions.
+
+ Boolean expressions can be used wherever comparison and matching
+expressions can be used. They can be used in `if', `while' `do' and
+`for' statements. They have numeric values (1 if true, 0 if false),
+which come into play if the result of the boolean expression is stored
+in a variable, or used in arithmetic.
+
+ In addition, every boolean expression is also a valid boolean
+pattern, so you can use it as a pattern to control the execution of
+rules.
+
+ Here are descriptions of the three boolean operators, with an
+example of each. It may be instructive to compare these examples with
+the analogous examples of boolean patterns (*note Boolean Operators and
+Patterns: Boolean Patterns.), which use the same boolean operators in
+patterns instead of expressions.
+
+`BOOLEAN1 && BOOLEAN2'
+ True if both BOOLEAN1 and BOOLEAN2 are true. For example, the
+ following statement prints the current input record if it contains
+ both `2400' and `foo'.
+
+ if ($0 ~ /2400/ && $0 ~ /foo/) print
+
+ The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is true.
+ This can make a difference when BOOLEAN2 contains expressions that
+ have side effects: in the case of `$0 ~ /foo/ && ($2 == bar++)',
+ the variable `bar' is not incremented if there is no `foo' in the
+ record.
+
+`BOOLEAN1 || BOOLEAN2'
+ True if at least one of BOOLEAN1 or BOOLEAN2 is true. For
+ example, the following command prints all records in the input
+ file `BBS-list' that contain *either* `2400' or `foo', or both.
+
+ awk '{ if ($0 ~ /2400/ || $0 ~ /foo/) print }' BBS-list
+
+ The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is false.
+ This can make a difference when BOOLEAN2 contains expressions
+ that have side effects.
+
+`!BOOLEAN'
+ True if BOOLEAN is false. For example, the following program
+ prints all records in the input file `BBS-list' that do *not*
+ contain the string `foo'.
+
+ awk '{ if (! ($0 ~ /foo/)) print }' BBS-list
+
+
+File: gawk.info, Node: Assignment Ops, Next: Increment Ops, Prev: Boolean Ops, Up: Expressions
+
+Assignment Expressions
+======================
+
+ An "assignment" is an expression that stores a new value into a
+variable. For example, let's assign the value 1 to the variable `z':
+
+ z = 1
+
+ After this expression is executed, the variable `z' has the value 1.
+Whatever old value `z' had before the assignment is forgotten.
+
+ Assignments can store string values also. For example, this would
+store the value `"this food is good"' in the variable `message':
+
+ thing = "food"
+ predicate = "good"
+ message = "this " thing " is " predicate
+
+(This also illustrates concatenation of strings.)
+
+ The `=' sign is called an "assignment operator". It is the simplest
+assignment operator because the value of the right-hand operand is
+stored unchanged.
+
+ Most operators (addition, concatenation, and so on) have no effect
+except to compute a value. If you ignore the value, you might as well
+not use the operator. An assignment operator is different; it does
+produce a value, but even if you ignore the value, the assignment still
+makes itself felt through the alteration of the variable. We call this
+a "side effect".
+
+ The left-hand operand of an assignment need not be a variable (*note
+Variables::.); it can also be a field (*note Changing the Contents of a
+Field: Changing Fields.) or an array element (*note Arrays in `awk':
+Arrays.). These are all called "lvalues", which means they can appear
+on the left-hand side of an assignment operator. The right-hand
+operand may be any expression; it produces the new value which the
+assignment stores in the specified variable, field or array element.
+
+ It is important to note that variables do *not* have permanent types.
+The type of a variable is simply the type of whatever value it happens
+to hold at the moment. In the following program fragment, the variable
+`foo' has a numeric value at first, and a string value later on:
+
+ foo = 1
+ print foo
+ foo = "bar"
+ print foo
+
+When the second assignment gives `foo' a string value, the fact that it
+previously had a numeric value is forgotten.
+
+ An assignment is an expression, so it has a value: the same value
+that is assigned. Thus, `z = 1' as an expression has the value 1. One
+consequence of this is that you can write multiple assignments together:
+
+ x = y = z = 0
+
+stores the value 0 in all three variables. It does this because the
+value of `z = 0', which is 0, is stored into `y', and then the value of
+`y = z = 0', which is 0, is stored into `x'.
+
+ You can use an assignment anywhere an expression is called for. For
+example, it is valid to write `x != (y = 1)' to set `y' to 1 and then
+test whether `x' equals 1. But this style tends to make programs hard
+to read; except in a one-shot program, you should rewrite it to get rid
+of such nesting of assignments. This is never very hard.
+
+ Aside from `=', there are several other assignment operators that do
+arithmetic with the old value of the variable. For example, the
+operator `+=' computes a new value by adding the right-hand value to
+the old value of the variable. Thus, the following assignment adds 5
+to the value of `foo':
+
+ foo += 5
+
+This is precisely equivalent to the following:
+
+ foo = foo + 5
+
+Use whichever one makes the meaning of your program clearer.
+
+ Here is a table of the arithmetic assignment operators. In each
+case, the right-hand operand is an expression whose value is converted
+to a number.
+
+`LVALUE += INCREMENT'
+ Adds INCREMENT to the value of LVALUE to make the new value of
+ LVALUE.
+
+`LVALUE -= DECREMENT'
+ Subtracts DECREMENT from the value of LVALUE.
+
+`LVALUE *= COEFFICIENT'
+ Multiplies the value of LVALUE by COEFFICIENT.
+
+`LVALUE /= QUOTIENT'
+ Divides the value of LVALUE by QUOTIENT.
+
+`LVALUE %= MODULUS'
+ Sets LVALUE to its remainder by MODULUS.
+
+`LVALUE ^= POWER'
+`LVALUE **= POWER'
+ Raises LVALUE to the power POWER. (Only the `^=' operator is
+ specified by POSIX.)
+
+
+File: gawk.info, Node: Increment Ops, Next: Conversion, Prev: Assignment Ops, Up: Expressions
+
+Increment Operators
+===================
+
+ "Increment operators" increase or decrease the value of a variable
+by 1. You could do the same thing with an assignment operator, so the
+increment operators add no power to the `awk' language; but they are
+convenient abbreviations for something very common.
+
+ The operator to add 1 is written `++'. It can be used to increment
+a variable either before or after taking its value.
+
+ To pre-increment a variable V, write `++V'. This adds 1 to the
+value of V and that new value is also the value of this expression.
+The assignment expression `V += 1' is completely equivalent.
+
+ Writing the `++' after the variable specifies post-increment. This
+increments the variable value just the same; the difference is that the
+value of the increment expression itself is the variable's *old* value.
+Thus, if `foo' has the value 4, then the expression `foo++' has the
+value 4, but it changes the value of `foo' to 5.
+
+ The post-increment `foo++' is nearly equivalent to writing `(foo +=
+1) - 1'. It is not perfectly equivalent because all numbers in `awk'
+are floating point: in floating point, `foo + 1 - 1' does not
+necessarily equal `foo'. But the difference is minute as long as you
+stick to numbers that are fairly small (less than a trillion).
+
+ Any lvalue can be incremented. Fields and array elements are
+incremented just like variables. (Use `$(i++)' when you wish to do a
+field reference and a variable increment at the same time. The
+parentheses are necessary because of the precedence of the field
+reference operator, `$'.)
+
+ The decrement operator `--' works just like `++' except that it
+subtracts 1 instead of adding. Like `++', it can be used before the
+lvalue to pre-decrement or after it to post-decrement.
+
+ Here is a summary of increment and decrement expressions.
+
+`++LVALUE'
+ This expression increments LVALUE and the new value becomes the
+ value of this expression.
+
+`LVALUE++'
+ This expression causes the contents of LVALUE to be incremented.
+ The value of the expression is the *old* value of LVALUE.
+
+`--LVALUE'
+ Like `++LVALUE', but instead of adding, it subtracts. It
+ decrements LVALUE and delivers the value that results.
+
+`LVALUE--'
+ Like `LVALUE++', but instead of adding, it subtracts. It
+ decrements LVALUE. The value of the expression is the *old* value
+ of LVALUE.
+
+
+File: gawk.info, Node: Conversion, Next: Values, Prev: Increment Ops, Up: Expressions
+
+Conversion of Strings and Numbers
+=================================
+
+ Strings are converted to numbers, and numbers to strings, if the
+context of the `awk' program demands it. For example, if the value of
+either `foo' or `bar' in the expression `foo + bar' happens to be a
+string, it is converted to a number before the addition is performed.
+If numeric values appear in string concatenation, they are converted to
+strings. Consider this:
+
+ two = 2; three = 3
+ print (two three) + 4
+
+This eventually prints the (numeric) value 27. The numeric values of
+the variables `two' and `three' are converted to strings and
+concatenated together, and the resulting string is converted back to the
+number 23, to which 4 is then added.
+
+ If, for some reason, you need to force a number to be converted to a
+string, concatenate the null string with that number. To force a string
+to be converted to a number, add zero to that string.
+
+ A string is converted to a number by interpreting a numeric prefix
+of the string as numerals: `"2.5"' converts to 2.5, `"1e3"' converts to
+1000, and `"25fix"' has a numeric value of 25. Strings that can't be
+interpreted as valid numbers are converted to zero.
+
+ The exact manner in which numbers are converted into strings is
+controlled by the `awk' built-in variable `CONVFMT' (*note Built-in
+Variables::.). Numbers are converted using a special version of the
+`sprintf' function (*note Built-in Functions: Built-in.) with `CONVFMT'
+as the format specifier.
+
+ `CONVFMT''s default value is `"%.6g"', which prints a value with at
+least six significant digits. For some applications you will want to
+change it to specify more precision. Double precision on most modern
+machines gives you 16 or 17 decimal digits of precision.
+
+ Strange results can happen if you set `CONVFMT' to a string that
+doesn't tell `sprintf' how to format floating point numbers in a useful
+way. For example, if you forget the `%' in the format, all numbers
+will be converted to the same constant string.
+
+ As a special case, if a number is an integer, then the result of
+converting it to a string is *always* an integer, no matter what the
+value of `CONVFMT' may be. Given the following code fragment:
+
+ CONVFMT = "%2.2f"
+ a = 12
+ b = a ""
+
+`b' has the value `"12"', not `"12.00"'.
+
+ Prior to the POSIX standard, `awk' specified that the value of
+`OFMT' was used for converting numbers to strings. `OFMT' specifies
+the output format to use when printing numbers with `print'. `CONVFMT'
+was introduced in order to separate the semantics of conversions from
+the semantics of printing. Both `CONVFMT' and `OFMT' have the same
+default value: `"%.6g"'. In the vast majority of cases, old `awk'
+programs will not change their behavior. However, this use of `OFMT'
+is something to keep in mind if you must port your program to other
+implementations of `awk'; we recommend that instead of changing your
+programs, you just port `gawk' itself!
+
+
+File: gawk.info, Node: Values, Next: Conditional Exp, Prev: Conversion, Up: Expressions
+
+Numeric and String Values
+=========================
+
+ Through most of this manual, we present `awk' values (such as
+constants, fields, or variables) as *either* numbers *or* strings.
+This is a convenient way to think about them, since typically they are
+used in only one way, or the other.
+
+ In truth though, `awk' values can be *both* string and numeric, at
+the same time. Internally, `awk' represents values with a string, a
+(floating point) number, and an indication that one, the other, or both
+representations of the value are valid.
+
+ Keeping track of both kinds of values is important for execution
+efficiency: a variable can acquire a string value the first time it is
+used as a string, and then that string value can be used until the
+variable is assigned a new value. Thus, if a variable with only a
+numeric value is used in several concatenations in a row, it only has
+to be given a string representation once. The numeric value remains
+valid, so that no conversion back to a number is necessary if the
+variable is later used in an arithmetic expression.
+
+ Tracking both kinds of values is also important for precise numerical
+calculations. Consider the following:
+
+ a = 123.321
+ CONVFMT = "%3.1f"
+ b = a " is a number"
+ c = a + 1.654
+
+The variable `a' receives a string value in the concatenation and
+assignment to `b'. The string value of `a' is `"123.3"'. If the
+numeric value was lost when it was converted to a string, then the
+numeric use of `a' in the last statement would lose information. `c'
+would be assigned the value 124.954 instead of 124.975. Such errors
+accumulate rapidly, and very adversely affect numeric computations.
+
+ Once a numeric value acquires a corresponding string value, it stays
+valid until a new assignment is made. If `CONVFMT' (*note Conversion
+of Strings and Numbers: Conversion.) changes in the meantime, the old
+string value will still be used. For example:
+
+ BEGIN {
+ CONVFMT = "%2.2f"
+ a = 123.456
+ b = a "" # force `a' to have string value too
+ printf "a = %s\n", a
+ CONVFMT = "%.6g"
+ printf "a = %s\n", a
+ a += 0 # make `a' numeric only again
+ printf "a = %s\n", a # use `a' as string
+ }
+
+This program prints `a = 123.46' twice, and then prints `a = 123.456'.
+
+ *Note Conversion of Strings and Numbers: Conversion, for the rules
+that specify how string values are made from numeric values.
+
+
+File: gawk.info, Node: Conditional Exp, Next: Function Calls, Prev: Values, Up: Expressions
+
+Conditional Expressions
+=======================
+
+ A "conditional expression" is a special kind of expression with
+three operands. It allows you to use one expression's value to select
+one of two other expressions.
+
+ The conditional expression looks the same as in the C language:
+
+ SELECTOR ? IF-TRUE-EXP : IF-FALSE-EXP
+
+There are three subexpressions. The first, SELECTOR, is always
+computed first. If it is "true" (not zero and not null) then
+IF-TRUE-EXP is computed next and its value becomes the value of the
+whole expression. Otherwise, IF-FALSE-EXP is computed next and its
+value becomes the value of the whole expression.
+
+ For example, this expression produces the absolute value of `x':
+
+ x > 0 ? x : -x
+
+ Each time the conditional expression is computed, exactly one of
+IF-TRUE-EXP and IF-FALSE-EXP is computed; the other is ignored. This
+is important when the expressions contain side effects. For example,
+this conditional expression examines element `i' of either array `a' or
+array `b', and increments `i'.
+
+ x == y ? a[i++] : b[i++]
+
+This is guaranteed to increment `i' exactly once, because each time one
+or the other of the two increment expressions is executed, and the
+other is not.
+
+
+File: gawk.info, Node: Function Calls, Next: Precedence, Prev: Conditional Exp, Up: Expressions
+
+Function Calls
+==============
+
+ A "function" is a name for a particular calculation. Because it has
+a name, you can ask for it by name at any point in the program. For
+example, the function `sqrt' computes the square root of a number.
+
+ A fixed set of functions are "built-in", which means they are
+available in every `awk' program. The `sqrt' function is one of these.
+*Note Built-in Functions: Built-in, for a list of built-in functions
+and their descriptions. In addition, you can define your own functions
+in the program for use elsewhere in the same program. *Note
+User-defined Functions: User-defined, for how to do this.
+
+ The way to use a function is with a "function call" expression,
+which consists of the function name followed by a list of "arguments"
+in parentheses. The arguments are expressions which give the raw
+materials for the calculation that the function will do. When there is
+more than one argument, they are separated by commas. If there are no
+arguments, write just `()' after the function name. Here are some
+examples:
+
+ sqrt(x^2 + y^2) # One argument
+ atan2(y, x) # Two arguments
+ rand() # No arguments
+
+ *Do not put any space between the function name and the
+open-parenthesis!* A user-defined function name looks just like the
+name of a variable, and space would make the expression look like
+concatenation of a variable with an expression inside parentheses.
+Space before the parenthesis is harmless with built-in functions, but
+it is best not to get into the habit of using space to avoid mistakes
+with user-defined functions.
+
+ Each function expects a particular number of arguments. For
+example, the `sqrt' function must be called with a single argument, the
+number to take the square root of:
+
+ sqrt(ARGUMENT)
+
+ Some of the built-in functions allow you to omit the final argument.
+If you do so, they use a reasonable default. *Note Built-in Functions:
+Built-in, for full details. If arguments are omitted in calls to
+user-defined functions, then those arguments are treated as local
+variables, initialized to the null string (*note User-defined
+Functions: User-defined.).
+
+ Like every other expression, the function call has a value, which is
+computed by the function based on the arguments you give it. In this
+example, the value of `sqrt(ARGUMENT)' is the square root of the
+argument. A function can also have side effects, such as assigning the
+values of certain variables or doing I/O.
+
+ Here is a command to read numbers, one number per line, and print the
+square root of each one:
+
+ awk '{ print "The square root of", $1, "is", sqrt($1) }'
+
+
+File: gawk.info, Node: Precedence, Prev: Function Calls, Up: Expressions
+
+Operator Precedence (How Operators Nest)
+========================================
+
+ "Operator precedence" determines how operators are grouped, when
+different operators appear close by in one expression. For example,
+`*' has higher precedence than `+'; thus, `a + b * c' means to multiply
+`b' and `c', and then add `a' to the product (i.e., `a + (b * c)').
+
+ You can overrule the precedence of the operators by using
+parentheses. You can think of the precedence rules as saying where the
+parentheses are assumed if you do not write parentheses yourself. In
+fact, it is wise to always use parentheses whenever you have an unusual
+combination of operators, because other people who read the program may
+not remember what the precedence is in this case. You might forget,
+too; then you could make a mistake. Explicit parentheses will help
+prevent any such mistake.
+
+ When operators of equal precedence are used together, the leftmost
+operator groups first, except for the assignment, conditional and
+exponentiation operators, which group in the opposite order. Thus, `a
+- b + c' groups as `(a - b) + c'; `a = b = c' groups as `a = (b = c)'.
+
+ The precedence of prefix unary operators does not matter as long as
+only unary operators are involved, because there is only one way to
+parse them--innermost first. Thus, `$++i' means `$(++i)' and `++$x'
+means `++($x)'. However, when another operator follows the operand,
+then the precedence of the unary operators can matter. Thus, `$x^2'
+means `($x)^2', but `-x^2' means `-(x^2)', because `-' has lower
+precedence than `^' while `$' has higher precedence.
+
+ Here is a table of the operators of `awk', in order of increasing
+precedence:
+
+assignment
+ `=', `+=', `-=', `*=', `/=', `%=', `^=', `**='. These operators
+ group right-to-left. (The `**=' operator is not specified by
+ POSIX.)
+
+conditional
+ `?:'. This operator groups right-to-left.
+
+logical "or".
+ `||'.
+
+logical "and".
+ `&&'.
+
+array membership
+ `in'.
+
+matching
+ `~', `!~'.
+
+relational, and redirection
+ The relational operators and the redirections have the same
+ precedence level. Characters such as `>' serve both as
+ relationals and as redirections; the context distinguishes between
+ the two meanings.
+
+ The relational operators are `<', `<=', `==', `!=', `>=' and `>'.
+
+ The I/O redirection operators are `<', `>', `>>' and `|'.
+
+ Note that I/O redirection operators in `print' and `printf'
+ statements belong to the statement level, not to expressions. The
+ redirection does not produce an expression which could be the
+ operand of another operator. As a result, it does not make sense
+ to use a redirection operator near another operator of lower
+ precedence, without parentheses. Such combinations, for example
+ `print foo > a ? b : c', result in syntax errors.
+
+concatenation
+ No special token is used to indicate concatenation. The operands
+ are simply written side by side.
+
+add, subtract
+ `+', `-'.
+
+multiply, divide, mod
+ `*', `/', `%'.
+
+unary plus, minus, "not"
+ `+', `-', `!'.
+
+exponentiation
+ `^', `**'. These operators group right-to-left. (The `**'
+ operator is not specified by POSIX.)
+
+increment, decrement
+ `++', `--'.
+
+field
+ `$'.
+
+
+File: gawk.info, Node: Statements, Next: Arrays, Prev: Expressions, Up: Top
+
+Control Statements in Actions
+*****************************
+
+ "Control statements" such as `if', `while', and so on control the
+flow of execution in `awk' programs. Most of the control statements in
+`awk' are patterned on similar statements in C.
+
+ All the control statements start with special keywords such as `if'
+and `while', to distinguish them from simple expressions.
+
+ Many control statements contain other statements; for example, the
+`if' statement contains another statement which may or may not be
+executed. The contained statement is called the "body". If you want
+to include more than one statement in the body, group them into a
+single compound statement with curly braces, separating them with
+newlines or semicolons.
+
+* Menu:
+
+* If Statement:: Conditionally execute
+ some `awk' statements.
+* While Statement:: Loop until some condition is satisfied.
+* Do Statement:: Do specified action while looping until some
+ condition is satisfied.
+* For Statement:: Another looping statement, that provides
+ initialization and increment clauses.
+* Break Statement:: Immediately exit the innermost enclosing loop.
+* Continue Statement:: Skip to the end of the innermost
+ enclosing loop.
+* Next Statement:: Stop processing the current input record.
+* Next File Statement:: Stop processing the current file.
+* Exit Statement:: Stop execution of `awk'.
+
+
+File: gawk.info, Node: If Statement, Next: While Statement, Prev: Statements, Up: Statements
+
+The `if' Statement
+==================
+
+ The `if'-`else' statement is `awk''s decision-making statement. It
+looks like this:
+
+ if (CONDITION) THEN-BODY [else ELSE-BODY]
+
+CONDITION is an expression that controls what the rest of the statement
+will do. If CONDITION is true, THEN-BODY is executed; otherwise,
+ELSE-BODY is executed (assuming that the `else' clause is present).
+The `else' part of the statement is optional. The condition is
+considered false if its value is zero or the null string, and true
+otherwise.
+
+ Here is an example:
+
+ if (x % 2 == 0)
+ print "x is even"
+ else
+ print "x is odd"
+
+ In this example, if the expression `x % 2 == 0' is true (that is,
+the value of `x' is divisible by 2), then the first `print' statement
+is executed, otherwise the second `print' statement is performed.
+
+ If the `else' appears on the same line as THEN-BODY, and THEN-BODY
+is not a compound statement (i.e., not surrounded by curly braces),
+then a semicolon must separate THEN-BODY from `else'. To illustrate
+this, let's rewrite the previous example:
+
+ awk '{ if (x % 2 == 0) print "x is even"; else
+ print "x is odd" }'
+
+If you forget the `;', `awk' won't be able to parse the statement, and
+you will get a syntax error.
+
+ We would not actually write this example this way, because a human
+reader might fail to see the `else' if it were not the first thing on
+its line.
+
+
+File: gawk.info, Node: While Statement, Next: Do Statement, Prev: If Statement, Up: Statements
+
+The `while' Statement
+=====================
+
+ In programming, a "loop" means a part of a program that is (or at
+least can be) executed two or more times in succession.
+
+ The `while' statement is the simplest looping statement in `awk'.
+It repeatedly executes a statement as long as a condition is true. It
+looks like this:
+
+ while (CONDITION)
+ BODY
+
+Here BODY is a statement that we call the "body" of the loop, and
+CONDITION is an expression that controls how long the loop keeps
+running.
+
+ The first thing the `while' statement does is test CONDITION. If
+CONDITION is true, it executes the statement BODY. (CONDITION is true
+when the value is not zero and not a null string.) After BODY has been
+executed, CONDITION is tested again, and if it is still true, BODY is
+executed again. This process repeats until CONDITION is no longer
+true. If CONDITION is initially false, the body of the loop is never
+executed.
+
+ This example prints the first three fields of each record, one per
+line.
+
+ awk '{ i = 1
+ while (i <= 3) {
+ print $i
+ i++
+ }
+ }'
+
+Here the body of the loop is a compound statement enclosed in braces,
+containing two statements.
+
+ The loop works like this: first, the value of `i' is set to 1.
+Then, the `while' tests whether `i' is less than or equal to three.
+This is the case when `i' equals one, so the `i'-th field is printed.
+Then the `i++' increments the value of `i' and the loop repeats. The
+loop terminates when `i' reaches 4.
+
+ As you can see, a newline is not required between the condition and
+the body; but using one makes the program clearer unless the body is a
+compound statement or is very simple. The newline after the open-brace
+that begins the compound statement is not required either, but the
+program would be hard to read without it.
+
+
+File: gawk.info, Node: Do Statement, Next: For Statement, Prev: While Statement, Up: Statements
+
+The `do'-`while' Statement
+==========================
+
+ The `do' loop is a variation of the `while' looping statement. The
+`do' loop executes the BODY once, then repeats BODY as long as
+CONDITION is true. It looks like this:
+
+ do
+ BODY
+ while (CONDITION)
+
+ Even if CONDITION is false at the start, BODY is executed at least
+once (and only once, unless executing BODY makes CONDITION true).
+Contrast this with the corresponding `while' statement:
+
+ while (CONDITION)
+ BODY
+
+This statement does not execute BODY even once if CONDITION is false to
+begin with.
+
+ Here is an example of a `do' statement:
+
+ awk '{ i = 1
+ do {
+ print $0
+ i++
+ } while (i <= 10)
+ }'
+
+prints each input record ten times. It isn't a very realistic example,
+since in this case an ordinary `while' would do just as well. But this
+reflects actual experience; there is only occasionally a real use for a
+`do' statement.
+