diff options
Diffstat (limited to 'gawk.info-4')
-rw-r--r-- | gawk.info-4 | 1305 |
1 files changed, 0 insertions, 1305 deletions
diff --git a/gawk.info-4 b/gawk.info-4 deleted file mode 100644 index 100e1d25..00000000 --- a/gawk.info-4 +++ /dev/null @@ -1,1305 +0,0 @@ -This is Info file gawk.info, produced by Makeinfo-1.54 from the input -file gawk.texi. - - This file documents `awk', a program that you can use to select -particular records in a file and perform operations upon them. - - This is Edition 0.15 of `The GAWK Manual', -for the 2.15 version of the GNU implementation -of AWK. - - Copyright (C) 1989, 1991, 1992, 1993 Free Software Foundation, Inc. - - Permission is granted to make and distribute verbatim copies of this -manual provided the copyright notice and this permission notice are -preserved on all copies. - - Permission is granted to copy and distribute modified versions of -this manual under the conditions for verbatim copying, provided that -the entire resulting derived work is distributed under the terms of a -permission notice identical to this one. - - Permission is granted to copy and distribute translations of this -manual into another language, under the above conditions for modified -versions, except that this permission notice may be stated in a -translation approved by the Foundation. - - -File: gawk.info, Node: Actions, Next: Expressions, Prev: Patterns, Up: Top - -Overview of Actions -******************* - - An `awk' program or script consists of a series of rules and -function definitions, interspersed. (Functions are described later. -*Note User-defined Functions: User-defined.) - - A rule contains a pattern and an action, either of which may be -omitted. The purpose of the "action" is to tell `awk' what to do once -a match for the pattern is found. Thus, the entire program looks -somewhat like this: - - [PATTERN] [{ ACTION }] - [PATTERN] [{ ACTION }] - ... - function NAME (ARGS) { ... } - ... - - An action consists of one or more `awk' "statements", enclosed in -curly braces (`{' and `}'). Each statement specifies one thing to be -done. The statements are separated by newlines or semicolons. - - The curly braces around an action must be used even if the action -contains only one statement, or even if it contains no statements at -all. However, if you omit the action entirely, omit the curly braces as -well. (An omitted action is equivalent to `{ print $0 }'.) - - Here are the kinds of statements supported in `awk': - - * Expressions, which can call functions or assign values to variables - (*note Expressions as Action Statements: Expressions.). Executing - this kind of statement simply computes the value of the expression - and then ignores it. This is useful when the expression has side - effects (*note Assignment Expressions: Assignment Ops.). - - * Control statements, which specify the control flow of `awk' - programs. The `awk' language gives you C-like constructs (`if', - `for', `while', and so on) as well as a few special ones (*note - Control Statements in Actions: Statements.). - - * Compound statements, which consist of one or more statements - enclosed in curly braces. A compound statement is used in order - to put several statements together in the body of an `if', - `while', `do' or `for' statement. - - * Input control, using the `getline' command (*note Explicit Input - with `getline': Getline.), and the `next' statement (*note The - `next' Statement: Next Statement.). - - * Output statements, `print' and `printf'. *Note Printing Output: - Printing. - - * Deletion statements, for deleting array elements. *Note The - `delete' Statement: Delete. - - -File: gawk.info, Node: Expressions, Next: Statements, Prev: Actions, Up: Top - -Expressions as Action Statements -******************************** - - Expressions are the basic building block of `awk' actions. An -expression evaluates to a value, which you can print, test, store in a -variable or pass to a function. But beyond that, an expression can -assign a new value to a variable or a field, with an assignment -operator. - - An expression can serve as a statement on its own. Most other kinds -of statements contain one or more expressions which specify data to be -operated on. As in other languages, expressions in `awk' include -variables, array references, constants, and function calls, as well as -combinations of these with various operators. - -* Menu: - -* Constants:: String, numeric, and regexp constants. -* Variables:: Variables give names to values for later use. -* Arithmetic Ops:: Arithmetic operations (`+', `-', etc.) -* Concatenation:: Concatenating strings. -* Comparison Ops:: Comparison of numbers and strings - with `<', etc. -* Boolean Ops:: Combining comparison expressions - using boolean operators - `||' ("or"), `&&' ("and") and `!' ("not"). - -* Assignment Ops:: Changing the value of a variable or a field. -* Increment Ops:: Incrementing the numeric value of a variable. - -* Conversion:: The conversion of strings to numbers - and vice versa. -* Values:: The whole truth about numbers and strings. -* Conditional Exp:: Conditional expressions select - between two subexpressions under control - of a third subexpression. -* Function Calls:: A function call is an expression. -* Precedence:: How various operators nest. - - -File: gawk.info, Node: Constants, Next: Variables, Prev: Expressions, Up: Expressions - -Constant Expressions -==================== - - The simplest type of expression is the "constant", which always has -the same value. There are three types of constants: numeric constants, -string constants, and regular expression constants. - - A "numeric constant" stands for a number. This number can be an -integer, a decimal fraction, or a number in scientific (exponential) -notation. Note that all numeric values are represented within `awk' in -double-precision floating point. Here are some examples of numeric -constants, which all have the same value: - - 105 - 1.05e+2 - 1050e-1 - - A string constant consists of a sequence of characters enclosed in -double-quote marks. For example: - - "parrot" - -represents the string whose contents are `parrot'. Strings in `gawk' -can be of any length and they can contain all the possible 8-bit ASCII -characters including ASCII NUL. Other `awk' implementations may have -difficulty with some character codes. - - Some characters cannot be included literally in a string constant. -You represent them instead with "escape sequences", which are character -sequences beginning with a backslash (`\'). - - One use of an escape sequence is to include a double-quote character -in a string constant. Since a plain double-quote would end the string, -you must use `\"' to represent a single double-quote character as a -part of the string. The backslash character itself is another -character that cannot be included normally; you write `\\' to put one -backslash in the string. Thus, the string whose contents are the two -characters `"\' must be written `"\"\\"'. - - Another use of backslash is to represent unprintable characters such -as newline. While there is nothing to stop you from writing most of -these characters directly in a string constant, they may look ugly. - - Here is a table of all the escape sequences used in `awk': - -`\\' - Represents a literal backslash, `\'. - -`\a' - Represents the "alert" character, control-g, ASCII code 7. - -`\b' - Represents a backspace, control-h, ASCII code 8. - -`\f' - Represents a formfeed, control-l, ASCII code 12. - -`\n' - Represents a newline, control-j, ASCII code 10. - -`\r' - Represents a carriage return, control-m, ASCII code 13. - -`\t' - Represents a horizontal tab, control-i, ASCII code 9. - -`\v' - Represents a vertical tab, control-k, ASCII code 11. - -`\NNN' - Represents the octal value NNN, where NNN are one to three digits - between 0 and 7. For example, the code for the ASCII ESC (escape) - character is `\033'. - -`\xHH...' - Represents the hexadecimal value HH, where HH are hexadecimal - digits (`0' through `9' and either `A' through `F' or `a' through - `f'). Like the same construct in ANSI C, the escape sequence - continues until the first non-hexadecimal digit is seen. However, - using more than two hexadecimal digits produces undefined results. - (The `\x' escape sequence is not allowed in POSIX `awk'.) - - A "constant regexp" is a regular expression description enclosed in -slashes, such as `/^beginning and end$/'. Most regexps used in `awk' -programs are constant, but the `~' and `!~' operators can also match -computed or "dynamic" regexps (*note How to Use Regular Expressions: -Regexp Usage.). - - Constant regexps may be used like simple expressions. When a -constant regexp is not on the right hand side of the `~' or `!~' -operators, it has the same meaning as if it appeared in a pattern, i.e. -`($0 ~ /foo/)' (*note Expressions as Patterns: Expression Patterns.). -This means that the two code segments, - - if ($0 ~ /barfly/ || $0 ~ /camelot/) - print "found" - -and - - if (/barfly/ || /camelot/) - print "found" - -are exactly equivalent. One rather bizarre consequence of this rule is -that the following boolean expression is legal, but does not do what -the user intended: - - if (/foo/ ~ $1) print "found foo" - - This code is "obviously" testing `$1' for a match against the regexp -`/foo/'. But in fact, the expression `(/foo/ ~ $1)' actually means -`(($0 ~ /foo/) ~ $1)'. In other words, first match the input record -against the regexp `/foo/'. The result will be either a 0 or a 1, -depending upon the success or failure of the match. Then match that -result against the first field in the record. - - Since it is unlikely that you would ever really wish to make this -kind of test, `gawk' will issue a warning when it sees this construct in -a program. - - Another consequence of this rule is that the assignment statement - - matches = /foo/ - -will assign either 0 or 1 to the variable `matches', depending upon the -contents of the current input record. - - Constant regular expressions are also used as the first argument for -the `sub' and `gsub' functions (*note Built-in Functions for String -Manipulation: String Functions.). - - This feature of the language was never well documented until the -POSIX specification. - - You may be wondering, when is - - $1 ~ /foo/ { ... } - -preferable to - - $1 ~ "foo" { ... } - - Since the right-hand sides of both `~' operators are constants, it -is more efficient to use the `/foo/' form: `awk' can note that you have -supplied a regexp and store it internally in a form that makes pattern -matching more efficient. In the second form, `awk' must first convert -the string into this internal form, and then perform the pattern -matching. The first form is also better style; it shows clearly that -you intend a regexp match. - - -File: gawk.info, Node: Variables, Next: Arithmetic Ops, Prev: Constants, Up: Expressions - -Variables -========= - - Variables let you give names to values and refer to them later. You -have already seen variables in many of the examples. The name of a -variable must be a sequence of letters, digits and underscores, but it -may not begin with a digit. Case is significant in variable names; `a' -and `A' are distinct variables. - - A variable name is a valid expression by itself; it represents the -variable's current value. Variables are given new values with -"assignment operators" and "increment operators". *Note Assignment -Expressions: Assignment Ops. - - A few variables have special built-in meanings, such as `FS', the -field separator, and `NF', the number of fields in the current input -record. *Note Built-in Variables::, for a list of them. These -built-in variables can be used and assigned just like all other -variables, but their values are also used or changed automatically by -`awk'. Each built-in variable's name is made entirely of upper case -letters. - - Variables in `awk' can be assigned either numeric or string values. -By default, variables are initialized to the null string, which is -effectively zero if converted to a number. There is no need to -"initialize" each variable explicitly in `awk', the way you would in C -or most other traditional languages. - -* Menu: - -* Assignment Options:: Setting variables on the command line - and a summary of command line syntax. - This is an advanced method of input. - - -File: gawk.info, Node: Assignment Options, Prev: Variables, Up: Variables - -Assigning Variables on the Command Line ---------------------------------------- - - You can set any `awk' variable by including a "variable assignment" -among the arguments on the command line when you invoke `awk' (*note -Invoking `awk': Command Line.). Such an assignment has this form: - - VARIABLE=TEXT - -With it, you can set a variable either at the beginning of the `awk' -run or in between input files. - - If you precede the assignment with the `-v' option, like this: - - -v VARIABLE=TEXT - -then the variable is set at the very beginning, before even the `BEGIN' -rules are run. The `-v' option and its assignment must precede all the -file name arguments, as well as the program text. - - Otherwise, the variable assignment is performed at a time determined -by its position among the input file arguments: after the processing of -the preceding input file argument. For example: - - awk '{ print $n }' n=4 inventory-shipped n=2 BBS-list - -prints the value of field number `n' for all input records. Before the -first file is read, the command line sets the variable `n' equal to 4. -This causes the fourth field to be printed in lines from the file -`inventory-shipped'. After the first file has finished, but before the -second file is started, `n' is set to 2, so that the second field is -printed in lines from `BBS-list'. - - Command line arguments are made available for explicit examination by -the `awk' program in an array named `ARGV' (*note Built-in -Variables::.). - - `awk' processes the values of command line assignments for escape -sequences (*note Constant Expressions: Constants.). - - -File: gawk.info, Node: Arithmetic Ops, Next: Concatenation, Prev: Variables, Up: Expressions - -Arithmetic Operators -==================== - - The `awk' language uses the common arithmetic operators when -evaluating expressions. All of these arithmetic operators follow normal -precedence rules, and work as you would expect them to. This example -divides field three by field four, adds field two, stores the result -into field one, and prints the resulting altered input record: - - awk '{ $1 = $2 + $3 / $4; print }' inventory-shipped - - The arithmetic operators in `awk' are: - -`X + Y' - Addition. - -`X - Y' - Subtraction. - -`- X' - Negation. - -`+ X' - Unary plus. No real effect on the expression. - -`X * Y' - Multiplication. - -`X / Y' - Division. Since all numbers in `awk' are double-precision - floating point, the result is not rounded to an integer: `3 / 4' - has the value 0.75. - -`X % Y' - Remainder. The quotient is rounded toward zero to an integer, - multiplied by Y and this result is subtracted from X. This - operation is sometimes known as "trunc-mod." The following - relation always holds: - - b * int(a / b) + (a % b) == a - - One possibly undesirable effect of this definition of remainder is - that `X % Y' is negative if X is negative. Thus, - - -17 % 8 = -1 - - In other `awk' implementations, the signedness of the remainder - may be machine dependent. - -`X ^ Y' -`X ** Y' - Exponentiation: X raised to the Y power. `2 ^ 3' has the value 8. - The character sequence `**' is equivalent to `^'. (The POSIX - standard only specifies the use of `^' for exponentiation.) - - -File: gawk.info, Node: Concatenation, Next: Comparison Ops, Prev: Arithmetic Ops, Up: Expressions - -String Concatenation -==================== - - There is only one string operation: concatenation. It does not have -a specific operator to represent it. Instead, concatenation is -performed by writing expressions next to one another, with no operator. -For example: - - awk '{ print "Field number one: " $1 }' BBS-list - -produces, for the first record in `BBS-list': - - Field number one: aardvark - - Without the space in the string constant after the `:', the line -would run together. For example: - - awk '{ print "Field number one:" $1 }' BBS-list - -produces, for the first record in `BBS-list': - - Field number one:aardvark - - Since string concatenation does not have an explicit operator, it is -often necessary to insure that it happens where you want it to by -enclosing the items to be concatenated in parentheses. For example, the -following code fragment does not concatenate `file' and `name' as you -might expect: - - file = "file" - name = "name" - print "something meaningful" > file name - -It is necessary to use the following: - - print "something meaningful" > (file name) - - We recommend you use parentheses around concatenation in all but the -most common contexts (such as in the right-hand operand of `='). - - -File: gawk.info, Node: Comparison Ops, Next: Boolean Ops, Prev: Concatenation, Up: Expressions - -Comparison Expressions -====================== - - "Comparison expressions" compare strings or numbers for -relationships such as equality. They are written using "relational -operators", which are a superset of those in C. Here is a table of -them: - -`X < Y' - True if X is less than Y. - -`X <= Y' - True if X is less than or equal to Y. - -`X > Y' - True if X is greater than Y. - -`X >= Y' - True if X is greater than or equal to Y. - -`X == Y' - True if X is equal to Y. - -`X != Y' - True if X is not equal to Y. - -`X ~ Y' - True if the string X matches the regexp denoted by Y. - -`X !~ Y' - True if the string X does not match the regexp denoted by Y. - -`SUBSCRIPT in ARRAY' - True if array ARRAY has an element with the subscript SUBSCRIPT. - - Comparison expressions have the value 1 if true and 0 if false. - - The rules `gawk' uses for performing comparisons are based on those -in draft 11.2 of the POSIX standard. The POSIX standard introduced the -concept of a "numeric string", which is simply a string that looks like -a number, for example, `" +2"'. - - When performing a relational operation, `gawk' considers the type of -an operand to be the type it received on its last *assignment*, rather -than the type of its last *use* (*note Numeric and String Values: -Values.). This type is *unknown* when the operand is from an -"external" source: field variables, command line arguments, array -elements resulting from a `split' operation, and the value of an -`ENVIRON' element. In this case only, if the operand is a numeric -string, then it is considered to be of both string type and numeric -type. If at least one operand of a comparison is of string type only, -then a string comparison is performed. Any numeric operand will be -converted to a string using the value of `CONVFMT' (*note Conversion of -Strings and Numbers: Conversion.). If one operand of a comparison is -numeric, and the other operand is either numeric or both numeric and -string, then `gawk' does a numeric comparison. If both operands have -both types, then the comparison is numeric. Strings are compared by -comparing the first character of each, then the second character of -each, and so on. Thus `"10"' is less than `"9"'. If there are two -strings where one is a prefix of the other, the shorter string is less -than the longer one. Thus `"abc"' is less than `"abcd"'. - - Here are some sample expressions, how `gawk' compares them, and what -the result of the comparison is. - -`1.5 <= 2.0' - numeric comparison (true) - -`"abc" >= "xyz"' - string comparison (false) - -`1.5 != " +2"' - string comparison (true) - -`"1e2" < "3"' - string comparison (true) - -`a = 2; b = "2"' -`a == b' - string comparison (true) - - echo 1e2 3 | awk '{ print ($1 < $2) ? "true" : "false" }' - -prints `false' since both `$1' and `$2' are numeric strings and thus -have both string and numeric types, thus dictating a numeric comparison. - - The purpose of the comparison rules and the use of numeric strings is -to attempt to produce the behavior that is "least surprising," while -still "doing the right thing." - - String comparisons and regular expression comparisons are very -different. For example, - - $1 == "foo" - -has the value of 1, or is true, if the first field of the current input -record is precisely `foo'. By contrast, - - $1 ~ /foo/ - -has the value 1 if the first field contains `foo', such as `foobar'. - - The right hand operand of the `~' and `!~' operators may be either a -constant regexp (`/.../'), or it may be an ordinary expression, in -which case the value of the expression as a string is a dynamic regexp -(*note How to Use Regular Expressions: Regexp Usage.). - - In very recent implementations of `awk', a constant regular -expression in slashes by itself is also an expression. The regexp -`/REGEXP/' is an abbreviation for this comparison expression: - - $0 ~ /REGEXP/ - - In some contexts it may be necessary to write parentheses around the -regexp to avoid confusing the `gawk' parser. For example, `(/x/ - /y/) -> threshold' is not allowed, but `((/x/) - (/y/)) > threshold' parses -properly. - - One special place where `/foo/' is *not* an abbreviation for `$0 ~ -/foo/' is when it is the right-hand operand of `~' or `!~'! *Note -Constant Expressions: Constants, where this is discussed in more detail. - - -File: gawk.info, Node: Boolean Ops, Next: Assignment Ops, Prev: Comparison Ops, Up: Expressions - -Boolean Expressions -=================== - - A "boolean expression" is a combination of comparison expressions or -matching expressions, using the boolean operators "or" (`||'), "and" -(`&&'), and "not" (`!'), along with parentheses to control nesting. -The truth of the boolean expression is computed by combining the truth -values of the component expressions. - - Boolean expressions can be used wherever comparison and matching -expressions can be used. They can be used in `if', `while' `do' and -`for' statements. They have numeric values (1 if true, 0 if false), -which come into play if the result of the boolean expression is stored -in a variable, or used in arithmetic. - - In addition, every boolean expression is also a valid boolean -pattern, so you can use it as a pattern to control the execution of -rules. - - Here are descriptions of the three boolean operators, with an -example of each. It may be instructive to compare these examples with -the analogous examples of boolean patterns (*note Boolean Operators and -Patterns: Boolean Patterns.), which use the same boolean operators in -patterns instead of expressions. - -`BOOLEAN1 && BOOLEAN2' - True if both BOOLEAN1 and BOOLEAN2 are true. For example, the - following statement prints the current input record if it contains - both `2400' and `foo'. - - if ($0 ~ /2400/ && $0 ~ /foo/) print - - The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is true. - This can make a difference when BOOLEAN2 contains expressions that - have side effects: in the case of `$0 ~ /foo/ && ($2 == bar++)', - the variable `bar' is not incremented if there is no `foo' in the - record. - -`BOOLEAN1 || BOOLEAN2' - True if at least one of BOOLEAN1 or BOOLEAN2 is true. For - example, the following command prints all records in the input - file `BBS-list' that contain *either* `2400' or `foo', or both. - - awk '{ if ($0 ~ /2400/ || $0 ~ /foo/) print }' BBS-list - - The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is false. - This can make a difference when BOOLEAN2 contains expressions - that have side effects. - -`!BOOLEAN' - True if BOOLEAN is false. For example, the following program - prints all records in the input file `BBS-list' that do *not* - contain the string `foo'. - - awk '{ if (! ($0 ~ /foo/)) print }' BBS-list - - -File: gawk.info, Node: Assignment Ops, Next: Increment Ops, Prev: Boolean Ops, Up: Expressions - -Assignment Expressions -====================== - - An "assignment" is an expression that stores a new value into a -variable. For example, let's assign the value 1 to the variable `z': - - z = 1 - - After this expression is executed, the variable `z' has the value 1. -Whatever old value `z' had before the assignment is forgotten. - - Assignments can store string values also. For example, this would -store the value `"this food is good"' in the variable `message': - - thing = "food" - predicate = "good" - message = "this " thing " is " predicate - -(This also illustrates concatenation of strings.) - - The `=' sign is called an "assignment operator". It is the simplest -assignment operator because the value of the right-hand operand is -stored unchanged. - - Most operators (addition, concatenation, and so on) have no effect -except to compute a value. If you ignore the value, you might as well -not use the operator. An assignment operator is different; it does -produce a value, but even if you ignore the value, the assignment still -makes itself felt through the alteration of the variable. We call this -a "side effect". - - The left-hand operand of an assignment need not be a variable (*note -Variables::.); it can also be a field (*note Changing the Contents of a -Field: Changing Fields.) or an array element (*note Arrays in `awk': -Arrays.). These are all called "lvalues", which means they can appear -on the left-hand side of an assignment operator. The right-hand -operand may be any expression; it produces the new value which the -assignment stores in the specified variable, field or array element. - - It is important to note that variables do *not* have permanent types. -The type of a variable is simply the type of whatever value it happens -to hold at the moment. In the following program fragment, the variable -`foo' has a numeric value at first, and a string value later on: - - foo = 1 - print foo - foo = "bar" - print foo - -When the second assignment gives `foo' a string value, the fact that it -previously had a numeric value is forgotten. - - An assignment is an expression, so it has a value: the same value -that is assigned. Thus, `z = 1' as an expression has the value 1. One -consequence of this is that you can write multiple assignments together: - - x = y = z = 0 - -stores the value 0 in all three variables. It does this because the -value of `z = 0', which is 0, is stored into `y', and then the value of -`y = z = 0', which is 0, is stored into `x'. - - You can use an assignment anywhere an expression is called for. For -example, it is valid to write `x != (y = 1)' to set `y' to 1 and then -test whether `x' equals 1. But this style tends to make programs hard -to read; except in a one-shot program, you should rewrite it to get rid -of such nesting of assignments. This is never very hard. - - Aside from `=', there are several other assignment operators that do -arithmetic with the old value of the variable. For example, the -operator `+=' computes a new value by adding the right-hand value to -the old value of the variable. Thus, the following assignment adds 5 -to the value of `foo': - - foo += 5 - -This is precisely equivalent to the following: - - foo = foo + 5 - -Use whichever one makes the meaning of your program clearer. - - Here is a table of the arithmetic assignment operators. In each -case, the right-hand operand is an expression whose value is converted -to a number. - -`LVALUE += INCREMENT' - Adds INCREMENT to the value of LVALUE to make the new value of - LVALUE. - -`LVALUE -= DECREMENT' - Subtracts DECREMENT from the value of LVALUE. - -`LVALUE *= COEFFICIENT' - Multiplies the value of LVALUE by COEFFICIENT. - -`LVALUE /= QUOTIENT' - Divides the value of LVALUE by QUOTIENT. - -`LVALUE %= MODULUS' - Sets LVALUE to its remainder by MODULUS. - -`LVALUE ^= POWER' -`LVALUE **= POWER' - Raises LVALUE to the power POWER. (Only the `^=' operator is - specified by POSIX.) - - -File: gawk.info, Node: Increment Ops, Next: Conversion, Prev: Assignment Ops, Up: Expressions - -Increment Operators -=================== - - "Increment operators" increase or decrease the value of a variable -by 1. You could do the same thing with an assignment operator, so the -increment operators add no power to the `awk' language; but they are -convenient abbreviations for something very common. - - The operator to add 1 is written `++'. It can be used to increment -a variable either before or after taking its value. - - To pre-increment a variable V, write `++V'. This adds 1 to the -value of V and that new value is also the value of this expression. -The assignment expression `V += 1' is completely equivalent. - - Writing the `++' after the variable specifies post-increment. This -increments the variable value just the same; the difference is that the -value of the increment expression itself is the variable's *old* value. -Thus, if `foo' has the value 4, then the expression `foo++' has the -value 4, but it changes the value of `foo' to 5. - - The post-increment `foo++' is nearly equivalent to writing `(foo += -1) - 1'. It is not perfectly equivalent because all numbers in `awk' -are floating point: in floating point, `foo + 1 - 1' does not -necessarily equal `foo'. But the difference is minute as long as you -stick to numbers that are fairly small (less than a trillion). - - Any lvalue can be incremented. Fields and array elements are -incremented just like variables. (Use `$(i++)' when you wish to do a -field reference and a variable increment at the same time. The -parentheses are necessary because of the precedence of the field -reference operator, `$'.) - - The decrement operator `--' works just like `++' except that it -subtracts 1 instead of adding. Like `++', it can be used before the -lvalue to pre-decrement or after it to post-decrement. - - Here is a summary of increment and decrement expressions. - -`++LVALUE' - This expression increments LVALUE and the new value becomes the - value of this expression. - -`LVALUE++' - This expression causes the contents of LVALUE to be incremented. - The value of the expression is the *old* value of LVALUE. - -`--LVALUE' - Like `++LVALUE', but instead of adding, it subtracts. It - decrements LVALUE and delivers the value that results. - -`LVALUE--' - Like `LVALUE++', but instead of adding, it subtracts. It - decrements LVALUE. The value of the expression is the *old* value - of LVALUE. - - -File: gawk.info, Node: Conversion, Next: Values, Prev: Increment Ops, Up: Expressions - -Conversion of Strings and Numbers -================================= - - Strings are converted to numbers, and numbers to strings, if the -context of the `awk' program demands it. For example, if the value of -either `foo' or `bar' in the expression `foo + bar' happens to be a -string, it is converted to a number before the addition is performed. -If numeric values appear in string concatenation, they are converted to -strings. Consider this: - - two = 2; three = 3 - print (two three) + 4 - -This eventually prints the (numeric) value 27. The numeric values of -the variables `two' and `three' are converted to strings and -concatenated together, and the resulting string is converted back to the -number 23, to which 4 is then added. - - If, for some reason, you need to force a number to be converted to a -string, concatenate the null string with that number. To force a string -to be converted to a number, add zero to that string. - - A string is converted to a number by interpreting a numeric prefix -of the string as numerals: `"2.5"' converts to 2.5, `"1e3"' converts to -1000, and `"25fix"' has a numeric value of 25. Strings that can't be -interpreted as valid numbers are converted to zero. - - The exact manner in which numbers are converted into strings is -controlled by the `awk' built-in variable `CONVFMT' (*note Built-in -Variables::.). Numbers are converted using a special version of the -`sprintf' function (*note Built-in Functions: Built-in.) with `CONVFMT' -as the format specifier. - - `CONVFMT''s default value is `"%.6g"', which prints a value with at -least six significant digits. For some applications you will want to -change it to specify more precision. Double precision on most modern -machines gives you 16 or 17 decimal digits of precision. - - Strange results can happen if you set `CONVFMT' to a string that -doesn't tell `sprintf' how to format floating point numbers in a useful -way. For example, if you forget the `%' in the format, all numbers -will be converted to the same constant string. - - As a special case, if a number is an integer, then the result of -converting it to a string is *always* an integer, no matter what the -value of `CONVFMT' may be. Given the following code fragment: - - CONVFMT = "%2.2f" - a = 12 - b = a "" - -`b' has the value `"12"', not `"12.00"'. - - Prior to the POSIX standard, `awk' specified that the value of -`OFMT' was used for converting numbers to strings. `OFMT' specifies -the output format to use when printing numbers with `print'. `CONVFMT' -was introduced in order to separate the semantics of conversions from -the semantics of printing. Both `CONVFMT' and `OFMT' have the same -default value: `"%.6g"'. In the vast majority of cases, old `awk' -programs will not change their behavior. However, this use of `OFMT' -is something to keep in mind if you must port your program to other -implementations of `awk'; we recommend that instead of changing your -programs, you just port `gawk' itself! - - -File: gawk.info, Node: Values, Next: Conditional Exp, Prev: Conversion, Up: Expressions - -Numeric and String Values -========================= - - Through most of this manual, we present `awk' values (such as -constants, fields, or variables) as *either* numbers *or* strings. -This is a convenient way to think about them, since typically they are -used in only one way, or the other. - - In truth though, `awk' values can be *both* string and numeric, at -the same time. Internally, `awk' represents values with a string, a -(floating point) number, and an indication that one, the other, or both -representations of the value are valid. - - Keeping track of both kinds of values is important for execution -efficiency: a variable can acquire a string value the first time it is -used as a string, and then that string value can be used until the -variable is assigned a new value. Thus, if a variable with only a -numeric value is used in several concatenations in a row, it only has -to be given a string representation once. The numeric value remains -valid, so that no conversion back to a number is necessary if the -variable is later used in an arithmetic expression. - - Tracking both kinds of values is also important for precise numerical -calculations. Consider the following: - - a = 123.321 - CONVFMT = "%3.1f" - b = a " is a number" - c = a + 1.654 - -The variable `a' receives a string value in the concatenation and -assignment to `b'. The string value of `a' is `"123.3"'. If the -numeric value was lost when it was converted to a string, then the -numeric use of `a' in the last statement would lose information. `c' -would be assigned the value 124.954 instead of 124.975. Such errors -accumulate rapidly, and very adversely affect numeric computations. - - Once a numeric value acquires a corresponding string value, it stays -valid until a new assignment is made. If `CONVFMT' (*note Conversion -of Strings and Numbers: Conversion.) changes in the meantime, the old -string value will still be used. For example: - - BEGIN { - CONVFMT = "%2.2f" - a = 123.456 - b = a "" # force `a' to have string value too - printf "a = %s\n", a - CONVFMT = "%.6g" - printf "a = %s\n", a - a += 0 # make `a' numeric only again - printf "a = %s\n", a # use `a' as string - } - -This program prints `a = 123.46' twice, and then prints `a = 123.456'. - - *Note Conversion of Strings and Numbers: Conversion, for the rules -that specify how string values are made from numeric values. - - -File: gawk.info, Node: Conditional Exp, Next: Function Calls, Prev: Values, Up: Expressions - -Conditional Expressions -======================= - - A "conditional expression" is a special kind of expression with -three operands. It allows you to use one expression's value to select -one of two other expressions. - - The conditional expression looks the same as in the C language: - - SELECTOR ? IF-TRUE-EXP : IF-FALSE-EXP - -There are three subexpressions. The first, SELECTOR, is always -computed first. If it is "true" (not zero and not null) then -IF-TRUE-EXP is computed next and its value becomes the value of the -whole expression. Otherwise, IF-FALSE-EXP is computed next and its -value becomes the value of the whole expression. - - For example, this expression produces the absolute value of `x': - - x > 0 ? x : -x - - Each time the conditional expression is computed, exactly one of -IF-TRUE-EXP and IF-FALSE-EXP is computed; the other is ignored. This -is important when the expressions contain side effects. For example, -this conditional expression examines element `i' of either array `a' or -array `b', and increments `i'. - - x == y ? a[i++] : b[i++] - -This is guaranteed to increment `i' exactly once, because each time one -or the other of the two increment expressions is executed, and the -other is not. - - -File: gawk.info, Node: Function Calls, Next: Precedence, Prev: Conditional Exp, Up: Expressions - -Function Calls -============== - - A "function" is a name for a particular calculation. Because it has -a name, you can ask for it by name at any point in the program. For -example, the function `sqrt' computes the square root of a number. - - A fixed set of functions are "built-in", which means they are -available in every `awk' program. The `sqrt' function is one of these. -*Note Built-in Functions: Built-in, for a list of built-in functions -and their descriptions. In addition, you can define your own functions -in the program for use elsewhere in the same program. *Note -User-defined Functions: User-defined, for how to do this. - - The way to use a function is with a "function call" expression, -which consists of the function name followed by a list of "arguments" -in parentheses. The arguments are expressions which give the raw -materials for the calculation that the function will do. When there is -more than one argument, they are separated by commas. If there are no -arguments, write just `()' after the function name. Here are some -examples: - - sqrt(x^2 + y^2) # One argument - atan2(y, x) # Two arguments - rand() # No arguments - - *Do not put any space between the function name and the -open-parenthesis!* A user-defined function name looks just like the -name of a variable, and space would make the expression look like -concatenation of a variable with an expression inside parentheses. -Space before the parenthesis is harmless with built-in functions, but -it is best not to get into the habit of using space to avoid mistakes -with user-defined functions. - - Each function expects a particular number of arguments. For -example, the `sqrt' function must be called with a single argument, the -number to take the square root of: - - sqrt(ARGUMENT) - - Some of the built-in functions allow you to omit the final argument. -If you do so, they use a reasonable default. *Note Built-in Functions: -Built-in, for full details. If arguments are omitted in calls to -user-defined functions, then those arguments are treated as local -variables, initialized to the null string (*note User-defined -Functions: User-defined.). - - Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of `sqrt(ARGUMENT)' is the square root of the -argument. A function can also have side effects, such as assigning the -values of certain variables or doing I/O. - - Here is a command to read numbers, one number per line, and print the -square root of each one: - - awk '{ print "The square root of", $1, "is", sqrt($1) }' - - -File: gawk.info, Node: Precedence, Prev: Function Calls, Up: Expressions - -Operator Precedence (How Operators Nest) -======================================== - - "Operator precedence" determines how operators are grouped, when -different operators appear close by in one expression. For example, -`*' has higher precedence than `+'; thus, `a + b * c' means to multiply -`b' and `c', and then add `a' to the product (i.e., `a + (b * c)'). - - You can overrule the precedence of the operators by using -parentheses. You can think of the precedence rules as saying where the -parentheses are assumed if you do not write parentheses yourself. In -fact, it is wise to always use parentheses whenever you have an unusual -combination of operators, because other people who read the program may -not remember what the precedence is in this case. You might forget, -too; then you could make a mistake. Explicit parentheses will help -prevent any such mistake. - - When operators of equal precedence are used together, the leftmost -operator groups first, except for the assignment, conditional and -exponentiation operators, which group in the opposite order. Thus, `a -- b + c' groups as `(a - b) + c'; `a = b = c' groups as `a = (b = c)'. - - The precedence of prefix unary operators does not matter as long as -only unary operators are involved, because there is only one way to -parse them--innermost first. Thus, `$++i' means `$(++i)' and `++$x' -means `++($x)'. However, when another operator follows the operand, -then the precedence of the unary operators can matter. Thus, `$x^2' -means `($x)^2', but `-x^2' means `-(x^2)', because `-' has lower -precedence than `^' while `$' has higher precedence. - - Here is a table of the operators of `awk', in order of increasing -precedence: - -assignment - `=', `+=', `-=', `*=', `/=', `%=', `^=', `**='. These operators - group right-to-left. (The `**=' operator is not specified by - POSIX.) - -conditional - `?:'. This operator groups right-to-left. - -logical "or". - `||'. - -logical "and". - `&&'. - -array membership - `in'. - -matching - `~', `!~'. - -relational, and redirection - The relational operators and the redirections have the same - precedence level. Characters such as `>' serve both as - relationals and as redirections; the context distinguishes between - the two meanings. - - The relational operators are `<', `<=', `==', `!=', `>=' and `>'. - - The I/O redirection operators are `<', `>', `>>' and `|'. - - Note that I/O redirection operators in `print' and `printf' - statements belong to the statement level, not to expressions. The - redirection does not produce an expression which could be the - operand of another operator. As a result, it does not make sense - to use a redirection operator near another operator of lower - precedence, without parentheses. Such combinations, for example - `print foo > a ? b : c', result in syntax errors. - -concatenation - No special token is used to indicate concatenation. The operands - are simply written side by side. - -add, subtract - `+', `-'. - -multiply, divide, mod - `*', `/', `%'. - -unary plus, minus, "not" - `+', `-', `!'. - -exponentiation - `^', `**'. These operators group right-to-left. (The `**' - operator is not specified by POSIX.) - -increment, decrement - `++', `--'. - -field - `$'. - - -File: gawk.info, Node: Statements, Next: Arrays, Prev: Expressions, Up: Top - -Control Statements in Actions -***************************** - - "Control statements" such as `if', `while', and so on control the -flow of execution in `awk' programs. Most of the control statements in -`awk' are patterned on similar statements in C. - - All the control statements start with special keywords such as `if' -and `while', to distinguish them from simple expressions. - - Many control statements contain other statements; for example, the -`if' statement contains another statement which may or may not be -executed. The contained statement is called the "body". If you want -to include more than one statement in the body, group them into a -single compound statement with curly braces, separating them with -newlines or semicolons. - -* Menu: - -* If Statement:: Conditionally execute - some `awk' statements. -* While Statement:: Loop until some condition is satisfied. -* Do Statement:: Do specified action while looping until some - condition is satisfied. -* For Statement:: Another looping statement, that provides - initialization and increment clauses. -* Break Statement:: Immediately exit the innermost enclosing loop. -* Continue Statement:: Skip to the end of the innermost - enclosing loop. -* Next Statement:: Stop processing the current input record. -* Next File Statement:: Stop processing the current file. -* Exit Statement:: Stop execution of `awk'. - - -File: gawk.info, Node: If Statement, Next: While Statement, Prev: Statements, Up: Statements - -The `if' Statement -================== - - The `if'-`else' statement is `awk''s decision-making statement. It -looks like this: - - if (CONDITION) THEN-BODY [else ELSE-BODY] - -CONDITION is an expression that controls what the rest of the statement -will do. If CONDITION is true, THEN-BODY is executed; otherwise, -ELSE-BODY is executed (assuming that the `else' clause is present). -The `else' part of the statement is optional. The condition is -considered false if its value is zero or the null string, and true -otherwise. - - Here is an example: - - if (x % 2 == 0) - print "x is even" - else - print "x is odd" - - In this example, if the expression `x % 2 == 0' is true (that is, -the value of `x' is divisible by 2), then the first `print' statement -is executed, otherwise the second `print' statement is performed. - - If the `else' appears on the same line as THEN-BODY, and THEN-BODY -is not a compound statement (i.e., not surrounded by curly braces), -then a semicolon must separate THEN-BODY from `else'. To illustrate -this, let's rewrite the previous example: - - awk '{ if (x % 2 == 0) print "x is even"; else - print "x is odd" }' - -If you forget the `;', `awk' won't be able to parse the statement, and -you will get a syntax error. - - We would not actually write this example this way, because a human -reader might fail to see the `else' if it were not the first thing on -its line. - - -File: gawk.info, Node: While Statement, Next: Do Statement, Prev: If Statement, Up: Statements - -The `while' Statement -===================== - - In programming, a "loop" means a part of a program that is (or at -least can be) executed two or more times in succession. - - The `while' statement is the simplest looping statement in `awk'. -It repeatedly executes a statement as long as a condition is true. It -looks like this: - - while (CONDITION) - BODY - -Here BODY is a statement that we call the "body" of the loop, and -CONDITION is an expression that controls how long the loop keeps -running. - - The first thing the `while' statement does is test CONDITION. If -CONDITION is true, it executes the statement BODY. (CONDITION is true -when the value is not zero and not a null string.) After BODY has been -executed, CONDITION is tested again, and if it is still true, BODY is -executed again. This process repeats until CONDITION is no longer -true. If CONDITION is initially false, the body of the loop is never -executed. - - This example prints the first three fields of each record, one per -line. - - awk '{ i = 1 - while (i <= 3) { - print $i - i++ - } - }' - -Here the body of the loop is a compound statement enclosed in braces, -containing two statements. - - The loop works like this: first, the value of `i' is set to 1. -Then, the `while' tests whether `i' is less than or equal to three. -This is the case when `i' equals one, so the `i'-th field is printed. -Then the `i++' increments the value of `i' and the loop repeats. The -loop terminates when `i' reaches 4. - - As you can see, a newline is not required between the condition and -the body; but using one makes the program clearer unless the body is a -compound statement or is very simple. The newline after the open-brace -that begins the compound statement is not required either, but the -program would be hard to read without it. - - -File: gawk.info, Node: Do Statement, Next: For Statement, Prev: While Statement, Up: Statements - -The `do'-`while' Statement -========================== - - The `do' loop is a variation of the `while' looping statement. The -`do' loop executes the BODY once, then repeats BODY as long as -CONDITION is true. It looks like this: - - do - BODY - while (CONDITION) - - Even if CONDITION is false at the start, BODY is executed at least -once (and only once, unless executing BODY makes CONDITION true). -Contrast this with the corresponding `while' statement: - - while (CONDITION) - BODY - -This statement does not execute BODY even once if CONDITION is false to -begin with. - - Here is an example of a `do' statement: - - awk '{ i = 1 - do { - print $0 - i++ - } while (i <= 10) - }' - -prints each input record ten times. It isn't a very realistic example, -since in this case an ordinary `while' would do just as well. But this -reflects actual experience; there is only occasionally a real use for a -`do' statement. - |