aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawk.texi
diff options
context:
space:
mode:
authorArnold D. Robbins <arnold@skeeve.com>2011-04-24 13:03:28 +0300
committerArnold D. Robbins <arnold@skeeve.com>2011-04-24 13:03:28 +0300
commit963bfd011dffc072a3fdc63b74910679d6660f50 (patch)
treef77920bafaa693f657eef6b781b4e419409c51bb /doc/gawk.texi
parent8eb45b02e704c95866970005fe771e3507fb935c (diff)
downloadegawk-963bfd011dffc072a3fdc63b74910679d6660f50.tar.gz
egawk-963bfd011dffc072a3fdc63b74910679d6660f50.tar.bz2
egawk-963bfd011dffc072a3fdc63b74910679d6660f50.zip
Update docs some more.
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r--doc/gawk.texi67
1 files changed, 37 insertions, 30 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index b4b014e7..60cfd1d7 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -12766,7 +12766,8 @@ You can omit direction and/or key type and/or comparison mode. Provided
that at least one is present, the missing parts of a sort specification
default to @samp{ascending}, @samp{index}, and @samp{string}, respectively.
An empty string, @code{""}, is the same as @samp{ascending index string},
-and a value of @samp{unsorted} will cause @samp{for (index in array) @dots{}} to process
+and a value of @samp{unsorted} will cause @samp{for (index in array) @dots{}}
+to process
the indices in arbitrary order. Another thing to note is that the array sorting
takes place at the time the @code{for} loop is about to
start executing, so changing the value of @code{PROCINFO["sorted_in"]}
@@ -13379,6 +13380,7 @@ END @{
@menu
* Controlling Scanning:: Controlling the order in which arrays are scanned.
+* Controlling Scanning With A Function:: Using a function to control scanning.
@end menu
In programs that use arrays, it is often necessary to use a loop that
@@ -13531,10 +13533,11 @@ numeric value, regardless of what the subarray itself contains,
and all subarrays are treated as being equal to each other. Their
order relative to each other is determined by their index strings.
+@node Controlling Scanning With A Function
@subsubsection Controlling Array Scanning Order With a User-defined Function
-The value of @code{PROCINFO["sorted_in"]} can also be a function name
-that will let you traverse an array based on any custom criterion.
+The value of @code{PROCINFO["sorted_in"]} can also be a function name.
+This lets you traverse an array based on any custom criterion.
The array elements are ordered according to the return value of this
function. This comparison function should be defined with at least
four arguments:
@@ -13553,16 +13556,19 @@ Either @var{v1} or @var{v2}, or both, can be arrays if the array being
traversed contains subarrays as values. The three possible return values
are interpreted this way:
-@quotation
-* If the return value of @code{comp_func(i1, v1, i2, v2)} is less than 0,
+@itemize @bullet
+@item
+If the return value of @code{comp_func(i1, v1, i2, v2)} is less than zero,
index @var{i1} comes before index @var{i2} during loop traversal.
-* If @code{comp_func(i1, v1, i2, v2)} returns 0, @var{i1} and @var{i2}
-come together but relative order with respect to each other is undefined.
+@item
+If @code{comp_func(i1, v1, i2, v2)} returns zero, @var{i1} and @var{i2}
+come together but the relative order with respect to each other is undefined.
-* If the return value of @code{comp_func(i1, v1, i2, v2)} is greater than 0,
+@item
+If the return value of @code{comp_func(i1, v1, i2, v2)} is greater than zero,
@var{i1} comes after @var{i2}.
-@end quotation
+@end itemize
The following comparison function can be used to scan an array in
numerical order of the indices:
@@ -13575,22 +13581,24 @@ function cmp_num_idx(i1, v1, i2, v2)
@}
@end example
-This function will traverse an array based on an order by element values
+This function traverses an array based on an order by element values
rather than by indices:
@example
function cmp_str_val(i1, v1, i2, v2)
@{
# string value comparison, ascending order
- v1 = v1 ""
- v2 = v2 ""
- if (v1 < v2) return -1
+ v1 = v1 ""
+ v2 = v2 ""
+ if (v1 < v2)
+ return -1
return (v1 != v2)
@}
@end example
-A comparison function to make all numbers, and numeric strings without
-any leading or trailing spaces come out first during loop traversal:
+Here is a
+comparison function to make all numbers, and numeric strings without
+any leading or trailing spaces, come out first during loop traversal:
@example
function cmp_num_str_val(i1, v1, i2, v2, n1, n2)
@@ -13612,7 +13620,7 @@ by a specific field position can be used for this purpose:
@example
# sort.awk --- simple program to sort by field position
-# field position is specified by POS
+# field position is specified by the global variable POS
function cmp_field(i1, v1, i2, v2)
@{
@@ -13642,7 +13650,7 @@ and the fields are seperated by colons. Running the program produces the
following output:
@example
-@kbd{$ gawk -vPOS=1 -F: -f sort.awk /etc/passwd}
+$ @kbd{gawk -vPOS=1 -F: -f sort.awk /etc/passwd}
@print{} adm:x:3:4:adm:/var/adm:/sbin/nologin
@print{} apache:x:48:48:Apache:/var/www:/sbin/nologin
@print{} avahi:x:70:70:Avahi daemon:/:/sbin/nologin
@@ -13665,14 +13673,14 @@ function cmp_randomize(i1, v1, i2, v2)
As mentioned above, the order of the indices is arbitrary if two
elements compare equal. This is usually not a problem, but letting
-the tied elements come out in arbitrary order can be an issue, specially
+the tied elements come out in arbitrary order can be an issue, especially
when comparing item values. The partial ordering of the equal elements
-may change during next loop traversal, if other elements are added or
+may change during the next loop traversal, if other elements are added or
removed from the array. One way to resolve ties when comparing elements
with otherwise equal values is to include the indices in the comparison
rules. Note that doing this may make the loop traversal less efficient,
so consider it only if necessary. The following comparison functions
-will force a deterministic order, and are based on the fact that the
+force a deterministic order, and are based on the fact that the
indices of two elements are never equal:
@example
@@ -13691,32 +13699,31 @@ function cmp_string(i1, v1, i2, v2)
@}
@end example
-@ignore
-Avoid using the term stable when describing the unpredictable behavior
-if two items compare equal. Usually, the goal of a "stable algorithm"
-is to maintain the original order of the items, which is a meaningless
-concept for a list constructed from a hash.
-@end ignore
+@c Avoid using the term ``stable'' when describing the unpredictable behavior
+@c if two items compare equal. Usually, the goal of a "stable algorithm"
+@c is to maintain the original order of the items, which is a meaningless
+@c concept for a list constructed from a hash.
A custom comparison function can often simplify ordered loop
traversal, and the the sky is really the limit when it comes to
designing such a function.
-
When string comparisons are made during a sort, either for element
-values where one or both aren't numbers or for element indices
+values where one or both aren't numbers, or for element indices
handled as strings, the value of @code{IGNORECASE}
(@pxref{Built-in Variables}) controls whether
-the comparisons treat corresponding upper and lower case letters as
+the comparisons treat corresponding uppercase and lowercase letters as
equivalent or distinct.
-This sorting extension is disabled in POSIX mode,
+All sorting based on @code{PROCINFO["sorted_in"]}
+is disabled in POSIX mode,
since the @code{PROCINFO} array is not special in that case.
As a side note, sorting the array indices before traversing
the array has been reported to add 15% to 20% overhead to the
execution time of @command{awk} programs. For this reason,
sorted array traversal is not the default.
+
@c The @command{gawk}
@c maintainers believe that only the people who wish to use a
@c feature should have to pay for it.