diff options
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r-- | doc/gawk.texi | 66 |
1 files changed, 50 insertions, 16 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi index 760340d6..fa145bb1 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -13785,8 +13785,9 @@ the program produces the following output: @subsection Scanning Multidimensional Arrays There is no special @code{for} statement for scanning a -``multidimensional'' array. There cannot be one, because, in truth, there -are no multidimensional arrays or elements---there is only a +``multidimensional'' array. There cannot be one, because, in truth, +@command{awk} does not have +multidimensional arrays or elements---there is only a multidimensional @emph{way of accessing} an array. @cindex subscripts in arrays, multidimensional, scanning @@ -13813,7 +13814,7 @@ into the individual indices by breaking it apart where the value of @code{SUBSEP} appears. The individual indices then become the elements of the array @code{separate}. -Thus, if a value is previously stored in @code{array[1, "foo"]}; then +Thus, if a value is previously stored in @code{array[1, "foo"]}, then an element with index @code{"1\034foo"} exists in @code{array}. (Recall that the default value of @code{SUBSEP} is the character with code 034.) Sooner or later, the @code{for} statement finds that index and does an @@ -13833,7 +13834,8 @@ separate indices is recovered. @node Arrays of Arrays @section Arrays of Arrays -@command{gawk} supports arrays of +@command{gawk} goes beyond standard @command{awk}'s multidimensional +array access and provides true arrays of arrays. Elements of a subarray are referred to by their own indices enclosed in square brackets, just like the elements of the main array. For example, the following creates a two-element subarray at index @samp{1} @@ -18314,7 +18316,7 @@ $ @kbd{gawk -f compdemo.awk} @print{} data[10] = one @print{} data[20] = two @print{} -@print{} Sort function: cmp_num_str_val @ii{Sort all numbers before all strings} +@print{} Sort function: cmp_num_str_val @ii{Sort all numeric values before all strings} @print{} data[one] = 10 @print{} data[two] = 20 @print{} data[100] = 100 @@ -18323,7 +18325,7 @@ $ @kbd{gawk -f compdemo.awk} @end example Consider sorting the entries of a GNU/Linux system password file -according to login names. The following program sorts records +according to login name. The following program sorts records by a specific field position and can be used for this purpose: @example @@ -18464,7 +18466,7 @@ or ``sort based on comparing the values in descending order.'' Having to write a simple comparison function for this purpose for use in all of your programs becomes tedious. For the common simple cases, @command{gawk} provides -the option of supplying special names that do the requested +the option of using special names that do the requested sorting for you. You can think of them as ``predefined'' sorting functions, if you like, although the names purposely include characters @@ -18473,6 +18475,10 @@ that are not valid in real @command{awk} function names. The following special values are available: @table @code +@item "@@unsorted" +Array elements are processed in arbitrary order, which is the default +@command{awk} behavior. + @item "@@ind_str_asc" Order by indices compared as strings; this is the most basic sort. (Internally, array indices are always strings, so with @samp{a[2*5] = 1} @@ -18498,12 +18504,13 @@ Order by element values rather than by indices. Scalar values are compared as numbers. Subarrays, if present, come out last. When numeric values are equal, the string values are used to provide an ordering: this guarantees consistent results across different -versions of the C @code{qsort()} function.@footnote{When two elements +versions of the C @code{qsort()} function@footnote{When two elements compare as equal, the C @code{qsort()} function does not guarantee that they will maintain their original relative order after sorting. Using the string value to provide a unique ordering when the numeric values are equal ensures that @command{gawk} behaves consistently -across different environments.} +across different environments.}, which @command{gawk} uses internally +to perform the sorting. @item "@@ind_str_desc" Reverse order from the most basic sort. @@ -18521,12 +18528,6 @@ Subarrays, if present, come out first. @item "@@val_num_desc" Element values, treated as numbers, ordered from high to low. Subarrays, if present, come out first. - -@item "@@unsorted" -Array elements are processed in arbitrary order, which is the normal -@command{awk} behavior. You can also get the normal behavior by just -deleting the @code{"sorted_in"} element from the @code{PROCINFO} array, -if it previously had a value assigned to it. @end table The array traversal order is determined before the @code{for} loop @@ -18561,6 +18562,36 @@ numeric value, regardless of what the subarray itself contains, and all subarrays are treated as being equal to each other. Their order relative to each other is determined by their index strings. +Here are some additional things to bear in mind about sorted +array traversal. + +@itemize @bullet +@item +The value of @code{PROCINFO["sorted_in"]} is global. That is, it affects +all array traversal @code{for} loops. If you need to change it within your +own function, you should see if it's defined and save and restore the value: + +@example +function myfunct(p1, p2, save_sorted) +@{ + @dots{} + if ("sorted_in" in PROCINFO) @{ + save_sorted = PROCINFO["sorted_in"] + PROCINFO["sorted_in"] = "@@val_str_desc" # or whatever + @} + @dots{} + if (save_sorted) + PROCINFO["sorted_in"] = save_sorted +@} +@end example + +@item +As mentioned, the default array traversal order is represented by +@code{"@@unsorted"}. You can also get the default behavior by assigning +the null string to @code{PROCINFO["sorted_in"]} or by just deleting the +@code{"sorted_in"} element from the @code{PROCINFO} array. +@end itemize + @node Array Sorting Functions @subsection Sorting Array Values and Indices with @command{gawk} @@ -18588,7 +18619,10 @@ After the call to @code{asort()}, the array @code{data} is indexed from 1 to some number @var{n}, the total number of elements in @code{data}. (This count is @code{asort()}'s return value.) @code{data[1]} @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on. -The array elements are compared as strings. +The comparison is based on the type of the elements +(@pxref{Typing and Comparison}). +All numeric values come before all string values, +which in turn come before all subarrays. @cindex side effects, @code{asort()} function An important side effect of calling @code{asort()} is that |