diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2011-02-17 08:40:51 +0200 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2011-02-17 08:40:51 +0200 |
commit | c6c72b4c08ecc138bcb3453398d756f3702acb11 (patch) | |
tree | b5e16087872343e44196da9f567f439d4226b00e | |
parent | e4e45aaeb8336c6f32be9f147b0d37d04243a2aa (diff) | |
download | egawk-c6c72b4c08ecc138bcb3453398d756f3702acb11.tar.gz egawk-c6c72b4c08ecc138bcb3453398d756f3702acb11.tar.bz2 egawk-c6c72b4c08ecc138bcb3453398d756f3702acb11.zip |
PROCINFO["sorted_in"] value now matters.
-rw-r--r-- | ChangeLog | 9 | ||||
-rw-r--r-- | NEWS | 4 | ||||
-rw-r--r-- | doc/ChangeLog | 7 | ||||
-rw-r--r-- | doc/gawk.info | 514 | ||||
-rw-r--r-- | doc/gawk.texi | 41 | ||||
-rw-r--r-- | eval.c | 264 | ||||
-rw-r--r-- | test/ChangeLog | 5 | ||||
-rw-r--r-- | test/sortfor.awk | 7 | ||||
-rw-r--r-- | test/sortfor.ok | 289 |
9 files changed, 846 insertions, 294 deletions
@@ -1,3 +1,10 @@ +Tue Feb 15 17:11:26 2011 Pat Rankin <rankin@pactechdata.com> + + * eval.c (sorted_in, sort_up_indx_str, sort_down_indx_str, + sort_up_indx_ignrcase, sort_down_indx_ignrcase, sort_ignorecase): + New functions to sort arrays for `for (index in array)' statements. + (r_interpret: case Op_arrayfor_init): Call sorted_in(). + Wed Feb 16 07:12:50 2011 John Haque <j.eh@mchsi.com> Fix line numbers in the lint, warning and error messages issued @@ -61,7 +68,7 @@ Fri Feb 11 10:26:25 2011 Arnold D. Robbins <arnold@skeeve.com> Thu Feb 10 21:31:36 2011 Andreas Buening <andreas.buening@nexgo.de> - * main.c (load_procinfo): Fix warning about unsed variables in + * main.c (load_procinfo): Fix warning about unsed variables if we don't have multiple groups. * protos.h: Move decls for many standard functions here if they aren't in the header files (OS/2) and bracket inside @@ -96,7 +96,9 @@ Changes from 3.1.8 to 4.0.0 - Probably others that I've forgotten 29. If PROCINFO["sorted_in"] exists, for(iggy in foo) loops sort the - indices before looping over them. + indices before looping over them. The value of this element + provides control over how the indices are sorted before the loop + traversal starts. 30. A new isarray() function exists to distinguish if an item is an array or not, to make it possible to traverse multidimensional arrays. diff --git a/doc/ChangeLog b/doc/ChangeLog index 57828c97..9c2ad6ec 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,10 @@ +Tue Feb 15 17:11:26 2011 Pat Rankin <rankin@pactechdata.com> + + * gawk.texi (Builit-in Variables: PROCINFO array, Scanning All + Elements of an Array: `for' statement): Document that the value + of PROCINFO["sorted_in"] matters; sort orders "ascending index + string", "descending index string", and "unsorted" are supported. + Sun Feb 13 19:58:35 2011 Arnold D. Robbins <arnold@skeeve.com> * awkcard.in, gawk.1, gawk.texi, gawkinet.texi: Fix typos diff --git a/doc/gawk.info b/doc/gawk.info index 46368574..bfac4508 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -9371,10 +9371,19 @@ with a pound sign (`#'). The value of the `getuid()' system call. `PROCINFO["sorted_in"]' - If this element exists in `PROCINFO', _no matter what its - value_, then `gawk' will cause `for(i in arr) ...' loops to - traverse the array indices in sorted order. *Note Scanning - an Array::, for more information. + If this element exists in `PROCINFO', its value controls the + order in which array indices will be processed by `for(i in + arr) ...' loops. A value of `"ascending index string"', + which may be shortened to `"ascending index"' or just + `"ascending"', will result in either case sensitive or case + insensitive ascending order depending upon the value of + `IGNORECASE'. A value of `"descending index string"', which + may be shortened in a similar manner, will result in the + opposite order. The value `"unsorted"' is also recognized, + yielding the default result of arbitrary order. Any other + value will be ignored, and warned about (at the time of first + `for(in in arr) ...' execution) when lint checking is enabled. + *Note Scanning an Array::, for more information. `PROCINFO["strftime"]' The default time format string for `strftime()'. Assigning a @@ -9888,14 +9897,16 @@ them. Similarly, changing VAR inside the loop may produce strange results. It is best to avoid such things. As an extension, `gawk' makes it possible for you to loop over the -elements of an array in order, sorted by index. Sorting is based on -string comparison (since all array indices are strings), and you cannot -control the style of sorting; it is always from lowest to highest -(`"A"' before `"B"'). To enable this feature, create the array element -`PROCINFO["sorted_in"]' (*note Auto-set::). The value of this element -does not matter; `gawk' only tests if the element with this index -exists in `PROCINFO' or not. This extension is disabled in POSIX mode, -since the `PROCINFO' array is not special in that case. For example: +elements of an array in order, based on the value of +`PROCINFO["sorted_in"]' (*note Auto-set::). At present two sorting +options are available: `"ascending index string"' and `"descending +index string"'. They can be shortened by omitting `string' or `index +string'. The value `"unsorted"' can be used as an explicit "no-op" and +yields the same result as when `PROCINFO["sorted_in"]' has no value at +all. If the index strings contain letters, the value of `IGNORECASE' +affects the order of the result. This extension is disabled in POSIX +mode, since the `PROCINFO' array is not special in that case. For +example: $ gawk 'BEGIN { > a[4] = 4 @@ -9906,7 +9917,7 @@ since the `PROCINFO' array is not special in that case. For example: -| 4 4 -| 3 3 $ gawk 'BEGIN { - > PROCINFO["sorted_in"]++ + > PROCINFO["sorted_in"] = "ascending index" > a[4] = 4 > a[3] = 3 > for (i in a) @@ -22678,6 +22689,9 @@ Metacharacters Instead, they denote regular expression operations, such as repetition, grouping, or alternation. +No-op + An operation that does nothing. + Null String A string with no characters in it. It is represented explicitly in `awk' programs by placing two double quote characters next to each @@ -24361,7 +24375,7 @@ Index (line 67) * advanced features, data files as single record: Records. (line 175) * advanced features, fixed-width data: Constant Size. (line 9) -* advanced features, FNR/NR variables: Auto-set. (line 206) +* advanced features, FNR/NR variables: Auto-set. (line 215) * advanced features, gawk: Advanced Features. (line 6) * advanced features, gawk, network programming: TCP/IP Networking. (line 6) @@ -24654,7 +24668,7 @@ Index * Brian Kernighan's awk, extensions: BTL. (line 6) * Broder, Alan J.: Contributors. (line 85) * Brown, Martin: Contributors. (line 79) -* BSD-based operating systems: Glossary. (line 591) +* BSD-based operating systems: Glossary. (line 594) * bt debugger command (alias for backtrace): Dgawk Stack. (line 13) * Buening, Andreas <1>: Bugs. (line 71) * Buening, Andreas <2>: Contributors. (line 89) @@ -24856,7 +24870,7 @@ Index (line 47) * dark corner, FILENAME variable <1>: Auto-set. (line 92) * dark corner, FILENAME variable: Getline Notes. (line 19) -* dark corner, FNR/NR variables: Auto-set. (line 206) +* dark corner, FNR/NR variables: Auto-set. (line 215) * dark corner, format-control characters: Control Letters. (line 18) * dark corner, FS as null string: Single Character Fields. (line 20) @@ -25055,7 +25069,7 @@ Index * differences in awk and gawk, regular expressions: Case-sensitivity. (line 26) * differences in awk and gawk, RS/RT variables: Records. (line 167) -* differences in awk and gawk, RT variable: Auto-set. (line 195) +* differences in awk and gawk, RT variable: Auto-set. (line 204) * differences in awk and gawk, single-character fields: Single Character Fields. (line 6) * differences in awk and gawk, split() function: String Functions. @@ -25336,7 +25350,7 @@ Index * floating-point, numbers, AWKNUM internal type: Internals. (line 19) * FNR variable <1>: Auto-set. (line 102) * FNR variable: Records. (line 6) -* FNR variable, changing: Auto-set. (line 206) +* FNR variable, changing: Auto-set. (line 215) * for statement: For Statement. (line 6) * for statement, in arrays: Scanning an Array. (line 20) * force_number() internal function: Internals. (line 27) @@ -25366,7 +25380,7 @@ Index * Free Software Foundation (FSF) <1>: Glossary. (line 297) * Free Software Foundation (FSF) <2>: Getting. (line 10) * Free Software Foundation (FSF): Manual History. (line 6) -* FreeBSD: Glossary. (line 591) +* FreeBSD: Glossary. (line 594) * FS variable <1>: User-modified. (line 56) * FS variable: Field Separators. (line 14) * FS variable, --field-separator option and: Options. (line 21) @@ -25510,7 +25524,7 @@ Index * gawk, regular expressions, operators: GNU Regexp Operators. (line 6) * gawk, regular expressions, precedence: Regexp Operators. (line 157) -* gawk, RT variable in <1>: Auto-set. (line 195) +* gawk, RT variable in <1>: Auto-set. (line 204) * gawk, RT variable in <2>: Getline/Variable/File. (line 10) * gawk, RT variable in <3>: Multiple Line. (line 129) @@ -25586,7 +25600,7 @@ Index * GNU long options, printing list of: Options. (line 141) * GNU Project <1>: Glossary. (line 315) * GNU Project: Manual History. (line 11) -* GNU/Linux <1>: Glossary. (line 591) +* GNU/Linux <1>: Glossary. (line 594) * GNU/Linux <2>: I18N Example. (line 55) * GNU/Linux: Manual History. (line 28) * GPL (General Public License) <1>: Glossary. (line 306) @@ -25841,7 +25855,7 @@ Index * lint checking, undefined functions: Pass By Value/Reference. (line 88) * LINT variable: User-modified. (line 98) -* Linux <1>: Glossary. (line 591) +* Linux <1>: Glossary. (line 594) * Linux <2>: I18N Example. (line 55) * Linux: Manual History. (line 28) * list debugger command: Miscellaneous Dgawk Commands. @@ -25912,7 +25926,7 @@ Index * nargs internal variable: Internals. (line 49) * nawk utility: Names. (line 17) * negative zero: Unexpected Results. (line 28) -* NetBSD: Glossary. (line 591) +* NetBSD: Glossary. (line 594) * networks, programming: TCP/IP Networking. (line 6) * networks, support for: Special Network. (line 6) * newlines <1>: Boolean Ops. (line 67) @@ -25957,7 +25971,7 @@ Index * not Boolean-logic operator: Boolean Ops. (line 6) * NR variable <1>: Auto-set. (line 118) * NR variable: Records. (line 6) -* NR variable, changing: Auto-set. (line 206) +* NR variable, changing: Auto-set. (line 215) * null strings <1>: Basic Data Typing. (line 50) * null strings <2>: Truth Values. (line 6) * null strings <3>: Regexp Field Splitting. @@ -26006,7 +26020,7 @@ Index * OFS variable <1>: User-modified. (line 124) * OFS variable <2>: Output Separators. (line 6) * OFS variable: Changing Fields. (line 64) -* OpenBSD: Glossary. (line 591) +* OpenBSD: Glossary. (line 594) * OpenSolaris: Other Versions. (line 86) * operating systems, BSD-based: Manual History. (line 28) * operating systems, PC, gawk on: PC Using. (line 6) @@ -26075,8 +26089,8 @@ Index * output, standard: Special FD. (line 6) * p debugger command (alias for print): Viewing And Changing Data. (line 36) -* P1003.1 POSIX standard: Glossary. (line 438) -* P1003.2 POSIX standard: Glossary. (line 438) +* P1003.1 POSIX standard: Glossary. (line 441) +* P1003.2 POSIX standard: Glossary. (line 441) * parameters, number of: Internals. (line 49) * parentheses (): Regexp Operators. (line 79) * parentheses (), pgawk program: Profiling. (line 141) @@ -26389,7 +26403,7 @@ Index * right angle bracket (>), >> operator (I/O): Redirection. (line 50) * right shift, bitwise: Bitwise Functions. (line 32) * Ritchie, Dennis: Basic Data Typing. (line 74) -* RLENGTH variable: Auto-set. (line 182) +* RLENGTH variable: Auto-set. (line 191) * RLENGTH variable, match() function and: String Functions. (line 194) * Robbins, Arnold <1>: Future Extensions. (line 6) * Robbins, Arnold <2>: Bugs. (line 32) @@ -26414,9 +26428,9 @@ Index * RS variable: Records. (line 20) * RS variable, multiline records and: Multiple Line. (line 17) * rshift() function (gawk): Bitwise Functions. (line 51) -* RSTART variable: Auto-set. (line 188) +* RSTART variable: Auto-set. (line 197) * RSTART variable, match() function and: String Functions. (line 194) -* RT variable <1>: Auto-set. (line 195) +* RT variable <1>: Auto-set. (line 204) * RT variable <2>: Getline/Variable/File. (line 10) * RT variable <3>: Multiple Line. (line 129) @@ -26719,7 +26733,7 @@ Index (line 6) * uniq utility: Uniq Program. (line 6) * uniq.awk program: Uniq Program. (line 65) -* Unix: Glossary. (line 591) +* Unix: Glossary. (line 594) * Unix awk, backslashes in escape sequences: Escape Sequences. (line 125) * Unix awk, close() function and: Close Files And Pipes. @@ -27042,224 +27056,224 @@ Node: Built-in Variables381978 Node: User-modified383073 Ref: User-modified-Footnote-1391074 Node: Auto-set391136 -Ref: Auto-set-Footnote-1400400 -Node: ARGC and ARGV400605 -Node: Arrays404364 -Node: Array Basics405935 -Node: Array Intro406646 -Node: Reference to Elements410964 -Node: Assigning Elements413234 -Node: Array Example413725 -Node: Scanning an Array415457 -Node: Delete418905 -Ref: Delete-Footnote-1421336 -Node: Numeric Array Subscripts421393 -Node: Uninitialized Subscripts423576 -Node: Multi-dimensional425204 -Node: Multi-scanning428295 -Node: Array Sorting429879 -Ref: Array Sorting-Footnote-1433077 -Node: Arrays of Arrays433271 -Node: Functions437810 -Node: Built-in438632 -Node: Calling Built-in439710 -Node: Numeric Functions441686 -Ref: Numeric Functions-Footnote-1445443 -Ref: Numeric Functions-Footnote-2445779 -Ref: Numeric Functions-Footnote-3445827 -Node: String Functions446096 -Ref: String Functions-Footnote-1467902 -Ref: String Functions-Footnote-2468031 -Ref: String Functions-Footnote-3468279 -Node: Gory Details468366 -Ref: table-sub-escapes470023 -Ref: table-posix-sub471337 -Ref: table-gensub-escapes472237 -Node: I/O Functions473408 -Ref: I/O Functions-Footnote-1480103 -Node: Time Functions480250 -Ref: Time Functions-Footnote-1491116 -Ref: Time Functions-Footnote-2491184 -Ref: Time Functions-Footnote-3491342 -Ref: Time Functions-Footnote-4491453 -Ref: Time Functions-Footnote-5491565 -Ref: Time Functions-Footnote-6491792 -Node: Bitwise Functions492058 -Ref: table-bitwise-ops492616 -Ref: Bitwise Functions-Footnote-1496776 -Node: Type Functions496960 -Node: I18N Functions497398 -Node: User-defined499025 -Node: Definition Syntax499829 -Ref: Definition Syntax-Footnote-1504466 -Node: Function Example504535 -Node: Function Caveats507129 -Node: Calling A Function507550 -Node: Variable Scope508665 -Node: Pass By Value/Reference510593 -Node: Return Statement514033 -Node: Dynamic Typing516975 -Node: Indirect Calls517712 -Node: Internationalization527394 -Node: I18N and L10N528822 -Node: Explaining gettext529508 -Ref: Explaining gettext-Footnote-1534570 -Ref: Explaining gettext-Footnote-2534753 -Node: Programmer i18n534918 -Node: Translator i18n539209 -Node: String Extraction540002 -Ref: String Extraction-Footnote-1540963 -Node: Printf Ordering541049 -Ref: Printf Ordering-Footnote-1543833 -Node: I18N Portability543897 -Ref: I18N Portability-Footnote-1546346 -Node: I18N Example546409 -Ref: I18N Example-Footnote-1549044 -Node: Gawk I18N549116 -Node: Advanced Features549733 -Node: Nondecimal Data551052 -Node: Two-way I/O552633 -Ref: Two-way I/O-Footnote-1558047 -Node: TCP/IP Networking558124 -Node: Profiling560967 -Node: Library Functions568367 -Ref: Library Functions-Footnote-1571406 -Node: Library Names571577 -Ref: Library Names-Footnote-1575048 -Ref: Library Names-Footnote-2575268 -Node: General Functions575354 -Node: Nextfile Function576417 -Node: Strtonum Function580798 -Node: Assert Function583749 -Node: Round Function587075 -Node: Cliff Random Function588616 -Node: Ordinal Functions589632 -Ref: Ordinal Functions-Footnote-1592702 -Ref: Ordinal Functions-Footnote-2592954 -Node: Join Function593170 -Ref: Join Function-Footnote-1594941 -Node: Gettimeofday Function595141 -Node: Data File Management598856 -Node: Filetrans Function599488 -Node: Rewind Function603724 -Node: File Checking605177 -Node: Empty Files606271 -Node: Ignoring Assigns608501 -Node: Getopt Function610054 -Ref: Getopt Function-Footnote-1621379 -Node: Passwd Functions621582 -Ref: Passwd Functions-Footnote-1630569 -Node: Group Functions630657 -Node: Walking Arrays638759 -Node: Sample Programs640325 -Node: Running Examples640990 -Node: Clones641718 -Node: Cut Program642841 -Node: Egrep Program652690 -Ref: Egrep Program-Footnote-1660461 -Node: Id Program660571 -Node: Split Program664187 -Ref: Split Program-Footnote-1667706 -Node: Tee Program667834 -Node: Uniq Program670637 -Node: Wc Program678060 -Ref: Wc Program-Footnote-1682324 -Node: Miscellaneous Programs682524 -Node: Dupword Program683712 -Node: Alarm Program685743 -Node: Translate Program690465 -Ref: Translate Program-Footnote-1694844 -Ref: Translate Program-Footnote-2695072 -Node: Labels Program695206 -Ref: Labels Program-Footnote-1698577 -Node: Word Sorting698661 -Node: History Sorting702544 -Node: Extract Program704382 -Ref: Extract Program-Footnote-1711863 -Node: Simple Sed711991 -Node: Igawk Program715053 -Ref: Igawk Program-Footnote-1730085 -Ref: Igawk Program-Footnote-2730286 -Node: Anagram Program730424 -Node: Signature Program733522 -Node: Debugger734625 -Node: Debugging735536 -Node: Debugging Concepts735850 -Node: Debugging Terms737706 -Node: Awk Debugging740251 -Node: Sample dgawk session741143 -Node: dgawk invocation741635 -Node: Finding The Bug742817 -Node: List of Debugger Commands749302 -Node: Breakpoint Control750613 -Node: Dgawk Execution Control754089 -Node: Viewing And Changing Data757441 -Node: Dgawk Stack760750 -Node: Dgawk Info762210 -Node: Miscellaneous Dgawk Commands766158 -Node: Readline Support771583 -Node: Dgawk Limitations772410 -Node: Language History774549 -Node: V7/SVR3.1775981 -Node: SVR4778276 -Node: POSIX779718 -Node: BTL780716 -Node: POSIX/GNU781450 -Node: Common Extensions786636 -Node: Contributors787737 -Node: Installation791772 -Node: Gawk Distribution792666 -Node: Getting793150 -Node: Extracting793976 -Node: Distribution contents795654 -Node: Unix Installation800672 -Node: Quick Installation801289 -Node: Additional Configuration Options803251 -Node: Configuration Philosophy804728 -Node: Non-Unix Installation807070 -Node: PC Installation807528 -Node: PC Binary Installation808827 -Node: PC Compiling810675 -Node: PC Testing813619 -Node: PC Using814795 -Node: Cygwin818980 -Node: MSYS819977 -Node: VMS Installation820491 -Node: VMS Compilation821097 -Ref: VMS Compilation-Footnote-1822104 -Node: VMS Installation Details822162 -Node: VMS Running823797 -Node: VMS Old Gawk825397 -Node: Bugs825871 -Node: Other Versions829736 -Node: Notes835015 -Node: Compatibility Mode835707 -Node: Additions836490 -Node: Accessing The Source837302 -Node: Adding Code838725 -Node: New Ports844273 -Node: Dynamic Extensions848386 -Node: Internals849762 -Node: Plugin License858878 -Node: Sample Library859512 -Node: Internal File Description860198 -Node: Internal File Ops863905 -Ref: Internal File Ops-Footnote-1868673 -Node: Using Internal File Ops868821 -Node: Future Extensions871198 -Node: Basic Concepts873702 -Node: Basic High Level874459 -Ref: Basic High Level-Footnote-1878494 -Node: Basic Data Typing878679 -Node: Floating Point Issues883204 -Node: String Conversion Precision884287 -Ref: String Conversion Precision-Footnote-1885981 -Node: Unexpected Results886090 -Node: POSIX Floating Point Problems887916 -Ref: POSIX Floating Point Problems-Footnote-1891618 -Node: Glossary891656 -Node: Copying915755 -Node: GNU Free Documentation License953312 -Node: Index978449 +Ref: Auto-set-Footnote-1401037 +Node: ARGC and ARGV401242 +Node: Arrays405001 +Node: Array Basics406572 +Node: Array Intro407283 +Node: Reference to Elements411601 +Node: Assigning Elements413871 +Node: Array Example414362 +Node: Scanning an Array416094 +Node: Delete419618 +Ref: Delete-Footnote-1422049 +Node: Numeric Array Subscripts422106 +Node: Uninitialized Subscripts424289 +Node: Multi-dimensional425917 +Node: Multi-scanning429008 +Node: Array Sorting430592 +Ref: Array Sorting-Footnote-1433790 +Node: Arrays of Arrays433984 +Node: Functions438523 +Node: Built-in439345 +Node: Calling Built-in440423 +Node: Numeric Functions442399 +Ref: Numeric Functions-Footnote-1446156 +Ref: Numeric Functions-Footnote-2446492 +Ref: Numeric Functions-Footnote-3446540 +Node: String Functions446809 +Ref: String Functions-Footnote-1468615 +Ref: String Functions-Footnote-2468744 +Ref: String Functions-Footnote-3468992 +Node: Gory Details469079 +Ref: table-sub-escapes470736 +Ref: table-posix-sub472050 +Ref: table-gensub-escapes472950 +Node: I/O Functions474121 +Ref: I/O Functions-Footnote-1480816 +Node: Time Functions480963 +Ref: Time Functions-Footnote-1491829 +Ref: Time Functions-Footnote-2491897 +Ref: Time Functions-Footnote-3492055 +Ref: Time Functions-Footnote-4492166 +Ref: Time Functions-Footnote-5492278 +Ref: Time Functions-Footnote-6492505 +Node: Bitwise Functions492771 +Ref: table-bitwise-ops493329 +Ref: Bitwise Functions-Footnote-1497489 +Node: Type Functions497673 +Node: I18N Functions498111 +Node: User-defined499738 +Node: Definition Syntax500542 +Ref: Definition Syntax-Footnote-1505179 +Node: Function Example505248 +Node: Function Caveats507842 +Node: Calling A Function508263 +Node: Variable Scope509378 +Node: Pass By Value/Reference511306 +Node: Return Statement514746 +Node: Dynamic Typing517688 +Node: Indirect Calls518425 +Node: Internationalization528107 +Node: I18N and L10N529535 +Node: Explaining gettext530221 +Ref: Explaining gettext-Footnote-1535283 +Ref: Explaining gettext-Footnote-2535466 +Node: Programmer i18n535631 +Node: Translator i18n539922 +Node: String Extraction540715 +Ref: String Extraction-Footnote-1541676 +Node: Printf Ordering541762 +Ref: Printf Ordering-Footnote-1544546 +Node: I18N Portability544610 +Ref: I18N Portability-Footnote-1547059 +Node: I18N Example547122 +Ref: I18N Example-Footnote-1549757 +Node: Gawk I18N549829 +Node: Advanced Features550446 +Node: Nondecimal Data551765 +Node: Two-way I/O553346 +Ref: Two-way I/O-Footnote-1558760 +Node: TCP/IP Networking558837 +Node: Profiling561680 +Node: Library Functions569080 +Ref: Library Functions-Footnote-1572119 +Node: Library Names572290 +Ref: Library Names-Footnote-1575761 +Ref: Library Names-Footnote-2575981 +Node: General Functions576067 +Node: Nextfile Function577130 +Node: Strtonum Function581511 +Node: Assert Function584462 +Node: Round Function587788 +Node: Cliff Random Function589329 +Node: Ordinal Functions590345 +Ref: Ordinal Functions-Footnote-1593415 +Ref: Ordinal Functions-Footnote-2593667 +Node: Join Function593883 +Ref: Join Function-Footnote-1595654 +Node: Gettimeofday Function595854 +Node: Data File Management599569 +Node: Filetrans Function600201 +Node: Rewind Function604437 +Node: File Checking605890 +Node: Empty Files606984 +Node: Ignoring Assigns609214 +Node: Getopt Function610767 +Ref: Getopt Function-Footnote-1622092 +Node: Passwd Functions622295 +Ref: Passwd Functions-Footnote-1631282 +Node: Group Functions631370 +Node: Walking Arrays639472 +Node: Sample Programs641038 +Node: Running Examples641703 +Node: Clones642431 +Node: Cut Program643554 +Node: Egrep Program653403 +Ref: Egrep Program-Footnote-1661174 +Node: Id Program661284 +Node: Split Program664900 +Ref: Split Program-Footnote-1668419 +Node: Tee Program668547 +Node: Uniq Program671350 +Node: Wc Program678773 +Ref: Wc Program-Footnote-1683037 +Node: Miscellaneous Programs683237 +Node: Dupword Program684425 +Node: Alarm Program686456 +Node: Translate Program691178 +Ref: Translate Program-Footnote-1695557 +Ref: Translate Program-Footnote-2695785 +Node: Labels Program695919 +Ref: Labels Program-Footnote-1699290 +Node: Word Sorting699374 +Node: History Sorting703257 +Node: Extract Program705095 +Ref: Extract Program-Footnote-1712576 +Node: Simple Sed712704 +Node: Igawk Program715766 +Ref: Igawk Program-Footnote-1730798 +Ref: Igawk Program-Footnote-2730999 +Node: Anagram Program731137 +Node: Signature Program734235 +Node: Debugger735338 +Node: Debugging736249 +Node: Debugging Concepts736563 +Node: Debugging Terms738419 +Node: Awk Debugging740964 +Node: Sample dgawk session741856 +Node: dgawk invocation742348 +Node: Finding The Bug743530 +Node: List of Debugger Commands750015 +Node: Breakpoint Control751326 +Node: Dgawk Execution Control754802 +Node: Viewing And Changing Data758154 +Node: Dgawk Stack761463 +Node: Dgawk Info762923 +Node: Miscellaneous Dgawk Commands766871 +Node: Readline Support772296 +Node: Dgawk Limitations773123 +Node: Language History775262 +Node: V7/SVR3.1776694 +Node: SVR4778989 +Node: POSIX780431 +Node: BTL781429 +Node: POSIX/GNU782163 +Node: Common Extensions787349 +Node: Contributors788450 +Node: Installation792485 +Node: Gawk Distribution793379 +Node: Getting793863 +Node: Extracting794689 +Node: Distribution contents796367 +Node: Unix Installation801385 +Node: Quick Installation802002 +Node: Additional Configuration Options803964 +Node: Configuration Philosophy805441 +Node: Non-Unix Installation807783 +Node: PC Installation808241 +Node: PC Binary Installation809540 +Node: PC Compiling811388 +Node: PC Testing814332 +Node: PC Using815508 +Node: Cygwin819693 +Node: MSYS820690 +Node: VMS Installation821204 +Node: VMS Compilation821810 +Ref: VMS Compilation-Footnote-1822817 +Node: VMS Installation Details822875 +Node: VMS Running824510 +Node: VMS Old Gawk826110 +Node: Bugs826584 +Node: Other Versions830449 +Node: Notes835728 +Node: Compatibility Mode836420 +Node: Additions837203 +Node: Accessing The Source838015 +Node: Adding Code839438 +Node: New Ports844986 +Node: Dynamic Extensions849099 +Node: Internals850475 +Node: Plugin License859591 +Node: Sample Library860225 +Node: Internal File Description860911 +Node: Internal File Ops864618 +Ref: Internal File Ops-Footnote-1869386 +Node: Using Internal File Ops869534 +Node: Future Extensions871911 +Node: Basic Concepts874415 +Node: Basic High Level875172 +Ref: Basic High Level-Footnote-1879207 +Node: Basic Data Typing879392 +Node: Floating Point Issues883917 +Node: String Conversion Precision885000 +Ref: String Conversion Precision-Footnote-1886694 +Node: Unexpected Results886803 +Node: POSIX Floating Point Problems888629 +Ref: POSIX Floating Point Problems-Footnote-1892331 +Node: Glossary892369 +Node: Copying916512 +Node: GNU Free Documentation License954069 +Node: Index979206 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 324e8eaf..94e7abbf 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -12710,10 +12710,19 @@ The parent process ID of the current process. The value of the @code{getuid()} system call. @item PROCINFO["sorted_in"] -If this element exists in @code{PROCINFO}, -@emph{no matter what its value}, -then @command{gawk} will cause @samp{for(i in arr) @dots{}} loops -to traverse the array indices in sorted order. +If this element exists in @code{PROCINFO}, its value controls the +order in which array indices will be processed by +@samp{for(i in arr) @dots{}} loops. +A value of @code{"ascending index string"}, which may be shortened to +@code{"ascending index"} or just @code{"ascending"}, will result in either +case sensitive or case insensitive ascending order depending upon +the value of @code{IGNORECASE}. +A value of @code{"descending index string"}, which may be shortened in +a similar manner, will result in the opposite order. +The value @code{"unsorted"} is also recognized, yielding the default +result of arbitrary order. Any other value will be ignored, and +warned about (at the time of first @samp{for(in in arr) @dots{}} +execution) when lint checking is enabled. @xref{Scanning an Array}, for more information. @item PROCINFO["strftime"] @@ -13379,17 +13388,16 @@ reach them. Similarly, changing @var{var} inside the loop may produce strange results. It is best to avoid such things. As an extension, @command{gawk} makes it possible for you to -loop over the elements of an array in order, sorted by index. -Sorting is based on string comparison (since all array indices are -strings), and you cannot control the style of sorting; it is always -from lowest to highest (@code{"A"} before @code{"B"}). -To enable this feature, create the array element +loop over the elements of an array in order, based on the value of @code{PROCINFO["sorted_in"]} (@pxref{Auto-set}). -The value of this element does not -matter; @command{gawk} only tests if the element with this index -exists in @code{PROCINFO} or not. -This extension is disabled in POSIX mode, since the @code{PROCINFO} -array is not special in that case. For example: +At present two sorting options are available: @code{"ascending +index string"} and @code{"descending index string"}. They can be +shortened by omitting @samp{string} or @samp{index string}. The value +@code{"unsorted"} can be used as an explicit ``no-op'' and yields the same +result as when @code{PROCINFO["sorted_in"]} has no value at all. If the +index strings contain letters, the value of @code{IGNORECASE} affects +the order of the result. This extension is disabled in POSIX mode, +since the @code{PROCINFO} array is not special in that case. For example: @example $ @kbd{gawk 'BEGIN @{} @@ -13401,7 +13409,7 @@ $ @kbd{gawk 'BEGIN @{} @print{} 4 4 @print{} 3 3 $ @kbd{gawk 'BEGIN @{} -> @kbd{ PROCINFO["sorted_in"]++} +> @kbd{ PROCINFO["sorted_in"] = "ascending index"} > @kbd{ a[4] = 4} > @kbd{ a[3] = 3} > @kbd{ for (i in a)} @@ -30341,6 +30349,9 @@ Characters used within a regexp that do not stand for themselves. Instead, they denote regular expression operations, such as repetition, grouping, or alternation. +@item No-op +An operation that does nothing. + @item Null String A string with no characters in it. It is represented explicitly in @command{awk} programs by placing two double quote characters next to @@ -3,7 +3,7 @@ */ /* - * Copyright (C) 1986, 1988, 1989, 1991-2010 the Free Software Foundation, Inc. + * Copyright (C) 1986, 1988, 1989, 1991-2011 the Free Software Foundation, Inc. * * This file is part of GAWK, the GNU implementation of the * AWK Programming Language. @@ -1054,36 +1054,254 @@ update_FNR() } } -/* comp_func --- array index comparison function for qsort */ + +typedef int (*qsort_compfunc)(const void *,const void *); + +static qsort_compfunc sorted_in(void); +static int sort_ignrcase(const char *, size_t,const char *,size_t); +static int sort_up_index_str(const void *, const void *); +static int sort_down_index_str(const void *, const void *); +static int sort_up_index_ignrcase(const void *, const void *); +static int sort_down_index_ignrcase(const void *, const void *); + +/* comp_func --- array index comparison function for qsort, used in debug.c */ int comp_func(const void *p1, const void *p2) { - size_t len1, len2; - const char *str1, *str2; + return sort_up_index_str(p1, p2); +} + +/* sort_ignorecase --- case insensitive string comparison from cmp_nodes() */ + +static int +sort_ignorecase(const char *s1, size_t l1, const char *s2, size_t l2) +{ + size_t l; + int ret; + + l = (l1 < l2) ? l1 : l2; +#ifdef MBS_SUPPORT + if (gawk_mb_cur_max > 1) { + ret = strncasecmpmbs((const unsigned char *) s1, + (const unsigned char *) s2, l); + } else +#endif + for (ret = 0; l-- > 0 && ret == 0; s1++, s2++) + ret = casetable[*(unsigned char *) s1] + - casetable[*(unsigned char *) s2]; + if (ret == 0 && l1 != l2) + ret = (l1 < l2) ? -1 : 1; + return ret; +} + +/* + * sort_up_index_str --- qsort comparison function; ascending index strings; + * index strings are distinct within an array, so no comparisons will yield + * equal and warrant disambiguation + */ + +static int +sort_up_index_str(const void *p1, const void *p2) +{ const NODE *t1, *t2; - int cmp1; + const char *str1, *str2; + size_t len1, len2; + int ret; + /* Array indexes are strings and distinct, never equal */ t1 = *((const NODE *const *) p1); t2 = *((const NODE *const *) p2); -/* - t1 = force_string(t1); - t2 = force_string(t2); -*/ - len1 = t1->ahname_len; str1 = t1->ahname_str; - - len2 = t2->ahname_len; + len1 = t1->ahname_len; str2 = t2->ahname_str; + len2 = t2->ahname_len; + + ret = memcmp(str1, str2, (len1 < len2) ? len1 : len2); + /* + * if they compared equal but their lengths differ, the + * shorter one is a prefix of the longer and compares as less + */ + if (ret == 0 && len1 != len2) + ret = (len1 < len2) ? -1 : 1; - /* Array indexes are strings, compare as such, always! */ - cmp1 = memcmp(str1, str2, len1 < len2 ? len1 : len2); - /* if prefixes are equal, size matters */ - return (cmp1 != 0 ? cmp1 : - len1 < len2 ? -1 : (len1 > len2)); + /* indices are unique within an array, so result should never be 0 */ + assert(ret != 0); + return ret; } +/* sort_down_index_str --- descending index strings */ + +static int +sort_down_index_str(const void *p1, const void *p2) +{ + /* + * Negation versus transposed arguments: when all keys are + * distinct, as with array indices here, either method will + * transform an ascending sort into a descending one. But if + * there are equal keys--such as when IGNORECASE is honored-- + * that get disambiguated into a determisitc order, negation + * will reverse those but transposed arguments would retain + * their relative order within the rest of the reversed sort. + */ + return -sort_up_index_str(p1, p2); +} + +/* + * sort_up_index_ignrcase --- ascending index string, case insensitive; + * case insensitive comparison can cause distinct index strings to compare + * equal, so disambiguation in necessary + */ + +static int +sort_up_index_ignrcase(const void *p1, const void *p2) +{ + const NODE *t1, *t2; + int ret; + + t1 = *((const NODE *const *) p1); + t2 = *((const NODE *const *) p2); + + ret = sort_ignorecase(t1->ahname_str, t1->ahname_len, + t2->ahname_str, t2->ahname_len); + + /* + * if case insensitive result is "they're the same", + * use case sensitive comparison to force distinct order + */ + if (ret == 0) + ret = sort_up_index_str(p1, p2); + return ret; +} + +/* sort_down_index_ignrcase --- descending index strings, case insensitive */ + +static int +sort_down_index_ignrcase(const void *p1, const void *p2) +{ + return -sort_up_index_ignrcase(p1, p2); +} + +struct sort_option { + qsort_compfunc func; + const char *keydescr; +}; + +/* + * sorted_in --- fetch and parse value of PROCINFO["sorted_in"] to decide + * whether to sort the traversal of ``for (index in array) {}'', and + * if so, what ordering to generate; returns a qsort comparison function + */ + +static qsort_compfunc +sorted_in(void) +{ + static struct sort_option sorts[] = { + /* explicit no-op */ + { (qsort_compfunc) NULL, "unsorted" }, + /* also matches "ascending index" and "ascending" */ + { sort_up_index_str, "ascending index string" }, + /* also matches "descending index" and "descending" */ + { sort_down_index_str, "descending index string" }, + /* + * to come + */ + /* also matches "ascending value"; "ascending" won't get here */ + /* { sort_up_value_str, "ascending value string" }, + * { sort_down_value_str, "descending value string" }, + * { sort_up_index_num, "ascending index number" }, + * { sort_down_index_num, "descending index number" }, + * { sort_up_value_num, "ascending value number" }, + * { sort_down_value_num, "descending value number" }, + * and possibly + * { sort_as_inserted, "insertion-order" }, + */ + }; + static NODE *sorted_str = NULL; + static short warned_extension = FALSE, warned_unrecognized = FALSE; + + NODE *r; + const char *s, *descr; + size_t i, k, l; + short is_number; + qsort_compfunc sort_func; + + /* if there's no PROCINFO[], there can be no ["sorted_in"], so no sorting */ + if (PROCINFO_node == NULL) + return (qsort_compfunc) NULL; + + if (sorted_str == NULL) /* do this once */ + sorted_str = make_string("sorted_in", 9); + + r = (NODE *) NULL; + if (in_array(PROCINFO_node, sorted_str)) + r = *assoc_lookup(PROCINFO_node, sorted_str, TRUE); + /* if there's no PROCINFO["sorted_in"], there's no sorting */ + if (!r) + return (qsort_compfunc) 0; + + /* we're going to make use of PROCINFO["sorted_in"] */ + if (do_lint && ! warned_extension) { + warned_extension = TRUE; + lintwarn(_("`PROCINFO[\"sorted_in\"]' is a gawk extension")); + } + + /* default result is no sorting */ + sort_func = (qsort_compfunc) NULL; + + /* undocumented synonyms: 0 is "unsorted", 1 is "ascending index" */ + if (r->flags & MAYBE_NUM) + (void) force_number(r); + is_number = ((r->flags & NUMBER) != 0); + if (is_number) { + if (r->numbr == 1) + sort_func = sort_up_index_str; + if (r->numbr == 0 || r->numbr == 1) + goto got_func; + /* + * using PROCINFO["sorted_in"] as a number is not a general + * index into sorts[]; entries beyond [1] may get reordered + */ + } + + r = force_string(r); + s = r->stptr; + l = r->stlen; + while (l > 0 && *s == ' ') + ++s, --l; + + /* treat empty string the same as 0, a synonym for no sorting */ + if (l == 0) + goto got_func; /* sort_func is still 0 */ + + /* scan the list of available sorting orders */ + for (i = 0; i < (sizeof sorts / sizeof *sorts); ++i) { + descr = sorts[i].keydescr; + if (((k = strlen(descr)) == l || (k > l && descr[l] == ' ')) + && !strncasecmp(s, descr, l)) { + sort_func = sorts[i].func; + goto got_func; + } + } + + /* we didn't match any key description; sort_func is still 0 */ + if ((do_lint || is_number) && ! warned_unrecognized) { + warned_unrecognized = TRUE; + lintwarn(_("`PROCINFO[\"sorted_in\"]' value is not recognized")); + } + +got_func: + unref(r); + if (IGNORECASE) { + if (sort_func == sort_up_index_str) + sort_func = sort_up_index_ignrcase; + if (sort_func == sort_down_index_str) + sort_func = sort_down_index_ignrcase; + } + + return sort_func; +} @@ -2202,10 +2420,7 @@ post: NODE *array; size_t num_elems = 0; size_t i, j; - static NODE *sorted_str = NULL; - - if (sorted_str == NULL) - sorted_str = make_string("sorted_in", 9); + qsort_compfunc sort_compare; /* get the array */ array = POP_ARRAY(); @@ -2229,9 +2444,10 @@ post: } } - if (PROCINFO_node != NULL - && in_array(PROCINFO_node, sorted_str)) - qsort(list, num_elems, sizeof(NODE *), comp_func); /* shazzam! */ + sort_compare = sorted_in(); + if (sort_compare) + qsort(list, num_elems, sizeof(NODE *), + sort_compare); /* shazzam! */ list[num_elems] = array; /* actual array for use in * lint warning in Op_arrayfor_incr */ diff --git a/test/ChangeLog b/test/ChangeLog index 542a9d73..8b2138d6 100644 --- a/test/ChangeLog +++ b/test/ChangeLog @@ -1,3 +1,8 @@ +Tue Feb 15 17:11:26 2011 Pat Rankin <rankin@pactechdata.com> + + * sortfor.awk: New values for PROCINFO["sorted_in"]. + * sortfor.ok: Sync with updated sortfor.awk. + Wed Feb 16 21:09:50 2011 Arnold D. Robbins <arnold@skeeve.com> * Makefile.am (lintwarn): New test. diff --git a/test/sortfor.awk b/test/sortfor.awk index f61200da..611eca64 100644 --- a/test/sortfor.awk +++ b/test/sortfor.awk @@ -1,8 +1,9 @@ -BEGIN { - PROCINFO["sorted_in"] = 1 -} { a[$0]++ } END { + PROCINFO["sorted_in"] = "ascending" + for (i in a) + print i + PROCINFO["sorted_in"] = "descending" for (i in a) print i } diff --git a/test/sortfor.ok b/test/sortfor.ok index 3da24591..8fc84d06 100644 --- a/test/sortfor.ok +++ b/test/sortfor.ok @@ -287,3 +287,292 @@ wjposer1 zero2 zeroe0 zeroflag +zeroflag +zeroe0 +zero2 +wjposer1 +widesub4 +widesub3 +widesub2 +widesub +wideidx2 +wideidx +uparrfs +unterm +uninitialized +uninit5 +uninit4 +uninit3 +uninit2 +tweakfld +tradanch +synerr2 +synerr1 +switch2 +swaplns +substr +subslash +subsepnm +subi18n +subamp +strtonum +strtod +strnum1 +strftlng +strftime +strcat1 +sprintfc +splitwht +splitvar +splitdef +splitarr +splitargv +splitarg4 +space +sortempty +sort1 +shadow +sclifin +sclforin +scalar +rswhite +rstest6 +rstest5 +rstest4 +rstest3 +rstest2 +rstest1 +rsstart3 +rsstart2 +rsstart1 +rsnulbig2 +rsnulbig +rsnul1nl +rs +resplit +reparse +reint2 +reint +reindops +regx8bit +regeq +redfilnm +rebuf +rebt8b2 +rebt8b1 +range1 +rand +prtoeval +prt1eval +profile2 +profile1 +procinfs +prmreuse +prmarscl +printfbad2 +printfbad1 +printf1 +printf0 +prec +prdupval +poundbang +posix2008sub +posix +pipeio2 +pipeio1 +pid +pcntplus +patsplit +parseme +parsefld +parse1 +paramtyp +paramres +paramdup +opasnslf +opasnidx +onlynl +ofmts +ofmtfidl +ofmtbig +ofmt +octsub +numsubstr +numindex +nulrsend +nors +noparms +nonl +nondec2 +nondec +noloop2 +noloop1 +nofmtch +nofile +noeffect +nlstrina +nlinstr +nlfldsep +nfset +nfneg +nfldstr +nested +negexp +nasty2 +nasty +mtchi18n +mmap8k +minusstr +messages +membug1 +mbstr1 +mbprintf3 +mbprintf2 +mbprintf1 +mbfw1 +math +match3 +match2 +match1 +manyfiles +manglprm +longwrds +longsub +localenl +litoct +lintold +lint +leadnl +leaddig +lc_num1 +iobug1 +intprec +intformat +intest +inputred +indirectcall +ignrcase +ignrcas2 +igncfs +igncdym +icasers +icasefs +hsprint +hex +gsubtst6 +gsubtst5 +gsubtst4 +gsubtst3 +gsubtst2 +gsubtest +gsubasgn +gnureops +gnuops3 +gnuops2 +getnr2tm +getnr2tb +getlnhd +getlndir +getlnbuf +getline3 +getline2 +getline +gensub2 +gensub +fwtest2 +fwtest +funstack +funsmnam +funsemnl +funlen +fstabplus +fsspcoln +fsrs +fsfwfs +fsbs +fpat1 +forsimp +forref +fordel +fnparydl +fnmisc +fnasgnm +fnaryscl +fnarydel +fnarray2 +fnarray +fnamedat +fmttest +fmtspcl +fldchgnf +fldchg +fieldwdth +fflush +fcall_exit2 +fcall_exit +exitval2 +exitval1 +eofsplit +dynlj +dumpvars +double2 +double1 +devfd2 +devfd1 +devfd +delfunc +delarprm +delarpm2 +defref +datanonl +convfmt +concat4 +concat3 +concat2 +concat1 +compare2 +compare +clsflnam +closebad +clos1way +clobber +childin +binmode1 +beginfile1 +badargs +backw +backgsub +back89 +awkpath +asorti +asort +asgext +arysubnm +aryprm8 +aryprm7 +aryprm6 +aryprm5 +aryprm4 +aryprm3 +aryprm2 +aryprm1 +arynocls +arynasty +arryref5 +arryref4 +arryref3 +arryref2 +arrymem1 +arrayref +arrayprm3 +arrayprm2 +arrayparm +argtest +argarray +anchgsub +addcomma +aasorti +aasort +aarray1 +aadelete2 +aadelete1 |