diff options
Diffstat (limited to 'doc/gawk.info')
-rw-r--r-- | doc/gawk.info | 495 |
1 files changed, 288 insertions, 207 deletions
diff --git a/doc/gawk.info b/doc/gawk.info index d4885eb1..e344c401 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -11160,6 +11160,62 @@ backslashes entered at the lexical level.) The problem with the historical approach is that there is no way to get a literal `\' followed by the matched text. + The 1992 POSIX standard attempted to fix this problem. That standard +says that `sub()' and `gsub()' look for either a `\' or an `&' after +the `\'. If either one follows a `\', that character is output +literally. The interpretation of `\' and `&' then becomes as shown in +*note table-sub-posix-92::. + + You type `sub()' sees `sub()' generates + ------- --------- -------------- + `&' `&' the matched text + `\\&' `\&' a literal `&' + `\\\\&' `\\&' a literal `\', then the matched text + `\\\\\\&' `\\\&' a literal `\&' + +Table 9.2: 1992 POSIX Rules for sub and gsub Escape Sequence Processing + +This appears to solve the problem. Unfortunately, the phrasing of the +standard is unusual. It says, in effect, that `\' turns off the special +meaning of any following character, but for anything other than `\' and +`&', such special meaning is undefined. This wording leads to two +problems: + + * Backslashes must now be doubled in the REPLACEMENT string, breaking + historical `awk' programs. + + * To make sure that an `awk' program is portable, _every_ character + in the REPLACEMENT string must be preceded with a backslash.(1) + + Because of the problems just listed, in 1996, the `gawk' maintainer +submitted proposed text for a revised standard that reverts to rules +that correspond more closely to the original existing practice. The +proposed rules have special cases that make it possible to produce a +`\' preceding the matched text. This is shown in *note +table-sub-proposed::. + + You type `sub()' sees `sub()' generates + ------- --------- -------------- + `\\\\\\&' `\\\&' a literal `\&' + `\\\\&' `\\&' a literal `\', followed by the matched text + `\\&' `\&' a literal `&' + `\\q' `\q' a literal `\q' + `\\\\' `\\' `\\' + +Table 9.3: Proposed rules for sub and backslash + + In a nutshell, at the runtime level, there are now three special +sequences of characters (`\\\&', `\\&' and `\&') whereas historically +there was only one. However, as in the historical case, any `\' that +is not part of one of these three sequences is not special and appears +in the output literally. + + `gawk' 3.0 and 3.1 follow these proposed POSIX rules for `sub()' and +`gsub()'. The POSIX standard took much longer to be revised than was +expected in 1996. The 2001 standard does not follow the above rules. +Instead, the rules there are somewhat simpler. The results are similar +except for one case. + The POSIX rules state that `\&' in the replacement string produces a literal `&', `\\' produces a literal `\', and `\' followed by anything else is not special; the `\' is placed straight into the output. These @@ -11173,9 +11229,21 @@ rules are presented in *note table-posix-sub::. `\\q' `\q' a literal `\q' `\\\\' `\\' `\' -Table 9.2: POSIX rules for `sub()' and `gsub()' +Table 9.4: POSIX rules for `sub()' and `gsub()' - `gawk' follows the POSIX rules. + The only case where the difference is noticeable is the last one: +`\\\\' is seen as `\\' and produces `\' instead of `\\'. + + Starting with version 3.1.4, `gawk' followed the POSIX rules when +`--posix' is specified (*note Options::). Otherwise, it continued to +follow the 1996 proposed rules, since that had been its behavior for +many years. + + When version 4.0.0, was released, the `gawk' maintainer made the +POSIX rules the default, breaking well over a decade's worth of +backwards compatibility.(2) Needless to say, this was a bad idea, and +as of version 4.0.1, `gawk' resumed its historical behavior, and only +follows the POSIX rules when `--posix' is given. The rules for `gensub()' are considerably simpler. At the runtime level, whenever `gawk' sees a `\', if the following character is a @@ -11193,7 +11261,7 @@ the `\' does not, as shown in *note table-gensub-escapes::. `\\\\\\&' `\\\&' a literal `\&' `\\q' `\q' a literal `q' -Table 9.3: Escape Sequence Processing for `gensub()' +Table 9.5: Escape Sequence Processing for `gensub()' Because of the complexity of the lexical and runtime level processing and the special cases for `sub()' and `gsub()', we recommend the use of @@ -11211,6 +11279,14 @@ functions. For example: Although this makes a certain amount of sense, it can be surprising. + ---------- Footnotes ---------- + + (1) This consequence was certainly unintended. + + (2) This was rather naive of him, despite there being a note in this +section indicating that the next major version would move to the POSIX +rules. + File: gawk.info, Node: I/O Functions, Next: Time Functions, Prev: String Functions, Up: Built-in @@ -11734,7 +11810,7 @@ table-bitwise-ops::. 0 | 0 0 | 0 1 | 0 1 1 | 0 1 | 1 1 | 1 0 -Table 9.4: Bitwise Operations +Table 9.6: Bitwise Operations As you can see, the result of an AND operation is 1 only when _both_ bits are 1. The result of an OR operation is 1 if _either_ bit is 1. @@ -24628,7 +24704,7 @@ Index * * (asterisk), * operator, as regexp operator: Regexp Operators. (line 87) * * (asterisk), * operator, null strings, matching: Gory Details. - (line 96) + (line 164) * * (asterisk), ** operator <1>: Precedence. (line 49) * * (asterisk), ** operator: Arithmetic Ops. (line 81) * * (asterisk), **= operator <1>: Precedence. (line 95) @@ -24868,7 +24944,7 @@ Index (line 23) * advanced features, network connections, See Also networks, connections: Advanced Features. (line 6) -* advanced features, null strings, matching: Gory Details. (line 96) +* advanced features, null strings, matching: Gory Details. (line 164) * advanced features, operators, precedence: Increment Ops. (line 61) * advanced features, piping into sh: Redirection. (line 143) * advanced features, regexp constants: Assignment Ops. (line 148) @@ -24965,7 +25041,7 @@ Index * asterisk (*), * operator, as regexp operator: Regexp Operators. (line 87) * asterisk (*), * operator, null strings, matching: Gory Details. - (line 96) + (line 164) * asterisk (*), ** operator <1>: Precedence. (line 49) * asterisk (*), ** operator: Arithmetic Ops. (line 81) * asterisk (*), **= operator <1>: Precedence. (line 95) @@ -26389,7 +26465,7 @@ Index * matching, expressions, See comparison expressions: Typing and Comparison. (line 9) * matching, leftmost longest: Multiple Line. (line 26) -* matching, null strings: Gory Details. (line 96) +* matching, null strings: Gory Details. (line 164) * mawk program: Other Versions. (line 35) * McPhee, Patrick: Contributors. (line 100) * memory, releasing: Internals. (line 101) @@ -26470,7 +26546,7 @@ Index * null strings, as array subscripts: Uninitialized Subscripts. (line 43) * null strings, converting numbers to strings: Conversion. (line 21) -* null strings, matching: Gory Details. (line 96) +* null strings, matching: Gory Details. (line 164) * null strings, quoting and: Quoting. (line 62) * number sign (#), #! (executable scripts): Executable Scripts. (line 6) @@ -26686,6 +26762,7 @@ Index * POSIX awk, field separators and: Fields. (line 6) * POSIX awk, FS variable and: User-modified. (line 66) * POSIX awk, function keyword in: Definition Syntax. (line 83) +* POSIX awk, functions and, gsub()/sub(): Gory Details. (line 54) * POSIX awk, functions and, length(): String Functions. (line 175) * POSIX awk, GNU long options and: Options. (line 15) * POSIX awk, interval expressions in: Regexp Operators. (line 135) @@ -27579,203 +27656,207 @@ Ref: String Functions-Footnote-2468728 Ref: String Functions-Footnote-3468976 Node: Gory Details469063 Ref: table-sub-escapes470742 -Ref: table-posix-sub472056 -Ref: table-gensub-escapes472969 -Node: I/O Functions474140 -Ref: I/O Functions-Footnote-1480795 -Node: Time Functions480942 -Ref: Time Functions-Footnote-1491834 -Ref: Time Functions-Footnote-2491902 -Ref: Time Functions-Footnote-3492060 -Ref: Time Functions-Footnote-4492171 -Ref: Time Functions-Footnote-5492283 -Ref: Time Functions-Footnote-6492510 -Node: Bitwise Functions492776 -Ref: table-bitwise-ops493334 -Ref: Bitwise Functions-Footnote-1497494 -Node: Type Functions497678 -Node: I18N Functions498148 -Node: User-defined499775 -Node: Definition Syntax500579 -Ref: Definition Syntax-Footnote-1505489 -Node: Function Example505558 -Node: Function Caveats508152 -Node: Calling A Function508573 -Node: Variable Scope509688 -Node: Pass By Value/Reference511663 -Node: Return Statement515103 -Node: Dynamic Typing518084 -Node: Indirect Calls518819 -Node: Internationalization528504 -Node: I18N and L10N529930 -Node: Explaining gettext530616 -Ref: Explaining gettext-Footnote-1535682 -Ref: Explaining gettext-Footnote-2535866 -Node: Programmer i18n536031 -Node: Translator i18n540231 -Node: String Extraction541024 -Ref: String Extraction-Footnote-1541985 -Node: Printf Ordering542071 -Ref: Printf Ordering-Footnote-1544855 -Node: I18N Portability544919 -Ref: I18N Portability-Footnote-1547368 -Node: I18N Example547431 -Ref: I18N Example-Footnote-1550066 -Node: Gawk I18N550138 -Node: Advanced Features550755 -Node: Nondecimal Data552268 -Node: Array Sorting553851 -Node: Controlling Array Traversal554551 -Node: Controlling Scanning With A Function555298 -Node: Controlling Scanning563001 -Ref: Controlling Scanning-Footnote-1566802 -Node: Array Sorting Functions567118 -Ref: Array Sorting Functions-Footnote-1570634 -Ref: Array Sorting Functions-Footnote-2570727 -Node: Two-way I/O570921 -Ref: Two-way I/O-Footnote-1576353 -Node: TCP/IP Networking576423 -Node: Profiling579267 -Node: Library Functions586741 -Ref: Library Functions-Footnote-1589748 -Node: Library Names589919 -Ref: Library Names-Footnote-1593390 -Ref: Library Names-Footnote-2593610 -Node: General Functions593696 -Node: Strtonum Function594649 -Node: Assert Function597579 -Node: Round Function600905 -Node: Cliff Random Function602448 -Node: Ordinal Functions603464 -Ref: Ordinal Functions-Footnote-1606534 -Ref: Ordinal Functions-Footnote-2606786 -Node: Join Function606995 -Ref: Join Function-Footnote-1608766 -Node: Gettimeofday Function608966 -Node: Data File Management612681 -Node: Filetrans Function613313 -Node: Rewind Function617452 -Node: File Checking618839 -Node: Empty Files619933 -Node: Ignoring Assigns622163 -Node: Getopt Function623716 -Ref: Getopt Function-Footnote-1635020 -Node: Passwd Functions635223 -Ref: Passwd Functions-Footnote-1644198 -Node: Group Functions644286 -Node: Walking Arrays652370 -Node: Sample Programs653939 -Node: Running Examples654604 -Node: Clones655332 -Node: Cut Program656556 -Node: Egrep Program666401 -Ref: Egrep Program-Footnote-1674174 -Node: Id Program674284 -Node: Split Program677900 -Ref: Split Program-Footnote-1681419 -Node: Tee Program681547 -Node: Uniq Program684350 -Node: Wc Program691779 -Ref: Wc Program-Footnote-1696045 -Ref: Wc Program-Footnote-2696245 -Node: Miscellaneous Programs696337 -Node: Dupword Program697525 -Node: Alarm Program699556 -Node: Translate Program704305 -Ref: Translate Program-Footnote-1708692 -Ref: Translate Program-Footnote-2708920 -Node: Labels Program709054 -Ref: Labels Program-Footnote-1712425 -Node: Word Sorting712509 -Node: History Sorting716393 -Node: Extract Program718232 -Ref: Extract Program-Footnote-1725715 -Node: Simple Sed725843 -Node: Igawk Program728905 -Ref: Igawk Program-Footnote-1744062 -Ref: Igawk Program-Footnote-2744263 -Node: Anagram Program744401 -Node: Signature Program747469 -Node: Debugger748569 -Node: Debugging749480 -Node: Debugging Concepts749893 -Node: Debugging Terms751749 -Node: Awk Debugging754372 -Node: Sample dgawk session755264 -Node: dgawk invocation755756 -Node: Finding The Bug756938 -Node: List of Debugger Commands763424 -Node: Breakpoint Control764735 -Node: Dgawk Execution Control768371 -Node: Viewing And Changing Data771722 -Node: Dgawk Stack775059 -Node: Dgawk Info776519 -Node: Miscellaneous Dgawk Commands780467 -Node: Readline Support785895 -Node: Dgawk Limitations786733 -Node: Language History788922 -Node: V7/SVR3.1790434 -Node: SVR4792755 -Node: POSIX794197 -Node: BTL795205 -Node: POSIX/GNU795939 -Node: Common Extensions801090 -Node: Ranges and Locales802197 -Ref: Ranges and Locales-Footnote-1806804 -Node: Contributors807025 -Node: Installation811287 -Node: Gawk Distribution812181 -Node: Getting812665 -Node: Extracting813491 -Node: Distribution contents815183 -Node: Unix Installation820405 -Node: Quick Installation821022 -Node: Additional Configuration Options822984 -Node: Configuration Philosophy824461 -Node: Non-Unix Installation826803 -Node: PC Installation827261 -Node: PC Binary Installation828560 -Node: PC Compiling830408 -Node: PC Testing833352 -Node: PC Using834528 -Node: Cygwin838713 -Node: MSYS839713 -Node: VMS Installation840227 -Node: VMS Compilation840830 -Ref: VMS Compilation-Footnote-1841837 -Node: VMS Installation Details841895 -Node: VMS Running843530 -Node: VMS Old Gawk845137 -Node: Bugs845611 -Node: Other Versions849464 -Node: Notes854745 -Node: Compatibility Mode855437 -Node: Additions856220 -Node: Accessing The Source857032 -Node: Adding Code858457 -Node: New Ports864424 -Node: Dynamic Extensions868537 -Node: Internals869913 -Node: Plugin License879016 -Node: Sample Library879650 -Node: Internal File Description880336 -Node: Internal File Ops884051 -Ref: Internal File Ops-Footnote-1888832 -Node: Using Internal File Ops888972 -Node: Future Extensions891349 -Node: Basic Concepts893853 -Node: Basic High Level894610 -Ref: Basic High Level-Footnote-1898645 -Node: Basic Data Typing898830 -Node: Floating Point Issues903355 -Node: String Conversion Precision904438 -Ref: String Conversion Precision-Footnote-1906138 -Node: Unexpected Results906247 -Node: POSIX Floating Point Problems908073 -Ref: POSIX Floating Point Problems-Footnote-1911778 -Node: Glossary911816 -Node: Copying936792 -Node: GNU Free Documentation License974349 -Node: Index999486 +Ref: table-sub-posix-92472096 +Ref: table-sub-proposed473439 +Ref: table-posix-sub474789 +Ref: table-gensub-escapes476335 +Ref: Gory Details-Footnote-1477542 +Ref: Gory Details-Footnote-2477593 +Node: I/O Functions477744 +Ref: I/O Functions-Footnote-1484399 +Node: Time Functions484546 +Ref: Time Functions-Footnote-1495438 +Ref: Time Functions-Footnote-2495506 +Ref: Time Functions-Footnote-3495664 +Ref: Time Functions-Footnote-4495775 +Ref: Time Functions-Footnote-5495887 +Ref: Time Functions-Footnote-6496114 +Node: Bitwise Functions496380 +Ref: table-bitwise-ops496938 +Ref: Bitwise Functions-Footnote-1501098 +Node: Type Functions501282 +Node: I18N Functions501752 +Node: User-defined503379 +Node: Definition Syntax504183 +Ref: Definition Syntax-Footnote-1509093 +Node: Function Example509162 +Node: Function Caveats511756 +Node: Calling A Function512177 +Node: Variable Scope513292 +Node: Pass By Value/Reference515267 +Node: Return Statement518707 +Node: Dynamic Typing521688 +Node: Indirect Calls522423 +Node: Internationalization532108 +Node: I18N and L10N533534 +Node: Explaining gettext534220 +Ref: Explaining gettext-Footnote-1539286 +Ref: Explaining gettext-Footnote-2539470 +Node: Programmer i18n539635 +Node: Translator i18n543835 +Node: String Extraction544628 +Ref: String Extraction-Footnote-1545589 +Node: Printf Ordering545675 +Ref: Printf Ordering-Footnote-1548459 +Node: I18N Portability548523 +Ref: I18N Portability-Footnote-1550972 +Node: I18N Example551035 +Ref: I18N Example-Footnote-1553670 +Node: Gawk I18N553742 +Node: Advanced Features554359 +Node: Nondecimal Data555872 +Node: Array Sorting557455 +Node: Controlling Array Traversal558155 +Node: Controlling Scanning With A Function558902 +Node: Controlling Scanning566605 +Ref: Controlling Scanning-Footnote-1570406 +Node: Array Sorting Functions570722 +Ref: Array Sorting Functions-Footnote-1574238 +Ref: Array Sorting Functions-Footnote-2574331 +Node: Two-way I/O574525 +Ref: Two-way I/O-Footnote-1579957 +Node: TCP/IP Networking580027 +Node: Profiling582871 +Node: Library Functions590345 +Ref: Library Functions-Footnote-1593352 +Node: Library Names593523 +Ref: Library Names-Footnote-1596994 +Ref: Library Names-Footnote-2597214 +Node: General Functions597300 +Node: Strtonum Function598253 +Node: Assert Function601183 +Node: Round Function604509 +Node: Cliff Random Function606052 +Node: Ordinal Functions607068 +Ref: Ordinal Functions-Footnote-1610138 +Ref: Ordinal Functions-Footnote-2610390 +Node: Join Function610599 +Ref: Join Function-Footnote-1612370 +Node: Gettimeofday Function612570 +Node: Data File Management616285 +Node: Filetrans Function616917 +Node: Rewind Function621056 +Node: File Checking622443 +Node: Empty Files623537 +Node: Ignoring Assigns625767 +Node: Getopt Function627320 +Ref: Getopt Function-Footnote-1638624 +Node: Passwd Functions638827 +Ref: Passwd Functions-Footnote-1647802 +Node: Group Functions647890 +Node: Walking Arrays655974 +Node: Sample Programs657543 +Node: Running Examples658208 +Node: Clones658936 +Node: Cut Program660160 +Node: Egrep Program670005 +Ref: Egrep Program-Footnote-1677778 +Node: Id Program677888 +Node: Split Program681504 +Ref: Split Program-Footnote-1685023 +Node: Tee Program685151 +Node: Uniq Program687954 +Node: Wc Program695383 +Ref: Wc Program-Footnote-1699649 +Ref: Wc Program-Footnote-2699849 +Node: Miscellaneous Programs699941 +Node: Dupword Program701129 +Node: Alarm Program703160 +Node: Translate Program707909 +Ref: Translate Program-Footnote-1712296 +Ref: Translate Program-Footnote-2712524 +Node: Labels Program712658 +Ref: Labels Program-Footnote-1716029 +Node: Word Sorting716113 +Node: History Sorting719997 +Node: Extract Program721836 +Ref: Extract Program-Footnote-1729319 +Node: Simple Sed729447 +Node: Igawk Program732509 +Ref: Igawk Program-Footnote-1747666 +Ref: Igawk Program-Footnote-2747867 +Node: Anagram Program748005 +Node: Signature Program751073 +Node: Debugger752173 +Node: Debugging753084 +Node: Debugging Concepts753497 +Node: Debugging Terms755353 +Node: Awk Debugging757976 +Node: Sample dgawk session758868 +Node: dgawk invocation759360 +Node: Finding The Bug760542 +Node: List of Debugger Commands767028 +Node: Breakpoint Control768339 +Node: Dgawk Execution Control771975 +Node: Viewing And Changing Data775326 +Node: Dgawk Stack778663 +Node: Dgawk Info780123 +Node: Miscellaneous Dgawk Commands784071 +Node: Readline Support789499 +Node: Dgawk Limitations790337 +Node: Language History792526 +Node: V7/SVR3.1794038 +Node: SVR4796359 +Node: POSIX797801 +Node: BTL798809 +Node: POSIX/GNU799543 +Node: Common Extensions804694 +Node: Ranges and Locales805801 +Ref: Ranges and Locales-Footnote-1810408 +Node: Contributors810629 +Node: Installation814891 +Node: Gawk Distribution815785 +Node: Getting816269 +Node: Extracting817095 +Node: Distribution contents818787 +Node: Unix Installation824009 +Node: Quick Installation824626 +Node: Additional Configuration Options826588 +Node: Configuration Philosophy828065 +Node: Non-Unix Installation830407 +Node: PC Installation830865 +Node: PC Binary Installation832164 +Node: PC Compiling834012 +Node: PC Testing836956 +Node: PC Using838132 +Node: Cygwin842317 +Node: MSYS843317 +Node: VMS Installation843831 +Node: VMS Compilation844434 +Ref: VMS Compilation-Footnote-1845441 +Node: VMS Installation Details845499 +Node: VMS Running847134 +Node: VMS Old Gawk848741 +Node: Bugs849215 +Node: Other Versions853068 +Node: Notes858349 +Node: Compatibility Mode859041 +Node: Additions859824 +Node: Accessing The Source860636 +Node: Adding Code862061 +Node: New Ports868028 +Node: Dynamic Extensions872141 +Node: Internals873517 +Node: Plugin License882620 +Node: Sample Library883254 +Node: Internal File Description883940 +Node: Internal File Ops887655 +Ref: Internal File Ops-Footnote-1892436 +Node: Using Internal File Ops892576 +Node: Future Extensions894953 +Node: Basic Concepts897457 +Node: Basic High Level898214 +Ref: Basic High Level-Footnote-1902249 +Node: Basic Data Typing902434 +Node: Floating Point Issues906959 +Node: String Conversion Precision908042 +Ref: String Conversion Precision-Footnote-1909742 +Node: Unexpected Results909851 +Node: POSIX Floating Point Problems911677 +Ref: POSIX Floating Point Problems-Footnote-1915382 +Node: Glossary915420 +Node: Copying940396 +Node: GNU Free Documentation License977953 +Node: Index1003090 End Tag Table |