diff options
-rw-r--r-- | doc/ChangeLog | 4 | ||||
-rw-r--r-- | doc/gawk.info | 921 | ||||
-rw-r--r-- | doc/gawk.texi | 134 | ||||
-rw-r--r-- | doc/gawktexi.in | 124 |
4 files changed, 605 insertions, 578 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog index 94e8b306..17b4b291 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,7 @@ +2014-02-14 Arnold D. Robbins <arnold@skeeve.com> + + * gawktexi.in: Lots of small edits. + 2014-02-07 Arnold D. Robbins <arnold@skeeve.com> * gawktexi.in: More minor fixes, update UPDATE_MONTH. diff --git a/doc/gawk.info b/doc/gawk.info index 6523199c..351d0d44 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -4083,18 +4083,23 @@ use for `RS' in this case: BEGIN { RS = "\0" } # whole file becomes one record? `gawk' in fact accepts this, and uses the NUL character for the -record separator. However, this usage is _not_ portable to other `awk' -implementations. +record separator. However, this usage is _not_ portable to most other +`awk' implementations. - All other `awk' implementations(1) store strings internally as -C-style strings. C strings use the NUL character as the string + Almost all other `awk' implementations(1) store strings internally +as C-style strings. C strings use the NUL character as the string terminator. In effect, this means that `RS = "\0"' is the same as `RS = ""'. (d.c.) + It happens that recent versions of `mawk' can use the NUL character +as a record separator. However, this is a special case: `mawk' does not +allow embedded NUL characters in strings. + The best way to treat a whole file as a single record is to simply read the file in, one record at a time, concatenating each record onto the end of the previous ones. + ---------- Footnotes ---------- (1) At least that we know about. @@ -7171,7 +7176,7 @@ decimal point when reading the `awk' program source code, and for command-line variable assignments (*note Other Arguments::). However, when interpreting input data, for `print' and `printf' output, and for number to string conversion, the local decimal point character is used. -(d.c.). Here are some examples indicating the difference in behavior, +(d.c.) Here are some examples indicating the difference in behavior, on a GNU/Linux system: $ export POSIXLY_CORRECT=1 Force POSIX behavior @@ -31141,7 +31146,7 @@ Index * files, /inet/... (gawk): TCP/IP Networking. (line 6) * files, /inet4/... (gawk): TCP/IP Networking. (line 6) * files, /inet6/... (gawk): TCP/IP Networking. (line 6) -* files, as single records: Records. (line 200) +* files, as single records: Records. (line 204) * files, awk programs in: Long. (line 6) * files, awkprof.out: Profiling. (line 6) * files, awkvars.out: Options. (line 93) @@ -32151,7 +32156,7 @@ Index * records, printing: Print. (line 22) * records, splitting input into: Records. (line 6) * records, terminating: Records. (line 117) -* records, treating files as: Records. (line 200) +* records, treating files as: Records. (line 204) * recursive functions: Definition Syntax. (line 73) * redirection of input: Getline/File. (line 6) * redirection of output: Redirection. (line 6) @@ -32802,456 +32807,456 @@ Node: Leftmost Longest171273 Node: Computed Regexps172474 Node: Reading Files175811 Node: Records177813 -Ref: Records-Footnote-1186702 -Node: Fields186739 -Ref: Fields-Footnote-1189772 -Node: Nonconstant Fields189858 -Node: Changing Fields192060 -Node: Field Separators198019 -Node: Default Field Splitting200721 -Node: Regexp Field Splitting201838 -Node: Single Character Fields205180 -Node: Command Line Field Separator206239 -Node: Full Line Fields209673 -Ref: Full Line Fields-Footnote-1210181 -Node: Field Splitting Summary210227 -Ref: Field Splitting Summary-Footnote-1213326 -Node: Constant Size213427 -Node: Splitting By Content218034 -Ref: Splitting By Content-Footnote-1221783 -Node: Multiple Line221823 -Ref: Multiple Line-Footnote-1227670 -Node: Getline227849 -Node: Plain Getline230065 -Node: Getline/Variable232160 -Node: Getline/File233307 -Node: Getline/Variable/File234648 -Ref: Getline/Variable/File-Footnote-1236247 -Node: Getline/Pipe236334 -Node: Getline/Variable/Pipe239033 -Node: Getline/Coprocess240140 -Node: Getline/Variable/Coprocess241392 -Node: Getline Notes242129 -Node: Getline Summary244916 -Ref: table-getline-variants245324 -Node: Read Timeout246236 -Ref: Read Timeout-Footnote-1249977 -Node: Command line directories250034 -Node: Printing250664 -Node: Print252295 -Node: Print Examples253632 -Node: Output Separators256416 -Node: OFMT258176 -Node: Printf259534 -Node: Basic Printf260440 -Node: Control Letters261979 -Node: Format Modifiers265791 -Node: Printf Examples271800 -Node: Redirection274515 -Node: Special Files281480 -Node: Special FD282013 -Ref: Special FD-Footnote-1285638 -Node: Special Network285712 -Node: Special Caveats286562 -Node: Close Files And Pipes287358 -Ref: Close Files And Pipes-Footnote-1294341 -Ref: Close Files And Pipes-Footnote-2294489 -Node: Expressions294639 -Node: Values295771 -Node: Constants296447 -Node: Scalar Constants297127 -Ref: Scalar Constants-Footnote-1297986 -Node: Nondecimal-numbers298168 -Node: Regexp Constants301168 -Node: Using Constant Regexps301643 -Node: Variables304698 -Node: Using Variables305353 -Node: Assignment Options307077 -Node: Conversion308949 -Ref: table-locale-affects314450 -Ref: Conversion-Footnote-1315074 -Node: All Operators315183 -Node: Arithmetic Ops315813 -Node: Concatenation318318 -Ref: Concatenation-Footnote-1321110 -Node: Assignment Ops321230 -Ref: table-assign-ops326218 -Node: Increment Ops327549 -Node: Truth Values and Conditions330983 -Node: Truth Values332066 -Node: Typing and Comparison333115 -Node: Variable Typing333908 -Ref: Variable Typing-Footnote-1337805 -Node: Comparison Operators337927 -Ref: table-relational-ops338337 -Node: POSIX String Comparison341885 -Ref: POSIX String Comparison-Footnote-1342841 -Node: Boolean Ops342979 -Ref: Boolean Ops-Footnote-1347057 -Node: Conditional Exp347148 -Node: Function Calls348880 -Node: Precedence352474 -Node: Locales356143 -Node: Patterns and Actions357232 -Node: Pattern Overview358286 -Node: Regexp Patterns359955 -Node: Expression Patterns360498 -Node: Ranges364183 -Node: BEGIN/END367149 -Node: Using BEGIN/END367911 -Ref: Using BEGIN/END-Footnote-1370642 -Node: I/O And BEGIN/END370748 -Node: BEGINFILE/ENDFILE373030 -Node: Empty375944 -Node: Using Shell Variables376260 -Node: Action Overview378545 -Node: Statements380902 -Node: If Statement382756 -Node: While Statement384255 -Node: Do Statement386299 -Node: For Statement387455 -Node: Switch Statement390607 -Node: Break Statement392704 -Node: Continue Statement394694 -Node: Next Statement396487 -Node: Nextfile Statement398877 -Node: Exit Statement401520 -Node: Built-in Variables403936 -Node: User-modified405031 -Ref: User-modified-Footnote-1413389 -Node: Auto-set413451 -Ref: Auto-set-Footnote-1426529 -Ref: Auto-set-Footnote-2426734 -Node: ARGC and ARGV426790 -Node: Arrays430641 -Node: Array Basics432146 -Node: Array Intro432972 -Node: Reference to Elements437289 -Node: Assigning Elements439559 -Node: Array Example440050 -Node: Scanning an Array441782 -Node: Controlling Scanning444096 -Ref: Controlling Scanning-Footnote-1449183 -Node: Delete449499 -Ref: Delete-Footnote-1452264 -Node: Numeric Array Subscripts452321 -Node: Uninitialized Subscripts454504 -Node: Multidimensional456131 -Node: Multiscanning459224 -Node: Arrays of Arrays460813 -Node: Functions465453 -Node: Built-in466272 -Node: Calling Built-in467350 -Node: Numeric Functions469338 -Ref: Numeric Functions-Footnote-1473170 -Ref: Numeric Functions-Footnote-2473527 -Ref: Numeric Functions-Footnote-3473575 -Node: String Functions473844 -Ref: String Functions-Footnote-1496764 -Ref: String Functions-Footnote-2496893 -Ref: String Functions-Footnote-3497141 -Node: Gory Details497228 -Ref: table-sub-escapes498907 -Ref: table-sub-posix-92500261 -Ref: table-sub-proposed501612 -Ref: table-posix-sub502966 -Ref: table-gensub-escapes504511 -Ref: Gory Details-Footnote-1505687 -Ref: Gory Details-Footnote-2505738 -Node: I/O Functions505889 -Ref: I/O Functions-Footnote-1512874 -Node: Time Functions513021 -Ref: Time Functions-Footnote-1523954 -Ref: Time Functions-Footnote-2524022 -Ref: Time Functions-Footnote-3524180 -Ref: Time Functions-Footnote-4524291 -Ref: Time Functions-Footnote-5524403 -Ref: Time Functions-Footnote-6524630 -Node: Bitwise Functions524896 -Ref: table-bitwise-ops525458 -Ref: Bitwise Functions-Footnote-1529679 -Node: Type Functions529863 -Node: I18N Functions531014 -Node: User-defined532641 -Node: Definition Syntax533445 -Ref: Definition Syntax-Footnote-1538355 -Node: Function Example538424 -Node: Function Caveats541018 -Node: Calling A Function541439 -Node: Variable Scope542554 -Node: Pass By Value/Reference545517 -Node: Return Statement549025 -Node: Dynamic Typing552006 -Node: Indirect Calls552937 -Node: Library Functions562622 -Ref: Library Functions-Footnote-1566135 -Ref: Library Functions-Footnote-2566278 -Node: Library Names566449 -Ref: Library Names-Footnote-1569920 -Ref: Library Names-Footnote-2570140 -Node: General Functions570226 -Node: Strtonum Function571254 -Node: Assert Function574184 -Node: Round Function577510 -Node: Cliff Random Function579053 -Node: Ordinal Functions580069 -Ref: Ordinal Functions-Footnote-1583141 -Ref: Ordinal Functions-Footnote-2583393 -Node: Join Function583602 -Ref: Join Function-Footnote-1585373 -Node: Getlocaltime Function585573 -Node: Readfile Function589314 -Node: Data File Management591153 -Node: Filetrans Function591785 -Node: Rewind Function595854 -Node: File Checking597241 -Node: Empty Files598335 -Node: Ignoring Assigns600565 -Node: Getopt Function602118 -Ref: Getopt Function-Footnote-1613421 -Node: Passwd Functions613624 -Ref: Passwd Functions-Footnote-1622599 -Node: Group Functions622687 -Node: Walking Arrays630771 -Node: Sample Programs632908 -Node: Running Examples633582 -Node: Clones634310 -Node: Cut Program635534 -Node: Egrep Program645379 -Ref: Egrep Program-Footnote-1653152 -Node: Id Program653262 -Node: Split Program656878 -Ref: Split Program-Footnote-1660397 -Node: Tee Program660525 -Node: Uniq Program663328 -Node: Wc Program670757 -Ref: Wc Program-Footnote-1675023 -Ref: Wc Program-Footnote-2675223 -Node: Miscellaneous Programs675315 -Node: Dupword Program676503 -Node: Alarm Program678534 -Node: Translate Program683287 -Ref: Translate Program-Footnote-1687674 -Ref: Translate Program-Footnote-2687922 -Node: Labels Program688056 -Ref: Labels Program-Footnote-1691427 -Node: Word Sorting691511 -Node: History Sorting695395 -Node: Extract Program697234 -Ref: Extract Program-Footnote-1704737 -Node: Simple Sed704865 -Node: Igawk Program707927 -Ref: Igawk Program-Footnote-1723084 -Ref: Igawk Program-Footnote-2723285 -Node: Anagram Program723423 -Node: Signature Program726491 -Node: Advanced Features727591 -Node: Nondecimal Data729477 -Node: Array Sorting731060 -Node: Controlling Array Traversal731757 -Node: Array Sorting Functions740041 -Ref: Array Sorting Functions-Footnote-1743910 -Node: Two-way I/O744104 -Ref: Two-way I/O-Footnote-1749536 -Node: TCP/IP Networking749606 -Node: Profiling752450 -Node: Internationalization759947 -Node: I18N and L10N761372 -Node: Explaining gettext762058 -Ref: Explaining gettext-Footnote-1767126 -Ref: Explaining gettext-Footnote-2767310 -Node: Programmer i18n767475 -Node: Translator i18n771677 -Node: String Extraction772470 -Ref: String Extraction-Footnote-1773431 -Node: Printf Ordering773517 -Ref: Printf Ordering-Footnote-1776301 -Node: I18N Portability776365 -Ref: I18N Portability-Footnote-1778814 -Node: I18N Example778877 -Ref: I18N Example-Footnote-1781515 -Node: Gawk I18N781587 -Node: Debugger782208 -Node: Debugging783179 -Node: Debugging Concepts783612 -Node: Debugging Terms785468 -Node: Awk Debugging788065 -Node: Sample Debugging Session788957 -Node: Debugger Invocation789477 -Node: Finding The Bug790809 -Node: List of Debugger Commands797297 -Node: Breakpoint Control798631 -Node: Debugger Execution Control802295 -Node: Viewing And Changing Data805655 -Node: Execution Stack809011 -Node: Debugger Info810478 -Node: Miscellaneous Debugger Commands814460 -Node: Readline Support819636 -Node: Limitations820467 -Node: Arbitrary Precision Arithmetic822719 -Ref: Arbitrary Precision Arithmetic-Footnote-1824368 -Node: General Arithmetic824516 -Node: Floating Point Issues826236 -Node: String Conversion Precision827117 -Ref: String Conversion Precision-Footnote-1828822 -Node: Unexpected Results828931 -Node: POSIX Floating Point Problems831084 -Ref: POSIX Floating Point Problems-Footnote-1834909 -Node: Integer Programming834947 -Node: Floating-point Programming836686 -Ref: Floating-point Programming-Footnote-1843017 -Ref: Floating-point Programming-Footnote-2843287 -Node: Floating-point Representation843551 -Node: Floating-point Context844716 -Ref: table-ieee-formats845555 -Node: Rounding Mode846939 -Ref: table-rounding-modes847418 -Ref: Rounding Mode-Footnote-1850433 -Node: Gawk and MPFR850612 -Node: Arbitrary Precision Floats851867 -Ref: Arbitrary Precision Floats-Footnote-1854310 -Node: Setting Precision854626 -Ref: table-predefined-precision-strings855312 -Node: Setting Rounding Mode857457 -Ref: table-gawk-rounding-modes857861 -Node: Floating-point Constants859048 -Node: Changing Precision860477 -Ref: Changing Precision-Footnote-1861874 -Node: Exact Arithmetic862048 -Node: Arbitrary Precision Integers865186 -Ref: Arbitrary Precision Integers-Footnote-1868204 -Node: Dynamic Extensions868351 -Node: Extension Intro869809 -Node: Plugin License871074 -Node: Extension Mechanism Outline871759 -Ref: load-extension872176 -Ref: load-new-function873654 -Ref: call-new-function874649 -Node: Extension API Description876664 -Node: Extension API Functions Introduction877877 -Node: General Data Types882743 -Ref: General Data Types-Footnote-1888345 -Node: Requesting Values888644 -Ref: table-value-types-returned889375 -Node: Constructor Functions890329 -Node: Registration Functions893349 -Node: Extension Functions894034 -Node: Exit Callback Functions896259 -Node: Extension Version String897508 -Node: Input Parsers898158 -Node: Output Wrappers907915 -Node: Two-way processors912425 -Node: Printing Messages914633 -Ref: Printing Messages-Footnote-1915710 -Node: Updating `ERRNO'915862 -Node: Accessing Parameters916601 -Node: Symbol Table Access917831 -Node: Symbol table by name918343 -Node: Symbol table by cookie920090 -Ref: Symbol table by cookie-Footnote-1924220 -Node: Cached values924283 -Ref: Cached values-Footnote-1927732 -Node: Array Manipulation927823 -Ref: Array Manipulation-Footnote-1928921 -Node: Array Data Types928960 -Ref: Array Data Types-Footnote-1931663 -Node: Array Functions931755 -Node: Flattening Arrays935521 -Node: Creating Arrays942373 -Node: Extension API Variables947098 -Node: Extension Versioning947734 -Node: Extension API Informational Variables949635 -Node: Extension API Boilerplate950721 -Node: Finding Extensions954525 -Node: Extension Example955085 -Node: Internal File Description955815 -Node: Internal File Ops959906 -Ref: Internal File Ops-Footnote-1971414 -Node: Using Internal File Ops971554 -Ref: Using Internal File Ops-Footnote-1973907 -Node: Extension Samples974173 -Node: Extension Sample File Functions975697 -Node: Extension Sample Fnmatch984182 -Node: Extension Sample Fork985908 -Node: Extension Sample Inplace987126 -Node: Extension Sample Ord988904 -Node: Extension Sample Readdir989740 -Node: Extension Sample Revout991272 -Node: Extension Sample Rev2way991865 -Node: Extension Sample Read write array992555 -Node: Extension Sample Readfile994438 -Node: Extension Sample API Tests995256 -Node: Extension Sample Time995781 -Node: gawkextlib997145 -Node: Language History999926 -Node: V7/SVR3.11001519 -Node: SVR41003839 -Node: POSIX1005281 -Node: BTL1006667 -Node: POSIX/GNU1007401 -Node: Feature History1013000 -Node: Common Extensions1025964 -Node: Ranges and Locales1027276 -Ref: Ranges and Locales-Footnote-11031894 -Ref: Ranges and Locales-Footnote-21031921 -Ref: Ranges and Locales-Footnote-31032181 -Node: Contributors1032402 -Node: Installation1037547 -Node: Gawk Distribution1038441 -Node: Getting1038925 -Node: Extracting1039751 -Node: Distribution contents1041443 -Node: Unix Installation1047148 -Node: Quick Installation1047765 -Node: Additional Configuration Options1050209 -Node: Configuration Philosophy1051945 -Node: Non-Unix Installation1054299 -Node: PC Installation1054757 -Node: PC Binary Installation1056056 -Node: PC Compiling1057904 -Node: PC Testing1060848 -Node: PC Using1062024 -Node: Cygwin1066209 -Node: MSYS1067209 -Node: VMS Installation1067723 -Node: VMS Compilation1068487 -Ref: VMS Compilation-Footnote-11070102 -Node: VMS Dynamic Extensions1070160 -Node: VMS Installation Details1071533 -Node: VMS Running1073780 -Node: VMS GNV1076614 -Node: VMS Old Gawk1077337 -Node: Bugs1077807 -Node: Other Versions1081725 -Node: Notes1087809 -Node: Compatibility Mode1088609 -Node: Additions1089392 -Node: Accessing The Source1090319 -Node: Adding Code1091759 -Node: New Ports1097804 -Node: Derived Files1101939 -Ref: Derived Files-Footnote-11107260 -Ref: Derived Files-Footnote-21107294 -Ref: Derived Files-Footnote-31107894 -Node: Future Extensions1107992 -Node: Implementation Limitations1108575 -Node: Extension Design1109827 -Node: Old Extension Problems1110981 -Ref: Old Extension Problems-Footnote-11112489 -Node: Extension New Mechanism Goals1112546 -Ref: Extension New Mechanism Goals-Footnote-11115911 -Node: Extension Other Design Decisions1116097 -Node: Extension Future Growth1118203 -Node: Old Extension Mechanism1119039 -Node: Basic Concepts1120779 -Node: Basic High Level1121460 -Ref: figure-general-flow1121731 -Ref: figure-process-flow1122330 -Ref: Basic High Level-Footnote-11125559 -Node: Basic Data Typing1125744 -Node: Glossary1129099 -Node: Copying1154561 -Node: GNU Free Documentation License1192118 -Node: Index1217255 +Ref: Records-Footnote-1186901 +Node: Fields186938 +Ref: Fields-Footnote-1189971 +Node: Nonconstant Fields190057 +Node: Changing Fields192259 +Node: Field Separators198218 +Node: Default Field Splitting200920 +Node: Regexp Field Splitting202037 +Node: Single Character Fields205379 +Node: Command Line Field Separator206438 +Node: Full Line Fields209872 +Ref: Full Line Fields-Footnote-1210380 +Node: Field Splitting Summary210426 +Ref: Field Splitting Summary-Footnote-1213525 +Node: Constant Size213626 +Node: Splitting By Content218233 +Ref: Splitting By Content-Footnote-1221982 +Node: Multiple Line222022 +Ref: Multiple Line-Footnote-1227869 +Node: Getline228048 +Node: Plain Getline230264 +Node: Getline/Variable232359 +Node: Getline/File233506 +Node: Getline/Variable/File234847 +Ref: Getline/Variable/File-Footnote-1236446 +Node: Getline/Pipe236533 +Node: Getline/Variable/Pipe239232 +Node: Getline/Coprocess240339 +Node: Getline/Variable/Coprocess241591 +Node: Getline Notes242328 +Node: Getline Summary245115 +Ref: table-getline-variants245523 +Node: Read Timeout246435 +Ref: Read Timeout-Footnote-1250176 +Node: Command line directories250233 +Node: Printing250863 +Node: Print252494 +Node: Print Examples253831 +Node: Output Separators256615 +Node: OFMT258375 +Node: Printf259733 +Node: Basic Printf260639 +Node: Control Letters262178 +Node: Format Modifiers265990 +Node: Printf Examples271999 +Node: Redirection274714 +Node: Special Files281679 +Node: Special FD282212 +Ref: Special FD-Footnote-1285837 +Node: Special Network285911 +Node: Special Caveats286761 +Node: Close Files And Pipes287557 +Ref: Close Files And Pipes-Footnote-1294540 +Ref: Close Files And Pipes-Footnote-2294688 +Node: Expressions294838 +Node: Values295970 +Node: Constants296646 +Node: Scalar Constants297326 +Ref: Scalar Constants-Footnote-1298185 +Node: Nondecimal-numbers298367 +Node: Regexp Constants301367 +Node: Using Constant Regexps301842 +Node: Variables304897 +Node: Using Variables305552 +Node: Assignment Options307276 +Node: Conversion309148 +Ref: table-locale-affects314648 +Ref: Conversion-Footnote-1315272 +Node: All Operators315381 +Node: Arithmetic Ops316011 +Node: Concatenation318516 +Ref: Concatenation-Footnote-1321308 +Node: Assignment Ops321428 +Ref: table-assign-ops326416 +Node: Increment Ops327747 +Node: Truth Values and Conditions331181 +Node: Truth Values332264 +Node: Typing and Comparison333313 +Node: Variable Typing334106 +Ref: Variable Typing-Footnote-1338003 +Node: Comparison Operators338125 +Ref: table-relational-ops338535 +Node: POSIX String Comparison342083 +Ref: POSIX String Comparison-Footnote-1343039 +Node: Boolean Ops343177 +Ref: Boolean Ops-Footnote-1347255 +Node: Conditional Exp347346 +Node: Function Calls349078 +Node: Precedence352672 +Node: Locales356341 +Node: Patterns and Actions357430 +Node: Pattern Overview358484 +Node: Regexp Patterns360153 +Node: Expression Patterns360696 +Node: Ranges364381 +Node: BEGIN/END367347 +Node: Using BEGIN/END368109 +Ref: Using BEGIN/END-Footnote-1370840 +Node: I/O And BEGIN/END370946 +Node: BEGINFILE/ENDFILE373228 +Node: Empty376142 +Node: Using Shell Variables376458 +Node: Action Overview378743 +Node: Statements381100 +Node: If Statement382954 +Node: While Statement384453 +Node: Do Statement386497 +Node: For Statement387653 +Node: Switch Statement390805 +Node: Break Statement392902 +Node: Continue Statement394892 +Node: Next Statement396685 +Node: Nextfile Statement399075 +Node: Exit Statement401718 +Node: Built-in Variables404134 +Node: User-modified405229 +Ref: User-modified-Footnote-1413587 +Node: Auto-set413649 +Ref: Auto-set-Footnote-1426727 +Ref: Auto-set-Footnote-2426932 +Node: ARGC and ARGV426988 +Node: Arrays430839 +Node: Array Basics432344 +Node: Array Intro433170 +Node: Reference to Elements437487 +Node: Assigning Elements439757 +Node: Array Example440248 +Node: Scanning an Array441980 +Node: Controlling Scanning444294 +Ref: Controlling Scanning-Footnote-1449381 +Node: Delete449697 +Ref: Delete-Footnote-1452462 +Node: Numeric Array Subscripts452519 +Node: Uninitialized Subscripts454702 +Node: Multidimensional456329 +Node: Multiscanning459422 +Node: Arrays of Arrays461011 +Node: Functions465651 +Node: Built-in466470 +Node: Calling Built-in467548 +Node: Numeric Functions469536 +Ref: Numeric Functions-Footnote-1473368 +Ref: Numeric Functions-Footnote-2473725 +Ref: Numeric Functions-Footnote-3473773 +Node: String Functions474042 +Ref: String Functions-Footnote-1496962 +Ref: String Functions-Footnote-2497091 +Ref: String Functions-Footnote-3497339 +Node: Gory Details497426 +Ref: table-sub-escapes499105 +Ref: table-sub-posix-92500459 +Ref: table-sub-proposed501810 +Ref: table-posix-sub503164 +Ref: table-gensub-escapes504709 +Ref: Gory Details-Footnote-1505885 +Ref: Gory Details-Footnote-2505936 +Node: I/O Functions506087 +Ref: I/O Functions-Footnote-1513072 +Node: Time Functions513219 +Ref: Time Functions-Footnote-1524152 +Ref: Time Functions-Footnote-2524220 +Ref: Time Functions-Footnote-3524378 +Ref: Time Functions-Footnote-4524489 +Ref: Time Functions-Footnote-5524601 +Ref: Time Functions-Footnote-6524828 +Node: Bitwise Functions525094 +Ref: table-bitwise-ops525656 +Ref: Bitwise Functions-Footnote-1529877 +Node: Type Functions530061 +Node: I18N Functions531212 +Node: User-defined532839 +Node: Definition Syntax533643 +Ref: Definition Syntax-Footnote-1538553 +Node: Function Example538622 +Node: Function Caveats541216 +Node: Calling A Function541637 +Node: Variable Scope542752 +Node: Pass By Value/Reference545715 +Node: Return Statement549223 +Node: Dynamic Typing552204 +Node: Indirect Calls553135 +Node: Library Functions562820 +Ref: Library Functions-Footnote-1566333 +Ref: Library Functions-Footnote-2566476 +Node: Library Names566647 +Ref: Library Names-Footnote-1570118 +Ref: Library Names-Footnote-2570338 +Node: General Functions570424 +Node: Strtonum Function571452 +Node: Assert Function574382 +Node: Round Function577708 +Node: Cliff Random Function579251 +Node: Ordinal Functions580267 +Ref: Ordinal Functions-Footnote-1583339 +Ref: Ordinal Functions-Footnote-2583591 +Node: Join Function583800 +Ref: Join Function-Footnote-1585571 +Node: Getlocaltime Function585771 +Node: Readfile Function589512 +Node: Data File Management591351 +Node: Filetrans Function591983 +Node: Rewind Function596052 +Node: File Checking597439 +Node: Empty Files598533 +Node: Ignoring Assigns600763 +Node: Getopt Function602316 +Ref: Getopt Function-Footnote-1613619 +Node: Passwd Functions613822 +Ref: Passwd Functions-Footnote-1622797 +Node: Group Functions622885 +Node: Walking Arrays630969 +Node: Sample Programs633106 +Node: Running Examples633780 +Node: Clones634508 +Node: Cut Program635732 +Node: Egrep Program645577 +Ref: Egrep Program-Footnote-1653350 +Node: Id Program653460 +Node: Split Program657076 +Ref: Split Program-Footnote-1660595 +Node: Tee Program660723 +Node: Uniq Program663526 +Node: Wc Program670955 +Ref: Wc Program-Footnote-1675221 +Ref: Wc Program-Footnote-2675421 +Node: Miscellaneous Programs675513 +Node: Dupword Program676701 +Node: Alarm Program678732 +Node: Translate Program683485 +Ref: Translate Program-Footnote-1687872 +Ref: Translate Program-Footnote-2688120 +Node: Labels Program688254 +Ref: Labels Program-Footnote-1691625 +Node: Word Sorting691709 +Node: History Sorting695593 +Node: Extract Program697432 +Ref: Extract Program-Footnote-1704935 +Node: Simple Sed705063 +Node: Igawk Program708125 +Ref: Igawk Program-Footnote-1723282 +Ref: Igawk Program-Footnote-2723483 +Node: Anagram Program723621 +Node: Signature Program726689 +Node: Advanced Features727789 +Node: Nondecimal Data729675 +Node: Array Sorting731258 +Node: Controlling Array Traversal731955 +Node: Array Sorting Functions740239 +Ref: Array Sorting Functions-Footnote-1744108 +Node: Two-way I/O744302 +Ref: Two-way I/O-Footnote-1749734 +Node: TCP/IP Networking749804 +Node: Profiling752648 +Node: Internationalization760145 +Node: I18N and L10N761570 +Node: Explaining gettext762256 +Ref: Explaining gettext-Footnote-1767324 +Ref: Explaining gettext-Footnote-2767508 +Node: Programmer i18n767673 +Node: Translator i18n771875 +Node: String Extraction772668 +Ref: String Extraction-Footnote-1773629 +Node: Printf Ordering773715 +Ref: Printf Ordering-Footnote-1776499 +Node: I18N Portability776563 +Ref: I18N Portability-Footnote-1779012 +Node: I18N Example779075 +Ref: I18N Example-Footnote-1781713 +Node: Gawk I18N781785 +Node: Debugger782406 +Node: Debugging783377 +Node: Debugging Concepts783810 +Node: Debugging Terms785666 +Node: Awk Debugging788263 +Node: Sample Debugging Session789155 +Node: Debugger Invocation789675 +Node: Finding The Bug791007 +Node: List of Debugger Commands797495 +Node: Breakpoint Control798829 +Node: Debugger Execution Control802493 +Node: Viewing And Changing Data805853 +Node: Execution Stack809209 +Node: Debugger Info810676 +Node: Miscellaneous Debugger Commands814658 +Node: Readline Support819834 +Node: Limitations820665 +Node: Arbitrary Precision Arithmetic822917 +Ref: Arbitrary Precision Arithmetic-Footnote-1824566 +Node: General Arithmetic824714 +Node: Floating Point Issues826434 +Node: String Conversion Precision827315 +Ref: String Conversion Precision-Footnote-1829020 +Node: Unexpected Results829129 +Node: POSIX Floating Point Problems831282 +Ref: POSIX Floating Point Problems-Footnote-1835107 +Node: Integer Programming835145 +Node: Floating-point Programming836884 +Ref: Floating-point Programming-Footnote-1843215 +Ref: Floating-point Programming-Footnote-2843485 +Node: Floating-point Representation843749 +Node: Floating-point Context844914 +Ref: table-ieee-formats845753 +Node: Rounding Mode847137 +Ref: table-rounding-modes847616 +Ref: Rounding Mode-Footnote-1850631 +Node: Gawk and MPFR850810 +Node: Arbitrary Precision Floats852065 +Ref: Arbitrary Precision Floats-Footnote-1854508 +Node: Setting Precision854824 +Ref: table-predefined-precision-strings855510 +Node: Setting Rounding Mode857655 +Ref: table-gawk-rounding-modes858059 +Node: Floating-point Constants859246 +Node: Changing Precision860675 +Ref: Changing Precision-Footnote-1862072 +Node: Exact Arithmetic862246 +Node: Arbitrary Precision Integers865384 +Ref: Arbitrary Precision Integers-Footnote-1868402 +Node: Dynamic Extensions868549 +Node: Extension Intro870007 +Node: Plugin License871272 +Node: Extension Mechanism Outline871957 +Ref: load-extension872374 +Ref: load-new-function873852 +Ref: call-new-function874847 +Node: Extension API Description876862 +Node: Extension API Functions Introduction878075 +Node: General Data Types882941 +Ref: General Data Types-Footnote-1888543 +Node: Requesting Values888842 +Ref: table-value-types-returned889573 +Node: Constructor Functions890527 +Node: Registration Functions893547 +Node: Extension Functions894232 +Node: Exit Callback Functions896457 +Node: Extension Version String897706 +Node: Input Parsers898356 +Node: Output Wrappers908113 +Node: Two-way processors912623 +Node: Printing Messages914831 +Ref: Printing Messages-Footnote-1915908 +Node: Updating `ERRNO'916060 +Node: Accessing Parameters916799 +Node: Symbol Table Access918029 +Node: Symbol table by name918541 +Node: Symbol table by cookie920288 +Ref: Symbol table by cookie-Footnote-1924418 +Node: Cached values924481 +Ref: Cached values-Footnote-1927930 +Node: Array Manipulation928021 +Ref: Array Manipulation-Footnote-1929119 +Node: Array Data Types929158 +Ref: Array Data Types-Footnote-1931861 +Node: Array Functions931953 +Node: Flattening Arrays935719 +Node: Creating Arrays942571 +Node: Extension API Variables947296 +Node: Extension Versioning947932 +Node: Extension API Informational Variables949833 +Node: Extension API Boilerplate950919 +Node: Finding Extensions954723 +Node: Extension Example955283 +Node: Internal File Description956013 +Node: Internal File Ops960104 +Ref: Internal File Ops-Footnote-1971612 +Node: Using Internal File Ops971752 +Ref: Using Internal File Ops-Footnote-1974105 +Node: Extension Samples974371 +Node: Extension Sample File Functions975895 +Node: Extension Sample Fnmatch984380 +Node: Extension Sample Fork986106 +Node: Extension Sample Inplace987324 +Node: Extension Sample Ord989102 +Node: Extension Sample Readdir989938 +Node: Extension Sample Revout991470 +Node: Extension Sample Rev2way992063 +Node: Extension Sample Read write array992753 +Node: Extension Sample Readfile994636 +Node: Extension Sample API Tests995454 +Node: Extension Sample Time995979 +Node: gawkextlib997343 +Node: Language History1000124 +Node: V7/SVR3.11001717 +Node: SVR41004037 +Node: POSIX1005479 +Node: BTL1006865 +Node: POSIX/GNU1007599 +Node: Feature History1013198 +Node: Common Extensions1026162 +Node: Ranges and Locales1027474 +Ref: Ranges and Locales-Footnote-11032092 +Ref: Ranges and Locales-Footnote-21032119 +Ref: Ranges and Locales-Footnote-31032379 +Node: Contributors1032600 +Node: Installation1037745 +Node: Gawk Distribution1038639 +Node: Getting1039123 +Node: Extracting1039949 +Node: Distribution contents1041641 +Node: Unix Installation1047346 +Node: Quick Installation1047963 +Node: Additional Configuration Options1050407 +Node: Configuration Philosophy1052143 +Node: Non-Unix Installation1054497 +Node: PC Installation1054955 +Node: PC Binary Installation1056254 +Node: PC Compiling1058102 +Node: PC Testing1061046 +Node: PC Using1062222 +Node: Cygwin1066407 +Node: MSYS1067407 +Node: VMS Installation1067921 +Node: VMS Compilation1068685 +Ref: VMS Compilation-Footnote-11070300 +Node: VMS Dynamic Extensions1070358 +Node: VMS Installation Details1071731 +Node: VMS Running1073978 +Node: VMS GNV1076812 +Node: VMS Old Gawk1077535 +Node: Bugs1078005 +Node: Other Versions1081923 +Node: Notes1088007 +Node: Compatibility Mode1088807 +Node: Additions1089590 +Node: Accessing The Source1090517 +Node: Adding Code1091957 +Node: New Ports1098002 +Node: Derived Files1102137 +Ref: Derived Files-Footnote-11107458 +Ref: Derived Files-Footnote-21107492 +Ref: Derived Files-Footnote-31108092 +Node: Future Extensions1108190 +Node: Implementation Limitations1108773 +Node: Extension Design1110025 +Node: Old Extension Problems1111179 +Ref: Old Extension Problems-Footnote-11112687 +Node: Extension New Mechanism Goals1112744 +Ref: Extension New Mechanism Goals-Footnote-11116109 +Node: Extension Other Design Decisions1116295 +Node: Extension Future Growth1118401 +Node: Old Extension Mechanism1119237 +Node: Basic Concepts1120977 +Node: Basic High Level1121658 +Ref: figure-general-flow1121929 +Ref: figure-process-flow1122528 +Ref: Basic High Level-Footnote-11125757 +Node: Basic Data Typing1125942 +Node: Glossary1129297 +Node: Copying1154759 +Node: GNU Free Documentation License1192316 +Node: Index1217453 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 1dd75e51..63489dae 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -3289,8 +3289,8 @@ The following list describes @command{gawk}-specific options: @table @code @item -b @itemx --characters-as-bytes -@cindex @code{-b} option -@cindex @code{--characters-as-bytes} option +@cindex @option{-b} option +@cindex @option{--characters-as-bytes} option Cause @command{gawk} to treat all input data as single-byte characters. In addition, all output written with @code{print} or @code{printf} are treated as single-byte characters. @@ -3304,8 +3304,8 @@ multibyte characters. This option is an easy way to tell @command{gawk}: @item -c @itemx --traditional -@cindex @code{-c} option -@cindex @code{--traditional} option +@cindex @option{-c} option +@cindex @option{--traditional} option @cindex compatibility mode (@command{gawk}), specifying Specify @dfn{compatibility mode}, in which the GNU extensions to the @command{awk} language are disabled, so that @command{gawk} behaves just @@ -3316,17 +3316,17 @@ which summarizes the extensions. Also see @item -C @itemx --copyright -@cindex @code{-C} option -@cindex @code{--copyright} option +@cindex @option{-C} option +@cindex @option{--copyright} option @cindex GPL (General Public License), printing Print the short version of the General Public License and then exit. @item -d@r{[}@var{file}@r{]} @itemx --dump-variables@r{[}=@var{file}@r{]} -@cindex @code{-d} option -@cindex @code{--dump-variables} option -@cindex @code{awkvars.out} file -@cindex files, @code{awkvars.out} +@cindex @option{-d} option +@cindex @option{--dump-variables} option +@cindex @file{awkvars.out} file +@cindex files, @file{awkvars.out} @cindex variables, global, printing list of Print a sorted list of global variables, their types, and final values to @var{file}. If no @var{file} is provided, print this @@ -3345,8 +3345,8 @@ names like @code{i}, @code{j}, etc.) @item -D@r{[}@var{file}@r{]} @itemx --debug=@r{[}@var{file}@r{]} -@cindex @code{-D} option -@cindex @code{--debug} option +@cindex @option{-D} option +@cindex @option{--debug} option @cindex @command{awk} debugging, enabling Enable debugging of @command{awk} programs (@pxref{Debugging}). @@ -3358,8 +3358,8 @@ No space is allowed between the @option{-D} and @var{file}, if @item -e @var{program-text} @itemx --source @var{program-text} -@cindex @code{-e} option -@cindex @code{--source} option +@cindex @option{-e} option +@cindex @option{--source} option @cindex source code, mixing Provide program source code in the @var{program-text}. This option allows you to mix source code in files with source @@ -3370,8 +3370,8 @@ programs (@pxref{AWKPATH Variable}). @item -E @var{file} @itemx --exec @var{file} -@cindex @code{-E} option -@cindex @code{--exec} option +@cindex @option{-E} option +@cindex @option{--exec} option @cindex @command{awk} programs, location of @cindex CGI, @command{awk} scripts for Similar to @option{-f}, read @command{awk} program text from @var{file}. @@ -3401,8 +3401,8 @@ with @samp{#!} scripts (@pxref{Executable Scripts}), like so: @item -g @itemx --gen-pot -@cindex @code{-g} option -@cindex @code{--gen-pot} option +@cindex @option{-g} option +@cindex @option{--gen-pot} option @cindex portable object files, generating @cindex files, portable object, generating Analyze the source program and @@ -3413,8 +3413,8 @@ for information about this option. @item -h @itemx --help -@cindex @code{-h} option -@cindex @code{--help} option +@cindex @option{-h} option +@cindex @option{--help} option @cindex GNU long options, printing list of @cindex options, printing list of @cindex printing, list of options @@ -3439,8 +3439,8 @@ find the main source code via the @option{-f} option or on the command-line. @item -l @var{lib} @itemx --load @var{lib} -@cindex @code{-l} option -@cindex @code{--load} option +@cindex @option{-l} option +@cindex @option{--load} option @cindex loading, library Load a shared library @var{lib}. This searches for the library using the @env{AWKLIBPATH} environment variable. The correct library suffix for your platform will be @@ -3451,8 +3451,8 @@ a shared library. @item -L @r{[}value@r{]} @itemx --lint@r{[}=value@r{]} -@cindex @code{-l} option -@cindex @code{--lint} option +@cindex @option{-l} option +@cindex @option{--lint} option @cindex lint checking, issuing warnings @cindex warnings, issuing Warn about constructs that are dubious or nonportable to @@ -3474,16 +3474,16 @@ care to search for all occurrences of each inappropriate construct. As @item -M @itemx --bignum -@cindex @code{-M} option -@cindex @code{--bignum} option +@cindex @option{-M} option +@cindex @option{--bignum} option Force arbitrary precision arithmetic on numbers. This option has no effect if @command{gawk} is not compiled to use the GNU MPFR and MP libraries (@pxref{Arbitrary Precision Arithmetic}). @item -n @itemx --non-decimal-data -@cindex @code{-n} option -@cindex @code{--non-decimal-data} option +@cindex @option{-n} option +@cindex @option{--non-decimal-data} option @cindex hexadecimal values@comma{} enabling interpretation of @cindex octal values@comma{} enabling interpretation of @cindex troubleshooting, @code{--non-decimal-data} option @@ -3498,15 +3498,15 @@ Use with care. @item -N @itemx --use-lc-numeric -@cindex @code{-N} option -@cindex @code{--use-lc-numeric} option +@cindex @option{-N} option +@cindex @option{--use-lc-numeric} option Force the use of the locale's decimal point character when parsing numeric input data (@pxref{Locales}). @item -o@r{[}@var{file}@r{]} @itemx --pretty-print@r{[}=@var{file}@r{]} -@cindex @code{-o} option -@cindex @code{--pretty-print} option +@cindex @option{-o} option +@cindex @option{--pretty-print} option Enable pretty-printing of @command{awk} programs. By default, output program is created in a file named @file{awkprof.out}. The optional @var{file} argument allows you to specify a different @@ -3516,16 +3516,16 @@ No space is allowed between the @option{-o} and @var{file}, if @item -O @itemx --optimize -@cindex @code{--optimize} option -@cindex @code{-O} option +@cindex @option{--optimize} option +@cindex @option{-O} option Enable some optimizations on the internal representation of the program. At the moment this includes just simple constant folding. The @command{gawk} maintainer hopes to add more optimizations over time. @item -p@r{[}@var{file}@r{]} @itemx --profile@r{[}=@var{file}@r{]} -@cindex @code{-p} option -@cindex @code{--profile} option +@cindex @option{-p} option +@cindex @option{--profile} option @cindex @command{awk} profiling, enabling Enable profiling of @command{awk} programs (@pxref{Profiling}). @@ -3540,8 +3540,8 @@ in the left margin, and function call counts for each function. @item -P @itemx --posix -@cindex @code{-P} option -@cindex @code{--posix} option +@cindex @option{-P} option +@cindex @option{--posix} option @cindex POSIX mode @cindex @command{gawk}, extensions@comma{} disabling Operate in strict POSIX mode. This disables all @command{gawk} @@ -3590,8 +3590,8 @@ also issues a warning if both options are supplied. @item -r @itemx --re-interval -@cindex @code{-r} option -@cindex @code{--re-interval} option +@cindex @option{-r} option +@cindex @option{--re-interval} option @cindex regular expressions, interval expressions and Allow interval expressions (@pxref{Regexp Operators}) @@ -3602,8 +3602,8 @@ and for use in combination with the @option{--traditional} option. @item -S @itemx --sandbox -@cindex @code{-S} option -@cindex @code{--sandbox} option +@cindex @option{-S} option +@cindex @option{--sandbox} option @cindex sandbox mode Disable the @code{system()} function, input redirections with @code{getline}, @@ -3615,16 +3615,16 @@ can't access your system (other than the specified input data file). @item -t @itemx --lint-old -@cindex @code{-L} option -@cindex @code{--lint-old} option +@cindex @option{-L} option +@cindex @option{--lint-old} option Warn about constructs that are not available in the original version of @command{awk} from Version 7 Unix (@pxref{V7/SVR3.1}). @item -V @itemx --version -@cindex @code{-V} option -@cindex @code{--version} option +@cindex @option{-V} option +@cindex @option{--version} option @cindex @command{gawk}, versions of, information about@comma{} printing Print version information for this particular copy of @command{gawk}. This allows you to determine if your copy of @command{gawk} is up to date @@ -5043,8 +5043,8 @@ These sequences are: @item Collating symbols Multicharacter collating elements enclosed between @samp{[.} and @samp{.]}. For example, if @samp{ch} is a collating element, -then @code{[[.ch.]]} is a regexp that matches this collating element, whereas -@code{[ch]} is a regexp that matches either @samp{c} or @samp{h}. +then @samp{[[.ch.]]} is a regexp that matches this collating element, whereas +@samp{[ch]} is a regexp that matches either @samp{c} or @samp{h}. @cindex bracket expressions, equivalence classes @item Equivalence classes @@ -5052,7 +5052,7 @@ Locale-specific names for a list of characters that are equal. The name is enclosed between @samp{[=} and @samp{=]}. For example, the name @samp{e} might be used to represent all of -``e,'' ``@`e,'' and ``@'e.'' In this case, @code{[[=e=]]} is a regexp +``e,'' ``@`e,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp that matches any of @samp{e}, @samp{@'e}, or @samp{@`e}. @end table @@ -5096,7 +5096,7 @@ or underscores (@samp{_}): @item \s Matches any whitespace character. Think of it as shorthand for -@w{@code{[[:space:]]}}. +@w{@samp{[[:space:]]}}. @c @cindex operators, @code{\S} (@command{gawk}) @cindex backslash (@code{\}), @code{\S} operator (@command{gawk}) @@ -5104,7 +5104,7 @@ Think of it as shorthand for @item \S Matches any character that is not whitespace. Think of it as shorthand for -@w{@code{[^[:space:]]}}. +@w{@samp{[^[:space:]]}}. @c @cindex operators, @code{\w} (@command{gawk}) @cindex backslash (@code{\}), @code{\w} operator (@command{gawk}) @@ -5112,7 +5112,7 @@ Think of it as shorthand for @item \w Matches any word-constituent character---that is, it matches any letter, digit, or underscore. Think of it as shorthand for -@w{@code{[[:alnum:]_]}}. +@w{@samp{[[:alnum:]_]}}. @c @cindex operators, @code{\W} (@command{gawk}) @cindex backslash (@code{\}), @code{\W} operator (@command{gawk}) @@ -5120,7 +5120,7 @@ letter, digit, or underscore. Think of it as shorthand for @item \W Matches any character that is not word-constituent. Think of it as shorthand for -@w{@code{[^[:alnum:]_]}}. +@w{@samp{[^[:alnum:]_]}}. @c @cindex operators, @code{\<} (@command{gawk}) @cindex backslash (@code{\}), @code{\<} operator (@command{gawk}) @@ -5231,7 +5231,7 @@ are allowed. @item @code{--traditional} Traditional Unix @command{awk} regexps are matched. The GNU operators are not special, and interval expressions are not available. -The POSIX character classes (@code{[[:alnum:]]}, etc.) are supported, +The POSIX character classes (@samp{[[:alnum:]]}, etc.) are supported, as Brian Kernighan's @command{awk} does support them. Characters described by octal and hexadecimal escape sequences are treated literally, even if they represent regexp metacharacters. @@ -5857,21 +5857,27 @@ BEGIN @{ RS = "\0" @} # whole file becomes one record? @command{gawk} in fact accepts this, and uses the @sc{nul} character for the record separator. However, this usage is @emph{not} portable -to other @command{awk} implementations. +to most other @command{awk} implementations. @cindex dark corner, strings, storing -All other @command{awk} implementations@footnote{At least that we know +Almost all other @command{awk} implementations@footnote{At least that we know about.} store strings internally as C-style strings. C strings use the @sc{nul} character as the string terminator. In effect, this means that @samp{RS = "\0"} is the same as @samp{RS = ""}. @value{DARKCORNER} +It happens that recent versions of @command{mawk} can use the @sc{nul} +character as a record separator. However, this is a special case: +@command{mawk} does not allow embedded @sc{nul} characters in strings. + @cindex records, treating files as @cindex files, as single records The best way to treat a whole file as a single record is to simply read the file in, one record at a time, concatenating each record onto the end of the previous ones. +@c @strong{FIXME}: Using @sc{nul} is good for @file{/proc/environ} etc. + @docbook </sidebar> @end docbook @@ -5902,20 +5908,26 @@ BEGIN @{ RS = "\0" @} # whole file becomes one record? @command{gawk} in fact accepts this, and uses the @sc{nul} character for the record separator. However, this usage is @emph{not} portable -to other @command{awk} implementations. +to most other @command{awk} implementations. @cindex dark corner, strings, storing -All other @command{awk} implementations@footnote{At least that we know +Almost all other @command{awk} implementations@footnote{At least that we know about.} store strings internally as C-style strings. C strings use the @sc{nul} character as the string terminator. In effect, this means that @samp{RS = "\0"} is the same as @samp{RS = ""}. @value{DARKCORNER} +It happens that recent versions of @command{mawk} can use the @sc{nul} +character as a record separator. However, this is a special case: +@command{mawk} does not allow embedded @sc{nul} characters in strings. + @cindex records, treating files as @cindex files, as single records The best way to treat a whole file as a single record is to simply read the file in, one record at a time, concatenating each record onto the end of the previous ones. + +@c @strong{FIXME}: Using @sc{nul} is good for @file{/proc/environ} etc. @end cartouche @end ifnotdocbook @c ENDOFRANGE inspl @@ -10105,7 +10117,7 @@ point when reading the @command{awk} program source code, and for command-line variable assignments (@pxref{Other Arguments}). However, when interpreting input data, for @code{print} and @code{printf} output, and for number to string conversion, the local decimal point character is used. -@value{DARKCORNER}. +@value{DARKCORNER} Here are some examples indicating the difference in behavior, on a GNU/Linux system: @@ -34088,7 +34100,7 @@ The option for raw sockets was removed, since it was never implemented (@pxref{TCP/IP Networking}). @item -Ranges of the form @code{[d-h]} are treated as if they were in the +Ranges of the form @samp{[d-h]} are treated as if they were in the C locale, no matter what kind of regexp is being used, and even if @option{--posix} (@pxref{Ranges and Locales}). @@ -34296,7 +34308,7 @@ When @command{gawk} switched to using locale-aware regexp matchers, the problems began; especially as both GNU/Linux and commercial Unix vendors started implementing non-ASCII locales, @emph{and making them the default}. Perhaps the most frequently asked question became something -like ``why does @code{[A-Z]} match lowercase letters?!?'' +like ``why does @samp{[A-Z]} match lowercase letters?!?'' This situation existed for close to 10 years, if not more, and the @command{gawk} maintainer grew weary of trying to explain that diff --git a/doc/gawktexi.in b/doc/gawktexi.in index e970d9a0..71f960ab 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -3217,8 +3217,8 @@ The following list describes @command{gawk}-specific options: @table @code @item -b @itemx --characters-as-bytes -@cindex @code{-b} option -@cindex @code{--characters-as-bytes} option +@cindex @option{-b} option +@cindex @option{--characters-as-bytes} option Cause @command{gawk} to treat all input data as single-byte characters. In addition, all output written with @code{print} or @code{printf} are treated as single-byte characters. @@ -3232,8 +3232,8 @@ multibyte characters. This option is an easy way to tell @command{gawk}: @item -c @itemx --traditional -@cindex @code{-c} option -@cindex @code{--traditional} option +@cindex @option{-c} option +@cindex @option{--traditional} option @cindex compatibility mode (@command{gawk}), specifying Specify @dfn{compatibility mode}, in which the GNU extensions to the @command{awk} language are disabled, so that @command{gawk} behaves just @@ -3244,17 +3244,17 @@ which summarizes the extensions. Also see @item -C @itemx --copyright -@cindex @code{-C} option -@cindex @code{--copyright} option +@cindex @option{-C} option +@cindex @option{--copyright} option @cindex GPL (General Public License), printing Print the short version of the General Public License and then exit. @item -d@r{[}@var{file}@r{]} @itemx --dump-variables@r{[}=@var{file}@r{]} -@cindex @code{-d} option -@cindex @code{--dump-variables} option -@cindex @code{awkvars.out} file -@cindex files, @code{awkvars.out} +@cindex @option{-d} option +@cindex @option{--dump-variables} option +@cindex @file{awkvars.out} file +@cindex files, @file{awkvars.out} @cindex variables, global, printing list of Print a sorted list of global variables, their types, and final values to @var{file}. If no @var{file} is provided, print this @@ -3273,8 +3273,8 @@ names like @code{i}, @code{j}, etc.) @item -D@r{[}@var{file}@r{]} @itemx --debug=@r{[}@var{file}@r{]} -@cindex @code{-D} option -@cindex @code{--debug} option +@cindex @option{-D} option +@cindex @option{--debug} option @cindex @command{awk} debugging, enabling Enable debugging of @command{awk} programs (@pxref{Debugging}). @@ -3286,8 +3286,8 @@ No space is allowed between the @option{-D} and @var{file}, if @item -e @var{program-text} @itemx --source @var{program-text} -@cindex @code{-e} option -@cindex @code{--source} option +@cindex @option{-e} option +@cindex @option{--source} option @cindex source code, mixing Provide program source code in the @var{program-text}. This option allows you to mix source code in files with source @@ -3298,8 +3298,8 @@ programs (@pxref{AWKPATH Variable}). @item -E @var{file} @itemx --exec @var{file} -@cindex @code{-E} option -@cindex @code{--exec} option +@cindex @option{-E} option +@cindex @option{--exec} option @cindex @command{awk} programs, location of @cindex CGI, @command{awk} scripts for Similar to @option{-f}, read @command{awk} program text from @var{file}. @@ -3329,8 +3329,8 @@ with @samp{#!} scripts (@pxref{Executable Scripts}), like so: @item -g @itemx --gen-pot -@cindex @code{-g} option -@cindex @code{--gen-pot} option +@cindex @option{-g} option +@cindex @option{--gen-pot} option @cindex portable object files, generating @cindex files, portable object, generating Analyze the source program and @@ -3341,8 +3341,8 @@ for information about this option. @item -h @itemx --help -@cindex @code{-h} option -@cindex @code{--help} option +@cindex @option{-h} option +@cindex @option{--help} option @cindex GNU long options, printing list of @cindex options, printing list of @cindex printing, list of options @@ -3367,8 +3367,8 @@ find the main source code via the @option{-f} option or on the command-line. @item -l @var{lib} @itemx --load @var{lib} -@cindex @code{-l} option -@cindex @code{--load} option +@cindex @option{-l} option +@cindex @option{--load} option @cindex loading, library Load a shared library @var{lib}. This searches for the library using the @env{AWKLIBPATH} environment variable. The correct library suffix for your platform will be @@ -3379,8 +3379,8 @@ a shared library. @item -L @r{[}value@r{]} @itemx --lint@r{[}=value@r{]} -@cindex @code{-l} option -@cindex @code{--lint} option +@cindex @option{-l} option +@cindex @option{--lint} option @cindex lint checking, issuing warnings @cindex warnings, issuing Warn about constructs that are dubious or nonportable to @@ -3402,16 +3402,16 @@ care to search for all occurrences of each inappropriate construct. As @item -M @itemx --bignum -@cindex @code{-M} option -@cindex @code{--bignum} option +@cindex @option{-M} option +@cindex @option{--bignum} option Force arbitrary precision arithmetic on numbers. This option has no effect if @command{gawk} is not compiled to use the GNU MPFR and MP libraries (@pxref{Arbitrary Precision Arithmetic}). @item -n @itemx --non-decimal-data -@cindex @code{-n} option -@cindex @code{--non-decimal-data} option +@cindex @option{-n} option +@cindex @option{--non-decimal-data} option @cindex hexadecimal values@comma{} enabling interpretation of @cindex octal values@comma{} enabling interpretation of @cindex troubleshooting, @code{--non-decimal-data} option @@ -3426,15 +3426,15 @@ Use with care. @item -N @itemx --use-lc-numeric -@cindex @code{-N} option -@cindex @code{--use-lc-numeric} option +@cindex @option{-N} option +@cindex @option{--use-lc-numeric} option Force the use of the locale's decimal point character when parsing numeric input data (@pxref{Locales}). @item -o@r{[}@var{file}@r{]} @itemx --pretty-print@r{[}=@var{file}@r{]} -@cindex @code{-o} option -@cindex @code{--pretty-print} option +@cindex @option{-o} option +@cindex @option{--pretty-print} option Enable pretty-printing of @command{awk} programs. By default, output program is created in a file named @file{awkprof.out}. The optional @var{file} argument allows you to specify a different @@ -3444,16 +3444,16 @@ No space is allowed between the @option{-o} and @var{file}, if @item -O @itemx --optimize -@cindex @code{--optimize} option -@cindex @code{-O} option +@cindex @option{--optimize} option +@cindex @option{-O} option Enable some optimizations on the internal representation of the program. At the moment this includes just simple constant folding. The @command{gawk} maintainer hopes to add more optimizations over time. @item -p@r{[}@var{file}@r{]} @itemx --profile@r{[}=@var{file}@r{]} -@cindex @code{-p} option -@cindex @code{--profile} option +@cindex @option{-p} option +@cindex @option{--profile} option @cindex @command{awk} profiling, enabling Enable profiling of @command{awk} programs (@pxref{Profiling}). @@ -3468,8 +3468,8 @@ in the left margin, and function call counts for each function. @item -P @itemx --posix -@cindex @code{-P} option -@cindex @code{--posix} option +@cindex @option{-P} option +@cindex @option{--posix} option @cindex POSIX mode @cindex @command{gawk}, extensions@comma{} disabling Operate in strict POSIX mode. This disables all @command{gawk} @@ -3518,8 +3518,8 @@ also issues a warning if both options are supplied. @item -r @itemx --re-interval -@cindex @code{-r} option -@cindex @code{--re-interval} option +@cindex @option{-r} option +@cindex @option{--re-interval} option @cindex regular expressions, interval expressions and Allow interval expressions (@pxref{Regexp Operators}) @@ -3530,8 +3530,8 @@ and for use in combination with the @option{--traditional} option. @item -S @itemx --sandbox -@cindex @code{-S} option -@cindex @code{--sandbox} option +@cindex @option{-S} option +@cindex @option{--sandbox} option @cindex sandbox mode Disable the @code{system()} function, input redirections with @code{getline}, @@ -3543,16 +3543,16 @@ can't access your system (other than the specified input data file). @item -t @itemx --lint-old -@cindex @code{-L} option -@cindex @code{--lint-old} option +@cindex @option{-L} option +@cindex @option{--lint-old} option Warn about constructs that are not available in the original version of @command{awk} from Version 7 Unix (@pxref{V7/SVR3.1}). @item -V @itemx --version -@cindex @code{-V} option -@cindex @code{--version} option +@cindex @option{-V} option +@cindex @option{--version} option @cindex @command{gawk}, versions of, information about@comma{} printing Print version information for this particular copy of @command{gawk}. This allows you to determine if your copy of @command{gawk} is up to date @@ -4890,8 +4890,8 @@ These sequences are: @item Collating symbols Multicharacter collating elements enclosed between @samp{[.} and @samp{.]}. For example, if @samp{ch} is a collating element, -then @code{[[.ch.]]} is a regexp that matches this collating element, whereas -@code{[ch]} is a regexp that matches either @samp{c} or @samp{h}. +then @samp{[[.ch.]]} is a regexp that matches this collating element, whereas +@samp{[ch]} is a regexp that matches either @samp{c} or @samp{h}. @cindex bracket expressions, equivalence classes @item Equivalence classes @@ -4899,7 +4899,7 @@ Locale-specific names for a list of characters that are equal. The name is enclosed between @samp{[=} and @samp{=]}. For example, the name @samp{e} might be used to represent all of -``e,'' ``@`e,'' and ``@'e.'' In this case, @code{[[=e=]]} is a regexp +``e,'' ``@`e,'' and ``@'e.'' In this case, @samp{[[=e=]]} is a regexp that matches any of @samp{e}, @samp{@'e}, or @samp{@`e}. @end table @@ -4943,7 +4943,7 @@ or underscores (@samp{_}): @item \s Matches any whitespace character. Think of it as shorthand for -@w{@code{[[:space:]]}}. +@w{@samp{[[:space:]]}}. @c @cindex operators, @code{\S} (@command{gawk}) @cindex backslash (@code{\}), @code{\S} operator (@command{gawk}) @@ -4951,7 +4951,7 @@ Think of it as shorthand for @item \S Matches any character that is not whitespace. Think of it as shorthand for -@w{@code{[^[:space:]]}}. +@w{@samp{[^[:space:]]}}. @c @cindex operators, @code{\w} (@command{gawk}) @cindex backslash (@code{\}), @code{\w} operator (@command{gawk}) @@ -4959,7 +4959,7 @@ Think of it as shorthand for @item \w Matches any word-constituent character---that is, it matches any letter, digit, or underscore. Think of it as shorthand for -@w{@code{[[:alnum:]_]}}. +@w{@samp{[[:alnum:]_]}}. @c @cindex operators, @code{\W} (@command{gawk}) @cindex backslash (@code{\}), @code{\W} operator (@command{gawk}) @@ -4967,7 +4967,7 @@ letter, digit, or underscore. Think of it as shorthand for @item \W Matches any character that is not word-constituent. Think of it as shorthand for -@w{@code{[^[:alnum:]_]}}. +@w{@samp{[^[:alnum:]_]}}. @c @cindex operators, @code{\<} (@command{gawk}) @cindex backslash (@code{\}), @code{\<} operator (@command{gawk}) @@ -5078,7 +5078,7 @@ are allowed. @item @code{--traditional} Traditional Unix @command{awk} regexps are matched. The GNU operators are not special, and interval expressions are not available. -The POSIX character classes (@code{[[:alnum:]]}, etc.) are supported, +The POSIX character classes (@samp{[[:alnum:]]}, etc.) are supported, as Brian Kernighan's @command{awk} does support them. Characters described by octal and hexadecimal escape sequences are treated literally, even if they represent regexp metacharacters. @@ -5655,20 +5655,26 @@ BEGIN @{ RS = "\0" @} # whole file becomes one record? @command{gawk} in fact accepts this, and uses the @sc{nul} character for the record separator. However, this usage is @emph{not} portable -to other @command{awk} implementations. +to most other @command{awk} implementations. @cindex dark corner, strings, storing -All other @command{awk} implementations@footnote{At least that we know +Almost all other @command{awk} implementations@footnote{At least that we know about.} store strings internally as C-style strings. C strings use the @sc{nul} character as the string terminator. In effect, this means that @samp{RS = "\0"} is the same as @samp{RS = ""}. @value{DARKCORNER} +It happens that recent versions of @command{mawk} can use the @sc{nul} +character as a record separator. However, this is a special case: +@command{mawk} does not allow embedded @sc{nul} characters in strings. + @cindex records, treating files as @cindex files, as single records The best way to treat a whole file as a single record is to simply read the file in, one record at a time, concatenating each record onto the end of the previous ones. + +@c @strong{FIXME}: Using @sc{nul} is good for @file{/proc/environ} etc. @end sidebar @c ENDOFRANGE inspl @c ENDOFRANGE recspl @@ -9602,7 +9608,7 @@ point when reading the @command{awk} program source code, and for command-line variable assignments (@pxref{Other Arguments}). However, when interpreting input data, for @code{print} and @code{printf} output, and for number to string conversion, the local decimal point character is used. -@value{DARKCORNER}. +@value{DARKCORNER} Here are some examples indicating the difference in behavior, on a GNU/Linux system: @@ -33237,7 +33243,7 @@ The option for raw sockets was removed, since it was never implemented (@pxref{TCP/IP Networking}). @item -Ranges of the form @code{[d-h]} are treated as if they were in the +Ranges of the form @samp{[d-h]} are treated as if they were in the C locale, no matter what kind of regexp is being used, and even if @option{--posix} (@pxref{Ranges and Locales}). @@ -33445,7 +33451,7 @@ When @command{gawk} switched to using locale-aware regexp matchers, the problems began; especially as both GNU/Linux and commercial Unix vendors started implementing non-ASCII locales, @emph{and making them the default}. Perhaps the most frequently asked question became something -like ``why does @code{[A-Z]} match lowercase letters?!?'' +like ``why does @samp{[A-Z]} match lowercase letters?!?'' This situation existed for close to 10 years, if not more, and the @command{gawk} maintainer grew weary of trying to explain that |