diff options
-rw-r--r-- | awklib/eg/prog/cut.awk | 2 | ||||
-rw-r--r-- | awklib/eg/prog/split.awk | 8 | ||||
-rw-r--r-- | awklib/eg/prog/uniq.awk | 5 | ||||
-rw-r--r-- | doc/ChangeLog | 5 | ||||
-rw-r--r-- | doc/gawk.info | 642 | ||||
-rw-r--r-- | doc/gawk.texi | 63 | ||||
-rw-r--r-- | doc/gawktexi.in | 63 | ||||
-rw-r--r-- | doc/gawkworkflow.texi | 4 |
8 files changed, 415 insertions, 377 deletions
diff --git a/awklib/eg/prog/cut.awk b/awklib/eg/prog/cut.awk index 1ec89288..b0987b3a 100644 --- a/awklib/eg/prog/cut.awk +++ b/awklib/eg/prog/cut.awk @@ -4,9 +4,9 @@ # May 1993 # Options: +# -c list Cut characters # -f list Cut fields # -d c Field delimiter character -# -c list Cut characters # # -s Suppress lines without the delimiter # diff --git a/awklib/eg/prog/split.awk b/awklib/eg/prog/split.awk index 8714dad2..16780044 100644 --- a/awklib/eg/prog/split.awk +++ b/awklib/eg/prog/split.awk @@ -7,10 +7,12 @@ # Revised slightly, May 2014 # Rewritten September 2020 -function usage() + +function usage( common) { - print("usage: split [-l count] [-a suffix-len] [file [outname]]") > "/dev/stderr" - print(" split [-b N[k|m]] [-a suffix-len] [file [outname]]") > "/dev/stderr" + common = "[-a suffix-len] [file [outname]]" + printf("usage: split [-l count] %s\n", common) > "/dev/stderr" + printf(" split [-b N[k|m]] %s\n", common) > "/dev/stderr" exit 1 } BEGIN { diff --git a/awklib/eg/prog/uniq.awk b/awklib/eg/prog/uniq.awk index 57c98f2c..e614bf2b 100644 --- a/awklib/eg/prog/uniq.awk +++ b/awklib/eg/prog/uniq.awk @@ -8,7 +8,8 @@ function usage() { - print("Usage: uniq [-udc [-f fields] [-s chars]] [ in [ out ]]") > "/dev/stderr" + print("Usage: uniq [-udc [-f fields] [-s chars]] " \ + "[ in [ out ]]") > "/dev/stderr" exit 1 } @@ -17,7 +18,7 @@ function usage() # -u only nonrepeated lines # -f n skip n fields # -s n skip n characters, skip fields first -# As of 2020, '+' can be used as option character in addition to '-' +# As of 2020, '+' can be used as the option character in addition to '-' # Previously allowed use of -N to skip fields and +N to skip # characters is no longer allowed, and not supported by this version. diff --git a/doc/ChangeLog b/doc/ChangeLog index 5805f050..26c118e9 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,8 @@ +2020-11-28 Arnold D. Robbins <arnold@skeeve.com> + + * gawkworkflow.texi: Add an additional web resource. + * gawktexi.in: More edits in sample programs chapter. + 2020-11-20 Arnold D. Robbins <arnold@skeeve.com> * gawktexi.in (Strange values): Correct the description of what diff --git a/doc/gawk.info b/doc/gawk.info index 19658e30..b7f772a2 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -17679,13 +17679,13 @@ pipeline generates a sorted, unique list of the logged-on users: separated with dashes. The list '1-8,15,22-35' specifies characters 1 through 8, 15, and 22 through 35. -'-f LIST' - Use LIST as the list of fields to cut out. - '-d DELIM' Use DELIM as the field-separator character instead of the TAB character. +'-f LIST' + Use LIST as the list of fields to cut out. + '-s' Suppress printing of lines that do not contain the field delimiter. @@ -17693,6 +17693,10 @@ pipeline generates a sorted, unique list of the logged-on users: function (*note Getopt Function::) and the 'join()' library function (*note Join Function::). + The current POSIX version of 'cut' has options to cut fields based on +both bytes and characters. This version does not attempt to implement +those options, as 'awk' works exclusively in terms of characters. + The program begins with a comment describing the options, the library functions needed, and a 'usage()' function that prints out a usage message and exits. 'usage()' is called if invalid arguments are @@ -17701,9 +17705,9 @@ supplied: # cut.awk --- implement cut in awk # Options: + # -c list Cut characters # -f list Cut fields # -d c Field delimiter character - # -c list Cut characters # # -s Suppress lines without the delimiter # @@ -18031,15 +18035,15 @@ Note the comment about invocation: Because several of the options overlap with 'gawk''s, a '--' is needed to tell 'gawk' to stop looking for options. - Next comes the code that handles the 'egrep'-specific behavior. If -no pattern is supplied with '-e', the first nonoption on the command -line is used. If the pattern is empty, that means no pattern was -supplied, so it's necessary to print an error message and exit. The -'awk' command-line arguments up to 'ARGV[Optind]' are cleared, so that -'awk' won't try to process them as files. If no files are specified, -the standard input is used, and if multiple files are specified, we make -sure to note this so that the file names can precede the matched lines -in the output: + Next comes the code that handles the 'egrep'-specific behavior. +'egrep' uses the first nonoption on the command line is used. if no +pattern is supplied with '-e'. If the pattern is empty, that means no +pattern was supplied, so it's necessary to print an error message and +exit. The 'awk' command-line arguments up to 'ARGV[Optind]' are +cleared, so that 'awk' won't try to process them as files. If no files +are specified, the standard input is used, and if multiple files are +specified, we make sure to note this so that the file names can precede +the matched lines in the output: if (pattern == "") pattern = ARGV[Optind++] @@ -18099,20 +18103,20 @@ code checks this condition by looking at the values of 'RSTART' and 'RLENGTH'. If those indicate that the match is not over the full line, 'matches' is set to zero (false). - If the user wants lines that did not match, the sense of 'matches' is -inverted using the '!' operator. 'fcount' is incremented with the value -of 'matches', which is either one or zero, depending upon a successful -or unsuccessful match. If the line does not match, the 'next' statement -just moves on to the next input line. - - A number of additional tests are made, but they are only done if we -are not counting lines. First, if the user only wants the exit status -('no_print' is true), then it is enough to know that _one_ line in this -file matched, and we can skip on to the next file with 'nextfile'. -Similarly, if we are only printing file names, we can print the file -name, and then skip to the next file with 'nextfile'. Finally, each -line is printed, with a leading file name, optional colon and line -number, and the final colon if necessary: + If the user wants lines that did not match, we invert the sense of +'matches' using the '!' operator. We then increment 'fcount' with the +value of 'matches', which is either one or zero, depending upon a +successful or unsuccessful match. If the line does not match, the +'next' statement just moves on to the next input line. + + We make a number of additional tests, but only if we are not counting +lines. First, if the user only wants the exit status ('no_print' is +true), then it is enough to know that _one_ line in this file matched, +and we can skip on to the next file with 'nextfile'. Similarly, if we +are only printing file names, we can print the file name, and then skip +to the next file with 'nextfile'. Finally, each line is printed, with a +leading file name, optional colon and line number, and the final colon +if necessary: { matches = match($0, pattern) @@ -18499,15 +18503,18 @@ character.(1) 'getopt()' function presented in *note Getopt Function::. The program begins with a standard descriptive comment and then a -'usage()' function describing the options: +'usage()' function describing the options. The variable 'common' keeps +the function's lines short so that they look nice on the page: # split.awk --- do split in awk # # Requires getopt() library function. - function usage() + + function usage( common) { - print("usage: split [-l count] [-a suffix-len] [file [outname]]") > "/dev/stderr" - print(" split [-b N[k|m]] [-a suffix-len] [file [outname]]") > "/dev/stderr" + common = "[-a suffix-len] [file [outname]]" + printf("usage: split [-l count] %s\n", common) > "/dev/stderr" + printf(" split [-b N[k|m]] %s\n", common) > "/dev/stderr" exit 1 } @@ -18876,7 +18883,8 @@ of the options and their meanings in comments: function usage() { - print("Usage: uniq [-udc [-f fields] [-s chars]] [ in [ out ]]") > "/dev/stderr" + print("Usage: uniq [-udc [-f fields] [-s chars]] " \ + "[ in [ out ]]") > "/dev/stderr" exit 1 } @@ -18891,7 +18899,7 @@ well as with '-'. An initial 'BEGIN' rule traverses the arguments changing any leading '+' to '-' so that the 'getopt()' function can parse the options: - # As of 2020, '+' can be used as option character in addition to '-' + # As of 2020, '+' can be used as the option character in addition to '-' # Previously allowed use of -N to skip fields and +N to skip # characters is no longer allowed, and not supported by this version. @@ -19106,7 +19114,7 @@ call out to other facilities written in C or C++. For the purposes of 'wc.awk', it's enough to know that the extension is loaded with the '@load' directive, and the additional function we will use is called 'mbs_length()'. This function returns the number of -bytes in a string, and not the number of characters. +bytes in a string, not the number of characters. The '"mbs"' extension comes from the 'gawkextlib' project. *Note gawkextlib:: for more information. @@ -19126,23 +19134,23 @@ standard input. If there are multiple files, it also prints total counts for all the files. The options and their meanings are as follows: +'-c' + Count only bytes. Once upon a time, the 'c' in this option stood + for "characters." But, as explained earlier, bytes and character + are no longer synonymous with each other. + '-l' Count only lines. +'-m' + Count only characters. + '-w' Count only words. A "word" is a contiguous sequence of nonwhitespace characters, separated by spaces and/or TABs. Luckily, this is the normal way 'awk' separates fields in its input data. -'-c' - Count only bytes. Once upon a time, the 'c' in this option stood - for "characters." But, as explained earlier, bytes and character - are no longer synonymous with each other. - -'-m' - Count only characters. - Implementing 'wc' in 'awk' is particularly elegant, because 'awk' does a lot of the work for us; it splits lines into words (i.e., fields) and counts them, it counts lines (i.e., records), and it can easily tell @@ -35158,7 +35166,7 @@ Index * BEGIN pattern, assert() user-defined function and: Assert Function. (line 83) * BEGIN pattern, pwcat program: Passwd Functions. (line 143) -* BEGIN pattern, running awk programs and: Cut Program. (line 63) +* BEGIN pattern, running awk programs and: Cut Program. (line 67) * BEGIN pattern, profiling and: Profiling. (line 62) * BEGIN pattern, TEXTDOMAIN variable and: Programmer i18n. (line 60) * BEGIN pattern, @namespace directive and: Changing The Namespace. @@ -35487,7 +35495,7 @@ Index * customized two-way processor: Two-way processors. (line 6) * cut utility: Cut Program. (line 6) * cut utility <1>: Cut Program. (line 6) -* cut.awk program: Cut Program. (line 45) +* cut.awk program: Cut Program. (line 49) * d debugger command (alias for delete): Breakpoint Control. (line 64) * dark corner: Conventions. (line 42) * dark corner, ARGV variable, value of: Executable Scripts. (line 55) @@ -36043,7 +36051,7 @@ Index * field separator, FPAT variable and: User-modified. (line 46) * field separator <1>: User-modified. (line 53) * field separator <2>: User-modified. (line 116) -* field separator, spaces as: Cut Program. (line 103) +* field separator, spaces as: Cut Program. (line 107) * field separator <3>: Changing Fields. (line 64) * fields: Reading Files. (line 14) * fields <1>: Fields. (line 6) @@ -36189,7 +36197,7 @@ Index (line 6) * FS variable, in multiline records: Multiple Line. (line 41) * FS variable <1>: User-modified. (line 53) -* FS variable, running awk programs and: Cut Program. (line 63) +* FS variable, running awk programs and: Cut Program. (line 67) * FSF (Free Software Foundation): Manual History. (line 6) * FSF (Free Software Foundation) <1>: Getting. (line 10) * FSF (Free Software Foundation) <2>: Glossary. (line 370) @@ -37607,7 +37615,7 @@ Index * split string into array: String Functions. (line 303) * split utility: Split Program. (line 6) * split() function, array elements, deleting: Delete. (line 61) -* split.awk program: Split Program. (line 50) +* split.awk program: Split Program. (line 51) * sprintf: OFMT. (line 15) * sprintf <1>: String Functions. (line 395) * sprintf() function, print/printf statements and: Round Function. @@ -38308,276 +38316,276 @@ Node: Sample Programs727047 Node: Running Examples727817 Node: Clones728545 Node: Cut Program729769 -Node: Egrep Program739698 -Node: Id Program748709 -Node: Split Program758656 -Ref: Split Program-Footnote-1768430 -Node: Tee Program768603 -Node: Uniq Program771393 -Node: Wc Program778957 -Node: Bytes vs. Characters779354 -Node: Using extensions780902 -Node: wc program781660 -Node: Miscellaneous Programs786525 -Node: Dupword Program787738 -Node: Alarm Program789768 -Node: Translate Program794623 -Ref: Translate Program-Footnote-1799188 -Node: Labels Program799458 -Ref: Labels Program-Footnote-1802809 -Node: Word Sorting802893 -Node: History Sorting806965 -Node: Extract Program809190 -Node: Simple Sed817244 -Node: Igawk Program820318 -Ref: Igawk Program-Footnote-1834649 -Ref: Igawk Program-Footnote-2834851 -Ref: Igawk Program-Footnote-3834973 -Node: Anagram Program835088 -Node: Signature Program838150 -Node: Programs Summary839397 -Node: Programs Exercises840611 -Ref: Programs Exercises-Footnote-1844741 -Node: Advanced Features844827 -Node: Nondecimal Data846817 -Node: Array Sorting848408 -Node: Controlling Array Traversal849108 -Ref: Controlling Array Traversal-Footnote-1857476 -Node: Array Sorting Functions857594 -Ref: Array Sorting Functions-Footnote-1862685 -Node: Two-way I/O862881 -Ref: Two-way I/O-Footnote-1870602 -Ref: Two-way I/O-Footnote-2870789 -Node: TCP/IP Networking870871 -Node: Profiling873989 -Node: Advanced Features Summary883303 -Node: Internationalization885147 -Node: I18N and L10N886627 -Node: Explaining gettext887314 -Ref: Explaining gettext-Footnote-1893206 -Ref: Explaining gettext-Footnote-2893391 -Node: Programmer i18n893556 -Ref: Programmer i18n-Footnote-1898505 -Node: Translator i18n898554 -Node: String Extraction899348 -Ref: String Extraction-Footnote-1900480 -Node: Printf Ordering900566 -Ref: Printf Ordering-Footnote-1903352 -Node: I18N Portability903416 -Ref: I18N Portability-Footnote-1905872 -Node: I18N Example905935 -Ref: I18N Example-Footnote-1909210 -Ref: I18N Example-Footnote-2909283 -Node: Gawk I18N909392 -Node: I18N Summary910041 -Node: Debugger911382 -Node: Debugging912382 -Node: Debugging Concepts912823 -Node: Debugging Terms914632 -Node: Awk Debugging917207 -Ref: Awk Debugging-Footnote-1918152 -Node: Sample Debugging Session918284 -Node: Debugger Invocation918818 -Node: Finding The Bug920204 -Node: List of Debugger Commands926678 -Node: Breakpoint Control928011 -Node: Debugger Execution Control931705 -Node: Viewing And Changing Data935067 -Node: Execution Stack938608 -Node: Debugger Info940245 -Node: Miscellaneous Debugger Commands944316 -Node: Readline Support949378 -Node: Limitations950274 -Node: Debugging Summary952828 -Node: Namespaces954107 -Node: Global Namespace955218 -Node: Qualified Names956616 -Node: Default Namespace957615 -Node: Changing The Namespace958356 -Node: Naming Rules959970 -Node: Internal Name Management961818 -Node: Namespace Example962860 -Node: Namespace And Features965422 -Node: Namespace Summary966857 -Node: Arbitrary Precision Arithmetic968334 -Node: Computer Arithmetic969821 -Ref: table-numeric-ranges973587 -Ref: table-floating-point-ranges974080 -Ref: Computer Arithmetic-Footnote-1974738 -Node: Math Definitions974795 -Ref: table-ieee-formats977771 -Node: MPFR features978338 -Node: FP Math Caution980056 -Ref: FP Math Caution-Footnote-1981128 -Node: Inexactness of computations981497 -Node: Inexact representation982528 -Node: Comparing FP Values983888 -Node: Errors accumulate985129 -Node: Strange values986585 -Ref: Strange values-Footnote-1989173 -Node: Getting Accuracy989278 -Node: Try To Round991988 -Node: Setting precision992887 -Ref: table-predefined-precision-strings993584 -Node: Setting the rounding mode995414 -Ref: table-gawk-rounding-modes995788 -Ref: Setting the rounding mode-Footnote-1999719 -Node: Arbitrary Precision Integers999898 -Ref: Arbitrary Precision Integers-Footnote-11003073 -Node: Checking for MPFR1003222 -Node: POSIX Floating Point Problems1004696 -Ref: POSIX Floating Point Problems-Footnote-11008981 -Node: Floating point summary1009019 -Node: Dynamic Extensions1011209 -Node: Extension Intro1012762 -Node: Plugin License1014028 -Node: Extension Mechanism Outline1014825 -Ref: figure-load-extension1015264 -Ref: figure-register-new-function1016829 -Ref: figure-call-new-function1017921 -Node: Extension API Description1019983 -Node: Extension API Functions Introduction1021696 -Ref: table-api-std-headers1023532 -Node: General Data Types1027781 -Ref: General Data Types-Footnote-11036411 -Node: Memory Allocation Functions1036710 -Ref: Memory Allocation Functions-Footnote-11041211 -Node: Constructor Functions1041310 -Node: API Ownership of MPFR and GMP Values1044776 -Node: Registration Functions1046089 -Node: Extension Functions1046789 -Node: Exit Callback Functions1052111 -Node: Extension Version String1053361 -Node: Input Parsers1054024 -Node: Output Wrappers1066745 -Node: Two-way processors1071257 -Node: Printing Messages1073522 -Ref: Printing Messages-Footnote-11074693 -Node: Updating ERRNO1074846 -Node: Requesting Values1075585 -Ref: table-value-types-returned1076322 -Node: Accessing Parameters1077258 -Node: Symbol Table Access1078495 -Node: Symbol table by name1079007 -Ref: Symbol table by name-Footnote-11082031 -Node: Symbol table by cookie1082159 -Ref: Symbol table by cookie-Footnote-11086344 -Node: Cached values1086408 -Ref: Cached values-Footnote-11089944 -Node: Array Manipulation1090097 -Ref: Array Manipulation-Footnote-11091188 -Node: Array Data Types1091225 -Ref: Array Data Types-Footnote-11093883 -Node: Array Functions1093975 -Node: Flattening Arrays1098473 -Node: Creating Arrays1105449 -Node: Redirection API1110216 -Node: Extension API Variables1113049 -Node: Extension Versioning1113760 -Ref: gawk-api-version1114189 -Node: Extension GMP/MPFR Versioning1115920 -Node: Extension API Informational Variables1117548 -Node: Extension API Boilerplate1118621 -Node: Changes from API V11122595 -Node: Finding Extensions1124167 -Node: Extension Example1124726 -Node: Internal File Description1125524 -Node: Internal File Ops1129604 -Ref: Internal File Ops-Footnote-11140954 -Node: Using Internal File Ops1141094 -Ref: Using Internal File Ops-Footnote-11143477 -Node: Extension Samples1143751 -Node: Extension Sample File Functions1145280 -Node: Extension Sample Fnmatch1152929 -Node: Extension Sample Fork1154416 -Node: Extension Sample Inplace1155634 -Node: Extension Sample Ord1159260 -Node: Extension Sample Readdir1160096 -Ref: table-readdir-file-types1160985 -Node: Extension Sample Revout1162052 -Node: Extension Sample Rev2way1162641 -Node: Extension Sample Read write array1163381 -Node: Extension Sample Readfile1165323 -Node: Extension Sample Time1166418 -Node: Extension Sample API Tests1168170 -Node: gawkextlib1168662 -Node: Extension summary1171580 -Node: Extension Exercises1175282 -Node: Language History1176524 -Node: V7/SVR3.11178180 -Node: SVR41180332 -Node: POSIX1181766 -Node: BTL1183147 -Node: POSIX/GNU1183876 -Node: Feature History1189654 -Node: Common Extensions1205973 -Node: Ranges and Locales1207256 -Ref: Ranges and Locales-Footnote-11211872 -Ref: Ranges and Locales-Footnote-21211899 -Ref: Ranges and Locales-Footnote-31212134 -Node: Contributors1212357 -Node: History summary1218354 -Node: Installation1219734 -Node: Gawk Distribution1220678 -Node: Getting1221162 -Node: Extracting1222125 -Node: Distribution contents1223763 -Node: Unix Installation1230243 -Node: Quick Installation1230925 -Node: Shell Startup Files1233339 -Node: Additional Configuration Options1234428 -Node: Configuration Philosophy1236743 -Node: Non-Unix Installation1239112 -Node: PC Installation1239572 -Node: PC Binary Installation1240410 -Node: PC Compiling1240845 -Node: PC Using1241962 -Node: Cygwin1245515 -Node: MSYS1246739 -Node: VMS Installation1247341 -Node: VMS Compilation1248132 -Ref: VMS Compilation-Footnote-11249361 -Node: VMS Dynamic Extensions1249419 -Node: VMS Installation Details1251104 -Node: VMS Running1253357 -Node: VMS GNV1257636 -Node: VMS Old Gawk1258371 -Node: Bugs1258842 -Node: Bug address1259505 -Node: Usenet1262487 -Node: Maintainers1263491 -Node: Other Versions1264676 -Node: Installation summary1271764 -Node: Notes1272973 -Node: Compatibility Mode1273767 -Node: Additions1274549 -Node: Accessing The Source1275474 -Node: Adding Code1276911 -Node: New Ports1283130 -Node: Derived Files1287505 -Ref: Derived Files-Footnote-11293165 -Ref: Derived Files-Footnote-21293200 -Ref: Derived Files-Footnote-31293798 -Node: Future Extensions1293912 -Node: Implementation Limitations1294570 -Node: Extension Design1295780 -Node: Old Extension Problems1296924 -Ref: Old Extension Problems-Footnote-11298442 -Node: Extension New Mechanism Goals1298499 -Ref: Extension New Mechanism Goals-Footnote-11301863 -Node: Extension Other Design Decisions1302052 -Node: Extension Future Growth1304165 -Node: Notes summary1304771 -Node: Basic Concepts1305929 -Node: Basic High Level1306610 -Ref: figure-general-flow1306892 -Ref: figure-process-flow1307577 -Ref: Basic High Level-Footnote-11310878 -Node: Basic Data Typing1311063 -Node: Glossary1314391 -Node: Copying1346276 -Node: GNU Free Documentation License1383819 -Node: Index1408939 +Node: Egrep Program739909 +Node: Id Program748920 +Node: Split Program758867 +Ref: Split Program-Footnote-1768757 +Node: Tee Program768930 +Node: Uniq Program771720 +Node: Wc Program779308 +Node: Bytes vs. Characters779705 +Node: Using extensions781253 +Node: wc program782007 +Node: Miscellaneous Programs786872 +Node: Dupword Program788085 +Node: Alarm Program790115 +Node: Translate Program794970 +Ref: Translate Program-Footnote-1799535 +Node: Labels Program799805 +Ref: Labels Program-Footnote-1803156 +Node: Word Sorting803240 +Node: History Sorting807312 +Node: Extract Program809537 +Node: Simple Sed817591 +Node: Igawk Program820665 +Ref: Igawk Program-Footnote-1834996 +Ref: Igawk Program-Footnote-2835198 +Ref: Igawk Program-Footnote-3835320 +Node: Anagram Program835435 +Node: Signature Program838497 +Node: Programs Summary839744 +Node: Programs Exercises840958 +Ref: Programs Exercises-Footnote-1845088 +Node: Advanced Features845174 +Node: Nondecimal Data847164 +Node: Array Sorting848755 +Node: Controlling Array Traversal849455 +Ref: Controlling Array Traversal-Footnote-1857823 +Node: Array Sorting Functions857941 +Ref: Array Sorting Functions-Footnote-1863032 +Node: Two-way I/O863228 +Ref: Two-way I/O-Footnote-1870949 +Ref: Two-way I/O-Footnote-2871136 +Node: TCP/IP Networking871218 +Node: Profiling874336 +Node: Advanced Features Summary883650 +Node: Internationalization885494 +Node: I18N and L10N886974 +Node: Explaining gettext887661 +Ref: Explaining gettext-Footnote-1893553 +Ref: Explaining gettext-Footnote-2893738 +Node: Programmer i18n893903 +Ref: Programmer i18n-Footnote-1898852 +Node: Translator i18n898901 +Node: String Extraction899695 +Ref: String Extraction-Footnote-1900827 +Node: Printf Ordering900913 +Ref: Printf Ordering-Footnote-1903699 +Node: I18N Portability903763 +Ref: I18N Portability-Footnote-1906219 +Node: I18N Example906282 +Ref: I18N Example-Footnote-1909557 +Ref: I18N Example-Footnote-2909630 +Node: Gawk I18N909739 +Node: I18N Summary910388 +Node: Debugger911729 +Node: Debugging912729 +Node: Debugging Concepts913170 +Node: Debugging Terms914979 +Node: Awk Debugging917554 +Ref: Awk Debugging-Footnote-1918499 +Node: Sample Debugging Session918631 +Node: Debugger Invocation919165 +Node: Finding The Bug920551 +Node: List of Debugger Commands927025 +Node: Breakpoint Control928358 +Node: Debugger Execution Control932052 +Node: Viewing And Changing Data935414 +Node: Execution Stack938955 +Node: Debugger Info940592 +Node: Miscellaneous Debugger Commands944663 +Node: Readline Support949725 +Node: Limitations950621 +Node: Debugging Summary953175 +Node: Namespaces954454 +Node: Global Namespace955565 +Node: Qualified Names956963 +Node: Default Namespace957962 +Node: Changing The Namespace958703 +Node: Naming Rules960317 +Node: Internal Name Management962165 +Node: Namespace Example963207 +Node: Namespace And Features965769 +Node: Namespace Summary967204 +Node: Arbitrary Precision Arithmetic968681 +Node: Computer Arithmetic970168 +Ref: table-numeric-ranges973934 +Ref: table-floating-point-ranges974427 +Ref: Computer Arithmetic-Footnote-1975085 +Node: Math Definitions975142 +Ref: table-ieee-formats978118 +Node: MPFR features978685 +Node: FP Math Caution980403 +Ref: FP Math Caution-Footnote-1981475 +Node: Inexactness of computations981844 +Node: Inexact representation982875 +Node: Comparing FP Values984235 +Node: Errors accumulate985476 +Node: Strange values986932 +Ref: Strange values-Footnote-1989520 +Node: Getting Accuracy989625 +Node: Try To Round992335 +Node: Setting precision993234 +Ref: table-predefined-precision-strings993931 +Node: Setting the rounding mode995761 +Ref: table-gawk-rounding-modes996135 +Ref: Setting the rounding mode-Footnote-11000066 +Node: Arbitrary Precision Integers1000245 +Ref: Arbitrary Precision Integers-Footnote-11003420 +Node: Checking for MPFR1003569 +Node: POSIX Floating Point Problems1005043 +Ref: POSIX Floating Point Problems-Footnote-11009328 +Node: Floating point summary1009366 +Node: Dynamic Extensions1011556 +Node: Extension Intro1013109 +Node: Plugin License1014375 +Node: Extension Mechanism Outline1015172 +Ref: figure-load-extension1015611 +Ref: figure-register-new-function1017176 +Ref: figure-call-new-function1018268 +Node: Extension API Description1020330 +Node: Extension API Functions Introduction1022043 +Ref: table-api-std-headers1023879 +Node: General Data Types1028128 +Ref: General Data Types-Footnote-11036758 +Node: Memory Allocation Functions1037057 +Ref: Memory Allocation Functions-Footnote-11041558 +Node: Constructor Functions1041657 +Node: API Ownership of MPFR and GMP Values1045123 +Node: Registration Functions1046436 +Node: Extension Functions1047136 +Node: Exit Callback Functions1052458 +Node: Extension Version String1053708 +Node: Input Parsers1054371 +Node: Output Wrappers1067092 +Node: Two-way processors1071604 +Node: Printing Messages1073869 +Ref: Printing Messages-Footnote-11075040 +Node: Updating ERRNO1075193 +Node: Requesting Values1075932 +Ref: table-value-types-returned1076669 +Node: Accessing Parameters1077605 +Node: Symbol Table Access1078842 +Node: Symbol table by name1079354 +Ref: Symbol table by name-Footnote-11082378 +Node: Symbol table by cookie1082506 +Ref: Symbol table by cookie-Footnote-11086691 +Node: Cached values1086755 +Ref: Cached values-Footnote-11090291 +Node: Array Manipulation1090444 +Ref: Array Manipulation-Footnote-11091535 +Node: Array Data Types1091572 +Ref: Array Data Types-Footnote-11094230 +Node: Array Functions1094322 +Node: Flattening Arrays1098820 +Node: Creating Arrays1105796 +Node: Redirection API1110563 +Node: Extension API Variables1113396 +Node: Extension Versioning1114107 +Ref: gawk-api-version1114536 +Node: Extension GMP/MPFR Versioning1116267 +Node: Extension API Informational Variables1117895 +Node: Extension API Boilerplate1118968 +Node: Changes from API V11122942 +Node: Finding Extensions1124514 +Node: Extension Example1125073 +Node: Internal File Description1125871 +Node: Internal File Ops1129951 +Ref: Internal File Ops-Footnote-11141301 +Node: Using Internal File Ops1141441 +Ref: Using Internal File Ops-Footnote-11143824 +Node: Extension Samples1144098 +Node: Extension Sample File Functions1145627 +Node: Extension Sample Fnmatch1153276 +Node: Extension Sample Fork1154763 +Node: Extension Sample Inplace1155981 +Node: Extension Sample Ord1159607 +Node: Extension Sample Readdir1160443 +Ref: table-readdir-file-types1161332 +Node: Extension Sample Revout1162399 +Node: Extension Sample Rev2way1162988 +Node: Extension Sample Read write array1163728 +Node: Extension Sample Readfile1165670 +Node: Extension Sample Time1166765 +Node: Extension Sample API Tests1168517 +Node: gawkextlib1169009 +Node: Extension summary1171927 +Node: Extension Exercises1175629 +Node: Language History1176871 +Node: V7/SVR3.11178527 +Node: SVR41180679 +Node: POSIX1182113 +Node: BTL1183494 +Node: POSIX/GNU1184223 +Node: Feature History1190001 +Node: Common Extensions1206320 +Node: Ranges and Locales1207603 +Ref: Ranges and Locales-Footnote-11212219 +Ref: Ranges and Locales-Footnote-21212246 +Ref: Ranges and Locales-Footnote-31212481 +Node: Contributors1212704 +Node: History summary1218701 +Node: Installation1220081 +Node: Gawk Distribution1221025 +Node: Getting1221509 +Node: Extracting1222472 +Node: Distribution contents1224110 +Node: Unix Installation1230590 +Node: Quick Installation1231272 +Node: Shell Startup Files1233686 +Node: Additional Configuration Options1234775 +Node: Configuration Philosophy1237090 +Node: Non-Unix Installation1239459 +Node: PC Installation1239919 +Node: PC Binary Installation1240757 +Node: PC Compiling1241192 +Node: PC Using1242309 +Node: Cygwin1245862 +Node: MSYS1247086 +Node: VMS Installation1247688 +Node: VMS Compilation1248479 +Ref: VMS Compilation-Footnote-11249708 +Node: VMS Dynamic Extensions1249766 +Node: VMS Installation Details1251451 +Node: VMS Running1253704 +Node: VMS GNV1257983 +Node: VMS Old Gawk1258718 +Node: Bugs1259189 +Node: Bug address1259852 +Node: Usenet1262834 +Node: Maintainers1263838 +Node: Other Versions1265023 +Node: Installation summary1272111 +Node: Notes1273320 +Node: Compatibility Mode1274114 +Node: Additions1274896 +Node: Accessing The Source1275821 +Node: Adding Code1277258 +Node: New Ports1283477 +Node: Derived Files1287852 +Ref: Derived Files-Footnote-11293512 +Ref: Derived Files-Footnote-21293547 +Ref: Derived Files-Footnote-31294145 +Node: Future Extensions1294259 +Node: Implementation Limitations1294917 +Node: Extension Design1296127 +Node: Old Extension Problems1297271 +Ref: Old Extension Problems-Footnote-11298789 +Node: Extension New Mechanism Goals1298846 +Ref: Extension New Mechanism Goals-Footnote-11302210 +Node: Extension Other Design Decisions1302399 +Node: Extension Future Growth1304512 +Node: Notes summary1305118 +Node: Basic Concepts1306276 +Node: Basic High Level1306957 +Ref: figure-general-flow1307239 +Ref: figure-process-flow1307924 +Ref: Basic High Level-Footnote-11311225 +Node: Basic Data Typing1311410 +Node: Glossary1314738 +Node: Copying1346623 +Node: GNU Free Documentation License1384166 +Node: Index1409286 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 0886e460..ad2cc6fe 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -25118,13 +25118,13 @@ may be separated by commas, and ranges of characters can be separated with dashes. The list @samp{1-8,15,22-35} specifies characters 1 through 8, 15, and 22 through 35. -@item -f @var{list} -Use @var{list} as the list of fields to cut out. - @item -d @var{delim} Use @var{delim} as the field-separator character instead of the TAB character. +@item -f @var{list} +Use @var{list} as the list of fields to cut out. + @item -s Suppress printing of lines that do not contain the field delimiter. @end table @@ -25134,6 +25134,10 @@ function (@pxref{Getopt Function}) and the @code{join()} library function (@pxref{Join Function}). +The current POSIX version of @command{cut} has options to cut fields based on +both bytes and characters. This version does not attempt to implement those options, +as @command{awk} works exclusively in terms of characters. + The program begins with a comment describing the options, the library functions needed, and a @code{usage()} function that prints out a usage message and exits. @code{usage()} is called if invalid arguments are @@ -25154,9 +25158,9 @@ supplied: @c file eg/prog/cut.awk # Options: +# -c list Cut characters # -f list Cut fields # -d c Field delimiter character -# -c list Cut characters # # -s Suppress lines without the delimiter # @@ -25228,7 +25232,7 @@ incorrect---@command{awk} would separate fields with runs of spaces, TABs, and/or newlines, and we want them to be separated with individual spaces. To this end, we save the original space character in the variable -@code{fs} for later use; after setting @code{FS} to @code{"[ ]"} we can't +@code{fs} for later use; after setting @code{FS} to @code{@w{"[ ]"}} we can't use it directly to see if the field delimiter character is in the string. Also remember that after @code{getopt()} is through @@ -25555,9 +25559,9 @@ Note the comment about invocation: Because several of the options overlap with @command{gawk}'s, a @option{--} is needed to tell @command{gawk} to stop looking for options. -Next comes the code that handles the @command{egrep}-specific behavior. If no -pattern is supplied with @option{-e}, the first nonoption on the -command line is used. +Next comes the code that handles the @command{egrep}-specific behavior. +@command{egrep} uses the first nonoption on the command line is used. +if no pattern is supplied with @option{-e}. If the pattern is empty, that means no pattern was supplied, so it's necessary to print an error message and exit. The @command{awk} command-line arguments up to @code{ARGV[Optind]} @@ -25640,13 +25644,13 @@ the code checks this condition by looking at the values of is not over the full line, @code{matches} is set to zero (false). If the user -wants lines that did not match, the sense of @code{matches} is inverted -using the @samp{!} operator. @code{fcount} is incremented with the value of +wants lines that did not match, we invert the sense of @code{matches} +using the @samp{!} operator. We then increment @code{fcount} with the value of @code{matches}, which is either one or zero, depending upon a successful or unsuccessful match. If the line does not match, the @code{next} statement just moves on to the next input line. -A number of additional tests are made, but they are only done if we +We make a number of additional tests, but only if we are not counting lines. First, if the user only wants the exit status (@code{no_print} is true), then it is enough to know that @emph{one} line in this file matched, and we can skip on to the next file with @@ -26158,7 +26162,9 @@ Here is an implementation of @command{split} in @command{awk}. It uses the @code{getopt()} function presented in @ref{Getopt Function}. The program begins with a standard descriptive comment and then -a @code{usage()} function describing the options: +a @code{usage()} function describing the options. The variable +@code{common} keeps the function's lines short so that they +look nice on the page: @cindex @code{split.awk} program @example @@ -26178,10 +26184,12 @@ a @code{usage()} function describing the options: @c endfile @end ignore @c file eg/prog/split.awk -function usage() + +function usage( common) @{ - print("usage: split [-l count] [-a suffix-len] [file [outname]]") > "/dev/stderr" - print(" split [-b N[k|m]] [-a suffix-len] [file [outname]]") > "/dev/stderr" + common = "[-a suffix-len] [file [outname]]" + printf("usage: split [-l count] %s\n", common) > "/dev/stderr" + printf(" split [-b N[k|m]] %s\n", common) > "/dev/stderr" exit 1 @} @c endfile @@ -26646,7 +26654,8 @@ the options and their meanings in comments: function usage() @{ - print("Usage: uniq [-udc [-f fields] [-s chars]] [ in [ out ]]") > "/dev/stderr" + print("Usage: uniq [-udc [-f fields] [-s chars]] " \ + "[ in [ out ]]") > "/dev/stderr" exit 1 @} @@ -26665,7 +26674,7 @@ so that the @code{getopt()} function can parse the options: @example @c file eg/prog/uniq.awk -# As of 2020, '+' can be used as option character in addition to '-' +# As of 2020, '+' can be used as the option character in addition to '-' # Previously allowed use of -N to skip fields and +N to skip # characters is no longer allowed, and not supported by this version. @@ -26914,7 +26923,7 @@ For the purposes of @file{wc.awk}, it's enough to know that the extension is loaded with the @code{@@load} directive, and the additional function we will use is called @code{mbs_length()}. This function returns the -number of bytes in a string, and not the number of characters. +number of bytes in a string, not the number of characters. The @code{"mbs"} extension comes from the @code{gawkextlib} project. @xref{gawkextlib} for more information. @@ -26933,23 +26942,23 @@ input. If there are multiple files, it also prints total counts for all the files. The options and their meanings are as follows: @table @code -@item -l -Count only lines. - -@item -w -Count only words. -A ``word'' is a contiguous sequence of nonwhitespace characters, separated -by spaces and/or TABs. Luckily, this is the normal way @command{awk} separates -fields in its input data. - @item -c Count only bytes. Once upon a time, the @samp{c} in this option stood for ``characters.'' But, as explained earlier, bytes and character are no longer synonymous with each other. +@item -l +Count only lines. + @item -m Count only characters. + +@item -w +Count only words. +A ``word'' is a contiguous sequence of nonwhitespace characters, separated +by spaces and/or TABs. Luckily, this is the normal way @command{awk} separates +fields in its input data. @end table Implementing @command{wc} in @command{awk} is particularly elegant, diff --git a/doc/gawktexi.in b/doc/gawktexi.in index 1498c8ae..a33e933f 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -24084,13 +24084,13 @@ may be separated by commas, and ranges of characters can be separated with dashes. The list @samp{1-8,15,22-35} specifies characters 1 through 8, 15, and 22 through 35. -@item -f @var{list} -Use @var{list} as the list of fields to cut out. - @item -d @var{delim} Use @var{delim} as the field-separator character instead of the TAB character. +@item -f @var{list} +Use @var{list} as the list of fields to cut out. + @item -s Suppress printing of lines that do not contain the field delimiter. @end table @@ -24100,6 +24100,10 @@ function (@pxref{Getopt Function}) and the @code{join()} library function (@pxref{Join Function}). +The current POSIX version of @command{cut} has options to cut fields based on +both bytes and characters. This version does not attempt to implement those options, +as @command{awk} works exclusively in terms of characters. + The program begins with a comment describing the options, the library functions needed, and a @code{usage()} function that prints out a usage message and exits. @code{usage()} is called if invalid arguments are @@ -24120,9 +24124,9 @@ supplied: @c file eg/prog/cut.awk # Options: +# -c list Cut characters # -f list Cut fields # -d c Field delimiter character -# -c list Cut characters # # -s Suppress lines without the delimiter # @@ -24194,7 +24198,7 @@ incorrect---@command{awk} would separate fields with runs of spaces, TABs, and/or newlines, and we want them to be separated with individual spaces. To this end, we save the original space character in the variable -@code{fs} for later use; after setting @code{FS} to @code{"[ ]"} we can't +@code{fs} for later use; after setting @code{FS} to @code{@w{"[ ]"}} we can't use it directly to see if the field delimiter character is in the string. Also remember that after @code{getopt()} is through @@ -24521,9 +24525,9 @@ Note the comment about invocation: Because several of the options overlap with @command{gawk}'s, a @option{--} is needed to tell @command{gawk} to stop looking for options. -Next comes the code that handles the @command{egrep}-specific behavior. If no -pattern is supplied with @option{-e}, the first nonoption on the -command line is used. +Next comes the code that handles the @command{egrep}-specific behavior. +@command{egrep} uses the first nonoption on the command line is used. +if no pattern is supplied with @option{-e}. If the pattern is empty, that means no pattern was supplied, so it's necessary to print an error message and exit. The @command{awk} command-line arguments up to @code{ARGV[Optind]} @@ -24606,13 +24610,13 @@ the code checks this condition by looking at the values of is not over the full line, @code{matches} is set to zero (false). If the user -wants lines that did not match, the sense of @code{matches} is inverted -using the @samp{!} operator. @code{fcount} is incremented with the value of +wants lines that did not match, we invert the sense of @code{matches} +using the @samp{!} operator. We then increment @code{fcount} with the value of @code{matches}, which is either one or zero, depending upon a successful or unsuccessful match. If the line does not match, the @code{next} statement just moves on to the next input line. -A number of additional tests are made, but they are only done if we +We make a number of additional tests, but only if we are not counting lines. First, if the user only wants the exit status (@code{no_print} is true), then it is enough to know that @emph{one} line in this file matched, and we can skip on to the next file with @@ -25124,7 +25128,9 @@ Here is an implementation of @command{split} in @command{awk}. It uses the @code{getopt()} function presented in @ref{Getopt Function}. The program begins with a standard descriptive comment and then -a @code{usage()} function describing the options: +a @code{usage()} function describing the options. The variable +@code{common} keeps the function's lines short so that they +look nice on the page: @cindex @code{split.awk} program @example @@ -25144,10 +25150,12 @@ a @code{usage()} function describing the options: @c endfile @end ignore @c file eg/prog/split.awk -function usage() + +function usage( common) @{ - print("usage: split [-l count] [-a suffix-len] [file [outname]]") > "/dev/stderr" - print(" split [-b N[k|m]] [-a suffix-len] [file [outname]]") > "/dev/stderr" + common = "[-a suffix-len] [file [outname]]" + printf("usage: split [-l count] %s\n", common) > "/dev/stderr" + printf(" split [-b N[k|m]] %s\n", common) > "/dev/stderr" exit 1 @} @c endfile @@ -25612,7 +25620,8 @@ the options and their meanings in comments: function usage() @{ - print("Usage: uniq [-udc [-f fields] [-s chars]] [ in [ out ]]") > "/dev/stderr" + print("Usage: uniq [-udc [-f fields] [-s chars]] " \ + "[ in [ out ]]") > "/dev/stderr" exit 1 @} @@ -25631,7 +25640,7 @@ so that the @code{getopt()} function can parse the options: @example @c file eg/prog/uniq.awk -# As of 2020, '+' can be used as option character in addition to '-' +# As of 2020, '+' can be used as the option character in addition to '-' # Previously allowed use of -N to skip fields and +N to skip # characters is no longer allowed, and not supported by this version. @@ -25880,7 +25889,7 @@ For the purposes of @file{wc.awk}, it's enough to know that the extension is loaded with the @code{@@load} directive, and the additional function we will use is called @code{mbs_length()}. This function returns the -number of bytes in a string, and not the number of characters. +number of bytes in a string, not the number of characters. The @code{"mbs"} extension comes from the @code{gawkextlib} project. @xref{gawkextlib} for more information. @@ -25899,23 +25908,23 @@ input. If there are multiple files, it also prints total counts for all the files. The options and their meanings are as follows: @table @code -@item -l -Count only lines. - -@item -w -Count only words. -A ``word'' is a contiguous sequence of nonwhitespace characters, separated -by spaces and/or TABs. Luckily, this is the normal way @command{awk} separates -fields in its input data. - @item -c Count only bytes. Once upon a time, the @samp{c} in this option stood for ``characters.'' But, as explained earlier, bytes and character are no longer synonymous with each other. +@item -l +Count only lines. + @item -m Count only characters. + +@item -w +Count only words. +A ``word'' is a contiguous sequence of nonwhitespace characters, separated +by spaces and/or TABs. Luckily, this is the normal way @command{awk} separates +fields in its input data. @end table Implementing @command{wc} in @command{awk} is particularly elegant, diff --git a/doc/gawkworkflow.texi b/doc/gawkworkflow.texi index 0aa68825..4106b75d 100644 --- a/doc/gawkworkflow.texi +++ b/doc/gawkworkflow.texi @@ -2147,6 +2147,10 @@ In particular, the @uref{https://git-scm.com/book/en/v2, See also @uref{http://savannah.gnu.org/maintenance/UsingGit, the Savannah quick introduction to Git}. +A nice article on how Git works is +@uref{http://jwiegley.github.io/git-from-the-bottom-up/, +@cite{Git From The Bottom Up}}, by John Wiegley. + @node TODO @appendix Stuff Still To Do In This Document |