Merge branch 'gawk-5.1-stable'

author: Arnold D. Robbins <arnold@skeeve.com> 2021-01-08 14:50:19 +0200
committer: Arnold D. Robbins <arnold@skeeve.com> 2021-01-08 14:50:19 +0200
commit: 32a9f7f24000827da433b89de7c24c5d95a561c6 (patch)
tree: 130816e4172b9e1bca3fb44f9fe9fca5a9f082c4 /doc/gawk.texi
parent: 117fe375fd1ab8aa02c7000f148142659ee14308 (diff)
parent: 5b3ac78d72621697e717765d2d634d7fc271f3c9 (diff)
download: egawk-32a9f7f24000827da433b89de7c24c5d95a561c6.tar.gz
egawk-32a9f7f24000827da433b89de7c24c5d95a561c6.tar.bz2
egawk-32a9f7f24000827da433b89de7c24c5d95a561c6.zip
1 files changed, 47 insertions, 5 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 664f9e70..078a0ec8 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -59,7 +59,7 @@
 @c applies to and all the info about who's publishing this edition
 
 @c These apply across the board.
-@set UPDATE-MONTH September, 2020
+@set UPDATE-MONTH January, 2021
 @set VERSION 5.1
 @set PATCHLEVEL 0
 
@@ -285,13 +285,13 @@ Fax: +1-617-542-2652
 Email: <email>gnu@@gnu.org</email>
 URL: <ulink url="https://www.gnu.org">https://www.gnu.org/</ulink></literallayout>
 
-<literallayout class="normal">Copyright &copy; 1989, 1991, 1992, 1993, 1996&ndash;2005, 2007, 2009&ndash;2020
+<literallayout class="normal">Copyright &copy; 1989, 1991, 1992, 1993, 1996&ndash;2005, 2007, 2009&ndash;2021
 Free Software Foundation, Inc.
 All Rights Reserved.</literallayout>
 @end docbook
 
 @ifnotdocbook
-Copyright @copyright{} 1989, 1991, 1992, 1993, 1996--2005, 2007, 2009--2020 @*
+Copyright @copyright{} 1989, 1991, 1992, 1993, 1996--2005, 2007, 2009--2021 @*
 Free Software Foundation, Inc.
 @end ifnotdocbook
 @sp 2
@@ -6995,7 +6995,36 @@ if the input text that could match the trailing part is fairly long.
 @command{gawk} attempts to avoid this problem, but currently, there's
 no guarantee that this will never happen.
 
-@quotation NOTE
+@cindex sidebar @subentry Caveats When Using Regular Expressions for @code{RS}
+@ifdocbook
+@docbook
+<sidebar><title>Caveats When Using Regular Expressions for @code{RS}</title>
+@end docbook
+
+Remember that in @command{awk}, the @samp{^} and @samp{$} anchor
+metacharacters match the beginning and end of a @emph{string}, and not
+the beginning and end of a @emph{line}.  As a result, something like
+@samp{RS = "^[[:upper:]]"} can only match at the beginning of a file.
+This is because @command{gawk} views the input file as one long string
+that happens to contain newline characters.
+It is thus best to avoid anchor metacharacters in the value of @code{RS}.
+
+Record splitting with regular expressions works differently than
+regexp matching with the @code{sub()}, @code{gsub()}, and @code{gensub()}
+(@pxref{String Functions}).  Those functions allow a regexp to match the empty string;
+record splitting does not.  Thus, for example @samp{RS = "()"} does @emph{not}
+split records between characters.
+
+@docbook
+</sidebar>
+@end docbook
+@end ifdocbook
+
+@ifnotdocbook
+@cartouche
+@center @b{Caveats When Using Regular Expressions for @code{RS}}
+
+
 Remember that in @command{awk}, the @samp{^} and @samp{$} anchor
 metacharacters match the beginning and end of a @emph{string}, and not
 the beginning and end of a @emph{line}.  As a result, something like
@@ -7003,7 +7032,14 @@ the beginning and end of a @emph{line}.  As a result, something like
 This is because @command{gawk} views the input file as one long string
 that happens to contain newline characters.
 It is thus best to avoid anchor metacharacters in the value of @code{RS}.
-@end quotation
+
+Record splitting with regular expressions works differently than
+regexp matching with the @code{sub()}, @code{gsub()}, and @code{gensub()}
+(@pxref{String Functions}).  Those functions allow a regexp to match the empty string;
+record splitting does not.  Thus, for example @samp{RS = "()"} does @emph{not}
+split records between characters.
+@end cartouche
+@end ifnotdocbook
 
 @cindex @command{gawk} @subentry @code{RT} variable in
 @cindex @code{RT} variable
@@ -7712,6 +7748,12 @@ $ @kbd{echo 'xxAA  xxBxx  C' |}
 @print{} -->C<--
 @end example
 
+Finally, field splitting with regular expressions works differently than
+regexp matching with the @code{sub()}, @code{gsub()}, and @code{gensub()}
+(@pxref{String Functions}).  Those functions allow a regexp to match the
+empty string; field splitting does not.  Thus, for example @samp{FS =
+"()"} does @emph{not} split fields between characters.
+
 @node Single Character Fields
 @subsection Making Each Character a Separate Field
author	Arnold D. Robbins <arnold@skeeve.com>	2021-01-08 14:50:19 +0200
committer	Arnold D. Robbins <arnold@skeeve.com>	2021-01-08 14:50:19 +0200
commit	32a9f7f24000827da433b89de7c24c5d95a561c6 (patch)
tree	130816e4172b9e1bca3fb44f9fe9fca5a9f082c4 /doc/gawk.texi
parent	117fe375fd1ab8aa02c7000f148142659ee14308 (diff)
parent	5b3ac78d72621697e717765d2d634d7fc271f3c9 (diff)
download	egawk-32a9f7f24000827da433b89de7c24c5d95a561c6.tar.gz egawk-32a9f7f24000827da433b89de7c24c5d95a561c6.tar.bz2 egawk-32a9f7f24000827da433b89de7c24c5d95a561c6.zip