aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawktexi.in
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawktexi.in')
-rw-r--r--doc/gawktexi.in49
1 files changed, 29 insertions, 20 deletions
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index 679073bf..f3b7de56 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -32,11 +32,13 @@
@ifnotdocbook
@set BULLET @bullet{}
@set MINUS @minus{}
+@set NUL @sc{nul}
@end ifnotdocbook
@ifdocbook
@set BULLET
@set MINUS
+@set NUL NUL
@end ifdocbook
@set xref-automatic-section-title
@@ -5105,10 +5107,10 @@ with @samp{A}.
@cindex POSIX @command{awk}, period (@code{.})@comma{} using
In strict POSIX mode (@pxref{Options}),
-@samp{.} does not match the @sc{nul}
+@samp{.} does not match the @value{NUL}
character, which is a character with all bits equal to zero.
-Otherwise, @sc{nul} is just another character. Other versions of @command{awk}
-may not be able to match the @sc{nul} character.
+Otherwise, @value{NUL} is just another character. Other versions of @command{awk}
+may not be able to match the @value{NUL} character.
@cindex @code{[]} (square brackets), regexp operator
@cindex square brackets (@code{[]}), regexp operator
@@ -6208,7 +6210,7 @@ a value that you know doesn't occur in the input file. This is hard
to do in a general way, such that a program always works for arbitrary
input files.
-You might think that for text files, the @sc{nul} character, which
+You might think that for text files, the @value{NUL} character, which
consists of a character with all bits equal to zero, is a good
value to use for @code{RS} in this case:
@@ -6217,23 +6219,23 @@ BEGIN @{ RS = "\0" @} # whole file becomes one record?
@end example
@cindex differences in @command{awk} and @command{gawk}, strings, storing
-@command{gawk} in fact accepts this, and uses the @sc{nul}
+@command{gawk} in fact accepts this, and uses the @value{NUL}
character for the record separator.
This works for certain special files, such as @file{/proc/environ} on
-GNU/Linux systems, where the @sc{nul} character is in fact the record separator.
+GNU/Linux systems, where the @value{NUL} character is in fact the record separator.
However, this usage is @emph{not} portable
to most other @command{awk} implementations.
@cindex dark corner, strings, storing
Almost all other @command{awk} implementations@footnote{At least that we know
about.} store strings internally as C-style strings. C strings use the
-@sc{nul} character as the string terminator. In effect, this means that
+@value{NUL} character as the string terminator. In effect, this means that
@samp{RS = "\0"} is the same as @samp{RS = ""}.
@value{DARKCORNER}
-It happens that recent versions of @command{mawk} can use the @sc{nul}
+It happens that recent versions of @command{mawk} can use the @value{NUL}
character as a record separator. However, this is a special case:
-@command{mawk} does not allow embedded @sc{nul} characters in strings.
+@command{mawk} does not allow embedded @value{NUL} characters in strings.
@cindex records, treating files as
@cindex treating files, as single records
@@ -9927,7 +9929,7 @@ double-quotation marks. For example:
@cindex strings, length limitations
represents the string whose contents are @samp{parrot}. Strings in
@command{gawk} can be of any length, and they can contain any of the possible
-eight-bit ASCII characters including ASCII @sc{nul} (character code zero).
+eight-bit ASCII characters including ASCII @value{NUL} (character code zero).
Other @command{awk}
implementations may have difficulty with some character codes.
@@ -13462,9 +13464,8 @@ the beginning, in the following manner:
@example
NF != 4 @{
- err = sprintf("%s:%d: skipped: NF != 4\n", FILENAME, FNR)
- print err > "/dev/stderr"
- next
+ printf("%s:%d: skipped: NF != 4\n", FILENAME, FNR) > "/dev/stderr"
+ next
@}
@end example
@@ -19659,8 +19660,8 @@ function mystrtonum(str, ret, n, i, k, c)
ret = 0
for (i = 1; i <= n; i++) @{
c = substr(str, i, 1)
- # index() returns 0 if c not in string,
- # includes c == "0"
+ # index() returns 0 if c not in string,
+ # includes c == "0"
k = index("1234567", c)
ret = ret * 8 + k
@@ -19673,8 +19674,8 @@ function mystrtonum(str, ret, n, i, k, c)
for (i = 1; i <= n; i++) @{
c = substr(str, i, 1)
c = tolower(c)
- # index() returns 0 if c not in string,
- # includes c == "0"
+ # index() returns 0 if c not in string,
+ # includes c == "0"
k = index("123456789abcdef", c)
ret = ret * 16 + k
@@ -30398,7 +30399,7 @@ and is managed by @command{gawk} from then on.
The API defines several simple @code{struct}s that map values as seen
from @command{awk}. A value can be a @code{double}, a string, or an
array (as in multidimensional arrays, or when creating a new array).
-String values maintain both pointer and length since embedded @sc{nul}
+String values maintain both pointer and length since embedded @value{NUL}
characters are allowed.
@quotation NOTE
@@ -30530,7 +30531,7 @@ Scalar values in @command{awk} are either numbers or strings. The
indicates what is in the @code{union}.
Representing numbers is easy---the API uses a C @code{double}. Strings
-require more work. Since @command{gawk} allows embedded @sc{nul} bytes
+require more work. Since @command{gawk} allows embedded @value{NUL} bytes
in string values, a string must be represented as a pair containing a
data-pointer and length. This is the @code{awk_string_t} type.
@@ -37776,7 +37777,7 @@ the derived files, because that keeps the repository less cluttered,
and it is easier to see the substantive changes when comparing versions
and trying to understand what changed between commits.
-However, there are two reasons why the @command{gawk} maintainer
+However, there are several reasons why the @command{gawk} maintainer
likes to have everything in the repository.
First, because it is then easy to reproduce any given version completely,
@@ -37845,6 +37846,14 @@ the maintainer is no different than Jane User who wants to try to build
Thus, the maintainer thinks that it's not just important, but critical,
that for any given branch, the above incantation @emph{just works}.
+@c Added 9/2014:
+A third reason to have all the files is that without them, using @samp{git
+bisect} to try to find the commit that introduced a bug is exceedingly
+difficult. The maintainer tried to do that on another project that
+requires running bootstrapping scripts just to create @command{configure}
+and so on; it was really painful. When the repository is self-contained,
+using @command{git bisect} in it is very easy.
+
@c So - that's my reasoning and philosophy.
What are some of the consequences and/or actions to take?