diff options
author | Kaz Kylheku <kaz@kylheku.com> | 2016-09-25 10:40:51 -0700 |
---|---|---|
committer | Kaz Kylheku <kaz@kylheku.com> | 2016-09-25 10:40:51 -0700 |
commit | b0bbe6e9dfd169f78b4908296d6edba52ed9a707 (patch) | |
tree | 9fe148cacf3e5252928b098fe33c41af6e9cf614 /txr.1 | |
parent | 7656e99c9e1ffb509a6310cadca26c4c1c7008c9 (diff) | |
download | txr-b0bbe6e9dfd169f78b4908296d6edba52ed9a707.tar.gz txr-b0bbe6e9dfd169f78b4908296d6edba52ed9a707.tar.bz2 txr-b0bbe6e9dfd169f78b4908296d6edba52ed9a707.zip |
awk macro: proper fs semantics in paragraph mode.
* share/txr/stdlib/awk.tl (sys:awk-state): New
slots: par-mode, par-mode-fs, par-mode-prev-fs.
(sys:awk-state rec-to-f): In paragraph mode,
detect that fs has changed since the last call.
In that case, take the user's fs and add to it
a newline match. If it is a regex, take the source,
add the syntax and recompile the regex. If it's
a string, build regex around it and compile.
(sys:awk-state loop): Maintain the par-mode-t
variable in the state structure as the rs
value triggers transitions into or out of
paragraph mode.
* txr.1: Updated documentation for rs.
Diffstat (limited to 'txr.1')
-rw-r--r-- | txr.1 | 29 |
1 files changed, 12 insertions, 17 deletions
@@ -38597,27 +38597,22 @@ or more blank lines (empty lines or lines containing only a mixture of tabs and spaces). This means that, effectively, the record-separating sequences match the regular expression .codn "/\en[ \en\et]*\en/" . -There is a difference between paragraph mode and simply using the above + +There are two differences between paragraph mode and simply using the above regular expression as .codn rs . -The difference is that if the first record which is read upon entering +The first difference is that if the first record which is read upon entering paragraph mode is empty (because the input begins with a match for the -separator regex), then that record is thrown away, and the next record -is read. - -Note that the POSIX Awk paragraph mode (which occurs when -.code RS -is blank) there is an additional difference: regardless of the value -of the field separator -.codn FS , -newline characters separate fields. This behavior is not implemented -in the -.code awk -macro. Since newlines are included as separators in under the default field -separation, the behaviors match in that case. Code using a custom +separator regex), then that record is thrown away, and the next record is read. +The second difference is that, if field separation based on the +.code fs +variable is in effect, then regardless of the value of +.codn fs , +newline characters separate fields. Therefore, the programmer-defined .code fs -must explicitly include a match for newline to obtain that as a field -separator. +doesn't have to include a match for newline. Moreover, if it is a simple +fixed string, it need not be converted to a regular expression which also +matches a newline. .coNP Variable @ krs .desc |