summaryrefslogtreecommitdiffstats
path: root/tests
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2016-10-26 20:19:42 -0700
committerKaz Kylheku <kaz@kylheku.com>2016-10-26 20:19:42 -0700
commite0dbcc3a6455d990c0a0ecde74e279e8f3b53843 (patch)
tree835afaf66a49e1e9b0183f13705d83be76c7b07a /tests
parent88268ee75421084cc412d26250beb7483f49c1b3 (diff)
downloadtxr-e0dbcc3a6455d990c0a0ecde74e279e8f3b53843.tar.gz
txr-e0dbcc3a6455d990c0a0ecde74e279e8f3b53843.tar.bz2
txr-e0dbcc3a6455d990c0a0ecde74e279e8f3b53843.zip
Fix tok-str semantics once again.
The problem is that when the regular expression is capable of matching empty strings, tok-str will extract an empty token immediately following a non-empty token. For instance (tok-str "a,b" /[^,]*/) extracts ("a" "" "b") instead of just ("a" "b"). This is a poor behavior and the way to fix it is to impose a rule that an empty token must not be extracted immediately at the ending position of a previous token. Only a non-empty token can be consecutive to a token. * lib.c (tok_str): Rewrite the logic of the loop, using the prev_empty flag to suppress empty tokens which immediately follow non-empty tokens. The addition of 1 to the position when the token is empty to skip a character is done at the bottom of the loop and a new last_end variable keeps track of the end position of the last extracted token for the purposes of extracting the keep-between area if keep_sep is true. The old loop is preserved intact and enabled by compatibility. * tests/015/split.tl: Multiple empty-regex test cases for tok-str updated. * txr.1: Updated tok-str documentation and also added a note between the conditions under which split-str and tok-str, invoked with keep-sep true, produce equivalent output. Added compatibility notes.
Diffstat (limited to 'tests')
-rw-r--r--tests/015/split.tl16
1 files changed, 8 insertions, 8 deletions
diff --git a/tests/015/split.tl b/tests/015/split.tl
index 30a8e01c..ae77a642 100644
--- a/tests/015/split.tl
+++ b/tests/015/split.tl
@@ -123,34 +123,34 @@
(split-str "abcacabcac" #/ab?/ t) ("" "ab" "c" "a" "c" "ab" "c" "a" "c"))
(mtest
- (tok-str "" #//) nil
- (tok-str "a" #//) nil
+ (tok-str "" #//) ("")
+ (tok-str "a" #//) ("" "")
(tok-str "" #/a/) nil
(tok-str "a" #/a/) ("a"))
(mtest
- (tok-str "" #// t) ("")
- (tok-str "a" #// t) ("a")
+ (tok-str "" #// t) ("" "" "")
+ (tok-str "a" #// t) ("" "" "a" "" "")
(tok-str "" #/a/ t) ("")
(tok-str "a" #/a/ t) ("" "a" ""))
(mtest
- (tok-str "ab" #//) ("")
+ (tok-str "ab" #//) ("" "" "")
(tok-str "ab" #/a/) ("a")
(tok-str "ab" #/b/) ("b")
(tok-str "ab" #/ab/) ("ab")
(tok-str "ab" #/abc/) nil)
(mtest
- (tok-str "ab" #// t) ("a" "" "b")
+ (tok-str "ab" #// t) ("" "" "a" "" "b" "" "")
(tok-str "ab" #/a/ t) ("" "a" "b")
(tok-str "ab" #/b/ t) ("a" "b" "")
(tok-str "ab" #/ab/ t) ("" "ab" "")
(tok-str "ab" #/abc/ t) ("ab"))
(mtest
- (tok-str "abc" #//) ("" "")
- (tok-str "abc" #// t) ("a" "" "b" "" "c"))
+ (tok-str "abc" #//) ("" "" "" "")
+ (tok-str "abc" #// t) ("" "" "a" "" "b" "" "c" "" ""))
(mtest
(tok-str "abc" #/a/) ("a")