summaryrefslogtreecommitdiffstats
path: root/regex.c
Commit message (Collapse)AuthorAgeFilesLines
* cobj: rename poorly named default operation.Kaz Kylheku2017-05-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Renaming cobj_hash_op to cobj_eq_hash_op. This function is only appropriate to use with COBJ objects which use eq as their equal funtion. I've spotted one instance of an inappropriate use which have to be addressed by a different commit: the equal function is other than eq, but cobj_hash_op is used for the equal hash. * lib.h (cobj_hash_op): Declaration renamed to cobj_eq_hash_op. * hash.c (cobj_hash_op): Renamed to cobj_eq_hash_op. (hash_iter_ops): Refer to renamed cobj_hash_eq_op. * ffi.c (ffi_type_builtin_ops, ffi_type_struct_ops, ffi_type_ptr_ops, ffi-closure_ops, ffi_call_desc_ops): Likewise. * lib.c (cptr_ops): Likewise. * parser.c (parser_ops): Likewise. * rand.c (random_state_ops): Likewise. * regex.c (char_set_ops, regex_obj_ops): Likewise. * socket.c (dgram_strm_ops): Likewise. * stream.c (null_ops, stdio_ops, tail_ops, pipe_ops, dir_ops, string_in_ops, byte_in_ops, strlist_in_ops, string_out_ops, strlist_out_ops, cat_stream_ops, record_adapter_ops): Likewise. * struct.c (struct_type_ops): Likewise. * sysif.c (cptr_dl_ops): Likewise. * syslog.c (syslog_strm_ops): Likewise. * unwind.c (cont_ops): Likewise.
* Rename badly named default_bool_argKaz Kylheku2017-03-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib.h (default_bool_arg): Inline function renamed to default_null_arg. * eval.c (if_fun, pad, ginterate, giterate, range_star, range, constantp, macroexpand_1, macro_form_p, expand_with_free_refs, do_expand, eval_intrinsic, func_get_name, make_env_intrinsic): Follow rename. * arith.c (lognot): Likewise. * gc.c (gc_finalize): Likewise. * glob.c (glob_wrap): Likewise. * hash.c (group_reduce, gethash_n): Likewise. * lib.c (print, multi_sort, lazy_str, vector, iff, tok_str, split_str_keep, search_str, remove_if, val): Likewise. * match.c (match_fun): Likewise. * parser.c (lisp_parse_impl, regex_parse): Likewise. * rand.c (make_random_state): Likewise. * regex.c (read_until_match, search_regex, regex_compile): Likewise. * socket.c (sock_accept, sock_connect): Likewise. * stream.c (open_files_star, open_files, run, open_process, open_tail, get_string, record_adapter): Likewise. * struct.c (static_slot_ensure, static_slot_ens_rec, clear_struct, make_struct_type): Likewise. * sysif.c (exec_wrap, errno_wrap, cobj_ops_init): Likewise. * unwind.c (uw_capture_cont, uw_find_frames_impl): Likewise.
* Fix missing nao terminator in formatted printing.Kaz Kylheku2017-03-131-1/+1
| | | | | | | | | | | | | | * arith.c (trunc1, trunc, floorf, ceili): Add missing nao terminator to uw_throwf calls. * debug.c (debug): Missing nao terminator in format call. * eval.c (expand_opt_params_rec, me_equot): Missing nao terminator in eval_error call. * lib.c (use_sym): Missing nao in uw_throw call. * regex.c (reg_derivative): Missing nao in uw_throwf.
* Bump copyright year to 2017.Kaz Kylheku2017-01-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl: Add 2017 to all copyright headers and strings.
* Fix inconsistency in regex-source.Kaz Kylheku2016-12-251-2/+8
| | | | | | | | | | | | | | | | If we compile the regex expression (compound "str*"), calling regex-source on the compiled regex object yields "str*". That, of course, is treated as regex character syntax if fed back to regex-compile, and the * becomes an operator. We want the source to be (compound "str*"). This happens because the AST optimizer reduces (compound X) -> X. * regex.c (regex_compile): If the optimized expression is just a character string atom S, then for the purposes of maintaining the source code, convert it to (compound S).
* Adding functions fr^$, fr^, fr$ and frr.Kaz Kylheku2016-12-011-0/+28
| | | | | | | | * regex.c (regex_range_full_fun, regex_range_left_fun, regex_range_right_fun, regex_range_search_fun): New functions. (regex_init): Register fr^$, fr^, fr$ and frr intrinsics. * txr.1: Documented.
* Add stream printing context.Kaz Kylheku2016-10-201-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is some infrastructure which will support *print-circle*. * lib.h (struct strm_ctx): Forward declared. (struct cobj_ops): Add context parameter to print function pointer. (cobj_print_op, obj_print_impl): Add context parameter to declarations. * hash.c (hash_print_op): Take context argument and pass it down in obj_print_impl calls. * lib.c (cobj_print_op, out_quasi_str): Likewise (obj_print_impl): Likewise, and also pass to COBJ print method. (obj_print, obj_pprint): Pass null pointer as context argument to obj_print_impl. * regex.c (regex_print): Take context parameter and ignore it. * socket.c (dgram_print): Likewise. * stream.h (struct strm_ctx): New struct type. (struct strm_base): New ctx member, pointer to struct strm_ctx. (stream_print_op): Add context parameter to declaration. (get_set_ctx, get_ctx): Declared. * stream.c (strm_base_init): Add null pointer to initializer. (strm_base_cleanup): Add assertion against context pointer being non-null: that indicates that some stream operation installed a context pointer and neglected to restore it to null before returning, which is bad because context will be stack allocated. (stream_print_op, stdio_stream_print, cat_stream_print): Take context parameter and ignore it. (get_set_ctx, get_ctx): New functions. * struct.c (struct_type_print): Take context parameter and ignore it. (struct_inst_print): Take context parameter and pass down to obj_print_impl.
* Support n-ary and and or operators in regex.Kaz Kylheku2016-10-101-1/+63
| | | | | | | | | | | Since much regex code assumes these are binary, the easiest and briefest approach is to implement a code transformation pass which rewrites n-ary forms into binary. * regex.c (reg_nary_unfold, reg_nary_to_bin): New functions. (regex_compile): Put raw sexp through reg_nary_to_bin to expand the nary syntax.
* Simplify some regex tree walking code.Kaz Kylheku2016-10-101-18/+10
| | | | | | | | | | * regex.c (reg_expand_nongreedy, reg_compile_csets): Generalize the compound_s case slightly by referring to sym rather than hard-coded compound_s. Then handle most of the regex operators under this same case. Their semantics are not relevant to the expansions being performed in these functions: all their arguments are regexes to be recursed over.
* New function rra.Kaz Kylheku2016-10-031-0/+50
| | | | | | | | | * regex.c (range_regex_all, regex_range_all): New functions. (regex_init): Register rra intrinsic function. * regex.c (range_regex_all, regex_range_all): Declared. * txr.1: Documented rra.
* New rr function.Kaz Kylheku2016-10-031-0/+12
| | | | | | | | | | * regex.c (regex_range_search): New function. (regex_init): Register regex_range_search as rr intrinsic. * regex.h (regex_range_search): Declared. * txr.1: Documented rr, and added reference to it in description of regex-range.
* search-regex improvement: negative start and more.Kaz Kylheku2016-10-031-40/+52
| | | | | | | | | | | | * regex.c (search_regex): Handle negative starting positions according to the convention elsewhere and fail excessively negative ones. Consistently fail on starting positions exceeding the length of the string. Handle zero length matches by reporting them against the start position or position one past the last character, based on the value of from-end. * txr.1: search-regex documentation updated.
* Synchronize license comments with LICENSE.Kaz Kylheku2016-10-011-16/+17
| | | | | | | | | | | | | | | | | | | | * Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Revert to verbatim 2-Clause BSD.
* Flurry of regex bugfixes.Kaz Kylheku2016-09-251-11/+27
| | | | | | | | | | | | | | | | | | | | | | | | | * regex.c (match_regex): Bail if pos is too positive, beyond length of string. (match_regex_right): Include the pos == end case in the iteration, so we can match an empty suffix of the string. The inner loop guard takes care of not feeding any characters from the string into the regex machine in this case; we just feed the terminating zero to get the final state. (match_regst): Normalize a negative pos, otherwise the sub_str calculation will be junk, since match_regex returns a normalized position. After normalizing, check that if the position is still negative, the match must fail. (match_regst_right_old, match_regst_right): Use zero rather than t as the range end in sub_str. That way if len is zero and neg(len) produces zero, an empty string will be sliced out. For negative values, the zero serves as one position beyond the last char, just like t. (do_match_full_offs, regex_match_full, regex_range_full, regex_range_left): Fail match if normalized starting pos is negative. (regex_range_right): Fix completely bogus calculation of the returne range in the case when the end position defaults to the string length.
* regex.c: code formatting.Kaz Kylheku2016-09-251-1/+1
| | | | * regex.c (puts_clear_flag): Fix bad indentation.
* New function: regex-source.Kaz Kylheku2016-09-251-0/+7
| | | | | | | | | * regex.c (regex_source): New function. (regex_init): regex-source intrinsic registered. * regex.h (regex_source): Declared. * txr.1: Documented.
* Bugfix in regex printing: & operator.Kaz Kylheku2016-09-251-1/+1
| | | | | * regex.c (print_rec): Fix checking arg1 for consp but accessing arg2.
* New regex functions: m^$, m^, m$, and others.Kaz Kylheku2016-09-231-0/+130
| | | | | | | | | | | | | | | | | | | * regex.c (do_match_full, do_match_full_offs, do_match_left, do_match_left_offs, do_match_right, do_match_right_offs): New static functions. (regex_match_full_fun, regex_match_right_fun, regex_match_full, regex_match_left, regex_match_right, regex_range_full, regex_range_left, regex_range_right): New functions. (regex_init): Register f^$, f^, f$, m^$, m^, m$, r^$, r^ and r$ intrinsics. * regex.h (regex_match_full_fun, regex_match_right_fun, regex_match_full, regex_match_left, regex_match_right, regex_range_full, regex_range_left, regex_range_right): Declared. * txr.1: Documented new functions.
* Semantics change in match-regex-right.Kaz Kylheku2016-09-221-3/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | The way the end-position argument works in match-regex-right and match-regst-right is poorly considered. It basically enforces a constraint that there is a match which ends at that position and does not go beyond. This patch changes it work right: the functions test that the regex matches up to that position, as if the string ended there. * regex.c (match_regex_right_old): New static function, identical to the previous match_regex_right. Since we won't ever be using this inside TXR from any other module, we don't make it external. (match_regex_right): Rewritten to new semantics. (match_regst_right_old): New static function; provides the semantics of the old match_regst_right based on match_regex_right_old. (regex_init): Register match-regex-right and match-regst-right intrinsics to the match_regex_right_old and match_regst_right_old functions if compatibility <= 150 is requested. Otherwise they go to the rewritten new functions. * txr.1: Documentation updated, and compat notes added.
* Fix match-regex not conforming to documentation.Kaz Kylheku2016-09-221-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation says that match-regex returns the length. Actually, it returns the position after the last character matched. This makes a difference when the match doesn't begin at character zero. The actual behavior is that of the match_regex C function which has behaved that way since the dawn of TXR, and internals depend on it behaving that way. So the internal function is being retained, and a new function is being registered as the match-regex intrinsic. The choice of binding for match-regex is subject to the compatibility option. The behavior of match-regst is also being fixed since its return value is incorrect due to this issue. Since its return value makes no sense at all (does not represent the matched text), it is not subject to the compatibility option; it is just fixed to conform with the documentation. * regex.c (match_regex_len): New function. (match_regst): Keep using match_regex, but use its return value properly. This simplifies the range extraction code, which is why match_regex works that way in the first place. (regex_init): Register match-regex to match_regex_len, unless compatibility <= 150 is requested; then register to match_regex. * regex.h (match_regex_len): Declared. * txr.1: Compatibility notes added.
* Support functional argument in regsub.Kaz Kylheku2016-09-221-23/+41
| | | | | | | | | * regex.c (regsub): Allow the second argument to be a function, which is called with str as an argument, and returns a range which indicates what part of the string is to be replaced, or else nil. * txr.1: Documented functional argument of regsub.
* Support negative positions in regex matching funs.Kaz Kylheku2016-09-211-1/+9
| | | | | | | | | * regex.c (match_regex, match_regex_right): Detect a negative start or end position, respectively, and add the string length to it. If it is still negative, bail with nil. * txr.1: Documented.
* Move regex intrinsic registrations to regex.c.Kaz Kylheku2016-09-211-0/+14
| | | | | | | | * eval.c (eval_init): Remove all regex-related function registrations from here. * regex.c (regex_init): Move regex-related function registrations here.
* regex: optimize double complement.Kaz Kylheku2016-09-161-40/+46
| | | | * regex.c (reg_optimize): Implement ~~R -> R reduction.
* regex: add case to complement optimization.Kaz Kylheku2016-09-151-0/+2
| | | | | | | | * regex.c (reg_optimize): Based on the reasoning in the previous commit, we can also statically optimize a complement whose argument is the t regex: match nothing. We convert that to match everything: the .* regex. Now (regex-compile "~[]") -> #/.*/.
* regex: fix broken complement operator.Kaz Kylheku2016-09-151-1/+3
| | | | | | | | | | | | | | | | | | | The form (match-regex "xy" #/~ab/) should return 2 (full match) because "xy" is in the complement of the set { "ab" }. It wrongly returns 1. * regex.c (reg_derivative): Handle the case when the derivative of the complement's constituent expression yields nil. This means that the complemented regex matches the input. In this case, the complement must lapse to the .+ regex: match one or more characters. That is to say, if the input has at least one more character, there is a match, which covers all such characters. Otherwise there is no match: the input matches the complemented regex. In the t case, the return value is also wrong. If the complemented regex hits a brick wall (matches nothing, not even the empty string), the correct complement is "match everything": the .* regex. Not the match empty string regex!
* NFA regex optimization: use just one set array.Kaz Kylheku2016-07-191-48/+31
| | | | | | | | | | | | | | | | | | | | | | We don't have to flip between two arrays, since the nfa_closure and and nfa_move_closure can write the output set into the same array. * regex.c (struct nfa_machine): Replace flip and flop members with a single set. (nfa_closure, nfa_move_closure): out array parameter removed; in renamed to set. References to in and out simply replaced with set. (nfa_run): Allocate one set instead of two, plus the stack. Remove code to swap the two pointers on each iteration. (regex_machine_reset): Prepare initial closure in the one and only set array. (regex_machine_init): Allocate set array, rather than flip an flop. (regex_machine_cleanup): Free set array and null out pointer rather than flip and flop arrays. (regex_machine_feed): Pass just the set ot the nfa_move_closure function. Remove flip/flop pointer swapping
* NFA regex optimization: combine move and closure.Kaz Kylheku2016-07-191-37/+90
| | | | | | | | | | | | | | | | | | | | * regex.c (struct nfa_machine_t): Remove move and clos array pointers, replace with flip and flop. Remove nmove member. (nfa_move): Static function removed. (nfa_move_closure): New static function, based on nfa_move and logic from nfa_closure. (nfa_run): Use nfa_move_closure and flip between two arrays. (regex_machine_reset): Remove reference to nmove member in nfa_machine_t. Prepare initial closure in flip array. (regex_machine_init): Allocate flip and flop arrays, rather than removed move and clos. (regex_machine_cleanup): Free flip and flop arrays and zero out the pointers, rather than removed move and clos. (regex_machine_feed): Replace nfa_move and nfa_closure with combined nfa_move_closure from flip to flop, and exchange of flip and flop arrays.
* New --free-all option for freeing memory on exit.Kaz Kylheku2016-06-071-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although we are garbage-collected, being able to clean up on shutdown is nevertheless useful for uncovering leaks. Leaks can occur, for instance, due to neglect to free out-of-heap satellite data from objects that are reclaimed by gc. This feature is long overdue. * arith.c, arith.h (arith_free_all): New function. * gc.c, gc.h (gc_free_all): New function. * lib.c (init): Remove program name parameter and redundant initialization of progname globl variable. * lib.h (progname): Superfluous declaration removed. This is already declared in txr.h. (init): Declaration updated. * regex.c (char_set_destroy): Do not check the static allocation flag here; just destroy the object. Do check for a null pointer, though. (char_set_cobj_destroy): This cobj destructor now checks the static flag of the char set object and avoids freeing it. Thus our char set singletons are left alone by gc, but our global freeing function takes care of them. (wide_cs): New static variable moved out of wide_display_char_p to static scope. (regex_free_all): New function. * regex.h (regex_free_all): Declared. * txr.c (progname): const qualifier and initializer removed. (main): Ensure progname is always dynamically allocated, even in the argv[0] == 0 case. Do not pass progname to init; it doesn't take that argument any more. (free_all): New static function. (txr_main): Implement --free-all option. * txr.h (progname): Declaration updated.
* Some streamlining in the cons recycling.Kaz Kylheku2016-05-151-1/+1
| | | | | | | | | | | * lib.c (rcyc_pop): Just assume that *plist points to a cons and access the fields directly. (rcyc_cons): Don't bother with rplacd. (rcyc_list): Don't bother with set macro. * regex.c (read_until_match): Defensive coding: locally ensure that rcyc_pop won't be called on a nil stack, which will now segfault.
* Recycle conses in unget-char and read-until-match.Kaz Kylheku2016-04-201-3/+7
| | | | | | | | | | | | * regex.c (ead_until_match): Use rcyc_pop instead of pop to move the conses to the recycle list. We know these are not shared with anything. Adding additional logic to completely recycle the stack. * socket.c (dgram_get_char): Use rcyc_pop to get the character from the push-back list. * stream.c (stdio_get_char): Likewise.
* read-until-match can optionally keep matched text.Kaz Kylheku2016-04-201-21/+19
| | | | | | | | | | | | | | | | | | | | * regex.c (read_until_match): New argument, include_match. Three times repeated termination code refactored into block reached by forward goto. (regex_init): Registration of read-until-match updated. * regex.h (read_until_match): Declaration updated. * stream.c (struct record_adapter_base): New member, include_match. (record_adapter_get_line): Pass match to read_until_match as new argument. (record_adapater): New argument, include_match. (stream_init): Update registration of record-adapter. * stream.h (record_adapter): Declaration updated. * txr.1: Updated.
* Fix broken read_until_match.Kaz Kylheku2016-04-191-17/+51
| | | | | * regex.c (read_until_match): Completely rewrite broken, unsalvageable, garbage logic.
* Header file cleanup.Kaz Kylheku2016-01-221-1/+0
| | | | | | | * arith.c, cadr.c, debug.c, eval.c, filter.c, gencadr.txr, glob.c, hash.c, linenoise/linenoise.c, lisplib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Remove unncessary header files.
* Regex printing not escaping [ and ].Kaz Kylheku2016-01-121-1/+2
| | | | | * regex.c (print_rec): Handle '[' and ']' in backslash-adding switch.
* Print control chars in regexes using \x.Kaz Kylheku2016-01-121-53/+70
| | | | | | | | | | | | | | | | | | | | * lib.c (out_str_char): Static function becomes extern. * lib.h (out_str_char): Declared. * regex.c (puts_clear_flag, putc_clear_flag): New static functions. (print_class_char): Take semicolon flag argument. Use out_str_char to render characters not escaped locally. Clear the semicolon flag. (paren_print_rec): Take semicolon flag argument, and pass it down. Clear it when printing parentheses. (print_rec): Take semicolon flag argument, and pass down to lower level functions. Use putc_clear_flag and puts_clear_flag instead of put_string and put_char. Use out_str_char for char object not esaped locally. (regex_print): define semi_flag and pass it down to print_rec.
* regex_print: [ and ] in char class must be escaped.Kaz Kylheku2016-01-121-1/+1
| | | | * regex.c (print_class_char): Add missing character cases.
* Record-delimiting stream adapter.Kaz Kylheku2016-01-011-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | * regex.c (read_until_match): New function. (regex_init): Registered read-until-match intrinsic. * regex.h (read_until_match): Declared. * stream.c (struct delegate_base): New struct type. (delegate_base_mark, delegate_put_string, delegate_put_char, delegate_put_byte, delegate_get_char, delegate_get_byte, delegate_unget_char, delegate_unget_byte, delegate_close, delegate_flush, delegate_seek, delegate_truncate, delegate_get_prop, delegate_set_prop, delegate_get_error, delegate_get_error_str, delegate_clear_error, make_delegate_stream): New static functions. (struct record_adapter_base): New struct type. (record_adapter_base_mark, record_adapter_mark_op, record_adapter_get_line): New static functions. (record_adapter_ops): New static structure. (record_adapter): New function. (stream_init): Registered record-adapter intrinsic. * stream.h (record_adapter): Declared. * txr.1: Documented read-until-match and record-adapter.
* Copyright year bump.Kaz Kylheku2015-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright. * linenoise/LICENSE, linenoise/linenoise.c, linenoise/linenoise.h: Bump one principal author's copyright from 2014 to 2015. The code is based on a snapshot of 2015 upstream work.
* range-regex returns range, not cons.Kaz Kylheku2015-12-071-2/+2
| | | | | | | | | | | * regex.c (range_regex): Return range. (search_regst): Use appropriate accessors on range returned by range_regex. * lib.c (tok_where): Destructure range returned by range_regex, using range_bind. * txr.1: Documented changed behavior.
* Fix serious regression in search_regex.Kaz Kylheku2015-11-061-3/+1
| | | | | | | | * regex.c (search_regex): In the Sep 7 2015 commit titled "Don't use prot1 for temporary gc protection", a rel1 call was left behind, causing an assert whenever the function is used for a succesful "from end" search.
* Stop using C library setjmp/longjmp.Kaz Kylheku2015-10-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR is moving to custom assembly-language routines. This is mainly motivated by a very dubious thing done in the GNU C Library setjmp and longjmp in the name of security. Evidently, glibc's setjmp "mangles" certain pointer values which are stored into the jmp_buf buffer. It's been that way since 2005, evidently. This means that, firstly, all along, the use of setjmp in gc.c to get registers into a buffer so they can be scanned has not actually worked properly. More importantly, this pointer mangling in setjmp and longjmp is very hostile to a stack copying implementation of delimited continuations. The reason is that continuations contain jmp_buf buffers, which get relocated in the process of capturing and reviving a continuation. Any pointers in a jmp_buf which point into the captured stack segment have to be fixed up to point into the relocated location. Mangled pointers make this difficult, requiring hacks which are specific to glibc and the machine architecture. We might as well implement a clean, well-behaved setjmp and longjmp. * Makefile (jmp.o): New object file. (dbg/%.o, opt/%.o): New rules for .S prerequisites. * args.c, arith.c, cadr.c, combi.c, cadr.c, combi.c, debug.c, eval.c, filter.c, glob.c, hash.c, lib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Removed <setjmp.h> include. * gc.c: Switch to struct jmp and jmp_save, instead of jmp_buf and setjmp. * jmp.S: New source file. * signal.h (struct jmp): New struct type. (jmp_save, jmp_restore): New function declarations denoting assembly language routines in jmp.S. (extended_jmp_buf): Uses struct jmp instead of setjmp. (extended_setjmp): Use jmp_save instead of setjmp. (extended_longjmp): Use jmp_restore instead of longjmp.
* Additional reductions for and.Kaz Kylheku2015-09-291-0/+6
| | | | | | | * regex.c (reg_optimize): If the empty regex is and-ed with another regex, that other regex must be nullable, otherwise the and matches nothing. This is captured in some new reductions for the and operator.
* Simplify and optimization.Kaz Kylheku2015-09-291-4/+1
| | | | | | * regex.c (reg_optimize): No need to check reg_matches_all in and optimization case because the argument object has already been reduced that way by reg_optimize recursion.
* Optimize some cases of the regex branch operator.Kaz Kylheku2015-09-291-0/+43
| | | | | | * regex.c (reg_compl_char_p): New static function. (reg_optimize): Optimize various cases of the or operator: (R|) -> R?, (a|b) -> [ab] and others.
* Some optimizations for * ? and +.Kaz Kylheku2015-09-291-4/+21
| | | | | | * regex.c (regex_optimize): Simplify compounded uses of repetition operators: RR* -> R, R+? -> R* and so on.
* Regex printer fails on \w, \s or \d in char class.Kaz Kylheku2015-09-291-0/+2
| | | | | regex.c (print_rec): Bugfix: handle symbols in character class syntax.
* More complement optimizations.Kaz Kylheku2015-09-281-0/+19
| | | | | * regex.c (reg_optimize): Transform ~.*c to (.*[^c])? and ~c.* to ([^c].*)? where c is a single-character match.
* Streamline some regex optimizations.Kaz Kylheku2015-09-281-15/+48
| | | | | | | * regex.c (reg_single_char_p, invert_single): New static functions. (reg_optimize): Simplify complement operator optimizations using new functions.
* Optimization for one-character range.Kaz Kylheku2015-09-271-2/+7
| | | | | * regex.c (reg_optimize): [a] -> a. Also take advantage of this where the complement case generates [a].