summaryrefslogtreecommitdiffstats
path: root/match.c
Commit message (Collapse)AuthorAgeFilesLines
* Hash-bang support for .tl files.Kaz Kylheku2015-07-021-1/+1
| | | | | | | | | | | | | | | | * parser.c (read_eval_stream): New boolean argument to request hash bang support. * parser.h (read_eval_stream): Declaration updated. * eval.c (sys_load): Pass new thid argument to read_eval_stream, to decline hash bang support. * match.c (v_load): Likewise. * txr.c (txr_main): Request hash bang support from read_eval_stream. Thus files referenced from the txr command line can have a #! line, which is ignored.
* @(load) and @(include) now load Lisp code.Kaz Kylheku2015-06-121-29/+36
| | | | | | | | | | | * match.c (v_load): Check txr_lisp_p flag coming out of open_txr_file and handle the Lisp case usin read_eval_stream. * parser.c (read_eval_stream, get_parser, parser_errors): New functions. * parser.h (read_eval_stream, get_parser, parser_errors): Declared.
* Preparing for lisp loading.Kaz Kylheku2015-06-101-1/+2
| | | | | | | | | | | | | * parser.c (open_txr_file): Rewritten to take new argument which indicates whether to treat an unsuffixed file as TXR or TXR Lisp, and is updated to indicate which is the case by looking at the suffix. * parser.h (open_txr_file): Declaration updated. * match.c (v_load): Follow change in open_txr_file. * txr.c (txr_main): Likewise.
* * match.c (v_load): Call parse_once rater than parse.Kaz Kylheku2015-06-071-1/+1
| | | | | | | | | | | | | * parser.c (regex_parse): Likewise. * txr.c (txr_main): Likewise. * parser.h (parse): Declaration updated. (parse_once): Declared. * parser.y (parse_once): New function, same as old parse implementation. (parse): Becomes one argument function which works with a previously initialized parser and continues the parse.
* Ligher weight debug instrumentation.Kaz Kylheku2015-05-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This speeds up the TXR Lisp interpreter, because do_eval sets up a debug frame and uses debug_return. * debug.c (debug_block_s): Symbol removed. (debug_init): Remove initialization of debug_block_s. * debug.h (debug_block_s): Declaration removed. (debug_enter): Do not establish a named block or a catch block; no time-wasting unwind stack manipulation at all. The debug_depth variable is managed by the extended setjmp context now. Provide a return value variable, and a well-defined name to branch to to exit from the debug block. (debug_return): Do not use heavy-weight uw_block_return; simply set the return variable and branch to debug_return_out label. * signal.h (EJ_DBG_MEMB, EJ_DBG_SAVE, EJ_DBG_REST, EJ_OPT_MEMB, EJ_OPT_SAVE, EJ_OPT_REST): New macros. (extended_jmp_buf): Define optional global state variables using EJ_OPT_MEMB. (extended_setjmp): Save and restore optional globals using EJ_OPT_SAVE and EJ_OPT_RESTORE. Now debug_depth is saved and restored if debugging support is compiled in. * match.c (open_data_source): Remove bogus debug_return invocations which were uncovered here by changes to the macro. * eval.c (do_eval, expand_macro): debug_return must now be after debug_end, because it won't dynamically clean up frames that it doesn't know about. The set_dyn_env is no longer unreachable in expand_macro; it is now necessary because debug_return isn't doing the longjmp that previously restored dyn_env.
* Slight internal representation change of string-only exceptions.Kaz Kylheku2015-02-061-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | One upshot of all this is that (throw 'foo "msg") now does exactly the same thing as (throwf 'foo "msg"). A message-only exception really is a one-string exception argument list ("message ..."), like the documentation says. * unwind.h (struct uw_catch): exception member renamed to args. (uw_catch): Macro follows structure member rename. * eval.c (op_catch): Removed now unnecessary kludge of turning non-list exception argument list into a one-element argument list. * match.c (v_try): Similar hack to the one in op_catch removed here. * unwind.c (uw_unwind_to_exit_point, uw_push_catch): Follows rename of exception member. (uw_throw): The exception parameter is renamed to args. The kludge removed from op_catch re-appears here, because numerous calls to uw_throw just pass a string as args. It's less of a kludge here because this is the master entry point to exception processing, and it straightens out the representation right away. The exception arguments or message are printed in a clearer way.
* Update copyright notices from 2014 to 2015.Kaz Kylheku2015-02-011-1/+1
| | | | | | | | | | | * arith.c, arith.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Update. * LICENSE, METALICENSE: Likewise.
* * match.c (h_trailer): Bugfix: not returning new variableKaz Kylheku2015-01-051-2/+2
| | | | bindings captured in trailer section. Ouch!
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-41/+41
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.
* Source file inclusion implemented: needed for macros.Kaz Kylheku2014-10-201-6/+17
| | | | | | | | | | | | | | | | | | | | | * match.c (include_s): New symbol variable. (v_load): Function extended to handle include semantics. (include): External wrapper function for doing inclusion via v_load. (syms_init): include_s initialized. * match.h (include_s): Declared. (include): Declared. * parser.y (check_for_include): New static function. (clauses_rev): Use check_for_include to replace @(include ..) directive. * txr.1: Documented include. * genvim.txr: Added include symbol. * txr.vim: Regenerated.
* * match.c (match_fun): Bugfix: replace incorrect plain returnKaz Kylheku2014-10-191-1/+1
| | | | | | | | | with debug_return. This causes a stray debug frame to be left on the environment stack which turns to garbage, leading to an invalid longjmp in another debug_return elsewhere which tries to use that frame. This was diagnosed by valgrind indicating accesses below the stack frame, and also by glibc "longjmp causes uninitialized stack frame" abort.
* * match.c (mf_all): Drop data_lineno parameter. InitializeKaz Kylheku2014-10-181-26/+24
| | | | | | | | | | | the corresponding member based on whether or not data is nil. (do_match_line, mf_from_ml, match_filter, match_fun, extract): Do not pass starting line number argument to mf_all. This fixes a bug when the line number at @(eof) for an empty file comes out as zero. (mf_args, v_skip, v_fuzz, v_next, v_gather, v_collect, open_data_source, match_files): Use zero and one instead of num(0) and num(1).
* * match.c (v_eof): Bugfix: we are at EOF not only whenKaz Kylheku2014-10-171-1/+1
| | | | | the remaining data is nil but when it is (nil). This happens for interactive streams.
* * match.c (dest_bind): Remove the restriction of not allowingKaz Kylheku2014-10-171-2/+14
| | | | | | | | | | | @(expr ...) and @var on the left side of a bind. This is useful, and necessary for @(line @(lisp expr)) to work: matching computed line numbers and character positions. * txr.1: Document use of Lisp on left hand side of bind, that there is a restriction on the left hand side of a set, and that Lisp can be used in a line or chr directive for computed matches.
* Converting cast expressions to macros that are retargettedKaz Kylheku2014-10-171-71/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to C++ style casts when compiling as C++. * lib.h (strip_qual, convert, coerce): New casting macros. (TAG_MASK, tag, type, wli_noex, auto_str, static_str, litptr, num_fast, chr, lit_noex, nil, nao): Use cast macros. * arith.c (mul, isqrt_fixnum, bit): Use cast macros. * configure (INT_PTR_MAX): Define using cast macro. * debug.c (debug_init): Use cast macro. * eval.c (do_eval, expand_macro, reg_op, reg_mac, eval_init): Use cast macros. * filter.c (filter_init): Use cast macro. * gc.c (more, mark_obj, in_heap, mark, sweep_one, unmark): Use cast macros. * hash.c (hash_double, equal_hash, eql_hash, hash_equal_op, hash_hash_op, hash_print_op, hash_mark, make_hash, make_similar_hash, copy_hash, gethash_c, gethash, gethash_f, gethash_n, remhash, hash_count, get_hash_userdata, set_hash_userdata, hash_iter_destroy, hash_iter_mark, hash_begin, hash_uni, hash_diff, hash_isec): Use cast macros. * lib.c (code2type, chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc, chk_strdup, num, c_num, string, mkstring, mkustring, upcase_str, downcase_str, string_extend, sub_str, cat_str, trim_str, c_chr, vector, vec_set_length, copy_vec, sub_vec, cat_vec, cobj_print_op, obj_init): Likewise. * match.c (do_match_line, hv_trampoline, match_files, dir_tables_init): Likewise. * parser.l (grammar): Likewise. * parser.y (parse): Likewise. * rand.c (make_state, make_random_state, random_fixnum, random): Likewise. * regex.c (CHAR_SET_L2_LO, CHAR_SET_L2_HI, CHAR_SET_L1_LO, CHAR_SET_L1_HI, CHAR_SET_L0_LO, CHAR_SET_L0_HI, L0_full, L0_fill_range, L1_full, L1_fill_range, L1_contains, L1_free, L2_full, L2_fill_range, L2_contains, L2_free, L3_fill_range, L3_contains, L3_free, char_set_create, char_set_cobj_destroy, nfa_state_accept, nfa_state_empty, nfa_state_single, nfa_state_wild, nfa_state_set,
* * match.c (dest_bind): More detailed log message for variableKaz Kylheku2014-10-161-1/+2
| | | | mismatch.
* New @(line) and @(chr) directives.Kaz Kylheku2014-10-161-1/+48
| | | | | | | | | | | | | * match.c (line_s): New variable. (h_chr, v_line): New static functions. (syms_init): line_s initialized. (dir_tables_init): Register v_line and h_chr. * match.h (line_s): Declared. * txr.1: Document @(line) and @(chr) directives. * txr.vim: Regenerated.
* * match.c (subst_vars): Fix buggy rendering of TXR Lisp expressionsKaz Kylheku2014-10-151-5/+24
| | | | | | | | | | | | | | | | | that evaluate to lists. For instance `@(list)` renders to the string "nil", and `@(list 1 2)` renders as "(1 2)". The desired behavior is "" and "1 2", respectively. (do_output_line): In output directives, there is a similar problem. A @(list) in the middle of an output block turns to nil, and a @(list 1 2) renders in parentheses as (1 2). Furthermore, there is the additional problem that no filtering is applied to the interpolated value. These behaviors are subject to the compatibility option, since they change the externally visible behavior of TXR programs. * txr.1: Document that empty lists in @(output) variable substitutions turn into nothing. Document value of 100 for -C option, describing the above issue.
* Eliminating the extra list wrapping applied to regularKaz Kylheku2014-10-031-28/+30
| | | | | | | | | | | | | | | | | | | expression objects in the syntax tree. The parser just puts out a #<regex ...> instead of (#<regex ...> regex-syntax). * eval.c (do_eval): We no longer need the hack of treating (#<regex> ...) as a special form which evaluates to #<regex>. (expand): We no longer have to skip over regex syntax, so the case is removed. * match.c (h_var, do_txeval, do_match_line): regexp cases are no longer subcases of consp but stand on their own. In do_match_line, we introduce a COBJ case into the type switch for regexes. * parser.y: regexes are now compiled in the regex and lisp_regex grammar rules instead of the dependent rules, and are not wrapped in extra syntax.
* * match.c (h_var): Fix regression introduced in 2014-08-11Kaz Kylheku2014-10-031-4/+3
| | | | | | | | | | | commit. The incompleteness of that change broke the case of an unbound variable followed by a bound variable. The value of the second variable was still being wrapped in the old complicated representation before being pushed to the front of the spec. * txr.1: Replace bogus text which says that variables are not bound to regexes, and so regex matches from variable substitutions do not arise. This works fine after this change.
* * match.c (v_load): Fix regression introduced in 94: broken @(load).Kaz Kylheku2014-08-291-1/+1
|
* Uprooting stupidities in handling of output variables.Kaz Kylheku2014-08-131-5/+1
| | | | | | | | | | | | | | | | | | | | * parser.y (o_elems_transform): Remove useless function which was only unwrapping the strange parse of output vars. (o_elems_opt, rep_elem, quasilit, wordsqlit): Eliminate o_elems_transform call. (o_var, q_var): Eliminate the phrase structure rules which match an extra o_elem or quasi_item, and which incorporate them into the var syntax tree element. Place the modifiers into the third position, not fourth. * eval.c (subst_vars): Eliminate handling of "pat" element. Actually that was not even there thanks to o_elems_transform being applied: dead code. Pull modifiers from the third element of the var form now, not fourth. * match.c (subst_vars): Similar changes as in the match.c subst_vars function. Here the pat variable is even more obviously useless; if it is not nil, it is just punted back to the spec.
* Fix regression in previous change: we must match a compound textKaz Kylheku2014-08-131-8/+13
| | | | | | | | | | | | element whole, and not break it up. * match.c (search_match): Take a spec argument. (h_var): Turn a text element into a one-element spec and process with search_match. * txr.1: Updated text about matching of variables followed by a directive or function, and about consecutive variables via directive.
* When a variable is delimited by some form other thanKaz Kylheku2014-08-121-95/+110
| | | | | | | | | | | | | | | | | | | | the contents of a variable, fixed string or regex, we now use the entire tail of the specline to find the match. So for instance @var@(trailer)foo works as intuition might expect. * match.c (search_form): Static function removed. (search_match): New static function based on search_form. Does not handle regexes, and does not update c->bindings. (h_var): Renamed local variable pat to next. Added a few missing rlcp's. Combined the cases when pat is a cons to one block so consp isn't repeatedly tested. Function now handles a var followed by (sys:text ...) elements specially; the first element of the text block is pulled out and matched. Implemented "var delimiting spec" general case which matches the entire tail of the spec at successive character positions until a match is found, and the skipped text goes into the variable.
* First cut at restructuring how variable matching works in the patternKaz Kylheku2014-08-111-22/+15
| | | | | | | | | | | | | | | | | language. The goal is to remove the strict behavior of using only one element of context after a variable. variable form at parse time: we unravel that first. * parser.y (grammar): Simplifying the phrase structure rule for the var element. All the variants that have a trailing elem are removed. The abstract syntax changes; the modifier moves to the third position in the list. * match.c (h_var): Matching change: the element which follows a variable is now pulled from the specline rather than the variable syntax, which is how it should have been done in the first place. The modifiers are pulled from a different spot in the variable syntax.
* Big switch to reentrant lexing and parsing.Kaz Kylheku2014-08-021-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * parser.l (YY_INPUT): Stop relying on removed yyin_stream; refer to stream via yyextra. (yyin_stream, lineno, errors, spec_file_str, prepared_error_message): Global variables removed. (yyget_column, yyset_column): Missing prototypes not generated by flex in bison bridge mode have to be added by us to avoid warning. (yyerror): Takes parser and scanner as parameters. Prepared error message is now in the parser context. Calls to other error handling functions receive scanner context. (yyerr): New function. (yyerrorf, yyerrprepf): Takes scanner argument, chases extra data to get to parser, and refers to parser variables instead of globals. (num_esc): Scanner argument added. (%option reentrant, %option bison-bridge, %option extra-type): New flex options. (grammar): yyscanner added everywhere. (end_of_char): Takes scanner argument. (parse_init): Removed references to yyin_stream and prepared_error_message. (parse_reset): Function renamed to open_txr_file. Returns results via pointers instead of setting global variables. (regex_parse, lisp_parse): Use reentrant parser interface. * parser.y (yyerror): Prototype removed. (yylex): Prototype moved after grammar, with new arguments. (sym_helper, define_transform): Take scanner argument. (make_expr): Takes parser argument. (rlrec): New static function. (rl): Function turned into macro. (mkexp, symhlpr): New macros. (%purse-parser, %parse-param, %lex-param): New Yacc options. (grammar): Actions re-worked for reentrance. Parser and scanner contexts are passed down to helper functions, in some cases via the three new macros. The result of the parse is stored in the syntax_tree member of the parser_t structure instead of a global. The yylex function receives the scanner instance. (get_spec): Function removed. (parse): New function. * parser.h (lineno, errors, yyin_stream, spec_file_str): Declarations removed. (parser_t): New struct. (yyerr): New function declared. (yyparse, yyerror, yyerrorf, end_of_regex, end_of_char, yylex, yylex_destroy): Declarations updated.
* * Makefile, arith.c, arith.h, combi.c, combi.h, configure, debug.c,Kaz Kylheku2014-07-231-16/+16
| | | | | | | | debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Synchronize license header with LICENSE.
* * match.c (subst_vars): Bugfix: I neglected to apply theKaz Kylheku2014-07-221-1/+1
| | | | | filter which is in effect to the result of interpolating a TXR Lisp expression, oops!
* * match.c (v_do, v_require): Set up and tear down environment frame,Kaz Kylheku2014-07-151-2/+11
| | | | | | | like other situations that evaluate TXR Lisp from the pattern language. Otherwise obscure things will go wrong. (h_do): Same as above, and additionally, add the forgotten call to install the bindings into the match context.
* * match.c (h_eol): Fix broken horizontal @(eol).Kaz Kylheku2014-07-151-1/+1
| | | | | It should be returning next_spec_k, rather than bindings, which indicate a complete match.
* Optimization: add missing tail updates to some listKaz Kylheku2014-06-201-1/+1
| | | | | | | | | | | collecting loops. * lib.c (tuples_func, where, sel): Catch return value of list_collect and update tail variable. * match.c (do_txeval): Likewise. * parser.y (expand_meta): Likewise for list_collect_nconc.
* * Makefile: Install share/txr/stdlib/*.txr material.Kaz Kylheku2014-06-121-1/+2
| | | | | | | | | | | * match.c (do_txeval): If a variable is not in the bindings, fall back on treating it as a TXR Lisp dynamic variable. This allows us to refer to the stdlib variable from a quasistring in a @(load ...) directive. * txr.c (sysroot_init): Register new variable, *txr-version*. * share/txr/stdlib/ver.txr: New file.
* * match.c (v_load): use the abs_path_p function instead ofKaz Kylheku2014-06-121-1/+1
| | | | | | | | | | | | | checking for leading slash. * stream.c (abs_path_p): New function. (stream_init): Register abs_path_p as abs-path-p. * stream.h (abs_path_p): Declared. * txr.1: Documented abs-path-p. * dep.mk: Updated.
* The dumping of bindings and printing of false must nowKaz Kylheku2014-06-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | be explicitly requested by the -B option. * match.c (opt_nobindings): Variable removed. (opt_print_bindings): New variable. (extract): Print bindings or "false" if opt_print_bindings is true. * stream.c (output_produced): Variable removed. (stdio_put_string, stdio_put_char, stdio_put_byte): Remove update of output_produced. * stream.h (output_produced): Declaration removed. * txr.1: Documentation updated. * txr.c (txr_main): Option 'b' does nothing. 'B', 'l', 'a', and '--lisp-bindings' set opt_print_bindings to 1. * txr.h (opt_nobindings): Declaration removed. (opt_print_bindings): Declared. * unwind.c (uw_throw): When exiting due to a query error or file error, print false when opt_print_bindings is true.
* String type related bugfixes: neglecting to handle all three kinds inKaz Kylheku2014-05-101-0/+4
| | | | | | | | | | | | | | | some places. In particular, the test case echo : | ./txr -c '@a:@a' - breaks because of neglected LIT in do_match_line. * arith.c (tofloat, toint): Handle LIT type in switch. * lib.c (ref, refset, replace, update): Handle LSTR type. * match.c (do_match_line, do_output_line): Handle LSTR and LIT objects in switch.
* Change to how locations are passed around, for the sake of generationalKaz Kylheku2014-03-291-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GC. The issue being solved here is the accuracy of the gc_set function. The existing impelmentation is too conservative. It has no generation information about the memory location being stored, and so it assumes the worst: that it is a location in the middle of a gen 1 object. This is sub-optimal, creating unacceptable pressure against the checkobj array and, worse, as a consequence causing unreachable gen 0 objects to be tenured into gen 1. To solve this problem, we replace "val *" pointers with a structure of type "loc" which keeps track of the object too, which lets us discover the generation. I tried another approach: using just a pointer with a bitfield indicating the generation. This turned out to have a serious issue: such a bitfield goes stale when the object is moved to a different generation. The object holding the memory location is in gen 1, but the annotated pointer still indicates gen 0. The gc_set function then makes the wrong decision, and premature reclamation takes place. * combi.c (perm_init_common, comb_gen_fun_common, rcomb_gen_fun_common, rcomb_list_gen_fun): Update to new interfaces for managing mutation. * debug.c (debug): Update to new interfaces for managing mutation. Avoid loc variable name. * eval.c (env_fbind, env_fbind): Update to new interfaces for managing mutation. (lookup_var_l, dwim_loc): Return loc type and update to new interfaces. (apply_frob_args, op_modplace, op_dohash, transform_op, mapcarv, mappendv, repeat_infinite_func, repeat_times_func): Update to new interfaces for managing mutation. * eval.h (lookup_var_l): Declaration updated. * filter.c (trie_add, trie_compress, trie_compress_intrinsic, * build_filter, built_filter_from_list, filter_init): Update to new * interfaces. * gc.c (gc_set): Rewritten to use loc type which provides the exact generation. We do not need the in_malloc_range hack any more, since we have the backpointer to the object. (gc_push): Take loc rather than raw pointer. * gc.h (gc_set, gc_push): Declarations updated. * hash.c (struct hash): The acons* functions use loc instead of val * now. (hash_equal_op, copy_hash, gethash_c, inhash, gethash_n, pushhash, Change to how locations are passed around, for the sake of generational GC. The issue being solved here is the accuracy of the gc_set function. The existing impelmentation is too conservative. It has no generation information about the memory location being stored, and so it assumes the worst: that it is a location in the middle of a gen 1 object. This is sub-optimal, creating unacceptable pressure against the checkobj array and, worse, as a consequence causing unreachable gen 0 objects to be tenured into gen 1. To solve this problem, we replace "val *" pointers with a structure of type "loc" which keeps track of the object too, which lets us discover the generation. I tried another approach: using just a pointer with a bitfield indicating the generation. This turned out to have a serious issue: such a bitfield goes stale when the object is moved to a different generation. The object holding the memory location is in gen 1, but the annotated pointer still indicates gen 0. The gc_set function then makes the wrong decision, and premature reclamation takes place. * combi.c (perm_init_common, comb_gen_fun_common, rcomb_gen_fun_common, rcomb_list_gen_fun): Update to new interfaces for managing mutation. * debug.c (debug): Update to new interfaces for managing mutation. Avoid loc variable name. * eval.c (env_fbind, env_fbind): Update to new interfaces for managing mutation. (lookup_var_l, dwim_loc): Return loc type and update to new interfaces. (apply_frob_args, op_modplace, op_dohash, transform_op, mapcarv, mappendv, repeat_infinite_func, repeat_times_func): Update to new interfaces for managing mutation. * eval.h (lookup_var_l): Declaration updated. * filter.c (trie_add, trie_compress, trie_compress_intrinsic, * build_filter, built_filter_from_list, filter_init): Update to new * interfaces. * gc.c (gc_set): Rewritten to use loc type which provides the exact generation. We do not need the in_malloc_range hack any more, since we have the backpointer to the object. (gc_push): Take loc rather than raw pointer. * gc.h (gc_set, gc_push): Declarations updated. * hash.c (struct hash): The acons* functions use loc instead of val * now. (hash_equal_op, copy_hash, gethash_c, inhash, gethash_n, pushhash,
* * eval.c (me_quasilist): New static function.Kaz Kylheku2014-03-251-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | (eval_init): Register me_quasilist as quasilist macro expander. * lib.c (quasilist_s): New global variable. (obj_init): quasilist_s initialized. * lib.h (quasilist_s): Declared. * match.c (do_txreval): Handle quasilist syntax. * parser.l (QWLIT): New exclusive state. Extend lexical grammar to transition to QWLIT state upon the #` or #*` sequence which kicks off a word literal, and in that state, piecewise lexically analyze the QLL, mostly by borrowing rules from quasiliterals. * parser.y (QWORDS, QWSPLICE): New tokens. (n_exprs): Integrate splicing form of QLL syntax. (n_expr): Integrate non-splicing form of QLL syntax. (litchars): Propagate line number info. (quasilit): Fix "string literal" wording in error message. * txr.1: Introduced WLL abbreviation for word list literals, cleaned up the text a little, and documented QLL's.
* * match.c (v_trailer): Fix segfault. The code whichKaz Kylheku2014-03-091-1/+1
| | | | | | helps implement the special interaction between @(accept) and @(trailer) was not handling the situation when there is not current unwind exit point.
* * lib.c (lazy_sub_str): Bugfix: "from" was mistakenly usedKaz Kylheku2014-03-091-1/+1
| | | | | | | in the adjustment of the "to" value. * match.c (search_form): Use predefined constants for -1 and 1 instead of calling num.
* Fixing broken processing of horizontal matching acrossKaz Kylheku2014-03-091-2/+9
| | | | | | | | | | | | | | long lines produced by @(freeform). Once the matching passes about 4000 characters, the "consume_prefix" function kicks in to save memory. Then any code which is not properly written to handle this displaced situation will break. * match.c (h_text, h_var, h_coll, h_parallel, h_fun): Bugfix. The recursive calls to match_line return an absolute position. From this value we must subtract c->base if we are to compare it with c->pos, or update c->pos. If we use the absolute value, we are abruptly jumping ahead in the data.
* * match.c (LOG_MATCH, LOG_MISMATCH): Wouldn't you know it;Kaz Kylheku2014-03-071-2/+2
| | | | | | | | the format strings in these macros contained a workaround for the broken * variable field width syntax, specifying ~*~a where the extra ~ in the middle just feeds a character that the broken state machine expects. These workarounds broke when I fixed the formatting, making -v mode useless.
* * lib.c (assert_s): New global variable.Kaz Kylheku2014-03-061-0/+68
| | | | | | | | | | | | | | | (obj_init): Intern assert symbol, store in assert_s. * lib.h (assert_s): Declared. * match.c (typed_error, v_assert, h_assert): New static functions. (dir_tables_init): Register v_assert and h_assert. Register assert_s as non-data-matching directive. * unwind.c (uw_init): Register assert as a subtype of error. * txr.1: Describe assert.
* * match.c: (v_next): Set the "curfile" in the context to "env" whenKaz Kylheku2014-03-061-1/+2
| | | | | | scanning environment. (open_data_source): Regression: was not setting c->curfile when opening anything.
* * match.c (match_files): Fix it again. The data (nil)Kaz Kylheku2014-03-061-1/+1
| | | | can occur from an interactive/real-time stream.
* Fixing regression caused by the 2014-02-19 change ("Fixed long-runningKaz Kylheku2014-03-051-2/+8
| | | | | | | | | | | | | | | issue ..."). * match.c (open_data_source): if c->data is t, but c->files is nil, set c->data to nil: we cannot possibly open anything later. (match_files): We need to call open_data_source one more time just before processing a line with horizontal material. The previous call(s) to open_data_source might not have opened anything. Before accesing car(c.data) the correct test is consp(c.data), not c.data. In the else clause, we now specificially check for nilp(c.data) which is the correct indicator of no more data. If c.data is any other atom at that point, we have an internal error, for which an assertion is added now.
* Replacing uses of the eq function which are used only as C booleans,Kaz Kylheku2014-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with just using the == operator. Removing cobj_equal_op since it's indistinguishable from eq. Streamlining missingp and null_or_missing_p. * eval.c (transform_op): eq to ==. (c_var_ops): cobj_equal_op to eq. * filter.c (trie_compress, trie_lookup_feed_char, filter_string_tree, html_hex_continue, html_dec_continue): eq to ==. * hash.c (hash_iter_ops): cobj_equal to eq. * lib.c (countq, getplist, getplist_f, search_str_tree, posq): eq to ==. (cobj_equal_op): Function removed. * lib.h (cobj_equal_op): Declaration removed. (missingp): Becomes a simple macro that yields a C boolean instead of t/nil val, because it's only used that way. (null_or_missing_p): Becomes inline function returning int. * match.c (v_output): eq to ==. * rand.c (random_state_ops): cobj_equal_op to eq. * regex.c (char_set_obj_ops, regex_obj_ops): cobj_equal_op to eq. (reg_derivative): Silly if3 expression replaced by null. (regexp): Redundant if2 expression wrapped around eq removed. * stream.c (null_ops, stdio_ops, tail_ops, pipe_ops, string_in_ops, byte_in_ops, string_out_ops, strlist_out_ops, dir_ops, cat_stream_ops): cobj_equal_op to eq. * syslog.c (syslog_strm_ops): cobj_equal_op to eq.
* The C function nullp is being renamed to null, and the rarelyKaz Kylheku2014-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | used global variable null which holds a symbol becomes null_s. A new macro called nilp is added that more efficiently checks whether an object is nil, producing a C boolean value rather than t or nil. Most of the uses of nullp in the codebase just become the more streamlined nilp. * debug.c (show_bindings): nullp to nilp * eval.c (lookup_var, lookup_var_l, lookup_fun, lookup_sym_lisp1, do_eval, expand_qquote, expand_quasi, expand_op): nullp to nilp. (op_modplace): nullp to null. (eval_init): Update registration of null and not from C function nullp to null. * filter.c (trie_compress, html_hex_continue): nullp to nil. (filter_string_tree): null to null_s. * hash.c (hash_next): nullp to nilp. * lib.c (null): Variable renamed to null_s. (code2type): null to null_s. (lazy_flatten_scan, chainv, lazy_str, lazy_str_force_upto, obj_print, obj_pprint): nullp to nilp. (obj_init): null to null_s; nullp to null. * lib.h (null): declaration changed to null_s. (nullp): Inline function renamed to null. (nilp): New macro. * match.c (do_match_line): nullp to nilp. * rand.c (make_random_state): Likewise. * regex.c (compile_regex): Likewise.
* Fixing a long-running issue in the TXR pattern language: prematureKaz Kylheku2014-02-191-21/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | opening of files, prior to directives that actually need data. The documentation basically lied that this is the case: namely, the text "A file isn't opened until the query demands material from that file, and then the contents are read on demand, not all at once." This is now a fact. * match.c (non_matching_directive_table): New global variable. (open_data_source): New static function. Contains an almost verbatim migration of the source-opening logic that used to be in match_files. The useless assignment to c->nil is gone, and c->data == t is explicitly tested for. Instead of assuming that only the @(next) directive does not need to have a data source open, the table of non-matching directives is consulted. Opening the data source is now skipped for numerous directives. (match_files): Call open_data_source within the loop. This means that even after processing numerous non-matching directives, we will still correctly set up the data lazy list. (dir_tables_init): Initialize non_matching_directive_table, protect from GC and populate with numerous directives. * txr.1: Improved documentation for @(next :args), and removed a description of the hack that a single @(next) at the top of the query suppressed the opening of the data source.
* * eval.c (subst_vars): Bugfix: results of expressions notKaz Kylheku2014-02-111-3/+4
| | | | | | | | | | | | treated in the same way as variables: lists not stringified, causing expansions with parentheses, and sometimes errors due to unhandled objects. Also, use tostringp instead of format for stringifying objects. stringifying object. Bugfix.k * match.c (subst_vars): Added comment similar to the one in the subst_vars of eval.c. Removed superfluous conversion code where the str variable is already known to be a string.
* Changes to the list collection mechanism to improveKaz Kylheku2014-01-221-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the extension of list operations over vectors and strings. * eval.c (do_eval_args, bindings_helper, op_each, subst_vars, supplement_op_syms, mapcarv, mappendv): Switch from list_collect_* macros to functions. * lib.c (copy_list): Switch from list_collect* macros to functions. Use list_collect_nconc for the final terminator. Doing a copy there with list_collect_append was actually wasteful, and now that list_collect_append calls copy_list in places, it triggered runaway recursion. (make_like): Bugfix: list_vector was used instead of vector_list. (to_seq, list_collect, list_collect_nconc, list_collect_append): New functions. (append2, appendv, nappend2, sub_list, replace_list, ldiff, remq, remql, remqual, remove_if, keep_if, proper_plist_to_alist, improper_plist_to_alist, split_str, split_str_set, tok_str, list_str, chain, andf, orf, lis_vector, mapcar, mapcon, mappend, merge, set_diff, env): Switch from list_collect* macros to functions. (replace_str, replace_vec): Allow single item replacement sequence. * lib.h (to_seq): Declared. (list_collect, list_collect_nconc, list_collect_append): Macros removed, replaced by function declarations of the same name. These functions return the new ptail since they cannot assign to it, requiring all uses to be updated to do the assignment of the returned value. (list_collect_decl): Use val rather than obj_t *. * match.c (vars_to_bindings, h_coll, subst_vars, extract_vars, extract_bindings, do_output_line, do_output, v_gather, v_collect): Switch from list_collect* macros to functions. * parser.y (o_elems_transform): Likewise. * regex.c (dv_compile_regex, regsub): Likewise. * txr.c (txr_main): Likewise.