summaryrefslogtreecommitdiffstats
path: root/parser.y
Commit message (Collapse)AuthorAgeFilesLines
* Synchronize license comments with LICENSE.Kaz Kylheku2016-10-011-16/+17
| | | | | | | | | | | | | | | | | | | | * Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Revert to verbatim 2-Clause BSD.
* Regression: @(rep) wrongly diagnoses empty clause.Kaz Kylheku2016-09-061-2/+2
| | | | | | | Introduced on 2016-04-27 in 7afbcc19. * parser.y (elem): Check $4 phrase position for empty clauses, rather than $2. That's where they are.
* Fix emulation of TXR 135 @(if) semantics.Kaz Kylheku2016-08-291-2/+4
| | | | | | | | | * parser.y (if_clause, elif_clauses_opt): The previous commit changes the emulation of old @(if) behavior, since expressions obtained via the n_exprs_opt grammar phrase are not subject to expand_meta. We can counteract this by calling expand_meta in the compatibility code.
* Fix broken expansion in @(if) and output @(repeat).Kaz Kylheku2016-08-291-5/+5
| | | | | | | | | * parser.y (if_clause, elif_clauses_opt, repeat_clause, rep_elem): Recognize argument expressions as n_exprs_opt rather than exprs_opt, so that expand_meta is not applied. They are Lisp expressions, which are broken by expand_meta. A failing test case is op syntax with @num metanum variables, e.g. @(if (foo (op bar @1.slot))).
* Allow empty @(catch) and @(finally).Kaz Kylheku2016-06-051-12/+3
| | | | | | | * parser.y (catch_clauses_opt): Don't diagnose empty catch and finally. There is no benefit in doing so; moreover, it contravenes the documentation, which explicitly says these may be empty. I.e. this fixes a regression.
* Diagnose empty clauses better in parallel constructs.Kaz Kylheku2016-04-271-86/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this change we allow the grammar, as such, to express empty clauses in the parallel constructs like @(some), @(cases), @(gather) and so forth. However, we raise an error when these occur. This results in a cleaner diagnostic behavior. In the future, empty clauses may be allowed; the semantics has to be worked out. An empty clause should neither succeed nor fail; the behavior should be as if it is not there. The general strategy in this patch is to eliminate the use of the clauses terminal symbol (completely) and use clauses_opt everywhere, and then diagnose when that produces a nil. Some elems are similarly changed to elems_opt in the horizontal versions of directives. * parser.y (clauses): Nonterminal symbol completely removed. (spec): Empty production rule removed, by using clauses_opt instead of clauses. (all_clause, some_clause, none_clause, maybe_clause, cases_clause, choose_clause, gather_clause): Instead of diagnosing empty clauses via a dedicated grammar rule, diagnose by looking for nil subtree emerging from from a clauses phrase. (gather_parts): Use clauses_opt. Yield nil instead of a cons if this clauses_opt is nil. (additional_gather_parts): Simplify grammar and actions by recursing back to gather_parts. If gather_parts produces nil, diagnose the empty and/or subclause situation. (collect_clause): Use clauses_opt and diagnose empty. Simplify error production, which doesn't have to look at the lookahead token yychar any more to implement the empty diagnostic. (clause_parts): Use clauses_opt instad of clauses. Yield nil instead of a cons if clause_opt yields nil. (additional_parts): Grammar and actions simplified by recursing back to clause_parts. If the recursive clause_parts produces nil, generate empty and/or subclause diagnostic. (elem): Use elems_opt for the various horizontal clauses and implement empty checks for them. Explicit empty productions are eliminated. (clause_parts_h): Changed in way analogous to clause_parts. (additional_parts_h): Changed analogously to additional_parts. (try_clause): Use clauses_opt and catch empties. Eliminate the second error production. (catch_clauses_opt): Diagnose empty catch and finally, unless in compatibility mode with 139 or less. The grammar already uses clauses_opt here; empty catches and finally had been allowed. Fixed messages in error productions for catch and finally to refer to catch and finally rather than try.
* Strengthen against resource leaks upon exceptions.Kaz Kylheku2016-04-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | * glob.c (glob_wrap): Perform argument conversions that might throw before allocating UTF-8 string. * parser.y (text): In the action for SPACE, the lexeme is not needed so free($1) right away. If regex_compile were to throw an exception, that lexeme will leak. * socket.c (getaddrinfo_wrap): Harden against leakage of node_u8 and service_u8 strings with an unwind block. For instance, the hints structure could contain bad values which cause addrinfo_in to throw. * stream.c (make_string_byte_input_stream): Perform possibly throwing argument conversions before allocating resources. * sysif.c (mkdir_wrap, mknod_wrap, chmod_wrap, symlink_wrap, link_wrap, setenv_wrap, crypt_wrap): Likewise. * syslog.c (openlog_wrap, syslog_wrapv): Likewise.
* Bugfix: @(if expr) not macro-expanding expr.Kaz Kylheku2016-04-181-2/+2
| | | | | * parser.y (if_clause, elif_clauses_opt): Add missing expand calls.
* Bugfix: @(output) not expanding some Lisp exprs.Kaz Kylheku2016-04-171-1/+1
| | | | | | | * parser.y (make_expr): This function is part of a hack for converting some hard-coded syntax like @(if ...) or @(and ...) in an @(output) block to a to a Lisp expression. Alas, the crucial step of expanding the Lisp form was neglected.
* Fix proper-listp to proper-list-p.Kaz Kylheku2016-04-141-1/+1
| | | | | | | | | | | | | | | | | | | | This is really a gratuitous incompatibility with Common Lisp and other dialects. Let's fix it internally also, but keep the proper-listp function binding for backwards compatibility. * eval.c (dot_to_apply, me_op): Update proper_listp call to proper_list_p. (eval_init): Register proper-list-p to the same C function as proper-listp, and that C function is now called proper_list_p. * lib.c (proper_listp): Renamed to proper_list_p. * lib.h (proper_listp): Declaration updated. * parser.y (define_transform): Update proper_listp call. * txr.1: Replace all occurrences of proper-listp with proper-list-p. Add note explaining the rename situation.
* New semantics for @(if) directive.Kaz Kylheku2016-03-221-10/+33
| | | | | | | | | | | | | | | * eval.h (if_s): Declared. * match.c (v_if): New static function. (dir_tables_init): Register v_if in v_directive_table under if symbol. * parser.y (IF): Token assigned to <lineno> type. (if_clause, elif_clauses_opt, else_clause_opt): New syntactic representation, understood by v_if. * txr.1: Documented if semantics more precisely, dropped the text about it being syntactic sugar for a cases with require, added compatibility note.
* Support binding in @(repeat)/@(rep) :vars.Kaz Kylheku2016-03-161-1/+47
| | | | | | | | | | | | | | | | * match.c (extract_bindings): Check for (var expr) syntax, evaluate and bind. * match.h (vars_k): Declared. * parser.y (expand_repeat_rep_args): New static function. (repeat_rep_helper): The :counter and :var arguments of repeat/rep must be macro-expanded, since there can be Lisp expressions there. This supports the new feature, but also fixes the bug of :counter (var form) not expanding form. * txr.1: Updated documentation about :vars in @(repeat).
* New :mandatory keyword in until/last clauses.Kaz Kylheku2016-01-151-11/+28
| | | | | | | | | | | | | | | | | | | * match.c (mandatory_k): New keyword variable. (h_coll, v_gather, v_collect): Implement :mandatory logic. (syms_init): Initialize mandatory_k. * parser.l (grammar): The UNTIL and LAST tokens must be matched similarly to collect, without consuming the closing parenthesis, allowing a list of items to be parsed between the symbol and the closure, in the NESTED state. * parser.y (gather_clause, collect_clause, elem, repeat_parts_opt, rep_parts_opt): Adjust to new until/last syntax. In the matching productions, the abstract syntax changes to incorporate the options. In the output productions, we throw an error if options are present. * txr.1: Documented :mandatory for collect, coll and gather.
* Copyright year bump.Kaz Kylheku2015-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright. * linenoise/LICENSE, linenoise/linenoise.c, linenoise/linenoise.h: Bump one principal author's copyright from 2014 to 2015. The code is based on a snapshot of 2015 upstream work.
* TXR quasiliterals and output vars treated as Lisp.Kaz Kylheku2015-12-261-17/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * eval.c (format_field): Function moved here from match.c, along with the introduction of a new behavior: if a meta-expr occurs among the modifiers, its constituent expression is evaluated in its place. This allows for @{a @[expr]} which was previously not allowed in Lisp quasiliterals, but worked in TXR quasiliterals due to the treatment of @ by txeval. (subst_vars): Static function turns external, so code in match.c can call it instead of the subst_vars in that module. For that purpose, it needs to take a filter argument and process filters, like the match.c subst_vars. (op_quasi_lit): Pass nil as filter argument to subst_vars. * eval.h (format_field, subst_vars): Declared. * match.c (format_field): Function removed, moved to eval.c and slightly changed. (subst_vars): Renamed to tx_subst_vars. By default, now just a wrapper for subst_vars. In compatibility mode, invokes the old logic. (do_txeval, do_output_line): Call tx_subst_vars rather than subst_vars. * match.h (format_field): Declaration removed. * parser.y (expr): Grammar production removed: no longer referenced. (o_var): Braced variable case now parsed as n_expr, and expanded as expr by default, since this is Lisp now. In compatibility mode, expanded using expand_meta. Also SYMTOK case must be subject to expansion; an output var can now be a symbol macro. (expand_meta): Expand a quasi-literal as Lisp, except in compatibility mode. * txr.1: Bit of a documentation update. Existing doc isn't totally clear.
* New --debug-expansion option.Kaz Kylheku2015-12-181-0/+10
| | | | | | | | | | | | | * txr.c (opt_dbg_expansion): New global variable. (help): Print summary for --debug-expansion. (txr_main): Recognize new option and set flag. * parser.y (parse_once): Suppress debug stepping around parser if opt_dbg_expansion is false. * txr.1 (opt_dbg_expansion): Declared. * txr.1: Documented new option.
* Bugfix: dot syntax doesn't record source loc info.Kaz Kylheku2015-12-171-2/+4
| | | | | * parser.y (n_expr): Fall back on getting line number info from parser->lineno, if it didn't come from the operands.
* Remove useless test from rlcp_tree.Kaz Kylheku2015-11-281-5/+4
| | | | | * parser.y (rlcp_tree): Remove redunant test, around the for loop, of a condition which is the same as its guard condition.
* WIP: fix erroneous use of rlcp that should be rl.Kaz Kylheku2015-11-261-8/+7
| | | | | * parser.y (hash, struct, range): Fix rlcp being wrongly used to copy nonexistent line number info from an integer.
* @(rep) as shorthand for @(coll :vars nil).Kaz Kylheku2015-11-201-2/+9
| | | | | | | | | | | | | * match.c (h_coll): Check for rep symbol, and handle similarly to v_coll. Use symbol in error message. (dir_tables_init): Bind rep symbol to h_coll. * parser.y (elems): Don't generate rep_elem phrase structure for the sake of catching "rep outside of output"; this production now conflicts with the intent to allow this. (elem): Add various REP productions which clones of COLL. * txr.1: Documented new @(rep) usage.
* New iread function.Kaz Kylheku2015-11-071-2/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The read function no longer works like it used to on an interactive terminal because of the support for .. and . syntax on a top-level expression. The iread function is provided which uses a modified syntax that doesn't support these operators on a top-level expression. The parser thus doesn't look one token ahead, and so iread can return immediately. * eval.c (eval_init): Register iread intrinsic function. * parser.c (prime_parser): Only push back the recently seen token when priming for a regular Lisp read. Handle the prime_interactive method by preparing a SECRET_ESCAPE_I token. (lisp_parse_impl): New static function, formed from previous lisp_parse. Takes a boolean argument indicating interactive mode. (prime_parser_post): New function. (lisp_parse): Now a wrapper for lisp_parse_impl which passes a nil to indicate noninteractive read. (iread): New function. * parser.h (enum prime_parser): New member, prime_interactive. (scrub_scanner, iread, prime_parser_post): Declared. * parser.l (prime_scanner): Handle the prime_interactive case the same way as prime_lisp. (scrub_scanner): New function. * parser.y (SECRET_ESCAPE_I): New token type. (i_expr): New nonterminal symbol. Like n_expr, but doesn't support dot or dotdot operators, except in nested subexpressions. (spec): Handle SECRET_ESCAPE_I by way of i_expr. (sym_helper): Before freeing the token lexeme, call scrub_scanner. If the token is registered as the scanner's most recently seen token, the scanner must forget that registration, because it is no longer valid. (parse): Call prime_parser_post. * txr.1: Documented iread.
* New range type, distinct from cons cell.Kaz Kylheku2015-11-011-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * eval.c (eval_init): Register intrinsic functions rcons, rangep from and to. (eval_init): Register rangep intrinsic. * gc.c (mark_obj): Traverse RNG objects. (finalize): Handle RNG in switch. * hash.c (equal_hash, eql_hash): Hashing for for RNG objects. * lib.c (range_s, rcons_s): New symbol variables. (code2type): Handle RNG type. (eql, equal): Equality for ranges. (less_tab_init): Table extended to cover RNG. (less): Semantics defined for ranges. (rcons, rangep, from, to): New functions. (obj_init): range_s and rcons_s variables initialized. (obj_print_impl): Produce #R notation for ranges. (generic_funcall, dwim_set): Recognize range objects for indexing * lib.h (enum type): New enum member, RNG. MAXTYPE redefined to RNG value. (TYPE_SHIFT): Increased to 5 since there are now 16 type codes. (struct range): New struct type. (union obj): New member rn, of type struct range. (range_s, rcons_s, rcons, rangep, from, to): Declared. (range_bind): New macro. * parser.l (grammar): New rule for recognizing the #R sequence as HASH_R token. * parser.y (HASH_R): New terminal symbol. (range): New nonterminal symbol. (n_expr): Derives the new range symbol. The n_expr DOTDOT n_expr rule produces rcons expression rather than const. * match.c (format_field): Recognize rcons syntax in fields which is now what ranges translate to. Also recognize range object. * tests/013/maze.tl (neigh): Fix code which destructures range as a cons. That can't be done any more. * txr.1: Document ranges.
* Renaming some functions for consistency.Kaz Kylheku2015-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * combi.c (perm_list, perm_str, rperm_list, reperm_gen_fun, rperm_vec, comb_vec, rcomb_list, rcomb_vec, rcomb_str): Follow rename of list_vector to list_vec. * eval.c (vector_list_s): Global variable renamed to vec_list_s. (expand_qquote): Follow vector_list_s to vec_list_s. (eval_init): Follow renames of all identifiers. Functions num-chr, chr-num, vector-list and list-vector are registered under new names, while remaining registered under old names. * eval.h (vector_list_s): Declaration renamed. * filter.c (url_encode): Follow chr_num to chr_int rename. * lib.c (make_like, interpose, shuffle): Follow vector_list to vec_list rename. (tolist, replace, replace_list): Follow list_vector to list_vec rename. (num_chr): Renamed to int_chr. (chr_num): Renamed to chr_int. (vector_list): Renamed to vec_list. (list_vector): Renamed to list_vec. * lib.h (num_chr, chr_num, list_vector, vector_list): * Declarations renamed. * parser.y (vector): Follow vector_list to vec_list rename. * txr.1: Updated documentation for num-chr, chr-num, list-vector and vector-list with new names, and notes about the old names being supported, but obsolescent.
* Introducing structs.Kaz Kylheku2015-09-021-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * args.c (args_cat_zap): New function. * args.h: (args_cat_zap): Declared. * eval.c (struct_lit_s): New symbol variable. (eval_init): Initialize struct_lit_s. * eval.h (struct_lit_s): Declared. * gc.c (finalize): If a symbol has a struct slot hash attached to it, we must free it when the symbol is reclaimed. * lib.c (make_sym): Initialize symbol's slot_cache pointer to null. (copy): Copy structure objects. (init): Call struct_init to initialize struct module. * lib.h (SLOT_CACHE_SIZE): New preprocessor symbol (slot_cache_line_t, slot_cache_t): New typedefs. (struct sym): New member, slot_cache. * lisplib.c (struct_set_entries, struct_instantiate): New static functions. (liplib_init): Register new functions in dl_table. parser.y (HASH_S): New terminal symbol. (struct): New grammar rule. (n_expr): Derive struct. (yybadtoken): Map HASH_S to #S string. parser.l (grammar): Recognize #S and return HASH_S token. share/txr/stdlib/place.tl (slot): New defplace. share/txr/stdlib/struct.tl: New file. struct.c: New file. struct.h: New file. * Makefile (OBJS): Adding struct.o.
* Print parser error message for parse-time exceptions.Kaz Kylheku2015-08-291-3/+25
| | | | | | * parser.y (parse_once, parse): Catch error exceptions coming out of yyparse, print message, then re-throw. This way we see the file and line number near where that happened.
* New --yydebug option.Kaz Kylheku2015-08-241-0/+9
| | | | | | | | | | | * parser.y (have_yydebug): New global constant. (yydebug_onoff): New function. * parser.h (have_yydebug, yydebug_onof): Declared. (yydebug_onoff): New function. * txr.c (help): List --yydebug option. (txr_main): --yydebug option implemented.
* Fix broken @@@<n>/@@@rest references in quasiliterals.Kaz Kylheku2015-08-191-2/+2
| | | | | | | | * parser.y (quasi_meta_helper): When obj is a sys:var, leave it alone; don't add another layer of var. Also, do the same if it is a sys:expr. * tests/012/quasi.tl: Added test case.
* Quasiquote regression from 110.Kaz Kylheku2015-08-191-5/+5
| | | | | | | | | | | | | | | | The problem is that one-argument function calls like @(whatever arg) in a quasiliteral being turned into sys:var items. * parser.y (quasi_meta_helper): Remove bogus check on length. The default case is now var, so the var_s check actually matters. The integerp check for the argument of a var form didn't do anything because the entire if statment conditionally selected a useless goto. Removing it for consistent treatment of var items. * tests/012/quasi.tl: Some new test cases involving @rest. These new tests pass whether or not we have that integerp(second(obj)) test in the quasi_meta_helper function. Either way @rest and @@rest produce the same thing.
* Get Berkeley Yacc port of the parser working again.Kaz Kylheku2015-08-141-0/+7
| | | | | | | | | | | | | * parser.y (byacc_fool): New grammar nonterminal symbol and dummy rule set. (spec): Use dummy byacc_fool to create a fake continuation in the grammar, so the Berkeley-Yacc-generated parser doesn't throw a syntax error. Our YYACCEPT prevents the byacc_fool part from consuming more than one token of lookahead. Bison doesn't need this because it has $default actions which reduce regardless of the lookahead token. BYacc insists on reducing only if it can match $end (end of input), and not other tokens, which constitute syntax errors.
* Remove unwanted yyparse declaration from y.tab.h.Kaz Kylheku2015-08-141-0/+1
| | | | | | | | | | * Makefile (y.tab.c): Putting in an ugly workaround for an obnoxious new behavior introduced in Bison 3.x, which breaks our build on platforms that have a newer Bison. After generating y.tab.h, we remove the unwanted declaration with sed. * parser.y (yyparse): Declare, since y.tab.h doesn't any more, and the newer Bison's parse skeletons expect it.
* Word splices not quite on board with consing dot handling.Kaz Kylheku2015-08-141-2/+2
| | | | | | | * parser.y (r_exprs): The WSPLICE and QWSPLICE syntax at the front of a list must now initialize the terminating atom to unique_s, not to nil. Without this we get mysterious "misplaced consing dot" errors (even though no consing dot occurs).
* Use new pushback token priming for single regex parse.Kaz Kylheku2015-08-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | * parser.h (enum prime_parser): New enum. (prime_parser, prime_scanner, parse): Declarations updated with new argument. * parser.c (prime_parser): New argument of enum prime_parser type Select appropriate secret token for regex and Lisp case. Pass prime selector down to prime_scanner. (regex_parse): Do not prepend secret escape to string. Do not use parse_once function; instead do the parser init and cleanup here and use the parse function. (lisp_parse): Pass new argument to parse, configuring the parser to be primed for Lisp parsing. * parser.l (grammar): Rule producing SECRET_ESCAPE_R removed. (prime_scanner): New argument. Pop the scanner state down to INITIAL. Then unconditionally switch to appopriate state based on priming configuration. * parser.y (parse): New argument for priming selection, passed down to prime parser.
* Revision to .. syntax.Kaz Kylheku2015-08-121-10/+3
| | | | | | | | | | | | | * parser.y (r_exprs, n_expr): Move the DOTDOT syntactic sugar rule from r_exprs to n_expr, where it is much simpler. This also means that the a..b syntax is now an expression by itself; it need not be enclosed in a list. The DOTDOT operator is made right associative; or rather its existing %right declaration is now activated. * txr.1: Remove documentation stating that the .. notation must be used in a list, and not in the dotted position of an improper list. Document the behavior in the dotted position, and document right associativity.
* Crafting a better parser-priming hack.Kaz Kylheku2015-08-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The method of inserting a character sequence which generates a SECRET_TOKEN_E token is being replaced with a purely token based method. Because we don't manipulate the input stream, the lexer is not involved. We don't have to flush its state and deal with the carry-over of the yy_hold_char. This comes about because recent changes expose a weakness in the old scheme. Now that a top-level expression can have the form expr.expr, it means that the Yacc parser reads one token ahead, to see whether there is a dot or something else. This lookahead token is discarded. We must re-create it when we call yyparse again. This re-creation is done by creating a custom yylex function, which can maintain pushback tokens. We can prime this array of pushback tokens to generate the SECRET_TOKEN_E, as well as to re-inject the lookahead symbol that was thrown away by the previous yyparse. To know which lookahead symbol to re-inject is simple: the scanner just keeps a copy of the most recent token that it returns to the parser. When the parser returns, that token must be the lookahead one. The tokens we keep now in the parser structure are subject to garbage collection, and so we must mark them. Since the YYSTYPE union has no type field, a new API is opened up into the garbage collector to help implement a conservative GC technique. * gc.c (gc_is_heap_obj): New function. * gc.h (gc_is_heap_obj): Declared. * match.c: Include y.tab.h. This is now needed by any module that needs to instantiate a parser_t structure, because members of type YYSTYPE occur in the structure. (parser.h can still be included without y.tab.h, but only an incomplete declaration for the parser strucure is then given, and a few functions are not declared.) * parser.c (yy_tok_mark): New static function. (parser_mark): Mark the recent token and the pushback tokens. (parser_common_init): Initialize the recent token, the pushback tokens, and the pushback stack index. (pushback_token): New static function. (prime_parser): hold_byte argument removed. Body considerably simplified. The catenated stream trick is no longer required. All we do here is set up two pushback tokens and prime the scanner, if necessary, so it is in the right start state for Lisp. * parser.l (YY_DECL): Take over definition of scanning function, renaming to yylex_impl, so we can implement yylex. (grammar): Rule which produces SECRET_ESCAPE_E token removed. (reset_scanner): Function removed. (yylex): New function. * parser.h (struct parser): Now only forward-declared unless y.tab.h has been included. New members, recent_tok, tok_pushback and tok_idx. (yyset_hold_char): Declared. (reset_scanner): Declaration removed. (yylex): Declared (if y.tab.h included). (prime_parser): Declaration updated. (prime_scanner): Declared. * Makefile: express new dependency on existence of y.tab.h of txr.o, match.o and parser.o.
* Dot with no whitespace generates qref syntax.Kaz Kylheku2015-08-101-4/+16
| | | | | | | | | | | | | | | | | | | | | | | a.b.(expr ...).c -> (qref a b (expr ...) c) Consing dot requires whitespace. * eval.c (qref_s): New symbol global variable. (eval_init): Initialize qref_s. * eval.h (qref_s): Declared. * parser.l (REQWS): New pattern definition, required whitespace. (grammar): New rules to scan CONSDOT (space required on both sides) and LAMBDOT (space required after). * parser.y (CONSDOT, LAMBDOT): New token types. (list): (. n_expr) rule replaced with LAMBDOT and CONSDOT. (r_exprs): r_exprs . n_expr consing dot rule replaced with CONSDOT. (n_expr): New n_expr . n_expr rule introduced here for producing qref expressions. (yybadtoken): Handle CONSDOT and LAMBDOT. * txr.1: Documented qref dot.
* Diagnose bad consing dot syntax like (a . b . c).Kaz Kylheku2015-08-101-7/+22
| | | | | | | | | | | | | | | | | | | | | * parser.y (r_exprs): Use unique object in the terminating cons to indicate the empty spot where the dotted cdr item will go. Check for misplaced consing dot. (misplaced_consing_dot_check): New static function. Checks for the terminator atom spot being taken already. Thus, the spot may be taken only by the very last reduction, such that the next reduction is r_exprs -> n_exprs where the terminating atom is processed. * parser.c (unique_s): New global variable. (parse_init): Initialize unique_s. * parser.h (unique_s): Declared. * share/txr/stdlib/place.tl (sys:placelet-1): We have a misplaced consing dot here! It was working correctly by "terminating atom propagation" behavior, which allowed (a . b c d) to produce (a c d . b). If a single terminating atom occurred in the middle of a list, it was promoted to the end.
* Better diagnosis for loose @ forms.Kaz Kylheku2015-08-061-4/+5
| | | | | | | | | | | | | * eval.c (op_meta_error): New static function. (eval_init): Register sys:var and sys:expr as operators that throw error. * parser.y (sym_helper): Take parser_t instead of scanner_t argument so we have access to the name and line number. Obtain scanner internally from parser. Add source location info to (sys:var ...) form. (symhlpr): Retarget macro to pass parser rather than scanner to sm_helper.
* Handle setting of parse name through prime_parser.Kaz Kylheku2015-07-101-2/+2
| | | | | | | | | | | | * parser.c (prime_parser): Take name as argument, and install it into parser. (lisp_parser): Pass name to parse, instead of setting it in the parser object. * parser.y (parse): Take name as argument and pass down to prime_parser. * parser.h (prime_parser, parse): Declarations updated.
* Bugfix: lexer loses unmatched "hold char" between top-level forms.Kaz Kylheku2015-07-101-2/+1
| | | | | | | | | | | | | | | | | | Test case: file containing 4(prinl 3). Scanner consumes 4 and (. The ( is lost when the scanner is reset for the next call to yyparse, resulting in jut prinl being read and interpreted as a variable. * parser.c (prime_parser): If present, append hold byte to priming string. Takes parser_t * instead of parser, and returns void now. * parser.l (reset_scanner): Now returns int value, the value of the scanner's yy_hold_char variable which is nonzero when the scanner is hanging on to an unmatched byte of input. * parser.h (reset_scanner, prime_parser): Declarations updated. * parser.y (parse): Pass hold byte returned by reset_scanner to prime_parser.
* Parser cleanup: embed scanner in parser.Kaz Kylheku2015-07-091-17/+4
| | | | | | | | | | | | | | | | | | | | | | | * parser.c (parser_destroy): New GC finalizer static function. (parser_ops): Register parser_destroy. (parser_common_init): New function, shared by parse and parse_once. Initializes embedded scanner. (parser_cleanup): New function, shared by parse_once and parser_destroy. (parser): Use parser_common_init. * parser.h (parser_t): New member, yyscan. (reset_scanner, parser_common_init): Declared. * parser.l (reset_scanner): New function. * parser.y (parse_once): Use parser_common_init, and thus perform only a few initializations. Do not define scanner as a local variable. (parse): Call reset_scanner instead of yylex_init since the scanner is being reused, and for the same reason do not call yylex_destroy. GC will do that now.
* First round of quasiliteral-related fixes.Kaz Kylheku2015-06-261-18/+23
| | | | | | | | | | | | | | | | | | | | | * parser.l: Do not try to recognize floating-point literals in QSPECIAL state; that is not possible because @134.3 in a quasiliteral parses as a METANUM followed by ".3". On the other hand, recognize METANUM literals in QSPECIAL state, so that @@123 scans. Recognize @ as a token in QSPECIAL state, so @@abc will scan. When transitioning from QSILIT and QWLIT states to QSPECIAL upon scanning @, return a @ token, which is now parsed in the grammar. * parser.y (quasi_meta_helper): New static function. (q_var): Do not handle SYMTOK any more, only the braced variable syntax. SYMTOK is handled as a n_expr. Braced vars are handled with explicit '@' token, which is now produced by the scanner when it shifts from QSILIT to QSPECIAL. (quasi_item): No longer necessary to recognize various forms here such as quotes and splices. Just recognize a n_expr, preceded by '@'.
* Error handling improvement in read.Kaz Kylheku2015-06-101-0/+7
| | | | | | | | | | * parser.y (spec): New grammar production to handle the cases that SECRET_ESCAPE_E is not followed by anything (the input ends before any object is scanned, or there is no input token which starts an object). * parser.c (lisp_parse): Deal with EOF indication from parser (the syntax_tree member of parser_t set to nao).
* * parser.y (yybadtoken): Print unexpected characterKaz Kylheku2015-06-101-1/+1
| | | | literally rather than as a Lisp character literal.
* * parser.c (stream_parser_hash): New static variable.Kaz Kylheku2015-06-071-0/+1
| | | | | | | | | | | | | | | | | | (parser_mark): Mark parser and primer members. (parser, ensure_parser): new argument: primer. (get_parser_impl, ensure_parser): New static functions. (prime_parser): New function. (lisp_parse): Multiple calls to this function on the same stream now logically continue the parse, not resetting the line number to 1. (parse_init): Initialize and gc-protect stream_parser_hash. * parser.h (parser_t): New members, primer and parser. (prime_parser): Declared. (parser): Declaration updated. * parser.y (parse): Now responsible for calling prime_parser.
* * match.c (v_load): Call parse_once rater than parse.Kaz Kylheku2015-06-071-1/+21
| | | | | | | | | | | | | * parser.c (regex_parse): Likewise. * txr.c (txr_main): Likewise. * parser.h (parse): Declaration updated. (parse_once): Declared. * parser.y (parse_once): New function, same as old parse implementation. (parse): Becomes one argument function which works with a previously initialized parser and continues the parse.
* Fix source location for dangling unquotes and splices.Kaz Kylheku2015-04-301-10/+22
| | | | | | | | | | * parser.y (grammar): Propagate the parser line number to the unquote or splice form, if it has not received location info from its operand (because its operand is an atom). In the quasi_item case, we also use rlcp_tree to make sure the info is propagated through the list being consed up. (rlcp_tree): Bugfix: propagate the source location info to every cons in the list itself, not just into their cars.
* Remove silly package lookup from keywordp.Kaz Kylheku2015-04-251-1/+1
| | | | | | | | | | This tiny change yields a 165% (2.65X) speedup in the tst/tests/011/mandel.txr test case. * lib.c (keywordp): Use keyword_package_var instead of the keyword_package macro which looks up the global environment. * parser.y (sym_helper): Likewise.
* Fix quasistring regression introduced in TXR 81.Kaz Kylheku2015-04-181-0/+16
| | | | | | | | * parser.y (expand_meta): This function must recognize quasistrings, inside (sys:quasi ...) forms, (sys:var ...) forms do not denote TXR Lisp variables. These must not be expanded. Doing so is not only wrong, but the way it was done broke brace variables by stripping their arguments.
* Allow quasiquotes in braces and quasiliterals, and quotes in braces.Kaz Kylheku2015-04-151-0/+2
| | | | | | | | | * parser.l: Consolidate rules for recognizing quote, unquote, and quasiquote. An effect of this is that quasiquotes can now occur in braces and in string quasiliterals. * parser.y (quasi_item): Support quotes and quasiquotes as quasi items: that is to say, i.e. objects denoted by @ in a quasiliteral.
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-23/+23
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.