summaryrefslogtreecommitdiffstats
path: root/parser.y
Commit message (Collapse)AuthorAgeFilesLines
* Fix broken @@@<n>/@@@rest references in quasiliterals.Kaz Kylheku2015-08-191-2/+2
| | | | | | | | * parser.y (quasi_meta_helper): When obj is a sys:var, leave it alone; don't add another layer of var. Also, do the same if it is a sys:expr. * tests/012/quasi.tl: Added test case.
* Quasiquote regression from 110.Kaz Kylheku2015-08-191-5/+5
| | | | | | | | | | | | | | | | The problem is that one-argument function calls like @(whatever arg) in a quasiliteral being turned into sys:var items. * parser.y (quasi_meta_helper): Remove bogus check on length. The default case is now var, so the var_s check actually matters. The integerp check for the argument of a var form didn't do anything because the entire if statment conditionally selected a useless goto. Removing it for consistent treatment of var items. * tests/012/quasi.tl: Some new test cases involving @rest. These new tests pass whether or not we have that integerp(second(obj)) test in the quasi_meta_helper function. Either way @rest and @@rest produce the same thing.
* Get Berkeley Yacc port of the parser working again.Kaz Kylheku2015-08-141-0/+7
| | | | | | | | | | | | | * parser.y (byacc_fool): New grammar nonterminal symbol and dummy rule set. (spec): Use dummy byacc_fool to create a fake continuation in the grammar, so the Berkeley-Yacc-generated parser doesn't throw a syntax error. Our YYACCEPT prevents the byacc_fool part from consuming more than one token of lookahead. Bison doesn't need this because it has $default actions which reduce regardless of the lookahead token. BYacc insists on reducing only if it can match $end (end of input), and not other tokens, which constitute syntax errors.
* Remove unwanted yyparse declaration from y.tab.h.Kaz Kylheku2015-08-141-0/+1
| | | | | | | | | | * Makefile (y.tab.c): Putting in an ugly workaround for an obnoxious new behavior introduced in Bison 3.x, which breaks our build on platforms that have a newer Bison. After generating y.tab.h, we remove the unwanted declaration with sed. * parser.y (yyparse): Declare, since y.tab.h doesn't any more, and the newer Bison's parse skeletons expect it.
* Word splices not quite on board with consing dot handling.Kaz Kylheku2015-08-141-2/+2
| | | | | | | * parser.y (r_exprs): The WSPLICE and QWSPLICE syntax at the front of a list must now initialize the terminating atom to unique_s, not to nil. Without this we get mysterious "misplaced consing dot" errors (even though no consing dot occurs).
* Use new pushback token priming for single regex parse.Kaz Kylheku2015-08-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | * parser.h (enum prime_parser): New enum. (prime_parser, prime_scanner, parse): Declarations updated with new argument. * parser.c (prime_parser): New argument of enum prime_parser type Select appropriate secret token for regex and Lisp case. Pass prime selector down to prime_scanner. (regex_parse): Do not prepend secret escape to string. Do not use parse_once function; instead do the parser init and cleanup here and use the parse function. (lisp_parse): Pass new argument to parse, configuring the parser to be primed for Lisp parsing. * parser.l (grammar): Rule producing SECRET_ESCAPE_R removed. (prime_scanner): New argument. Pop the scanner state down to INITIAL. Then unconditionally switch to appopriate state based on priming configuration. * parser.y (parse): New argument for priming selection, passed down to prime parser.
* Revision to .. syntax.Kaz Kylheku2015-08-121-10/+3
| | | | | | | | | | | | | * parser.y (r_exprs, n_expr): Move the DOTDOT syntactic sugar rule from r_exprs to n_expr, where it is much simpler. This also means that the a..b syntax is now an expression by itself; it need not be enclosed in a list. The DOTDOT operator is made right associative; or rather its existing %right declaration is now activated. * txr.1: Remove documentation stating that the .. notation must be used in a list, and not in the dotted position of an improper list. Document the behavior in the dotted position, and document right associativity.
* Crafting a better parser-priming hack.Kaz Kylheku2015-08-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The method of inserting a character sequence which generates a SECRET_TOKEN_E token is being replaced with a purely token based method. Because we don't manipulate the input stream, the lexer is not involved. We don't have to flush its state and deal with the carry-over of the yy_hold_char. This comes about because recent changes expose a weakness in the old scheme. Now that a top-level expression can have the form expr.expr, it means that the Yacc parser reads one token ahead, to see whether there is a dot or something else. This lookahead token is discarded. We must re-create it when we call yyparse again. This re-creation is done by creating a custom yylex function, which can maintain pushback tokens. We can prime this array of pushback tokens to generate the SECRET_TOKEN_E, as well as to re-inject the lookahead symbol that was thrown away by the previous yyparse. To know which lookahead symbol to re-inject is simple: the scanner just keeps a copy of the most recent token that it returns to the parser. When the parser returns, that token must be the lookahead one. The tokens we keep now in the parser structure are subject to garbage collection, and so we must mark them. Since the YYSTYPE union has no type field, a new API is opened up into the garbage collector to help implement a conservative GC technique. * gc.c (gc_is_heap_obj): New function. * gc.h (gc_is_heap_obj): Declared. * match.c: Include y.tab.h. This is now needed by any module that needs to instantiate a parser_t structure, because members of type YYSTYPE occur in the structure. (parser.h can still be included without y.tab.h, but only an incomplete declaration for the parser strucure is then given, and a few functions are not declared.) * parser.c (yy_tok_mark): New static function. (parser_mark): Mark the recent token and the pushback tokens. (parser_common_init): Initialize the recent token, the pushback tokens, and the pushback stack index. (pushback_token): New static function. (prime_parser): hold_byte argument removed. Body considerably simplified. The catenated stream trick is no longer required. All we do here is set up two pushback tokens and prime the scanner, if necessary, so it is in the right start state for Lisp. * parser.l (YY_DECL): Take over definition of scanning function, renaming to yylex_impl, so we can implement yylex. (grammar): Rule which produces SECRET_ESCAPE_E token removed. (reset_scanner): Function removed. (yylex): New function. * parser.h (struct parser): Now only forward-declared unless y.tab.h has been included. New members, recent_tok, tok_pushback and tok_idx. (yyset_hold_char): Declared. (reset_scanner): Declaration removed. (yylex): Declared (if y.tab.h included). (prime_parser): Declaration updated. (prime_scanner): Declared. * Makefile: express new dependency on existence of y.tab.h of txr.o, match.o and parser.o.
* Dot with no whitespace generates qref syntax.Kaz Kylheku2015-08-101-4/+16
| | | | | | | | | | | | | | | | | | | | | | | a.b.(expr ...).c -> (qref a b (expr ...) c) Consing dot requires whitespace. * eval.c (qref_s): New symbol global variable. (eval_init): Initialize qref_s. * eval.h (qref_s): Declared. * parser.l (REQWS): New pattern definition, required whitespace. (grammar): New rules to scan CONSDOT (space required on both sides) and LAMBDOT (space required after). * parser.y (CONSDOT, LAMBDOT): New token types. (list): (. n_expr) rule replaced with LAMBDOT and CONSDOT. (r_exprs): r_exprs . n_expr consing dot rule replaced with CONSDOT. (n_expr): New n_expr . n_expr rule introduced here for producing qref expressions. (yybadtoken): Handle CONSDOT and LAMBDOT. * txr.1: Documented qref dot.
* Diagnose bad consing dot syntax like (a . b . c).Kaz Kylheku2015-08-101-7/+22
| | | | | | | | | | | | | | | | | | | | | * parser.y (r_exprs): Use unique object in the terminating cons to indicate the empty spot where the dotted cdr item will go. Check for misplaced consing dot. (misplaced_consing_dot_check): New static function. Checks for the terminator atom spot being taken already. Thus, the spot may be taken only by the very last reduction, such that the next reduction is r_exprs -> n_exprs where the terminating atom is processed. * parser.c (unique_s): New global variable. (parse_init): Initialize unique_s. * parser.h (unique_s): Declared. * share/txr/stdlib/place.tl (sys:placelet-1): We have a misplaced consing dot here! It was working correctly by "terminating atom propagation" behavior, which allowed (a . b c d) to produce (a c d . b). If a single terminating atom occurred in the middle of a list, it was promoted to the end.
* Better diagnosis for loose @ forms.Kaz Kylheku2015-08-061-4/+5
| | | | | | | | | | | | | * eval.c (op_meta_error): New static function. (eval_init): Register sys:var and sys:expr as operators that throw error. * parser.y (sym_helper): Take parser_t instead of scanner_t argument so we have access to the name and line number. Obtain scanner internally from parser. Add source location info to (sys:var ...) form. (symhlpr): Retarget macro to pass parser rather than scanner to sm_helper.
* Handle setting of parse name through prime_parser.Kaz Kylheku2015-07-101-2/+2
| | | | | | | | | | | | * parser.c (prime_parser): Take name as argument, and install it into parser. (lisp_parser): Pass name to parse, instead of setting it in the parser object. * parser.y (parse): Take name as argument and pass down to prime_parser. * parser.h (prime_parser, parse): Declarations updated.
* Bugfix: lexer loses unmatched "hold char" between top-level forms.Kaz Kylheku2015-07-101-2/+1
| | | | | | | | | | | | | | | | | | Test case: file containing 4(prinl 3). Scanner consumes 4 and (. The ( is lost when the scanner is reset for the next call to yyparse, resulting in jut prinl being read and interpreted as a variable. * parser.c (prime_parser): If present, append hold byte to priming string. Takes parser_t * instead of parser, and returns void now. * parser.l (reset_scanner): Now returns int value, the value of the scanner's yy_hold_char variable which is nonzero when the scanner is hanging on to an unmatched byte of input. * parser.h (reset_scanner, prime_parser): Declarations updated. * parser.y (parse): Pass hold byte returned by reset_scanner to prime_parser.
* Parser cleanup: embed scanner in parser.Kaz Kylheku2015-07-091-17/+4
| | | | | | | | | | | | | | | | | | | | | | | * parser.c (parser_destroy): New GC finalizer static function. (parser_ops): Register parser_destroy. (parser_common_init): New function, shared by parse and parse_once. Initializes embedded scanner. (parser_cleanup): New function, shared by parse_once and parser_destroy. (parser): Use parser_common_init. * parser.h (parser_t): New member, yyscan. (reset_scanner, parser_common_init): Declared. * parser.l (reset_scanner): New function. * parser.y (parse_once): Use parser_common_init, and thus perform only a few initializations. Do not define scanner as a local variable. (parse): Call reset_scanner instead of yylex_init since the scanner is being reused, and for the same reason do not call yylex_destroy. GC will do that now.
* First round of quasiliteral-related fixes.Kaz Kylheku2015-06-261-18/+23
| | | | | | | | | | | | | | | | | | | | | * parser.l: Do not try to recognize floating-point literals in QSPECIAL state; that is not possible because @134.3 in a quasiliteral parses as a METANUM followed by ".3". On the other hand, recognize METANUM literals in QSPECIAL state, so that @@123 scans. Recognize @ as a token in QSPECIAL state, so @@abc will scan. When transitioning from QSILIT and QWLIT states to QSPECIAL upon scanning @, return a @ token, which is now parsed in the grammar. * parser.y (quasi_meta_helper): New static function. (q_var): Do not handle SYMTOK any more, only the braced variable syntax. SYMTOK is handled as a n_expr. Braced vars are handled with explicit '@' token, which is now produced by the scanner when it shifts from QSILIT to QSPECIAL. (quasi_item): No longer necessary to recognize various forms here such as quotes and splices. Just recognize a n_expr, preceded by '@'.
* Error handling improvement in read.Kaz Kylheku2015-06-101-0/+7
| | | | | | | | | | * parser.y (spec): New grammar production to handle the cases that SECRET_ESCAPE_E is not followed by anything (the input ends before any object is scanned, or there is no input token which starts an object). * parser.c (lisp_parse): Deal with EOF indication from parser (the syntax_tree member of parser_t set to nao).
* * parser.y (yybadtoken): Print unexpected characterKaz Kylheku2015-06-101-1/+1
| | | | literally rather than as a Lisp character literal.
* * parser.c (stream_parser_hash): New static variable.Kaz Kylheku2015-06-071-0/+1
| | | | | | | | | | | | | | | | | | (parser_mark): Mark parser and primer members. (parser, ensure_parser): new argument: primer. (get_parser_impl, ensure_parser): New static functions. (prime_parser): New function. (lisp_parse): Multiple calls to this function on the same stream now logically continue the parse, not resetting the line number to 1. (parse_init): Initialize and gc-protect stream_parser_hash. * parser.h (parser_t): New members, primer and parser. (prime_parser): Declared. (parser): Declaration updated. * parser.y (parse): Now responsible for calling prime_parser.
* * match.c (v_load): Call parse_once rater than parse.Kaz Kylheku2015-06-071-1/+21
| | | | | | | | | | | | | * parser.c (regex_parse): Likewise. * txr.c (txr_main): Likewise. * parser.h (parse): Declaration updated. (parse_once): Declared. * parser.y (parse_once): New function, same as old parse implementation. (parse): Becomes one argument function which works with a previously initialized parser and continues the parse.
* Fix source location for dangling unquotes and splices.Kaz Kylheku2015-04-301-10/+22
| | | | | | | | | | * parser.y (grammar): Propagate the parser line number to the unquote or splice form, if it has not received location info from its operand (because its operand is an atom). In the quasi_item case, we also use rlcp_tree to make sure the info is propagated through the list being consed up. (rlcp_tree): Bugfix: propagate the source location info to every cons in the list itself, not just into their cars.
* Remove silly package lookup from keywordp.Kaz Kylheku2015-04-251-1/+1
| | | | | | | | | | This tiny change yields a 165% (2.65X) speedup in the tst/tests/011/mandel.txr test case. * lib.c (keywordp): Use keyword_package_var instead of the keyword_package macro which looks up the global environment. * parser.y (sym_helper): Likewise.
* Fix quasistring regression introduced in TXR 81.Kaz Kylheku2015-04-181-0/+16
| | | | | | | | * parser.y (expand_meta): This function must recognize quasistrings, inside (sys:quasi ...) forms, (sys:var ...) forms do not denote TXR Lisp variables. These must not be expanded. Doing so is not only wrong, but the way it was done broke brace variables by stripping their arguments.
* Allow quasiquotes in braces and quasiliterals, and quotes in braces.Kaz Kylheku2015-04-151-0/+2
| | | | | | | | | * parser.l: Consolidate rules for recognizing quote, unquote, and quasiquote. An effect of this is that quasiquotes can now occur in braces and in string quasiliterals. * parser.y (quasi_item): Support quotes and quasiquotes as quasi items: that is to say, i.e. objects denoted by @ in a quasiliteral.
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-23/+23
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.
* Source file inclusion implemented: needed for macros.Kaz Kylheku2014-10-201-2/+17
| | | | | | | | | | | | | | | | | | | | | * match.c (include_s): New symbol variable. (v_load): Function extended to handle include semantics. (include): External wrapper function for doing inclusion via v_load. (syms_init): include_s initialized. * match.h (include_s): Declared. (include): Declared. * parser.y (check_for_include): New static function. (clauses_rev): Use check_for_include to replace @(include ..) directive. * txr.1: Documented include. * genvim.txr: Added include symbol. * txr.vim: Regenerated.
* * parser.y (r_exprs): New grammar symbol. r_exprs usesKaz Kylheku2014-10-191-16/+46
| | | | | | | | | left-recursive rules to avoid filling the yacc stack, and returns the items in reverse order. The output of each r_exprs-generating rule consists of a list which gives the terminating atom, and then the list contents in reverse order. (n_exprs): Becomes wrapper for r_exprs which deals with the terminating atom, and reversed items.
* * parser.y: Allow TXR to support large programs, and efficiently so.Kaz Kylheku2014-10-181-5/+7
| | | | | | | | | | | (clauses_rev): New grammar symbol. Uses a left-recursive rule that does not consume an amount of parser stack proportional to the number of clauses, and sticks to efficient consing, which means that the list is built up in reverse. (clauses): Now just a wrapper rule for clauses_rev which nreverses its output. (clauses_opt): Retargetted to use clauses_rev instead of clauses, and reverse its output.
* Converting cast expressions to macros that are retargettedKaz Kylheku2014-10-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to C++ style casts when compiling as C++. * lib.h (strip_qual, convert, coerce): New casting macros. (TAG_MASK, tag, type, wli_noex, auto_str, static_str, litptr, num_fast, chr, lit_noex, nil, nao): Use cast macros. * arith.c (mul, isqrt_fixnum, bit): Use cast macros. * configure (INT_PTR_MAX): Define using cast macro. * debug.c (debug_init): Use cast macro. * eval.c (do_eval, expand_macro, reg_op, reg_mac, eval_init): Use cast macros. * filter.c (filter_init): Use cast macro. * gc.c (more, mark_obj, in_heap, mark, sweep_one, unmark): Use cast macros. * hash.c (hash_double, equal_hash, eql_hash, hash_equal_op, hash_hash_op, hash_print_op, hash_mark, make_hash, make_similar_hash, copy_hash, gethash_c, gethash, gethash_f, gethash_n, remhash, hash_count, get_hash_userdata, set_hash_userdata, hash_iter_destroy, hash_iter_mark, hash_begin, hash_uni, hash_diff, hash_isec): Use cast macros. * lib.c (code2type, chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc, chk_strdup, num, c_num, string, mkstring, mkustring, upcase_str, downcase_str, string_extend, sub_str, cat_str, trim_str, c_chr, vector, vec_set_length, copy_vec, sub_vec, cat_vec, cobj_print_op, obj_init): Likewise. * match.c (do_match_line, hv_trampoline, match_files, dir_tables_init): Likewise. * parser.l (grammar): Likewise. * parser.y (parse): Likewise. * rand.c (make_state, make_random_state, random_fixnum, random): Likewise. * regex.c (CHAR_SET_L2_LO, CHAR_SET_L2_HI, CHAR_SET_L1_LO, CHAR_SET_L1_HI, CHAR_SET_L0_LO, CHAR_SET_L0_HI, L0_full, L0_fill_range, L1_full, L1_fill_range, L1_contains, L1_free, L2_full, L2_fill_range, L2_contains, L2_free, L3_fill_range, L3_contains, L3_free, char_set_create, char_set_cobj_destroy, nfa_state_accept, nfa_state_empty, nfa_state_single, nfa_state_wild, nfa_state_set,
* Purge stray occurrences of "void *" from code base.Kaz Kylheku2014-10-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib.c (cobj_print_op): In the format call, cast the C pointer to val, since the ~p conversion now takes a val rather than void *. (cptr_equal_op, obj_print, obj_pprint): Remove cast to void *, since obj now already has the type that ~p expects. * lib.h (struct any): Use mem_t * instead of void *. * parser.h (yyscan_t): Repeat the definition from inside the Flex-generated lex.yy.c. (yylex_init, yylex_destroy, yyget_extra, yyset_extra): Re-declare using yyscan_t typedef in place of void *. * parser.l (yyget_column, yyerrprepf): Re-declare using yyscan_t. (grammar): Use yyg in place of yyscanner in calls to yyerrprepf. * parser.y (yylex, %lex-param): Use yyscan_t instead of void *. (parse): Use yyscan_t for local variable. * signal.c (stack): Change type from void * to mem_t *. * stream.c (vformat): Conversion specifier p extracts val instead of void *. (run): Use casts that only remove const, not all the way to void *. * txr.1: Documented p conversion specifier of format. * Makefile (OBJS-y): Initialize with := to make sure it is a simple variable, and not a macro. (SRCS): New variable, listing source files. (enforce): New rule for enforcing coding conventions. Currently checks for void * occurrences.
* More type safety, with help from C++ compiler.Kaz Kylheku2014-10-141-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | * parser.h (scanner_t): New typedef. Cleverly, we use yyguts_t as the basis for this, which pays off in a few places, getting rid of conversions. (parser_t): scanner member redeclared to scanner_t *. (yyerror, yyerr, yyerrof, end_of_regex, end_of_char): Declarations updated. * parser.l (yyerror, yyerrorf, num_esc): scanner argument becomes scanner_t * instead of void *. (yyscanner): Occurrences replaced by yyg: why should we refer to the type-unsafe void *yyscanner, when we know that in the middle of the yylex function we have the properly typed yyguts_t *yyg. (end_of_regex, end_of_char): Argument changes to scanner_t * and name strategically changes to yyg, eliminating the local variable and cast. * parser.y (sym_helper): scanner argument becomes scanner_t * instead of void *. (%parse-param}: void *scnr becomes scanner_t *scnr. (define-transform, yybadtoken): Use scanner_t * instead of void *. (parse): Since yylex_init takes a void **, we use it to initialize a local void *, and then assign it to the structure member.
* C++ upkeep.Kaz Kylheku2014-10-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR's support for compiling as C++ pays off: C++ compiler finds serious bugs introduced in August 2 ("Big switch to reentrant lexing and parsing"). The yyerror function was being misused; some of the calls reversed the scanner and parser arguments. Since one of the two parameters is void *, this reversal wasn't caught. * parser.l (yyerror): Fix first two arguments being reversed. (num_esc): Change previously correct call to yyerror to follow reversed arguments, so that it stays correct. * parser.y (%parse-param): Change order of these directives so that the scnr parameter is before the parser parameter. This causes the yacc-generated calls to yyerror to have the arguments in the correct order. It also has the effect of changing the signature of yyparse, reversing its parameters. (parse): Update call to yyparse to new argument order. * parser.h (yyparse): Declaration removed. (yyerror): Declaration updated. * regex.c (regex_kind_t): New enum typedef. (struct regex): Use regex_kind_t rather than an enum inside the struct, which has different scoping rules under C++. * txr.c (get_self_path): Fix signed/unsigned warning.
* Eliminating the extra list wrapping applied to regularKaz Kylheku2014-10-031-10/+8
| | | | | | | | | | | | | | | | | | | expression objects in the syntax tree. The parser just puts out a #<regex ...> instead of (#<regex ...> regex-syntax). * eval.c (do_eval): We no longer need the hack of treating (#<regex> ...) as a special form which evaluates to #<regex>. (expand): We no longer have to skip over regex syntax, so the case is removed. * match.c (h_var, do_txeval, do_match_line): regexp cases are no longer subcases of consp but stand on their own. In do_match_line, we introduce a COBJ case into the type switch for regexes. * parser.y: regexes are now compiled in the regex and lisp_regex grammar rules instead of the dependent rules, and are not wrapped in extra syntax.
* Uprooting stupidities in handling of output variables.Kaz Kylheku2014-08-131-54/+10
| | | | | | | | | | | | | | | | | | | | * parser.y (o_elems_transform): Remove useless function which was only unwrapping the strange parse of output vars. (o_elems_opt, rep_elem, quasilit, wordsqlit): Eliminate o_elems_transform call. (o_var, q_var): Eliminate the phrase structure rules which match an extra o_elem or quasi_item, and which incorporate them into the var syntax tree element. Place the modifiers into the third position, not fourth. * eval.c (subst_vars): Eliminate handling of "pat" element. Actually that was not even there thanks to o_elems_transform being applied: dead code. Pull modifiers from the third element of the var form now, not fourth. * match.c (subst_vars): Similar changes as in the match.c subst_vars function. Here the pat variable is even more obviously useless; if it is not nil, it is just punted back to the spec.
* First cut at restructuring how variable matching works in the patternKaz Kylheku2014-08-111-18/+3
| | | | | | | | | | | | | | | | | language. The goal is to remove the strict behavior of using only one element of context after a variable. variable form at parse time: we unravel that first. * parser.y (grammar): Simplifying the phrase structure rule for the var element. All the variants that have a trailing elem are removed. The abstract syntax changes; the modifier moves to the third position in the list. * match.c (h_var): Matching change: the element which follows a variable is now pulled from the specline rather than the variable syntax, which is how it should have been done in the first place. The modifiers are pulled from a different spot in the variable syntax.
* * parser.l (yyerr): Function removed; it is not used in the lexer,Kaz Kylheku2014-08-071-34/+36
| | | | | | | | | | and converted to a macro in the parser. * parser.y (define_transform): take a parser argument rather than scanner. Set up scnr local variable for yyerr macro. Remove scnr argument from macro calls. (yyerr): New macro. (grammar): Remove scnr argument from yyerr calls.
* Reentrant parser regression.Kaz Kylheku2014-08-071-64/+52
| | | | | | | | | | | * parser.y (yybadtok): New macro. (yybadtoken): Function must take parser argument. (grammar): Replace uses of yybadtoken with yybadtok. * parser.h (yybadtoken): Declaration updated. * parser.l (grammar): Fix incorrect yyprepf calls that are missing the yyscanner parameter.
* * parser.y: Back port from Berkeley Yacc to GNU Bison.Kaz Kylheku2014-08-051-0/+5
| | | | | | | | We need a prototype of yylex that is in scope of the grammar, but YYSTYPE is not defined there. * parser.l: Bison 3 declares yyparse in y.tab.h, so we have to reorder some #includes.
* Big switch to reentrant lexing and parsing.Kaz Kylheku2014-08-021-128/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * parser.l (YY_INPUT): Stop relying on removed yyin_stream; refer to stream via yyextra. (yyin_stream, lineno, errors, spec_file_str, prepared_error_message): Global variables removed. (yyget_column, yyset_column): Missing prototypes not generated by flex in bison bridge mode have to be added by us to avoid warning. (yyerror): Takes parser and scanner as parameters. Prepared error message is now in the parser context. Calls to other error handling functions receive scanner context. (yyerr): New function. (yyerrorf, yyerrprepf): Takes scanner argument, chases extra data to get to parser, and refers to parser variables instead of globals. (num_esc): Scanner argument added. (%option reentrant, %option bison-bridge, %option extra-type): New flex options. (grammar): yyscanner added everywhere. (end_of_char): Takes scanner argument. (parse_init): Removed references to yyin_stream and prepared_error_message. (parse_reset): Function renamed to open_txr_file. Returns results via pointers instead of setting global variables. (regex_parse, lisp_parse): Use reentrant parser interface. * parser.y (yyerror): Prototype removed. (yylex): Prototype moved after grammar, with new arguments. (sym_helper, define_transform): Take scanner argument. (make_expr): Takes parser argument. (rlrec): New static function. (rl): Function turned into macro. (mkexp, symhlpr): New macros. (%purse-parser, %parse-param, %lex-param): New Yacc options. (grammar): Actions re-worked for reentrance. Parser and scanner contexts are passed down to helper functions, in some cases via the three new macros. The result of the parse is stored in the syntax_tree member of the parser_t structure instead of a global. The yylex function receives the scanner instance. (get_spec): Function removed. (parse): New function. * parser.h (lineno, errors, yyin_stream, spec_file_str): Declarations removed. (parser_t): New struct. (yyerr): New function declared. (yyparse, yyerror, yyerrorf, end_of_regex, end_of_char, yylex, yylex_destroy): Declarations updated.
* * parser.l: Allow unquotes and splices in QSPECIAL and BRACED states.Kaz Kylheku2014-07-301-0/+2
| | | | | | | | | * parser.y (quasi_item): Support splices as items. * genvim.txr: Syntax highlighting support for unquotes in quasiliterals. * txr.vim: Updated.
* * Makefile, arith.c, arith.h, combi.c, combi.h, configure, debug.c,Kaz Kylheku2014-07-231-16/+16
| | | | | | | | debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Synchronize license header with LICENSE.
* * parser.y (METANUM): Forgotten %right declaration for thisKaz Kylheku2014-07-221-1/+1
| | | | token has been resulting in a shift-reduce conflict.
* * configure: Add a check, in the case that we cannot make anKaz Kylheku2014-07-221-0/+5
| | | | | | | | | | | | | | | | | executable, whether this is due to being required to use C99. For instance, the Solaris environment requires compilation using the C99 dialect if _XOPEN_SOURCE is set to 600 or higher. * debug.c: When compiling as C99, we have to obey the special C99 conventions for instantiating inline functions. * hash.c: Likewise. * lib.c: Likewise. * parser.y: Likewise. * unwind.c: Likewise.
* Bugfix: macros not being expanded in expansions embedded inKaz Kylheku2014-06-201-3/+23
| | | | | | | | | | | | | | quasilierals: i.e. the forms X and Y in `@{X}` and `@{X Y}`, where X and Y can be Lisp symbol macros or compound forms that is a macro call. * eval.c (expand_quasi): Handle the var forms in a quasi. * parser.y (n_exprs_opt, q_var): New grammar nonterminals. q_var is a clone of o_var, but with different construction behavior. It fixes the bug that o_var applies expand_meta to embedded Lisp forms, which is not appropriate for TXR Lisp quasiliterals. (quasi_item): Derive q_var rather than o_var.
* Optimization: add missing tail updates to some listKaz Kylheku2014-06-201-1/+1
| | | | | | | | | | | collecting loops. * lib.c (tuples_func, where, sel): Catch return value of list_collect and update tail variable. * match.c (do_txeval): Likewise. * parser.y (expand_meta): Likewise for list_collect_nconc.
* * lib.c (obj_print): Render character DC00 as "pnul".Kaz Kylheku2014-06-151-0/+1
| | | | | | | | | | | | | Clean up code which chooses rendering for characters. Print C0 and C1 control characters, as well as D800-DFFF, FFFE and FFFF and characters above FFFF using hex; others are printed using the #\<char> notation. * parser.y (char_from_name): map "pnul" to DC00. * txr.1: Documented pnul, clarified character printing rules, and added a cautionary note about possible ambiguity in printing.
* Change to how locations are passed around, for the sake of generationalKaz Kylheku2014-03-291-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GC. The issue being solved here is the accuracy of the gc_set function. The existing impelmentation is too conservative. It has no generation information about the memory location being stored, and so it assumes the worst: that it is a location in the middle of a gen 1 object. This is sub-optimal, creating unacceptable pressure against the checkobj array and, worse, as a consequence causing unreachable gen 0 objects to be tenured into gen 1. To solve this problem, we replace "val *" pointers with a structure of type "loc" which keeps track of the object too, which lets us discover the generation. I tried another approach: using just a pointer with a bitfield indicating the generation. This turned out to have a serious issue: such a bitfield goes stale when the object is moved to a different generation. The object holding the memory location is in gen 1, but the annotated pointer still indicates gen 0. The gc_set function then makes the wrong decision, and premature reclamation takes place. * combi.c (perm_init_common, comb_gen_fun_common, rcomb_gen_fun_common, rcomb_list_gen_fun): Update to new interfaces for managing mutation. * debug.c (debug): Update to new interfaces for managing mutation. Avoid loc variable name. * eval.c (env_fbind, env_fbind): Update to new interfaces for managing mutation. (lookup_var_l, dwim_loc): Return loc type and update to new interfaces. (apply_frob_args, op_modplace, op_dohash, transform_op, mapcarv, mappendv, repeat_infinite_func, repeat_times_func): Update to new interfaces for managing mutation. * eval.h (lookup_var_l): Declaration updated. * filter.c (trie_add, trie_compress, trie_compress_intrinsic, * build_filter, built_filter_from_list, filter_init): Update to new * interfaces. * gc.c (gc_set): Rewritten to use loc type which provides the exact generation. We do not need the in_malloc_range hack any more, since we have the backpointer to the object. (gc_push): Take loc rather than raw pointer. * gc.h (gc_set, gc_push): Declarations updated. * hash.c (struct hash): The acons* functions use loc instead of val * now. (hash_equal_op, copy_hash, gethash_c, inhash, gethash_n, pushhash, Change to how locations are passed around, for the sake of generational GC. The issue being solved here is the accuracy of the gc_set function. The existing impelmentation is too conservative. It has no generation information about the memory location being stored, and so it assumes the worst: that it is a location in the middle of a gen 1 object. This is sub-optimal, creating unacceptable pressure against the checkobj array and, worse, as a consequence causing unreachable gen 0 objects to be tenured into gen 1. To solve this problem, we replace "val *" pointers with a structure of type "loc" which keeps track of the object too, which lets us discover the generation. I tried another approach: using just a pointer with a bitfield indicating the generation. This turned out to have a serious issue: such a bitfield goes stale when the object is moved to a different generation. The object holding the memory location is in gen 1, but the annotated pointer still indicates gen 0. The gc_set function then makes the wrong decision, and premature reclamation takes place. * combi.c (perm_init_common, comb_gen_fun_common, rcomb_gen_fun_common, rcomb_list_gen_fun): Update to new interfaces for managing mutation. * debug.c (debug): Update to new interfaces for managing mutation. Avoid loc variable name. * eval.c (env_fbind, env_fbind): Update to new interfaces for managing mutation. (lookup_var_l, dwim_loc): Return loc type and update to new interfaces. (apply_frob_args, op_modplace, op_dohash, transform_op, mapcarv, mappendv, repeat_infinite_func, repeat_times_func): Update to new interfaces for managing mutation. * eval.h (lookup_var_l): Declaration updated. * filter.c (trie_add, trie_compress, trie_compress_intrinsic, * build_filter, built_filter_from_list, filter_init): Update to new * interfaces. * gc.c (gc_set): Rewritten to use loc type which provides the exact generation. We do not need the in_malloc_range hack any more, since we have the backpointer to the object. (gc_push): Take loc rather than raw pointer. * gc.h (gc_set, gc_push): Declarations updated. * hash.c (struct hash): The acons* functions use loc instead of val * now. (hash_equal_op, copy_hash, gethash_c, inhash, gethash_n, pushhash,
* More generational GC fixes. One GC fix.Kaz Kylheku2014-03-271-1/+1
| | | | | | | | | | | | | | | * combi.c (perm_init_common, comb_gen_fun_common, rcomb_gen_fun_common): Use set macro instead of plain assignment. * hash.c (hash_grow, copy_hash, hash_update_1): Use set macro instead of plain assignment. * lib.c (nreverse, lazy_appendv_func, lazy_appendv, vec_push, refset): Use set macro instead of plain assignment. (make_package): Assign all fields of the newly created PKG object before calling a function which can trigger GC. * parser.y (rlset): Use set macro.
* * eval.c (me_quasilist): New static function.Kaz Kylheku2014-03-251-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | (eval_init): Register me_quasilist as quasilist macro expander. * lib.c (quasilist_s): New global variable. (obj_init): quasilist_s initialized. * lib.h (quasilist_s): Declared. * match.c (do_txreval): Handle quasilist syntax. * parser.l (QWLIT): New exclusive state. Extend lexical grammar to transition to QWLIT state upon the #` or #*` sequence which kicks off a word literal, and in that state, piecewise lexically analyze the QLL, mostly by borrowing rules from quasiliterals. * parser.y (QWORDS, QWSPLICE): New tokens. (n_exprs): Integrate splicing form of QLL syntax. (n_expr): Integrate non-splicing form of QLL syntax. (litchars): Propagate line number info. (quasilit): Fix "string literal" wording in error message. * txr.1: Introduced WLL abbreviation for word list literals, cleaned up the text a little, and documented QLL's.
* * parser.y (yybadtoken): Add missing cases for new tokenKaz Kylheku2014-03-251-0/+2
| | | | types WORDS and WSPLICE.
* Introducing word list literals.Kaz Kylheku2014-03-251-3/+17
| | | | | | | | | | | | | | | | * parser.l (WLIT): New exclusive start state. Extend lexical grammar to transition to WLIT state upon the #" or #*" sequence which kicks off a word literal, and in that state, piecewise lexically analyze the literal, mostly by borrowing rules from other literals. * parser.y (WORDS, WSPLICE): New tokens. (n_exprs): Integrate splicing form of word list literal syntax. (n_expr): Integrate non-splicit for of word list literal syntax. (litchars): Propagate line number info. (wordslit): New grammar rule. * txr.1: Updated.