summaryrefslogtreecommitdiffstats
path: root/parser.l
Commit message (Collapse)AuthorAgeFilesLines
* Synchronize license comments with LICENSE.Kaz Kylheku2016-10-011-16/+17
| | | | | | | | | | | | | | | | | | | | * Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Revert to verbatim 2-Clause BSD.
* Allow whitespace between @ and ; in comments.Kaz Kylheku2016-05-231-2/+2
| | | | | | | * parser.l (grammar): Recognize {WS}* between @ and ; (or the legacy #) in comments. * txr.1: Documentation updated.
* Handle non-UTF-8 byte in regex scanned from string.Kaz Kylheku2016-04-211-0/+6
| | | | | | | | The current behavior is that there is no lex rule for this, so such a byte gets echoed. parser.l (grammar): Add fallback rule to match one byte in SREGEX state and turn it into 0xDCxx character.
* Better job of diagnosing out-of-range char escapes.Kaz Kylheku2016-04-211-2/+9
| | | | | | * parser.l (num_esc): Check for converted value being out of the range of wchar_t or beyond 0x10FFFF, whichever is less.
* Bugfix: allow newline in regex parsing from string.Kaz Kylheku2016-04-181-1/+7
| | | | | | | * parser.l (grammar): The newline character is incorrectly handled by the same rule under the SREGEX and REGEX states. In the SREGEX state, just return it as a REGCHAR, not forgetting to increment the line number.
* Trailing whitespace.Kaz Kylheku2016-04-181-1/+1
| | | | * parser.l: Remove trailing whitespace.
* Revamp bad character messages in lexer.Kaz Kylheku2016-04-011-4/+15
| | | | | | | | * parser.l (grammar): Drop colon from unrecognized escape message. "bad character in directive" handles various cases to avoid printing junk to the terminal. Basic message harmonizes with the one in the yybadtoken function in the parser. Non-UTF-8 byte printed as TXR hex integer literal.
* gc bug: prepared_msg field of struct parser.Kaz Kylheku2016-03-071-1/+2
| | | | | | | * parser.l (yyerrprepf): Replace wrong bare assignment to parser->prepared_msg with proper set macro which handles the mutation of a mature generation object such that it points to a baby object.
* New :mandatory keyword in until/last clauses.Kaz Kylheku2016-01-151-5/+4
| | | | | | | | | | | | | | | | | | | * match.c (mandatory_k): New keyword variable. (h_coll, v_gather, v_collect): Implement :mandatory logic. (syms_init): Initialize mandatory_k. * parser.l (grammar): The UNTIL and LAST tokens must be matched similarly to collect, without consuming the closing parenthesis, allowing a list of items to be parsed between the symbol and the closure, in the NESTED state. * parser.y (gather_clause, collect_clause, elem, repeat_parts_opt, rep_parts_opt): Adjust to new until/last syntax. In the matching productions, the abstract syntax changes to incorporate the options. In the output productions, we throw an error if options are present. * txr.1: Documented :mandatory for collect, coll and gather.
* Copyright year bump.Kaz Kylheku2015-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright. * linenoise/LICENSE, linenoise/linenoise.c, linenoise/linenoise.h: Bump one principal author's copyright from 2014 to 2015. The code is based on a snapshot of 2015 upstream work.
* Implementing *print-base* and ~d format directive.Kaz Kylheku2015-11-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * debug.c (show_bindings): Use ~d for level, so as not to be influenced by *print-base*. (debug): Use ~d for line numbers. * lib.c (gensym): Use ~d conversion specifier for formatting gensym counter into symbol name. * match.c (LOG_MISMATCH, LOG_MATCH): Use ~d for line number references. (h_skip, h_coll, h_fun, h_chr, match_line_completely, v_skip, v_fuzz, v_gather, v_collect, v_output, v_filter, v_fun, v_assert, v_load, v_line, h_assert, open_data_source): Use ~d for line refs, number of iterations, errno values. * parser.c (repl): Use ~d for prompt line numbers, numbered variables and the expr-<n> string in error messages. * parser.l (yyerrorf, source_loc_str): Use ~d for line numbers. * stream.c (print_base_s): New symbol variable. (formatv): Implement *print-base*. (stdio_maybe_read_error, stdio_maybe_error, stdio_close, pipe_close, open_directory, open_file, open_fileno, open_tail, open_process, run, remove_path): Use ~d for errno values. (stream_init): Initialize print_base_s and register *print-base* special variable. sysif.c (mkdir_wrap, ensure_dir, getcwd_wrap, mknod_wrap, chmod_wrap, symlink_wrap, link_wrap, readlink_wrap, excec_wrap, stat_impl, pipe_wrap, poll_wrap, getgroups_wrap, setuid_wrap, seteuid_wrap, setgid_wrap): Use ~d for errno values and system function results. * txr.1: Documented *print-base* and ~d conversion specifier.
* New iread function.Kaz Kylheku2015-11-071-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The read function no longer works like it used to on an interactive terminal because of the support for .. and . syntax on a top-level expression. The iread function is provided which uses a modified syntax that doesn't support these operators on a top-level expression. The parser thus doesn't look one token ahead, and so iread can return immediately. * eval.c (eval_init): Register iread intrinsic function. * parser.c (prime_parser): Only push back the recently seen token when priming for a regular Lisp read. Handle the prime_interactive method by preparing a SECRET_ESCAPE_I token. (lisp_parse_impl): New static function, formed from previous lisp_parse. Takes a boolean argument indicating interactive mode. (prime_parser_post): New function. (lisp_parse): Now a wrapper for lisp_parse_impl which passes a nil to indicate noninteractive read. (iread): New function. * parser.h (enum prime_parser): New member, prime_interactive. (scrub_scanner, iread, prime_parser_post): Declared. * parser.l (prime_scanner): Handle the prime_interactive case the same way as prime_lisp. (scrub_scanner): New function. * parser.y (SECRET_ESCAPE_I): New token type. (i_expr): New nonterminal symbol. Like n_expr, but doesn't support dot or dotdot operators, except in nested subexpressions. (spec): Handle SECRET_ESCAPE_I by way of i_expr. (sym_helper): Before freeing the token lexeme, call scrub_scanner. If the token is registered as the scanner's most recently seen token, the scanner must forget that registration, because it is no longer valid. (parse): Call prime_parser_post. * txr.1: Documented iread.
* New range type, distinct from cons cell.Kaz Kylheku2015-11-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * eval.c (eval_init): Register intrinsic functions rcons, rangep from and to. (eval_init): Register rangep intrinsic. * gc.c (mark_obj): Traverse RNG objects. (finalize): Handle RNG in switch. * hash.c (equal_hash, eql_hash): Hashing for for RNG objects. * lib.c (range_s, rcons_s): New symbol variables. (code2type): Handle RNG type. (eql, equal): Equality for ranges. (less_tab_init): Table extended to cover RNG. (less): Semantics defined for ranges. (rcons, rangep, from, to): New functions. (obj_init): range_s and rcons_s variables initialized. (obj_print_impl): Produce #R notation for ranges. (generic_funcall, dwim_set): Recognize range objects for indexing * lib.h (enum type): New enum member, RNG. MAXTYPE redefined to RNG value. (TYPE_SHIFT): Increased to 5 since there are now 16 type codes. (struct range): New struct type. (union obj): New member rn, of type struct range. (range_s, rcons_s, rcons, rangep, from, to): Declared. (range_bind): New macro. * parser.l (grammar): New rule for recognizing the #R sequence as HASH_R token. * parser.y (HASH_R): New terminal symbol. (range): New nonterminal symbol. (n_expr): Derives the new range symbol. The n_expr DOTDOT n_expr rule produces rcons expression rather than const. * match.c (format_field): Recognize rcons syntax in fields which is now what ranges translate to. Also recognize range object. * tests/013/maze.tl (neigh): Fix code which destructures range as a cons. That can't be done any more. * txr.1: Document ranges.
* Better diagnostic for cramped floating literals.Kaz Kylheku2015-10-071-2/+7
| | | | | | * parser.l: Different text needed for ).1 and a.1 cases, because the insertion of a zero cannot fix the latter. Might as well make the messages more detailed.
* syntax: be tolerant of carriage returns.Kaz Kylheku2015-09-161-15/+16
| | | | | | | | | | | | | | This is needed for multi-line mode with CR line breaks. It also makes TXR tolerant when code is ported among systems with different line endings. * parser.l (NL): New lex named pattern, matching three possible line terminators: CR, NL or CR-NL. (grammar): In places where \n was previously matched, use {NL}. In a few places where \n is in a character class, add \r. In one place (comment matching), the the pattern . which implicitly doesn't match newlines had to be replaced with [^\r\n].
* Parse errors lose program prefix and parens.Kaz Kylheku2015-09-061-2/+7
| | | | | | | | | | | * parser.l (yyerrorf): Don't print the program prefix and parenthes, except if compatibility to 114 or older is requested. The main motivation for this is the repl, where the program prefix is not informative. The new format is also a de facto standard which is compatible with other parsers. Vim understands it directly. * txr.1: Documented.
* One-liner to allow @{obj.slot} in quasiliterals.Kaz Kylheku2015-09-021-1/+1
| | | | | | | | * parser.l (grammar): Recognize '.' token in BRACED state also. * genvim.txr: @{obj.slot ...} syntax highlighting support. Include txr_dot and txr_dotdot in txr_bracevar region.
* Introducing structs.Kaz Kylheku2015-09-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * args.c (args_cat_zap): New function. * args.h: (args_cat_zap): Declared. * eval.c (struct_lit_s): New symbol variable. (eval_init): Initialize struct_lit_s. * eval.h (struct_lit_s): Declared. * gc.c (finalize): If a symbol has a struct slot hash attached to it, we must free it when the symbol is reclaimed. * lib.c (make_sym): Initialize symbol's slot_cache pointer to null. (copy): Copy structure objects. (init): Call struct_init to initialize struct module. * lib.h (SLOT_CACHE_SIZE): New preprocessor symbol (slot_cache_line_t, slot_cache_t): New typedefs. (struct sym): New member, slot_cache. * lisplib.c (struct_set_entries, struct_instantiate): New static functions. (liplib_init): Register new functions in dl_table. parser.y (HASH_S): New terminal symbol. (struct): New grammar rule. (n_expr): Derive struct. (yybadtoken): Map HASH_S to #S string. parser.l (grammar): Recognize #S and return HASH_S token. share/txr/stdlib/place.tl (slot): New defplace. share/txr/stdlib/struct.tl: New file. struct.c: New file. struct.h: New file. * Makefile (OBJS): Adding struct.o.
* Allow slashes in regex passed to regex-parse.Kaz Kylheku2015-08-151-16/+15
| | | | | | | | | | | * parser.l (SREGEX): New start state, for stand-alone regex parsing. (grammar): All REGEX state rules are active in the SREGEX state also. The rule for the / character returns a REGCHAR if in the SREGEX state, so it is treated as an ordinary character. * txr.1: Updated regex-parse documentation about the treatment of the slash. Also added notes about double escaping when a string literal is passed to regex-parse.
* Floating-point constant tightening.Kaz Kylheku2015-08-121-8/+8
| | | | | | | | | | | | | * parser.l (grammar): Change order of rule which recognizes FLODOT with a one-character trailing context other than a dot, and the rule which diagnoses trailing junk. The issue is that this order gives the wrong interpretation to 123.E, treating it as 123. followed by E rather than trailing junk, like in the case of 123.0E or 123.B. * txr.1: Adding the valid example 1.E5. Removing references to dot as consing dot. Fixed documentation which says that 1.E is 1 followed by a consing dot and E. The wrong behavior in fact produced 1.0 followed by E. No consing dot semantics.
* Use new pushback token priming for single regex parse.Kaz Kylheku2015-08-121-7/+10
| | | | | | | | | | | | | | | | | | | | | | | * parser.h (enum prime_parser): New enum. (prime_parser, prime_scanner, parse): Declarations updated with new argument. * parser.c (prime_parser): New argument of enum prime_parser type Select appropriate secret token for regex and Lisp case. Pass prime selector down to prime_scanner. (regex_parse): Do not prepend secret escape to string. Do not use parse_once function; instead do the parser init and cleanup here and use the parse function. (lisp_parse): Pass new argument to parse, configuring the parser to be primed for Lisp parsing. * parser.l (grammar): Rule producing SECRET_ESCAPE_R removed. (prime_scanner): New argument. Pop the scanner state down to INITIAL. Then unconditionally switch to appopriate state based on priming configuration. * parser.y (parse): New argument for priming selection, passed down to prime parser.
* Crafting a better parser-priming hack.Kaz Kylheku2015-08-121-19/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The method of inserting a character sequence which generates a SECRET_TOKEN_E token is being replaced with a purely token based method. Because we don't manipulate the input stream, the lexer is not involved. We don't have to flush its state and deal with the carry-over of the yy_hold_char. This comes about because recent changes expose a weakness in the old scheme. Now that a top-level expression can have the form expr.expr, it means that the Yacc parser reads one token ahead, to see whether there is a dot or something else. This lookahead token is discarded. We must re-create it when we call yyparse again. This re-creation is done by creating a custom yylex function, which can maintain pushback tokens. We can prime this array of pushback tokens to generate the SECRET_TOKEN_E, as well as to re-inject the lookahead symbol that was thrown away by the previous yyparse. To know which lookahead symbol to re-inject is simple: the scanner just keeps a copy of the most recent token that it returns to the parser. When the parser returns, that token must be the lookahead one. The tokens we keep now in the parser structure are subject to garbage collection, and so we must mark them. Since the YYSTYPE union has no type field, a new API is opened up into the garbage collector to help implement a conservative GC technique. * gc.c (gc_is_heap_obj): New function. * gc.h (gc_is_heap_obj): Declared. * match.c: Include y.tab.h. This is now needed by any module that needs to instantiate a parser_t structure, because members of type YYSTYPE occur in the structure. (parser.h can still be included without y.tab.h, but only an incomplete declaration for the parser strucure is then given, and a few functions are not declared.) * parser.c (yy_tok_mark): New static function. (parser_mark): Mark the recent token and the pushback tokens. (parser_common_init): Initialize the recent token, the pushback tokens, and the pushback stack index. (pushback_token): New static function. (prime_parser): hold_byte argument removed. Body considerably simplified. The catenated stream trick is no longer required. All we do here is set up two pushback tokens and prime the scanner, if necessary, so it is in the right start state for Lisp. * parser.l (YY_DECL): Take over definition of scanning function, renaming to yylex_impl, so we can implement yylex. (grammar): Rule which produces SECRET_ESCAPE_E token removed. (reset_scanner): Function removed. (yylex): New function. * parser.h (struct parser): Now only forward-declared unless y.tab.h has been included. New members, recent_tok, tok_pushback and tok_idx. (yyset_hold_char): Declared. (reset_scanner): Declaration removed. (yylex): Declared (if y.tab.h included). (prime_parser): Declaration updated. (prime_scanner): Declared. * Makefile: express new dependency on existence of y.tab.h of txr.o, match.o and parser.o.
* Diagnose ambiguous floats like (a b).4 and x.y.5Kaz Kylheku2015-08-101-0/+30
| | | | | | | | These look like integers involved in qref dot syntax. * parser.l (DOTFLO): New pattern definition. (grammar): New rules for detecting cramped floating literals.
* Dot with no whitespace generates qref syntax.Kaz Kylheku2015-08-101-0/+11
| | | | | | | | | | | | | | | | | | | | | | | a.b.(expr ...).c -> (qref a b (expr ...) c) Consing dot requires whitespace. * eval.c (qref_s): New symbol global variable. (eval_init): Initialize qref_s. * eval.h (qref_s): Declared. * parser.l (REQWS): New pattern definition, required whitespace. (grammar): New rules to scan CONSDOT (space required on both sides) and LAMBDOT (space required after). * parser.y (CONSDOT, LAMBDOT): New token types. (list): (. n_expr) rule replaced with LAMBDOT and CONSDOT. (r_exprs): r_exprs . n_expr consing dot rule replaced with CONSDOT. (n_expr): New n_expr . n_expr rule introduced here for producing qref expressions. (yybadtoken): Handle CONSDOT and LAMBDOT. * txr.1: Documented qref dot.
* Handle abc: token syntax.Kaz Kylheku2015-08-101-2/+2
| | | | | | * parser.l (BTREG, NTREG): Allow an empty string symbol name with a nonempty package name. Without this, abc: parses as abc :.
* * eval.c (force): Default the new second argument of source_loc_str.Kaz Kylheku2015-08-041-4/+4
| | | | | | | | | | | | | | | | | | | (eval_error): Derive location of error from the last_form_evaled, if form doesn't have it. (eval_init): Re-register source-loc-str as binary with an optional arg. * match.c (debuglf, sem_error, file_err, typed_error): Default new argument of source_loc_str. * parser.h (source_loc_str): Declaration updated. * parser.l (source_loc_str): Take second argument which specifies alternative value if the source loc info is not found. * unwind.c (uw_throw): Simplify code thanks to source_loc_str default argument. * txr.1: Document new argument of source-loc-str.
* * parser.l (grammar): Do not allow unescaped newline inKaz Kylheku2015-07-231-1/+15
| | | | | | | | | | | word list literals and word list quasiliterals, except in <= 109 compatibility mode. An escaped newline in these literals, together with surrounding whitespace, now produces a single space, except in <= 109 compatibility mode. * txr.1: Documented new rules for WLL's and QLL's, and added compatibility notes.
* Bugfix: lexer loses unmatched "hold char" between top-level forms.Kaz Kylheku2015-07-101-1/+5
| | | | | | | | | | | | | | | | | | Test case: file containing 4(prinl 3). Scanner consumes 4 and (. The ( is lost when the scanner is reset for the next call to yyparse, resulting in jut prinl being read and interpreted as a variable. * parser.c (prime_parser): If present, append hold byte to priming string. Takes parser_t * instead of parser, and returns void now. * parser.l (reset_scanner): Now returns int value, the value of the scanner's yy_hold_char variable which is nonzero when the scanner is hanging on to an unmatched byte of input. * parser.h (reset_scanner, prime_parser): Declarations updated. * parser.y (parse): Pass hold byte returned by reset_scanner to prime_parser.
* Parser cleanup: embed scanner in parser.Kaz Kylheku2015-07-091-0/+8
| | | | | | | | | | | | | | | | | | | | | | | * parser.c (parser_destroy): New GC finalizer static function. (parser_ops): Register parser_destroy. (parser_common_init): New function, shared by parse and parse_once. Initializes embedded scanner. (parser_cleanup): New function, shared by parse_once and parser_destroy. (parser): Use parser_common_init. * parser.h (parser_t): New member, yyscan. (reset_scanner, parser_common_init): Declared. * parser.l (reset_scanner): New function. * parser.y (parse_once): Use parser_common_init, and thus perform only a few initializations. Do not define scanner as a local variable. (parse): Call reset_scanner instead of yylex_init since the scanner is being reused, and for the same reason do not call yylex_destroy. GC will do that now.
* Bugfix: allow @1 in brace variables.Kaz Kylheku2015-06-071-1/+1
| | | | | | | * parser.l (grammar): Scan a METANUM token in the BRACED state also. This allows us to correctly reference op arguments in a quasiliteral, as in `foo @{@1 [1..2] ","} bar`.
* Support trailing semicolon after hex/octal characters.Kaz Kylheku2015-07-021-2/+9
| | | | | | | | | | | | | | | | * parser.l (%option): Remove nounput option since we need yyunput. (grammar): Rule for matching hex and octal escape in SPECIAL state recognizes optional semicolon. In 109 compatibility, this is pushed back into the stream, otherwise consumed. * txr.1: Updated documentation, including compat notes. * genvim.txr (txr_char): Include optional semicolon in match. Corrected some errors where 8 and 9 were being included as matches for octal digits. (txr_error): Default match for \x or \o not followed by digits.
* Third round of quasiliteral-related fixes.Kaz Kylheku2015-06-261-0/+6
| | | | | | * parser.l (char_esc): Recognize \@ escape. (grammar): Add a rule for a \@ escape in quasiliterals, and quasi word list literals.
* Second round of quasiliteral-related fixes.Kaz Kylheku2015-06-261-1/+6
| | | | | | | | | * parser.l: Only shift to QSPECIAL state when @ is followed by a trailing context consisting of certain characters. Not every kind of Lisp object syntax can be introduced with @ in a quasiliteral. Adding a rule to produce an error when @ appears that is not followed by an allowed character.
* First round of quasiliteral-related fixes.Kaz Kylheku2015-06-261-7/+8
| | | | | | | | | | | | | | | | | | | | | * parser.l: Do not try to recognize floating-point literals in QSPECIAL state; that is not possible because @134.3 in a quasiliteral parses as a METANUM followed by ".3". On the other hand, recognize METANUM literals in QSPECIAL state, so that @@123 scans. Recognize @ as a token in QSPECIAL state, so @@abc will scan. When transitioning from QSILIT and QWLIT states to QSPECIAL upon scanning @, return a @ token, which is now parsed in the grammar. * parser.y (quasi_meta_helper): New static function. (q_var): Do not handle SYMTOK any more, only the braced variable syntax. SYMTOK is handled as a n_expr. Braced vars are handled with explicit '@' token, which is now produced by the scanner when it shifts from QSILIT to QSPECIAL. (quasi_item): No longer necessary to recognize various forms here such as quotes and splices. Just recognize a n_expr, preceded by '@'.
* Do not allow unrecognized escapes in regex.Kaz Kylheku2015-04-191-11/+14
| | | | | | | | | | | | * parser.l (REGOP): New regex alias for matching all regex special characters. (grammar): Several rules for regex special characters merged together. New rule introduced to match a special character after a backslash, making it literal. The old rule which makes literal any character after a backslash now throws an error, unless version 105 comaptibility is selected. * txr.1: Documented this behavior change.
* Allow quasiquotes in braces and quasiliterals, and quotes in braces.Kaz Kylheku2015-04-151-6/+1
| | | | | | | | | * parser.l: Consolidate rules for recognizing quote, unquote, and quasiquote. An effect of this is that quasiquotes can now occur in braces and in string quasiliterals. * parser.y (quasi_item): Support quotes and quasiquotes as quasi items: that is to say, i.e. objects denoted by @ in a quasiliteral.
* Diagnose trailing junk in numeric literals.Kaz Kylheku2015-04-141-14/+19
| | | | | | | * parser.l: Combining the handling of hex, octal and binary numeric literals into a single rule. Implementing an additional rule which diagnoses such tokens that have trailing junk. Thus, something like #x1F2AZ is now invalid syntax.
* * parser.c (open_txr_file, regex_parse, lisp_parse): FunctionsKaz Kylheku2014-12-211-77/+0
| | | | | | | moved here from parser.l. * parser.l (open_txr_file, regex_parse, lisp_parse): Functions moved from here to parser.c.
* * Makefile (OBJS): Add parser.o.Kaz Kylheku2014-12-211-1/+1
| | | | | | | | | | * parser.h (parser_s): Declared. (parse_init): Declaration removed. (parser_l_init): Declared. * parser.l (parse_init): Function renamed to parser_l_init. * parser.c: New file.
* Improved error reporting, particularly for macro expansion.Kaz Kylheku2015-02-211-4/+2
| | | | | | | | | | | | | | | | | | | * eval.c (last_form_expanded): New variable. (do_expand): New static function; contains previous expand function. (expand): Becomes a wrapper for do_expand, with re-entry counting. (eval_init): GC-protect last_form_expanded. * eval.h (last_form_expanded): Declared. * parser.l (regex_parse, lisp_parse): Just use a simple word for the name of the regex or string parse location, not the entire expression itself. * unwind.c (uw_throw): Check whether expansion was going on when the unhandled exception was thrown and print additional information.
* * hash.c (hash_begin): Use coerce macro instead of raw C cast.Kaz Kylheku2014-10-251-1/+1
| | | | | | Use cobj_handle so hash argument is validated. * parser.l (YY_INPUT): Use convert macro instead of raw C cast.
* GNU Flex's libfl library provides nothing. Let us not refer to it. ItKaz Kylheku2014-10-241-0/+9
| | | | | | | | | | | | only causes build issues on some systems where it is not provided in the standard location, or is not cross-compiled properly. * Makefile (LEXLIB): Reference to variable removed. * configure (lexlib): Variable removed. (LEXLIB): config.make variable not generated. * parser.l (yywrap): Provide this trivial function as inline.
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-2/+2
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.
* * parser.l (lisp_parse): Bugfix: the error_stream argumentKaz Kylheku2014-10-191-3/+12
| | | | | | | | | must be checked to be a stream before we plant it in place of std_error, otherwise we will get a type exception thrown lower down, which leads to runaway recursion as TXR tries to print the error messages on std_error. * dep.mk: Regenerated.
* Converting cast expressions to macros that are retargettedKaz Kylheku2014-10-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to C++ style casts when compiling as C++. * lib.h (strip_qual, convert, coerce): New casting macros. (TAG_MASK, tag, type, wli_noex, auto_str, static_str, litptr, num_fast, chr, lit_noex, nil, nao): Use cast macros. * arith.c (mul, isqrt_fixnum, bit): Use cast macros. * configure (INT_PTR_MAX): Define using cast macro. * debug.c (debug_init): Use cast macro. * eval.c (do_eval, expand_macro, reg_op, reg_mac, eval_init): Use cast macros. * filter.c (filter_init): Use cast macro. * gc.c (more, mark_obj, in_heap, mark, sweep_one, unmark): Use cast macros. * hash.c (hash_double, equal_hash, eql_hash, hash_equal_op, hash_hash_op, hash_print_op, hash_mark, make_hash, make_similar_hash, copy_hash, gethash_c, gethash, gethash_f, gethash_n, remhash, hash_count, get_hash_userdata, set_hash_userdata, hash_iter_destroy, hash_iter_mark, hash_begin, hash_uni, hash_diff, hash_isec): Use cast macros. * lib.c (code2type, chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc, chk_strdup, num, c_num, string, mkstring, mkustring, upcase_str, downcase_str, string_extend, sub_str, cat_str, trim_str, c_chr, vector, vec_set_length, copy_vec, sub_vec, cat_vec, cobj_print_op, obj_init): Likewise. * match.c (do_match_line, hv_trampoline, match_files, dir_tables_init): Likewise. * parser.l (grammar): Likewise. * parser.y (parse): Likewise. * rand.c (make_state, make_random_state, random_fixnum, random): Likewise. * regex.c (CHAR_SET_L2_LO, CHAR_SET_L2_HI, CHAR_SET_L1_LO, CHAR_SET_L1_HI, CHAR_SET_L0_LO, CHAR_SET_L0_HI, L0_full, L0_fill_range, L1_full, L1_fill_range, L1_contains, L1_free, L2_full, L2_fill_range, L2_contains, L2_free, L3_fill_range, L3_contains, L3_free, char_set_create, char_set_cobj_destroy, nfa_state_accept, nfa_state_empty, nfa_state_single, nfa_state_wild, nfa_state_set,
* Purge stray occurrences of "void *" from code base.Kaz Kylheku2014-10-171-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib.c (cobj_print_op): In the format call, cast the C pointer to val, since the ~p conversion now takes a val rather than void *. (cptr_equal_op, obj_print, obj_pprint): Remove cast to void *, since obj now already has the type that ~p expects. * lib.h (struct any): Use mem_t * instead of void *. * parser.h (yyscan_t): Repeat the definition from inside the Flex-generated lex.yy.c. (yylex_init, yylex_destroy, yyget_extra, yyset_extra): Re-declare using yyscan_t typedef in place of void *. * parser.l (yyget_column, yyerrprepf): Re-declare using yyscan_t. (grammar): Use yyg in place of yyscanner in calls to yyerrprepf. * parser.y (yylex, %lex-param): Use yyscan_t instead of void *. (parse): Use yyscan_t for local variable. * signal.c (stack): Change type from void * to mem_t *. * stream.c (vformat): Conversion specifier p extracts val instead of void *. (run): Use casts that only remove const, not all the way to void *. * txr.1: Documented p conversion specifier of format. * Makefile (OBJS-y): Initialize with := to make sure it is a simple variable, and not a macro. (SRCS): New variable, listing source files. (enforce): New rule for enforcing coding conventions. Currently checks for void * occurrences.
* More type safety, with help from C++ compiler.Kaz Kylheku2014-10-141-22/+18
| | | | | | | | | | | | | | | | | | | | | | | | | * parser.h (scanner_t): New typedef. Cleverly, we use yyguts_t as the basis for this, which pays off in a few places, getting rid of conversions. (parser_t): scanner member redeclared to scanner_t *. (yyerror, yyerr, yyerrof, end_of_regex, end_of_char): Declarations updated. * parser.l (yyerror, yyerrorf, num_esc): scanner argument becomes scanner_t * instead of void *. (yyscanner): Occurrences replaced by yyg: why should we refer to the type-unsafe void *yyscanner, when we know that in the middle of the yylex function we have the properly typed yyguts_t *yyg. (end_of_regex, end_of_char): Argument changes to scanner_t * and name strategically changes to yyg, eliminating the local variable and cast. * parser.y (sym_helper): scanner argument becomes scanner_t * instead of void *. (%parse-param}: void *scnr becomes scanner_t *scnr. (define-transform, yybadtoken): Use scanner_t * instead of void *. (parse): Since yylex_init takes a void **, we use it to initialize a local void *, and then assign it to the structure member.
* C++ upkeep.Kaz Kylheku2014-10-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR's support for compiling as C++ pays off: C++ compiler finds serious bugs introduced in August 2 ("Big switch to reentrant lexing and parsing"). The yyerror function was being misused; some of the calls reversed the scanner and parser arguments. Since one of the two parameters is void *, this reversal wasn't caught. * parser.l (yyerror): Fix first two arguments being reversed. (num_esc): Change previously correct call to yyerror to follow reversed arguments, so that it stays correct. * parser.y (%parse-param): Change order of these directives so that the scnr parameter is before the parser parameter. This causes the yacc-generated calls to yyerror to have the arguments in the correct order. It also has the effect of changing the signature of yyparse, reversing its parameters. (parse): Update call to yyparse to new argument order. * parser.h (yyparse): Declaration removed. (yyerror): Declaration updated. * regex.c (regex_kind_t): New enum typedef. (struct regex): Use regex_kind_t rather than an enum inside the struct, which has different scoping rules under C++. * txr.c (get_self_path): Fix signed/unsigned warning.
* * eval.c (eval_init): Update registration of lisp-parse and readKaz Kylheku2014-09-021-2/+8
| | | | | | | | | | | | | | | | | | | | to account for new parameter. * lib.c (syntax_error_s): New symbol_variable. (obj_init): New symbol variable initialized. * lib.h (syntax_error_s): Declared. * parser.h (lisp_parse): Declaration updated. * parser.l (lisp_parse): Takes third parameter. * txr.1: Third parameter of read described. * txr.c (txr_main): Pass colon_k to third parameter of lisp_parse to obtain exception throwing behavior. * unwind.c (uw_init): Register syntax-error as subtype of error.
* * parser.l (yyerr): Function removed; it is not used in the lexer,Kaz Kylheku2014-08-071-5/+0
| | | | | | | | | | and converted to a macro in the parser. * parser.y (define_transform): take a parser argument rather than scanner. Set up scnr local variable for yyerr macro. Remove scnr argument from macro calls. (yyerr): New macro. (grammar): Remove scnr argument from yyerr calls.