txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Better identify functions that misuse COBJ-s and hashes.	Kaz Kylheku	2018-11-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this patch, the cobj_handle, cobj_ops and variants of gethash get an additional argument to identify the caller. Many functions are updated to pass this down. * buf.c (buf_strm): Pass self name to cobj_handle. * eval.c (env_fbind, env_vbind, rt_defvarl, me_case): Pass self name to gethash_c or gethash_e. (load): Pass self name to read_eval_stream and read_compiled_file. (reg_symacro): Pass situation-identifying string to gethash_c. * ffi.c (ffi_type_struct_checked, ffi_closure_struct_checked, ffi_call_desc_checked, uni_struct_checked): Take self name parameter, and pass down to cobj_handle. (ffi_get_type, ffi_get_lisp_type): Take self name and pass down to ffi_type_struct_checked. (union_get_ptr): Take self name and pass to uni_struct_checked. (ffi_union_in, ffi_union_put): Pass self name to union_get_ptr. (ffi_type_compile): Pass self name to ffi_get_lisp_type. (ffi_make_call_desc): Pass self name to ffi_type_struct_checked, ffi_get_type and ffi_call_desc_checked. (ffi_make_closure): Pass self name to ffi_call_desc_checked. (ffi_closure_get_fptr): Take self name, pass to ffi_closure_struct_checked. (ffi_typedef, ffi_size, ffi_alignof, ffi_offsetof, ffi_arraysize, ffi_elemsize, ffi_elemtype, ffi_put_into, ffi_put, ffi_in, ffi_get, ffi_out, make_carray): Pass self name to ffi_closure_struct_checked. (carray_struct_checked): Take self name, pass to cobj_handle. (carray_set_length, carray_dup, carray_own, carray_free, carray_type, length_carray, copy_carray, carray_ptr, buf_carray, vec_carray, list_carray, carray_ref, carray_refset, carray_sub, carray_replace, carray_get_common, carray_put_common, unum_carray, num_carray, put_carray, fill_carray): Pass self name to carray_struct_checked. (carray_blank, carray_buf, carray_cptr): Pass self name ffi_type_struct_checked. (carray_pun): Pass self name to carray_struct_checked and ffi_type_struct_checked. (make_union): Pass self name to ffi_type_struct_checked. (union_members, union_get, union_put, union_in, union_out): Pass self name to uni_struct_checked. (make_zstruct, zero_fill, put_obj, get_obj, fill_obj): Pass self-name to ffi_type_struct_checked. * ffi.h (ffi_closure_get_fptr, union_get_ptr): Declarations updated. * filter.c (trie_add): Pass self-name to gethash_l. * hash.c (make_similar_hash, copy_hash, hash_count, get_hash_userdata, set_hash_userdata, hash_begin, hash_next, hash_uni, hash_diff, hash_isec): Pass self name to cobj_handle. (gethash_c, gethash_e): Take self name parameter and pass down to cobj_handle. (gethash_f): Take self parameter and pass down to gethash_e. (gethash, inhash, gethash_n, sethash, pushhash, remhash, clearhash, hash_update_1): Pass self name to gethash_e or gethash_c. * hash.h (gethash_c, gethash_e, gethash_f): Declarations updated. (gethash_l): Take self name, and pass down to gethash_c. * lib.c (class_check): Take self name parameter and use in type mismatch diagnostic. (use_sym, unuse_sym, symbol_needs_prefix, find_symbol, intern, unintern, intern_fallback, unique, in, sel, obj_print_impl, populate_obj_hash, obj_hash_merge): Pass self name to gethash_f or gethash_l. (symbol_visible, obj_init): Pass situation-identifying string to gethash_e. (cobj_handle, cobj_ops): Take self name parameter and pass down to class_check. * lib.h (class_check, cobj_handle, cobj_ops): Declarations updated. * match.c (v_load): Pass self name to read_compiled_file and read_eval_stream. * parser.c (get_parser_impl): Take self name and pass to cobj_handle. (ensure_parser): Pass situation-identifying string to gethash_c. (parser_circ_def): Pass self-name to gethash_c. (lisp_parser_impl): Pass self name to get_parser_impl and class_check. (lisp_parse, nread, iread): Pass self-name to lisp_parser_impl. (read_file_common): Take self name parameter and pass down to get_parser_impl. (read_eval_stream, read_compiled_file): Take self name and pass down to read_file_common. (load_rcfile): Pass situation-identifying string to read_eval_streem. (get_visible_syms): Pass situation-identifying string to gethash_c. (parser_errors, parser_eof): Pass self name to cobj_handle. * parser.h (read_eval_stream, read_compiled_file): Declarations updated. * parser.y (rlset): Pass self name to gethash_c. * rand.c (make_random_state, random_state_get_vec,l random_fixnum, random_float): Pass self name to cobj_handle. * regex.c (regex_source, regex_print, regex_run): Pass self-name to cobj_handle. (regex_machine_init): Take self name param and pass to cobj_handle. (search_regex, match_regex, match_regex_right, regex_prefix_match, read_until_match): Pass self-name to regex_machine_init. * stream.c (stdio_get_fd): Pass self name to cobj_handle. (generic_get_line): Get COBJ operations via unsafe, diret object access rather than cobj_ops. (set_mode_props): Get object handle via unsafe, direct object access. (stream_fd, sock_family, sock_type, sock_peer, set_sock_peer, get_string_from_stream, get_list_from_stream, stream_set_prop, stream_get_prop, close_stream, get_error, get_error_str, clear_error, get_line, get_char, get_byte, unget_char, unget_byte, put_buf, fill_buf, put_string, put_char, put_byte, flush_stream, seek_stream, truncate_stream, get_indent_mode, test_set_indent_mode, set_indent_mode, get_indent, set_indent, inc_indent, width_check, force_break, get_set_ctx, get_ctx): Pass self name to cobj_ops. (make_delegate_stream): Take self name parameter, pass down to cobj_ops. (record_adapter): Pass self name down to make_delegate_stream. (format): Pass self name to class_check. * struct.c (stype_handle): Pass self name to cobj_handle. (make_struct_type): Pass self name to class_check. * txr.c (read_eval_stream_noerr): Take self name parameter, pass to read_eval_stream. (txr_main): Pass istuation-identifying string to read_compiled_file and read_eval_stream_noerr. * unwind.c (revive_cont): Pass self-name to cobj_handle. * vm.c (vm_desc_struct): Take self name parameter, pass to cobj_handle. (vm_desc_nlevels, vm_desc_nregs, vm_desc_bytecode, vm_desc_datavec, vm_desc_symvec, vm_execute_toplevel, vm_execute_closure, vm_closure_entry): Pass self name to vm_desc_struct. (vm_closure_struct): Take self name parameter, pass to cobj_handle.
*	txr: support variable in postive match.	Kaz Kylheku	2018-05-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	In the @{var mod} syntax in the pattern language, allow mod to be a variable which contains a regex or integer, not just an integer or regex literal. * match.c (h_var): Check for modifier being a variable, and resolve it. * parser.y (modifiers): Allow a SYMTOK phrase. * txr.1: Documented.
*	parser: propagate copyright to generated parser.	Kaz Kylheku	2018-04-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	* parser.y: Move the copyright comment header into the %{ ... %} section so that it is copied to the generated parser, rathe than stripped away by the generator. The problem is that Bison adds a GPL header to the file wrongly implying that the whole thing is under the GPL (with a special exception). Without our copyright header there, it looks as if the whole file is from Bison.
*	parser: show starting line of unterminated form.	Kaz Kylheku	2018-04-15	1	-0/+5
\| \| \| \| \| \|	* parser.y (parse): Note the line number before parsing. If the error seems to be bad termination, issue an extra message indicating the starting line number of the form.
*	parser: @(if) hack in output must use usr package.	Kaz Kylheku	2018-04-10	1	-6/+3
\| \| \| \| \| \| \| \| \| \| \|	* match.c (else_s, elif_s): New symbol variables. (syms_init): Initialize new variable with interned symbols. * match.h (else_s, elif_s): Declared. * parser.y (not_a_clause): Refer to if_s, else_s and elif_s, which are symbols in the usr package, instead of intering symbols in whatever package is current.
*	lib: get rid of preprocessor macros for packages.	Kaz Kylheku	2018-04-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The identifiers user_package, system_package and keyword_package are preprocessor symbols that expand to other preprocessor symbols for no good reason. Time to get rid of this. * lib.c (system_package_var, keyword_package_var, user_package_var): Variables renamed to system_package, keyword_package and user_package. (symbol_package, keywordp, obj_init): Fix variable references to follow rename. * lib.h (keyword_package, user_package, system_package): Macros removed. (system_package_var, keyword_package_var, user_package_var): Variables renamed. * eval.c (eval_init): Fix variable references to follow rename. * parser.y (sym_helper): Likewise.
*	parser: don't generate special lits outside quasiquote.	Kaz Kylheku	2018-04-04	1	-9/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The parser generates a sys:hash-lit, sys:struct-lit or sys:vector-lit whenever a hash, struct or vector literal contains unquotes. This allows the quasiquote expander to treat these objects as ordinary list structure when interpolating inside them, and then recognize these symbols and construct the implied real objects. The issue is that these literals are generated even if the unquotes occur outside of a backquote. For instance if a vector literal like #(,a) occurs out of the blue, not in any backquote, this is still a (sys:vector-lit (sys:unquote a)) and not an actual vector. The issue is compounded because this substitution takes place even if there is no actual comma or splice notation. Even the following is a sys:vector-lit: #((sys:unquote x)). In any case, it causes problems for compiled files, because such material can occur in the data vector of a compiled toplevel form. In this patch we modify the parser to keep track of the quasiquote/unquote level. The special literals are generated only when the object occurs inside a quasiquote. * parser.h (struct parser): New member, quasi_level. * parser.c (parser_common_init): Initialize the parser's new quasi_level member. * parser.y (vector, hash, struct): To decide whether to generate the special literal, don't just check whether unquotes occur in the list. Check that we are in a quasiquote, indicated by the quasiquoting level being positive. (i_expr, n_expr): Use a mid-rule actions on the quasiquote, unquote and splice rules to bump the quasiquoting level in one direction before recognizing the object, and then bump in the opposite direction when reducing the rule. (parse): Initialize quasi_level.
*	parser: avoid consing for buf literals.	Kaz Kylheku	2018-04-03	1	-12/+7
\| \| \| \| \| \| \| \| \| \| \|	* parser.y (buflit, buflit_items): Don't cons up a list of bytes in buflit_items which are then assembled into a buffer. Rather, the buflit_items rules construct and fill a buffer object directly. The buflit rule then just has to signal the end of the buffer literal to the lexer, and trim the buffer to the actual size. We will need this for efficient loading of compiled files, in which the virtual machine code is represented as a buffer literal.
*	packages: drop no-fallback-list interning restriction.	Kaz Kylheku	2018-03-09	1	-14/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Removing the restriction that qualified pkg:sym syntax may not cause interning to take place if pkg has a nonempty fallback list. This serves no purpose, only hindering the flexibility of the package system. * parser.y (sym_helper): When processing a qualified symbol, if the package exists, just intern it. * txr.1: Revise all text which touched on the removed rule, to remove all traces of it from the documentation.
*	Copyright year bump 2018.	Kaz Kylheku	2018-02-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/doloop.tl, share/txr/stdlib/error.tl, share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/pmac.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, win/cleansvg.txr: Extended Copyright line to 2018.
*	bugfix: don't record source loc of symbols and numbers.	Kaz Kylheku	2018-02-05	1	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents a bug which manifests itself as a totally bogus file name and line number being reported in a diagnostic. The cause is that source loc info is recorded for an interned symbol when it is first encountered in one place in the code. Then that symbol occurs again in another place (perhaps a different file) in such a way that its source loc info is inherited into a surrounding generated form which now has incorrect source loc info: the location of the first occurrence of the symbol not of this form. Then when some error is reported against the form, the bogus source loc info is shown. * parser.y (rlviable): New static function. (rlset): Only record source loc for forms which satisfy rlviable.
*	read, iread: source location recording now conditional.	Kaz Kylheku	2017-12-29	1	-58/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recording of source location info incurs a time and space penalty. We don't want to impose this on programs which are just reading large amounts of Lisp data that isn't code. * eval.c (eval_init): Register lisp-parse and read functions to the newly introduced nread function rather than lisp_parse. lisp_parse continues to record source location info unconditionally. * parser.c (rec_source_loc_s): New symbol variable. (parser_common_init): Set the new member of the parser structure, rec_source_loc, according to the current value of the special var rec-source-loc. (lisp_parse_impl): New second argument, rlcp_p. If true, it overrides the rec_source_loc member of the parser structure to true. (lisp_parse): Pass true argument to rlcp_p parameter of lisp_parse_impl, so parsing via lisp_parse always records source loc info. (nread): New function. (iread): Pass true argument to rlcp_p parameter of lisp_parse_impl, so rec-source-loc controls whether source location info is recorded. (parse_init): Initilize rec_source_loc_s symbol variable, and register the rec-source-loc special var. * parser.h (struct parser): New member, rec_source_loc. (rec_source_loc_s, nread): Declared. * parser.y (rlcp_parser): New static function. Like rlcp but does nothing if parser->rec_source_loc is false. (rlc): New macro. (grammar): Replace rlcp uses with rlc, which expands to a call to rlcp_parser. (rlrec): Do nothing if source loc recording is not enabled in the parser. (make_expr, uref_helper): Replace rlcp with rlc. This is possible because these functions have a parser local variable that the macro expansion can refer to. (parse_once): Override rec_source_loc in the parser to 1, so that source loc info is always recorded when parsing is invoked through this function. * txr.1: Documented rec-source-loc and added text under read and iread.
*	cleanup: remove unnecessary header includes.	Kaz Kylheku	2017-09-19	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	* eval.c: doesn't need rand.h. * filter.c: doesn't need gc.h. * parser.l: doesn't need eval.h. * parser.y: doesn't need utf8.h, stream.h, args.h or cadr.h.
*	parser: fix precedence of DOTDOT.	Kaz Kylheku	2017-09-07	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is that a.b .. c.d parses as (qref a b..c d), which is useless and counterintuitive. Let's fix it, but with a backward compatibility switch to give more leeway to any hapless people out there whose code happens to depend on this unfortunate situation. We basically use two token numbers for the .. token: OLD_DOTDOT, and DOTDOT. Both are wired into the grammar. In backward compatibility mode, the lexer pumps out OLD_DOTDOT. Otherwise DOTDOT. * parser.l (grammar): When .. is scanned, return OLD_DOTDOT when in compatibility with 185 or earlier. Otherwise DOTDOT. * parser.y (OLD_DOTDOT): New terminal symbol; introduced at the same high precedence previously occupied by DOTDOT. (DOTDOT): Changes precedence to lower than '.' and UREFDOT. (n_expr): Two productions added involving OLD_DOTDOT. These are copy and paste of the existing productions involving DOTDOT; the only difference is that OLD_DOTDOT replaces DOTDOT. (yybadtoken): Handle OLD_DOTDOT. * txr.1: Compat notes added.
*	txr -i honored despite parse-time exception.	Kaz Kylheku	2017-09-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an error is thrown while parsing a .txr file or while reading and evaluating the forms of a .tl file. * parser.y (parse_once, parse): Wording change in message when exception is caught. Only exceptions derived from error are caught. * txr.c (parse_once_noerr, read_eval_stream_noerr): New static functions. (txr_main): Use parse_once_noerr and read_eval_stream_noerr instead of parse_once and read_eval_stream. Don't exit if a TXR file has parser errors; in that situation, exit only if interactive mode is not requested, otherwise go interactive. Make sure self-path is registered to the name of the input source in this case also. * unwind.h (ignerr_func_body): New macro.
*	parser: bugfix: empty buf literal problem.	Kaz Kylheku	2017-08-22	1	-1/+2
\| \| \| \| \| \| \|	* parser.y (buflit): Fix neglect to call end_of_buflit in the empty buffer literal case, which precipitates syntax errors when an empty buffer literal #b'' is embedded in other syntax.
*	parser: fix byacc regression related to hash-semi.	Kaz Kylheku	2017-08-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "byacc_fool" rule needed a small update when in November 2016 it became involved in newly introduced #; (hash semicolon) syntax for commenting out. The problem is that when a top-level nested list expression is followed by #; then a byacc-generated parser throws a syntax error. This is because the byacc_fool production only generates a n_expr, or empty. Thus only those symbols are allowed which may begin a n_expr. The hash-semicolon token isn't one of them. * parser.y (byacc_fool): Add a production rule which generates a HASH_SEMI. Thus HASH_SEMI is now one of the terminals that may legally follow any sentential form that matches a byacc_fool rule after it.
*	parser: more efficient treatment of string literals.	Kaz Kylheku	2017-08-17	1	-27/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this patch we switch the string literal parser from right recursion to left recursion, so that it doesn't require a Yacc stack depth proportional to the number of characters in the literal. Secondly, we build the string directly in a syntax-directed way, rather than building a list of characters and then walking the list to build a string. This was discovered as a byacc regression, though the fix is not for the sake of byacc but basic efficiency. The Sep 29, 2015 commit 111650e235ab2e529fa1529b1c9a23688a11cd1f "Implementation of static slots for structures." extended the string literal in the struct.tl test case in such a way that if the parser is generated by byacc rather than GNU Bison, the test case fails with a "yacc stack overflow". I haven't done any regression testing with byacc in over two years so I didn't notice this. Quasiliterals could use this treatment also. Word list literals benefit from this change, but they still use a Yacc stack depth proportional to the number of words, since the accumulation of words is right recursive. * parser.y (lit_char_helper): Static function removed. (restlitchar): New grammar nonterminal symbol. (strlit, quasi_item, wordslit): No need to call lit_char_helper. (litchars): A litchars is now either a single LITCHAR, or else a LITCHARS followed by a sequence of more. This sequence is a separate production rule called restlitchar, which is purely left recursive. (If litchars is made directly left recursive without this helper rule, intractable reduce/reduce and shift/reduce conflicts arise.)
*	Bugfix: (sys:expr . atom) bad syntax out of parser.	Kaz Kylheku	2017-08-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	* parser.y (expand_meta): Fix incorrect conversion of (sys:var x) when x is a non-bindable term to (sys:expr . x). Should be (sys:expr x). This doesn't have that much of an impact, I don't think. It prevent certain degenerate forms from working like @(bind x @"str"). The bad thing is that this particular one has a silent problem: @"str" wrongly evaluates to #\s. Neverheless, this doesn't seem worth the addition of a compat flag test; the odds of someone depending on @"str" producing #\s in some pattern language code see vanishingly low.
*	Continuing implementation of buffers.	Kaz Kylheku	2017-04-21	1	-2/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Makefile (OBJS): New objects itypes.o and buf.o. * buf.c, buf.h: New files. * itypes.c, itypes.h: New files. * lib.c (obj_print_impl): Handle BUF via buf_print and buf_pprint. (init): Call itypes_init and buf_init. * parser.h (end_of_buflit): Declared. * parser.l (BUFLIT): New exclusive state. (grammar): New rules for recognizing start of buffer literal and its interior. (end_of_buflit): New function. * parser.y (HASH_B_QUOTE): New token. (buflit, buflit_items, buflit_item): New nonterminals and corresponding grammar rules. (i_expr, n_expr): These symbols now generate a buflit; a buffer literal is a kind of expression. (yybadtoken): Handle HASH_B_QUOTE case.
*	parser: add some error cases to hash notations.	Kaz Kylheku	2017-04-08	1	-1/+9
\| \| \| \| \| \| \| \|	Produce better diagnostics for expressions like #[... or #Habc. * parser.y (vector, hash, struct, range): Add error productions.
*	parser: refactor grammar to banish #[] etc.	Kaz Kylheku	2017-04-07	1	-16/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Turns out we have an over-eager parser whcih recognizes invalid notion such as #[...], #S[...] and others. This is because the grammar nonterminal list is overloaded with phrase structures. The syntax of a vector literal, for instance, is '#' list, but a list can be '[' ... and other expressions. * parser.y (dwim, meta, compound): New non-terminal symbols. Dwim derives the square bracketed "dwim" expressons that were previously overloaded into list. Meta derives the @ exprs. compound is what list used to be. (list): Handle only (...) list expressions. (o_elem, modifiers): Derives compound rather than list, thus preserving existing behavior. (i_expr, n_expr): Likewise. All other uses references to the list nonterminal stay, thereby trimming the grammar of dubious expressions.
*	parser: fix a...b syntax error.	Kaz Kylheku	2017-04-02	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	This issue has implications mainly for read/print consistency. The (rcons a .b) expression prints a...b, but that doesn't read back. The reason is that the . on .b isn't preceded by whitespace, and so isn't the UREFDOT token recognized in a n_expr. It's just the '.' token which is a syntax error in that situation. * parser.y (n_expr): New special case rule to handle the phrase pattern n_expr DOTDOT '.' n_expr which is now a syntax error.
*	parser: support uref dot as top-level expr.	Kaz Kylheku	2017-03-15	1	-0/+8
\| \| \| \| \|	* parser.y (hash_semi_oor_n_expr, hash_semi_or_i_expr): add grammar rules for leading dot.
*	parser: factor repeated uref-related code.	Kaz Kylheku	2017-03-15	1	-32/+20
\| \| \| \| \| \| \|	* parser.y (uref_helper): New static function. (list, i_dot_expr, n_expr, n_dot_expr): Replace most action code releated to unbound ref dot syntax with call to uref_helper.
*	Add in-package directive.	Kaz Kylheku	2017-03-13	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	* match.c (in_package_s): New symbol variable. (syms_init): Initialize in_package_s. * match.h (in_package_s): Declared. * parser.y (check_parse_time_action): Add case for in-package. Evaluate just with eval, as a case of the in-package macro. * txr.1: Documented.
*	New directive: mdo.	Kaz Kylheku	2017-03-12	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.h (progn_s): Declarationa added. * match.c (mdo_s): New symbol variable. (syms_init): Initialize mdo_s. * match.h (mdo_s): Declared. * parser.y (check_for_include): Renamed to check_parse_time_action and implements mdo, not only include. (clauses_rev): Follow rename of function. * txr.1: Documented.
*	uref: the a.b.c syntax extended to .a.b.c	Kaz Kylheku	2017-03-06	1	-13/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now it is possible to use a leading dot on the referencing dot syntax. This is the is the "unbound reference dot". It expands to the uref macro, which denotes an unbound-reference: it produces a function which takes an object as the argument, and curries the reference implied by the remaining arguments. * eval.c (uref_s): New global symbol variable. (eval_init): Intern uref symbol and init uref_s. * eval.h (uref_s): Declared. * lib.c (simple_qref_args_p): A qref expression is now also not simple if it contains an embedded uref, meaning that it cannot be rendered into the dot notation without ambiguity. (obj_print_impl): Support printing (uref a b c) as .a.b.c. * lisplib.c (struct_set_entries): Add uref to the list of autoload triggers for struct.tl. * parser.l (DOTDOT): Consume any leading whitespace as part of recognizing the DOTDOT token. Otherwise the new rule for UREFDOT, which matches (mandatory) leading space will take precedence, causing " .." to be scanned wrong. (UREFDOT): Rule for new kind of dot token, which is preceded by mandatory whitespace, and isn't consing dot (which has mandatory trailing whitespace too, matched by an earlier rule). * parser.y (UREFDOT): New token type. (i_dot_expr, n_dot_expr): New grammar rules. (list): Handle a leading dot on the first element of a list as a special case. Things are done this way because trying to work a UREFDOT into the grammar otherwise causes intractable conflicts. (i_expr): The ^, ' and , punctuators are now followed by an i_dot_expr, so that the expression can be an unbound dot. (n_expr): Same change as in i_expr, but using n_dot_expr. Plus new UREFDOT n_expr production. * share/txr/stdlib/struct.tl (uref): New macro. * txr.1: Documented.
*	Harmonize code with previous commit.	Kaz Kylheku	2017-03-04	1	-2/+3
\| \| \| \| \| \| \|	* parser.y (expand_repeat_rep_args): Use a sym local variable to avoid evaluating first(arg) twice, like the previous commit does in another case of this function.
*	bugfix: :vars in output repeat not registered.	Kaz Kylheku	2017-03-04	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Test case: @(output) @ (repeat :vars (x (y 42)) @ (list x y) @ (end) @(end) x and y are spuriously reported as unbound variables in the (list x y) form. * parser.y (expand_repeat_rep_args): Do the missing calls to match_reg_var when processing :vars list.
*	Support horizontal @(block), phase 1.	Kaz Kylheku	2017-02-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Unresolved issue: horizontal @(accept) terminating in a vertical @(block) or horizontal @(block) in a different line, or vertical @(accept) caught in horizontal context. * match.c (h_block, h_accept_fail): New functions. (dir_tables_init): Register horizontal @(block), @(accept) and @(fail). * parser.y (elem): Support BLOCK syntax.
*	bugfix: :filter not handled right in output var.	Kaz Kylheku	2017-01-26	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This issue was fixed in quasiliterals only. Because of the implementation duplicity between output vars and quasiliteral vars, we have to fix it in two places. When the parser handles quasiliterals, it builds vars without expanding the contents. The quasiliteral expander takes care of recognzing (sys:var ...) forms and properly handles them and their attributes, avoiding expanding the argument of a :filter keyword. When the parser handles an o_var that is a braced variable, it calls expand on its contents right there, then builds the (sys:var ...) form from the expanded contents. Why don't we just call expand_quasi in the o_var rule to have a single (sys:var ...) form expanded exactly how it is done in quasiliterals. * eval.c (expand_quasi): Change static function to external. * eval.c (expand_quasi): Declared. * parser.y (o_var): Construct an unexpanded (sys:var ...) form, and then wrap it in a one-element list. This is a de-facto quasi-items list, which can be expanded by expand_quasi. Then we pull the car of the expansion to get our expanded var.
*	bugfix: catch arguments not registered properly.	Kaz Kylheku	2017-01-23	1	-1/+1
\| \| \| \| \| \| \| \|	Symptom: variables appearing in a @(catch) are reported as unbound variables anyway. * parser.y (process_catch_exprs); The parameters are the second element of the catch form, not its rest.
*	Bump copyright year to 2017.	Kaz Kylheku	2017-01-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSE, LICENSE-CYG, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl, share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/package.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl: Add 2017 to all copyright headers and strings.
*	parser bugfix: expand used instead of expand_forms.	Kaz Kylheku	2017-01-22	1	-1/+1
\| \| \| \| \|	* parser.y (o_var): fix expand wrongly being called on a list of forms.
*	Enable unbound warnings when expanding TXR code.	Kaz Kylheku	2017-01-22	1	-12/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this change, Lisp expansion-time warnings are no longer suppressed during the parsing of the TXR pattern language. Embedded Lisp expressions can refer to TXR pattern variables, which generates spurious warnings that must be suppressed. Since TXR pattern variables are dynamically introduced in a very flexible way, it's hard to do an exact job of this. We take the crude approach that warnings are suppressed for all pattern variables that appear anywhere in the TXR code. To do that, we identify, at parse time, all directives which can bind new variables, and register those variables as if they were tentative global defs, purging all pending warnings for them. * match.c (binding_directive_table): New static hash table. (match_reg_var, match_reg_params, match_reg_elem): New functions. (match_reg_var_rec): New static function. (dir_tables_init): gc-protect binding_directive_table, and populate its entries. * match.h (into_k, named_k): Declared. (match_reg_var, match_reg_params, match_reg_elem): Declared. * parser.y (process_catch_exprs): New static function. (elem): Call match_reg_elem for each basic directive, to process the variables in that directive according to its operator symbol. Do this for each compound form elem and variable elem. Te horizontal @(define) eleme has its own grammar production here, and we handle its parameter list in that rule. (define_clause): Handle the parameters of a vertical @(define). It binds pattern variables, and so we must suppress unbound warnings for those. (catch_clauses_opt): Process the parameters bound by @(catch) clauses. (output_clause): Suppress warnings for the variables nominated by any :into or :named argument. (expand_repeat_rep_args): Suppress warnings for :counter variable, and for :vars variables. (parse_once): Remove the warning-muffling handler frame set up around the yyparse call. * txr.c (txr_main): Suppress warnings for TXR variables defined using -D syntax on the command line. Dump deferred warnings after parsing a .txr file.
*	Improve accuracy of expansion of repeat/rep args.	Kaz Kylheku	2017-01-22	1	-8/+10
\| \| \| \| \| \| \|	* parser.y (expand_repeat_rep_args): Correctly handle situation when :counter or :vars appears as an argument to another keyword. (A warning might be generated here, since this situation is wrong.)
*	bugfix: expand macros in a number of directives.	Kaz Kylheku	2017-01-21	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the last round of changes on this topic, bringing proper macro expansion to the arguments to @(skip), @(fuzz), @(next), @(call), @(cat), @(load) and @(close). * match.c (match_expand_keyword_args): Only process the keyword arguments if they are followed by an argument. Process @(next) arguments here too: :list and :string take a Lisp expression, but :tlist and :var take an argument which is not a Lisp expression and must be handled properly. Also, expand any non-keyword expression. This handles the <source> argument of @(next). (match_expand_elem): New function. * match.h (match_expand_elem): Declared. * parser.h (expand_meta): Declared. * parser.y (expand_meta): Static function becomes external. (elem): Expand elem other than require or do using match_expand_elem. We don't fold require and do into this because match_expand_elem has a backward compat switch in it that doesn't apply to these.
*	bugfix: expand dest arg of @(output).	Kaz Kylheku	2017-01-21	1	-1/+12
\| \| \| \| \|	* parser.y (expand_form_ver): New inline function. (output_clause): If exprs are present, expand first one.
*	Expand lisp forms in @(mod) and @(modlast) args.	Kaz Kylheku	2017-01-19	1	-4/+15
\| \| \| \| \| \|	* parser.y (expand_forms_ver): New function. (repeat_parts_opt, rep_parts_opt): Expand the exprs_opt that follow MOD or MODLAST.
*	Bugfix: expand macros in collect, coll, gather.	Kaz Kylheku	2017-01-19	1	-16/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the argument lists of @(collect)/@(repeat), @(coll)/@(rep) and @(gather), Lisp expressions can appear as arguments to keywords or for supplying default values for variables. These are not being macro-expanded. * match.c (match_expand_vars): New static function. (match_expand_keyword_args): New function. * match.h (match_expand_keyword_args): Declared. * parser.y (gather_clause, collect_clause, elem): Use new function in match.c to expand the argument lists.
*	Eliminate rejection of empty clauses.	Kaz Kylheku	2017-01-08	1	-87/+27
\| \| \| \| \| \|	* parser.y (grammar): Remove all checks which raise a syntax error if a clause is empty. These reject some correct situations, getting in the programmer's way.
*	Bugfix: incorrect quasi-quoting over #R syntax.	Kaz Kylheku	2016-12-25	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The issue is that ^#R(,(+ 2 2) ,(+ 3 3)) produces 4..6 rather than #R(4..6). 4..6 is, of course, the syntax (rcons 4 6) which evaluates to a range. Here we want the range to which it evaluates, not the syntax. * eval.c (expand_qquote_rec): Handle the case when the qquoted_form is a range atom: expand the from and to parts, and generate a rcons expression. Though this seems to be opposite to the previous paragraph, it's the right thing. * parser.y (range): Drop the unquotes_occurs case which produces rcons syntax. Produce a range object, always. This is the source of the problem: a (rcons ...) expression was produced here which was just traversed by the qquote expander as list. It's the expander that must produce the rcons expression.
*	Tweak terminology in some parser error messages.	Kaz Kylheku	2016-12-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Eliminate references to a "list expression". This term specifically denotes (list ...) and not any compound expression. * parser.y (define_clause): Change "unterminated list expression" to "unterminated define directive". (output_clause): Change "unterminated list expression" to "unterminated output directive". (list): Change "list expression" to "expression".
*	parser: fix problems at EOF involving #; syntax.	Kaz Kylheku	2016-12-06	1	-24/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses a problem whereby if a TXR Lisp file ends with an erased object notation such as #;(a b c), there is a syntax error. The strategy is to simplify the grammar so that a single yyparse primed with SECRET_ESCAPE_E or SECRET_ESCAPE_I will read either an object, or just one instance of the #; notation. If #;OBJ is read, then the parse tree is returned as the nao value. The caller knows that #;OBJ must have occurred because there are no errors and the parser isn't at EOF, yet there is no parse tree. Then in lisp_parse we can loop on this situation, and make adjustments elsewhere also. So that iread continues to work, we must separate the parser_eof condition from the lookahead token. Under iread, we were clearing the token in prime_parser_post, but that was having the side effect of making the parser look like it is in EOF. We now preserve the EOF indication in a flag, so we can manipulate the token. * parser.h (struct parser): new member, eof. * parser.c (parser_common_init): Initialize new eof flag in parser structure to zero. (prime_parser_post): Set the eof flag if the parser's most recent token is zero. (lisp_parse_impl): Call the parser repeatedly while there are no errors, and no EOF, yet no object has been produced. This indicates that a #; erasure has been processed. (read_eval_stream): Restructure the logic here for clarity. Do not break the loop if error_val was returned from the parser, but there are no errors, and the parser isn't at EOF. This is behavior is probably redundant with respect to the loop in lisp_parse_impl. (read_eval_ret_last): Bugfixes here. Pass an error indicating value down to lisp_parse, like in read_eval_stream and make the logic similar. (parser_eof): Just return an indication based no the eof flag. * parser.y (hash_semis_n_expr, hash_semis_i_expr, ignored_i_exprs, ignored_n_exprs): Grammar rules removed. (hash_semi_or_n_expr, hash_semi_or_i_expr): New grammar rules. (spec): Retarget SECRET_ESCAPE_E and SECRET_ESCAPE_I cases to new rules. (parse): Clear eof flag to zero.
*	Eliminate duplicated warning-suppressing function.	Kaz Kylheku	2016-11-28	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (warning_continue): Static function removed. (no_warn_expand): Use uw_muffle_warning instead of removed function. * parser.y (warning_continue): Static function removed. (parse_once): Use uw_muffle_warning instead of removed function. * unwind.c (uw_muffle_warning): New function. * unwind.h (uw_muffle_warning): Declared.
*	Expander warns about unbound variables.	Kaz Kylheku	2016-11-26	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_exception): New static function. (eval_error): Reduced to wrapper around eval_exception. (eval_warn): New function. (me_op): Bind the rest symbol in a shadowing env to suppress watnings about unbound rest. (do_expand): Throw a warning when a bindable symbol is traversed that has no binding. (expand): Don't install atoms as last_form_expanded. * lib.c (warning_s, restart_s, continue_s): New symbol variables. (obj_init): Initialize new symbol variables. * lib.h (warning_s, restart_s, continue_s): Declared. * lisplib.c (except_set_entries): New entries for ignwarn and macro-time-ignwarn. * parser.c (repl_warning): New static function. (repl): Use repl_warning function as a handler for warning exceptions: to print their message and then continue by throwing a continue exception. * parser.y (warning_continue): New static function. (parse_once): Use warning_continue to ignore warnings. In other words, we suppress warnings from Lisp that is mixed into TXR pattern language code, because this produces too many false positives. * share/txr/stdlib/except.tl (ignwarn, macro-time-ignwarn): New macros. * share/txr/stdlib/place.tl (call-update-expander, call-clobber-expander, call-delete-expander): Ignore warnings around calls to sys:expand, because of some gensym-related false positives (we expand code into which we inserted some gensyms, without having inserted the constructs which bind them. * tests/011/macros-2.txr: Suppress unbound variable warnings from a test case. * tests/012/ifa.tl: Bind unbound x y variables in one test case. * tests/012/struct.tl: Suppress unbound variable warnings in some test cases. * uwind.c (uw_throw): If a warning is unhandled, then print its message with a "warning" prefix and then throw a continue exception. (uw_register_subtype): Eliminate the check for sub already being a subtype of sup. This allows us to officially register new types against t. (uw_late_init): Register continue exception type as a subtype of the restart type. Formally register warning type. * txr.1: Documented ignwarn.
*	bugfix: quasilit read/print consistency, part 2.	Kaz Kylheku	2016-11-26	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this patch commit I'm addressing the issue introduced in part 1 that expressions in @(output) blocks are still using (sys:expr ...) wrapping, but are passed down to an evaluator which now expects unwrapped expressions now. As part of this change, I'm changing the representation of @expr from (sys:expr . expr) to (sys:expr expr). * eval.c (format_field): Adjust access to sys:expr expression based on new representation. (transform_op): Likewise. * lib.c (obj_print_impl): Likewise. * match.c (dest_bind): Likewise. (do_txeval): Likewise. (do_output_line): Likewise, in some compat code. Here is the fix for the issue: when calling tx_subst_vars, we pass a list of one element containing the expression, not wrapped in sys:expr. Previously, we passed a one-element list containing the sys:expr. * parser.y (o_elem): If a list occurs in the syntax, represent it as (sys:expr list) rather than (sys:expr . list). (list): Do the same for @ n_expr syntax. (expand_meta, make_expr): Harmonize with the representation change.
*	bugfix: quasilit read/print consistency, part 1.	Kaz Kylheku	2016-11-26	1	-20/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bug is that `@@@a` prints as `@@a` which reads as a different object. In this patch we simplify how quasiliterals are represented. Embedded expressions are no longer (sys:expr E), just E. Meta-numbers N and variables V are still (sys:var N). However `@@a` and `@a` remain equivalent. * eval.c (subst_vars): No need to look for expr_s; just evaluate a compound form. The recursive nested case is unnecessary and is removed. (expand_quasi): Do nothandle expr_s; it is not part of the quasi syntax any more. * lib.c (out_quasi_str): Do not look for expr_s in the quasi syntax; just print any expression with a @ the fallback case. * match.c (tx_subst_vars): Analogous changes to those done in subst_vars in eval.c. * parser.y (quasi_meta_helper): Static function removed. This was responsible for the issue due to stripping a level of meta from expressions already having a meta on them. (quasi_item): In the `@` n_expr syntax case, no longer call quasi_meta_helper. The remaining logic is simple enough to put in line. Symbols and integers get wrapped with (sys:var ...); other expressions are integrated into the syntax as-is.
*	Completion of fallback list implementation.	Kaz Kylheku	2016-11-16	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (find_symbol): New function. (symbol_present): Search the fallback list also to determine whether the symbol is visible. * lib.h (find_symbol): Declared. * parser.y (sym_helper): Implement a new behavior for qualified symbols. Interning new symbols is only allowed for packages that have an empty fallback list. * parser.c (get_visible_syms): New static function. (find_matching_syms): Use get_visible_syms to get the list of eligible symbols. This way the fallback list of the package is included if it is the current package. * share/txr/stdlib/package.tl (defpackage): Do not insert a default (:use usr) if there is no :usr clause. Since defpackage is very new, no need for backward compatibility; the amount of code depending on this is likely zero. * txr.1: Documented fallback list feature.