txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	args: don't use alloca for const size cases.	Kaz Kylheku	2022-10-15	1	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* args.h (args_decl_list): This macro now handles only constant values of N. It declares an anonyous container struct type which juxtaposes the struc args header with exactly N values. This is simply defined as a local variable without alloca. (args_decl_constsize): Like args_decl, but requiring a constant N; implemented via args_decl_list. (args_decl_list_dyn): New name for the old args_decl_list which calls alloca. No places in the code depend on this at all, except the definition of args_decl. (args_decl): Retargeted to args_decl_list_dyn. There is some inconsistency in the macro naming in that args_decl_constsize depends on args_decl_list, and args_decl depends on arg_decl_list_dyn. This was done to minimize diffs. Most direct uses of args_decl_list have a constant size, but a large number of args_decl uses do not have a constant size. * eval.c (op_catch): Use args_decl_constsize. * ffi.c (ffi_struct_in, ffi_struct_get, union_out): Likewise. * ftw.c (ftw_callback): Likewise. * lib.c (funcall, funcall1, funcall2, funcall3, funcall4, uniq, relate): Likewise. * socket.c (sockaddr_in_unpack, sockaddr_in6_unpack, sockaddr_un_unpack): Likewise. * stream.c (formatv): Likewise. * struct.c (struct_from_plist, struct_from_args, make_struct_lit): Likewise. * sysif.c (termios_unpack): Likewise. * time.c (broken_time_struct): Likewise.
*	funcall: consolidate VM fun handling.	Kaz Kylheku	2022-10-14	1	-34/+80
\| \| \| \| \| \| \|	* lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Handle FVM case separately, regardless of the f.variadic flag. If the VM function is variadic, the call can still use the special cases.
*	funcall: handle optargs in funcall helpers.	Kaz Kylheku	2022-10-14	1	-8/+85
\| \| \| \| \| \|	* lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Handle some situations when the function is a built-in with optional args.
*	funcall: don't route to generic_fun on optargs.	Kaz Kylheku	2022-10-14	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Do not go through the generic_funcall slow path just because the target function has optional arguments. It's possible that the call is supplying all of the required arguments. Let's try it like that and then if it doesn't work and there are optionals, check again and go the generic_funcall route. This might not be an overall improvement by itself, if we end up going to generic_funcall in more cases than not. However, this change paves the way for more changes: handling some cases of optargs in these helpers.
*	json: support standard-style formatting.	Kaz Kylheku	2022-10-11	1	-36/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stream.c (standard_k, print_json_format_s): New symbol variables. (stream_init): New variables initialized. * stream.h (enum json_fmt): New enum. (standard_k, print_json_format_s): Declared. * lib.c (out_json_rec): Take enum json_fmt param, and pass it recursively. Printing for vector and dictionaries reacts to argument value. (out_json, put_json): Examine value of special var print-json-format and calculate enum json_fmt value from this. Pass to out_json_rec. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	put-json: restore indent on unwinding.	Kaz Kylheku	2022-10-11	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following situation is observed in the listener. 1> (put-json #H(() (a 1))) { print: invalid object a in JSON during evaluation at expr-1:1 of form (put-json #H(() (a 1))) 1> 1> An indent established in the aborted JSON print job has been left in the stream. * lib.c (put_json): Save the indent value also, not only the mode. Restore the indent mode and value on unwinding, not just on a normal exit from out_json_rec, similiarly to what obj_print does.
*	strings: revert caching of hash value.	Kaz Kylheku	2022-10-08	1	-24/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Research indicates that this is something useful in languages that abuse strings for implementing symbols. We have interned symbols. * lib.h (struct string): Remove hash member. * lib.c (string_own, string, string_utf8, mkustring, string_extend, replace_str, chr_str_set): Remove all initializations and updates of the removed hash member. * hash.c (equal_hash): Do not cache string hash value.
*	strings: take advantage of malloc_usable_size	Kaz Kylheku	2022-10-06	1	-13/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On platforms which have the malloc_usable_size function, we don't have to store the allocated size of an object; malloc provides us the allocated size (which may be larger than we requested). Here we take advantage of this for strings. And since we don't have to store the string allocated size any more, we use that field for something else: storing the hash code (for seed zero). This can speed up some hashing operations. * configure (have_malloc_usable_size): New variable. Configure test for have_malloc_usable size. We have to try several header files, too. We set the configure variable HAVE_MALLOC_USABLE_SIZE, and possibly HAVE_MALLOC_H or HAVE_MALLOC_NP_H. * lib.h (struct string): If HAVE_MALLOC_USABLE_SIZE is true, we define a member called hash insetad of alloc. Also, we change alloc to cnum. * lib.c: Include <malloc_np.h> if HAVE_MALLOC_NP_H is defined. (string_own, string, string_utf8, mkstring, mkustring, init_str, string_extend, string_finish, string_set_code, string_get_code, length_str, replace_str, chr_str_set): Fix code for both cases. On platforms with malloc_usable_size, we have the allocated size from malloc, so we don't have to retrieve it from the object or store it. Any operations which mutate the string must reset the hash field to zero; zero means "hash has not been calculated". * hash.c (equal_hash): Just retrive a string's hash value, if it is nonzero, otherwise calculate, cache it and return it. * gc.c (mark_obj): The alloc member of struct string is a machine integer now; no need to mark it.
*	seq-iter: bugfix: floating-point ranges.	Kaz Kylheku	2022-09-15	1	-24/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (seq_iter_get_range_bignum): Static function renamed to seq_iter_get_range_number because it in fact generalizes to numbers. (seq_iter_peek_range_bignum): Renamed to seq_iter_peek_range_number. (seq_iter_get_rev_range_bignum): Renamed to seq_iter_get_rev_range_number. (seq_iter_peek_rev_range_bignum): Renamed to seq_iter_peek_rev_range_number. (si_range_bignum_ops): Renamed to si_range_number_ops. (si_rev_range_bignum_ops): Renamed to si_rev_range_number_ops. (seq_iter_init_with_info): Handle ranges where the from value is floating-point. Also, if the from-value is bignum that fits into cnum range, we now try to handle that as a cnum range. * tests/012/iter.tl: New tests.
*	Implement NaN boxing.	Kaz Kylheku	2022-09-13	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On platforms with 64 bit pointers, and therefore 64-bit-wide TXR values, we can use a representation technique which allows double floating-point values to be unboxed. Fixnum integers are reduced from 62 bits to 50, and there is a little more complexity in the run-time type checking and dispatch which costs extra cycles. The support is currently off by default; it must be explicitly enabled with ./configure --nan-boxing. * lib.h (NUM_MAX, NUM_MIN, NUM_BIT): Define separately for NaN boxing. (TAG_FLNUM, TAG_WIDTH, NAN_TAG_BIT, NAN_TAG_MASK, TAG_BIGMASK, TAG_BIGSHIFT, NAN_FLNUM_DELTA): New preprocessor symbols. (enum type, type_t): The FLNUM enumeration constant moves to just after LIT, so that its value is the same as TAG_FLNUM. (struct flonum): Does not exist under NaN boxing. (union obj): No fl member under NaN boxing. (tag, is_ptr): Separately defined for NaN boxing. (is_flo): New function under NaN boxing. (tag_ex): New function. It's like tag, but identifies floating-point values as TAG_FLNUM. The tag function continues to map them to TAG_PTR, which is wrong under NaN boxing, but needed in order not to separately write tons of cases in the arith.c module. (type): Use tag_ex, so TAG_FLNUM is handled, if it exists. (auto_str, static_str, litptr, num_fast, chr, c_n, c_u): Different definition for NaN boxing. (c_ch, c_f): New function. (throw_mismatch): Attribute with NORETURN. (nao): Separate definition for NaN boxing. * lib.c (seq_kind_tab): Reorder initializer to follow enum reordering. (seq_iter_rewind): use c_n and c_ch functions, since type checking has been done in those cases. The self parameter is no longer needed. (iter_more): use c_ch on CHR object. (equal): Use c_f accessor to get double value rather than assuming there is a struct flonum representation. (stringp): Use tag_ex, otherwise a floating-point number is identified as TAG_PTR. (diff, isec, isecp): Don't pass removed self parameter to seq_iter_rewind. * arith.c (c_unum, c_dbl_num, c_dbl_unum, plus, minus, signum, gt, lt, ge, le, numeq, logand, logior, logxor, logxor_old, bit, bitset, tofloat, toint, width, c_num, c_fixnum): Extract floating-point value using c_f accessor. Handle CHR type separately from NUM because the storage representation is no longer identical; CHR values have a two bit tag over bits where NUM has ordinary value bits. NUM is tagged at the NaN level with the upper 14 bits being 0xFFFC. The remaining 50 bits are the value. (flo): Construct unboxed float under NaN boxing by taking image of double as a 64 bit value, and adding the delta offset, then casting to the val pointer type. (c_flo): Separate implementation for NaN boxing. (integerp, numberp): Use tag_ex. * buf.c (str_buf, buf_int): Separate CHR and NUM cases, like in numerous arith.c functions. * chksum.c (sha256_hash, md5_hash): Use c_ch accessor for CHR value. * hash.c (equal_hash, eql_hash): Handle CHR separately. Use c_f accessor for floating-point value. (eq_hash): Use tag_ex and handle TAG_FLNUM value under NaN boxing. Handle CHR separately from NUM. * ffi.c (ffi_float_put, ffi_double_put, carray_uint, carray_int): Handle CHR and NUM separately. * stream.c (formatv): Use c_f accessor. * configure: disable automatic selection of NaN boxing on 64 bit platforms, for now. Add test whether -Wno-strict-aliasing is supported by the compiler, performed only if NaN boxing is enabled. We need to disable this warning because it goes off on the code that reinterprets an integer as a double and vice versa.
*	Reduce proliferation of TAG_SHIFT.	Kaz Kylheku	2022-09-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	* arith.c (num_to_buffer, c_unum, c_dbl_num, c_dbl_unum, c_num, c_fixnum): Use c_n inline function instead of open coding exactly the same thing. * lib.c (c_chr): Likewise. * struct.c (make_struct_type, lookup_slot, lookup_static_slot_desc, static_slot_p): Likewise.
*	syntax: read and print [. x] and [. @x].	Kaz Kylheku	2022-09-08	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (obj_print_impl): Handle (dwim . atom) syntax by printing [. atom]. Note that (dwim . @var) and (dwim . @(expr)) already print as [. @var] and [. @(expr)]; this is not new. But none of these forms are supported by reading without the accompanying change to the parser. * parser.y (dwim): Handle the [. expr] and [ . expr] syntax, so that forms like [. a] and [. @a] have print-read consistency. The motivation is to be able to [. @args] in pattern matching to match a DWIM forms; I tried that and was surprised to have it blow up in my face. * tests/012/readprint.tl: New test file. Future printer/parser changes will be tested here. Historically, changes to the syntax have not been consistently unit-tested. * y.tab.c.shipped: Regenerated.
*	New macro: close-lazy-streams.	Kaz Kylheku	2022-08-28	1	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (lazy_stream_s): New symbol variable. (lazy_streams_binding): New static variable. (lazy_stream_register): New static function (lazy_stream_cons): If the stream is associated with a lazy cons, register it with lazy_stream_register. (obj_init): gc-protect lazy_streams_binding variable. Intern the sys:lazy-streams symbol. * lib.h (lazy_streams_s): Declared. * eval.c (eval_init): Register sys:lazy-streams special variable. * stdlib/getput.tl (close-lazy-streams): New macro. * autoload.c (getput_set_entries): Trigger autload on close-lazy-streams symbol. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New function: search-all	Kaz Kylheku	2022-08-17	1	-9/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): search-all intrinsic registered. * lib.c (search_common): New Boolean argument all, indicating whether all positions are to be returned. We must handle this in the two places where empty key and sequence are handled, and also in the main loop. A trick is used: the found variable is now bound by list_collect_decl, but not used for collecting unless all is true. (search, rsearch, contains): Pass 0 for all argument of search_common. (search_all): New function. * lib.h (search_all): Declared. * tests/012/seq.tl: New tests. * txr.1: Documented. * stdlib/doc-syms.tl: Regenerated.
*	stringp: rewrite.	Kaz Kylheku	2022-07-28	1	-5/+13
\| \| \| \| \| \|	* lib.c (stringp): Examine tag and then type separately, rather than using the canned type function. This leads to slightly nicer code, shorter by a couple of instructions.
*	New function: count.	Kaz Kylheku	2022-07-18	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general count function, with keyfun and testfun, is noticeably absent. Let's implement it. * lib.[ch] (count): New function. * eval.c (eval_init): Register count intrinsic. * tests/012/seq.tl: Some tests for count. * txr.1: Add count to count-if section. Revise documentation based on pos/pos-if. * stdlib/doc-syms.tl: Updated.
*	New function: str	Kaz Kylheku	2022-06-12	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The str function is like mkstring but allows a fill pattern to be specified. * eval.c (eval_init): str intrinsic registered. * lib.[ch[ (str): New function. * tests/015/str.tl: New file. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New: spln and tokn functions.	Kaz Kylheku	2022-05-30	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of trying to work the new count parameter into the spl and tok functions, it's better to make new ones. * eval.c (eval_init): spln and tokn intrinsics registered. * lib.[ch] (spln, tokn): New functions. * tests/015/split.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	tok-str: takes count argument.	Kaz Kylheku	2022-05-28	1	-4/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Update registration of tok-str. * lib.c (tok_str): New argument, count_opt. Implemented in the compat 155 case; what the heck. (tok): Pass nil to new parameter of tok_str. * lib.h (tok_str): Declaration updated. * tests/015/split.tl: New tests. * txr.1: Documented.
*	split-str: new count parameter.	Kaz Kylheku	2022-05-17	1	-7/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Fix up registration of split-str to account for new parameter. * lib.c (split_str_keep): Implement new optional count argument. (spl): Pass nil value to split_str_keep for new argument. I'd like this function to benefit from this argument also, but the design isn't settled. (split_str): Pass nil argument to split_str_keep. * lib.h (split_str_keep): Declaration updated. * tests/015/split.tl: New tests. * txr.1: Documented.
*	Print ([...] . @var) and ([...] . @(expr)) notation.	Kaz Kylheku	2022-05-11	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \|	This change fixes objects like (@a @b . @c) being printed as (@a @b sys:var c). This is piggybacked into the logic which renders dotted unquotes. In other words, we are already printing (x . ,y) in that from rather than (x sys:unquote y); we just recognize sys:var, and sys:expr in the same code. * lib.c (obj_print_impl): Recognize dotted metavariables and metaexpressions similarly to dotted unquotes.
*	New function: isecp.	Kaz Kylheku	2022-03-30	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register isecp intrinsic. * lib.c (isecp): New function. * lib.h (isecp): Declared. * stdlib/compiler.tl (lambda-apply-transform, dump-compiled-objects): Use isecp instead of isec, since the actual intersection of symbols isn't needed, only whether it exists. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New function: partition-if.	Kaz Kylheku	2022-02-23	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register partition-if intrinsic. * lib.c (partition_if_countdown_funv, partition_if_func): New functions. (partition_if): New function. * lib.h (partition_if): Declared. * tests/012/seq.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	find-max: convert to seq_info iteration.	Kaz Kylheku	2022-02-22	1	-64/+17
\| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (find_max): Simplify into a single loop rather than handling various sequence types specially. This means it works for all iterable objects now. * txr.1: find-max documentation updated; discussion of hash tables removed, since the described behavior is the one expected for hash tables as iterables. * tests/012/seq.tl: Add some test coverage.
*	New functions: find-max-key and find-min-key.	Kaz Kylheku	2022-02-21	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register find-max-key and find-min-key intrinsics. * lib.c (find_max_key, find_min_key): New functions. * lib.h (find_max_key, find_min_key): Declared. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	ffi: move socket stuff to socket module.	Kaz Kylheku	2022-02-17	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* alloca.h (zalloca): Macro moved here from ffi.c; it's useful to any code that wants to do a zero-filled alloca, and socket.c needs it now. * ffi.c (HAVE_SOCKETS): All includes conditional on HAVE_SOCKETS removed. (zalloca): Macro removed; moved to alloca.h. (ffi_type_struct_checked, ffi_type_lookup): Static functions changed to external linkage. (ffi_type_size, ffi_type_put, ffi_type_get): New functions, used by external module that has incomplete definition of struct txr_ffi_type. (type_by_size): New static array, moved out of ffi_init_extra_types function. (ffi_type_by_size): New function. (ffi_init_extra_types): type_by_size array relocated to file scope. (sock_opt, sock_set_opt): Moved to socket.c, and adjusted to use newly developed external access to needed ffi mechanisms. (ffi_init): Numerous definitions related to sockets removed; these go to socket.c. * ffi.h (struct txr_ffi_type): Declared here now as incomplete type. (ffi_type_struct_checked, ffi_type_size, ffi_type_put, ffi_type_get, ffi_type_lookup, ffi_type_by_size): Declared. * lib.c (init): Call new function sock_init. * socket.c (sock_opt, sock_set_opt): New functions, moved from ffi.c, and slightly adapted to work with external interfaces exposed by ffi.c. (sock_init): New function. This performs unconditional initializations not keyed to the lazy loading lisplib.c mechanism. Here we create the socklen-t FFI type. FFI types lookup doesn't trigger lazy loading, so we do it this way; the alternative would be to introduce lazy load triggering to FFI type lookup, just for this one type. (sock_load_init): All the socket function and variable registrations move here from ffi_init.
*	all: wrong self name.	Kaz Kylheku	2022-02-13	1	-1/+1
\| \| \| \|	* lib.c (all_satisfy): self should be "all" rather than "some".
*	lib: fix return value of separate for nil seq.	Paul A. Patience	2022-02-07	1	-1/+1
\| \| \| \| \|	* lib.c (separate): Return (list nil nil) instead of just nil when the sequence parameter is nil, as is documented.
*	Use null_string throughout code base.	Kaz Kylheku	2022-02-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (load): Use null_string instead of lit(""). * lib.c (obj_init): Likewise. * match.c (LOG_MATCH, LOG_MISMATCH, do_txeval): Likewise. * parser.c (regex_parse, lisp_parse_impl, find_matching_syms): Likewise. * stream.c (do_parse_mode): Likewise. * txr.c (sysroot_init): Likewise. (txr_main): Replace string(L"") with null_string.
*	New function: copy-cptr.	Kaz Kylheku	2022-01-28	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): copy-cptr intrinsic registered. * lib.c (copy_cptr): New function. (copy): Use copy_cptr for CPTR objects. * lib.h (copy_cptr): Declared. * tests/017/ffi-misc.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	Remove numerous unused global functions.	Kaz Kylheku	2022-01-23	1	-108/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.[ch] (lookup_global_var_l): Remove. * itypes.[ch] (c_schar): Likewise. * lib.[ch] (null_list, rcyc_list, gequal, func_n6v, func_n7v, func_n8v, do_pa_123_23, pa_123_23, orf, aconsql_new_c): Likewise. (obj_init): Remove references to null_list. * mpi/mpi-config.h (MP_FOR_TXR): New preprocessor symbol, defined as 1. * mpi/mpi.c (mp_get_prec, mp_set_prec, mp_init_array, mp_clear_array, mp_set_word, mp_exptmod_d, mp_cmp_d, mp_cmp_mag, mp_cmp_int, mp_lcm, mp_xgcd, mp_invmod, mp_char2value): Exclude using #if !MPI_FOR_TXR, rather than remove. We don't bother excluding the declarations in the header. * utf8.[ch] (w_freopen): Remove.
*	lib: new functions nand, nor, nandf and norf.	Paul A. Patience	2022-01-22	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (me_nand, me_nor, nor_fun, nand_fun): New functions. (eval_init): Register new intrinsics. * lib.c (nandv, norv): New functions. * lib.h (nandv, norv): Declared. * txr.1: Documented, along with trivial fixes to the descriptions of and, or, andf, orf and notf. * stdlib/doc-syms.tl: Updated.
*	cptr-int: allow full unsigned range.	Kaz Kylheku	2022-01-13	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cptr-int function requries an address to be expressed as a signed integer, which is incovenient. E.g. -2147483648 to 2147483647 in a 32 bit address space. Let's fix it to accept an extended range. * lib.c (cptr_int): Convert the argument value to a ucnum if it is positive according to plusp, otherwise to cnum. Then either one to the mem_t * pointer. Thus we can accept either signed or unsigned values. * txr.1: Document the extended range of cptr-int.
*	Copyright year bump 2022.	Kaz Kylheku	2022-01-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	*LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2022.
*	Casts have crept into the code not wrapped by macros.	Kaz Kylheku	2022-01-06	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is against TXR coding conventions to use the C cast notation. The usage creeps into the code. To find instances of this, we must compile using GNU g++, and add -Wold-style-cast via EXTRA_FLAGS. * eval.c (prof_call): Use macro instead of cast. * ffi.c (pad_retval, ffi_varray_alloc, make_ffi_type_union, carray_dup, carray_replace, uint_carray, int_carray, put_carray, fill_carray): Likewise. * itypes.c (c_i64, c_u64): Likewise. * lib.c (cyr, chk_xalloc, spilt_str_keep, vector, cobj_register): Likewise. * linenoise.c (record_undo): Likewise. Also, drop one superfluous cast: wstrdup_fn returns wchar_t . (flash, edit_insert, edit_insert_str): Use macro instead of cast. mpi/mpi.c (s_mp_ispow2d): Likewise. * parser.c (lino_getch): Likewise. * rand.c (make_random_state, random_buf): Likewise. * stream.c (generic_get_line, do_parse_mode): Likewise. * struct.c (get_duplicate_supers, call_initfun_chain, call_postinitfun_chain): Likewise. * sysif.c (c_time): Likewise. * tree.c (tr_insert): Likewise.
*	lazy-str-get-trailing-list: spurious empty string issue.	Kaz Kylheku	2022-01-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (lazy_str_get_trailing_list): Remove the spurious empty string caused by splitting on the terminator. Whenever the materialized prefix is not-empty, and there is a non-empty terminator, the prefix necessarily ends in the termintator. If we split on the terminator, the list of pieces ends in in an empty string, which is undesirable. This has to be subject to compat, unfortunately; it's a very visible behavior that affects the continuation of line-based matching after the @(freeform) directive. * tests/006/freeform-5.txr: With this fix, we no longer have to match the spurious blank line coming from @(freeform). * tests/015/lazy-str.tl: New file. * txr.1: Updated documentation with compat notes. There was some outright incorrect text describing lazy-str-get-trailing-list. Also, the lazy-str-force-upto and lazy-str-force were under-documented. The return value of the former was not completely described: that it returns t in the other case when not returning nil. It wasn't mentioned that the functions observe the limit-count. Moreover, the exact algorithm for forcing is now documented.
*	Eliminate declaration-after-statement everywhere.	Kaz Kylheku	2021-12-29	1	-35/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The use of -ansi doesn't by itself diagnose instances of some constructs we don't want in the project, like mixed declarations and statements. * configure (diag_flags): Add -Werror=declaration-after-statement. This is C only, so filter it out for C++. Also add -Werror=vla. * HACKING: Update inaccurate statements about what dialect we are using. TXR isn't pure C90: some GCC extensions are used. We even use long long if the configure script detects it as working, and some C99 library features. * buf.c (replace_buf, buf_list): Fix by reordering. * eval.c (op_dohash, op_load_time_lit): Fix by reordering. * ffi.c (ffi_simple_release): Fix by reordering. (align_sw_get): Fix empty macro to expand to dummy declaration so a semicolon after it isn't interpreted as a statement. On platforms with alignment, remove a semicolon from the macro so that it requires one. (ffi_i8_put, ffi_u8_put): Fix by reordering. * gc.c (gc_init): Fix with extra braces. * hash.c (hash_init): Fix by reordering. * lib.c (list_collect_revappend, sub_iter, replace_str, replace_vec, mapcar_listout, mappend, mapdo, window_map_list, subst): Fix by reordering. (gensym, find, rfind, pos, rpos, in, search_common): Fix by renaming optional argument and using declaration instead of assignment. * linenoise/linenoise.c (edit_in_editor): Fix by reordering. * parser.c (is_balanced_line): Fix by reordering. * regex.c (nfa_count_one, print_rec): Fix by reordering. * signal.c (sig_mask): Fix by reordering. * stream.c (get_string): Fix by renaming optional argument and using declaration instead of assignment. * struct.c (lookup_static_slot_desc): Fix by turning mutated variable into block local. (umethod_args_fun): Fix by reordering. (get_special_slot): Fix by new scope via braces. * sysif.c (usleep_wrap): Fix by new scope via braces. (setrlimit_wrap): Fix by new scope via braces. * time.c (time_string_meth, time_parse_meth): Fix by reordering. * tree.c (tr_do_delete_spec): Fix by new scope via braces. * unwind.h (uw_block_beg): New macro which doesn't define RESULTVAR but expects it to refers to an existing one. (uw_block_begin): Replace do while (0) with enum trick so that we have a declaration that requires a semicolon, rather than a statement, allowing declarations to follow. (uw_match_env_begin): Now opens a scope and features the same enum trick as in uw_block_begin. This fixes a declaration-follows-statement issue in the v_output function in match.c. (uw_match_env_end): Closes scope opened by uw_match_env_begin. * unwind.c (revive_cont): Fix by introducing variable, and using new uw_block_beg macro. * vm.c (vm_execute_closure): Fix using combination of local variable and reordering.
*	The pairlis function comes to TXR Lisp.	Kaz Kylheku	2021-12-22	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register pairlis intrinsic. * lib.c, lib.h (pairlis): New function. * tests/012/seq.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New functions: subq, subql, subqual and subst.	Kaz Kylheku	2021-12-22	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register new intrinsics. * lib.c, lib.h (subq, subql, subqual, subst): New functions. * tests/012/seq.tl: New test cases. * stdlib/optimize.tl (subst): Function removed. The new subst drop-in replaces this one. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	less: bug, vectors not supported.	Kaz Kylheku	2021-12-20	1	-1/+2
\| \| \| \| \| \| \| \|	* lib.c (less_tab_init): Add missing initialization for VEC, with a priority above CONS: all vectors are greater than conses. The BUF priority is bumped to 7. * test/012/less.tl: New file.
*	tree: support for duplicate keys.	Kaz Kylheku	2021-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* tree.c (tr_insert): New argument for allowing duplicate. If it is true, suppresses the case of replacing a node, causing the logic to fall through to traversing right, so the duplicate key effectively looks like it is greater than the existing duplicates, and gets inserted as the rightmost duplicate. (tr_do_delete_specific, tr_delete_specific): New static functions. (tree_insert_node): New parameter, passed to tr_insert. (tree_insert): New parameter, passed to tree_insert_node. (tree_delete_specific_node): New function. (tree): New parameter to allow duplicate keys in the elements sequence. (tree_construct): Pass t to tree to allow duplicate elements. (tree_init): Update registrations of tree, tree-insert and tree-insert-node. Register tree-delete-specific-node function. * tree.h (tree, tree_insert_node, tree_insert): Declarations updated. (tree_delete_specific_node): Declared. * lib.c (seq): Pass t argument to tree_insert, allowing duplicates. * parser.c (circ_backpatch): Likewise. * parser.y (tree): Pass t to new argument of tree, so duplicates are preserved in the element list of the #T literal. * y.tab.c.shipped: Updated. * tests/010/tree.tl: Test cases for duplicate keys. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	tree-count: new function.	Kaz Kylheku	2021-12-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	* tree.c (tree_count): New function. (tree_init): tree-count intrinsic registered. * tree.h (tree_count): Declared. * lib.c (length): Support search tree argument via tree_count. * tests/010/tree.tl: Test cases for tree-count, indirectly via len. * txr.1: Documented.
*	iter-reset: gc problem.	Kaz Kylheku	2021-12-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (iter_reset): When we reinitialize the iterator, it can allocate a new secondary object, e.g. using hash_begin, which is stored into the iterator. This is potentially a wrong-way assignment in terms of GC generations and so we must call mut(iter) to indicate that the object has been suspiciously mutated. We only do this if the iterator has a mark function. If it doesn't have one, then it isn't wrapping a heap object, and so doesn't have this issue. (seq_reset): This has the same issue, and the fix is the same. Since ths function is obsolescent, we don't bother doing the si->ops->mark check; we optimize for code size instead.
*	iter-begin: gc problem.	Kaz Kylheku	2021-12-17	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue 1: the seq_iter_init_with_info function potentially allocates an object via hash_begin or tree_begin and installs it into the iterator. The problem is that under iter_begin, the iterator is a heaped object; this extra allocation can trigger gc which pushes the iterator into the mature generation; yet the assignment in seq_iter_init_with_info is just a plain assignment without using the set macro. Issue 2: when gc is triggered in the above situations, it crashes due to the struct seq_iter being incompletely initialized. The mark function tries to dereference the si->ops pointer. Alas, this is initialized in the wrong order inside seq_iter_init_with_info. Concretely, tree_begin is called first, and then the it->ops = &si_tree_ops assignment is performed, which means that if the garbage collector runs under tree_begin, it sees a null it->ops pointer. However, this issue cannot just be fixed here by rearranging the code because that leaves Issue 1 unsolved. Also, this initialization order is not an issue for stack-allocated struct seq_iters. The fix for Issue 1 and Issue 2 is to reorder things in iter_begin. Initialize the iterator structure first, and then create the iterator cobj. Now, of course, that goes against the usual correct protocol for object initialization. If we just do this re-ordering naively, we have Issue 3: the familiar problem that the cobj() call triggers gc, and the iterator object (e.g. from tree_iter) that has been stored into the seq_iter structure is not visible ot the GC, and is reclaimed. * lib.c (iter_begin): reorder the calls so that seq_iter_init_with_info is called first, and then the cobj to create from it the heap-allocated iterator, taking care of Issue 1 and Issue 2. To avoid Issue 3, after initializing the structure, we pull out the vulnerable iterator object into a local variable, and pass it to gc_hint(), to ensure that the variable is spilled into the stack, thereby protecting it from reclamation. (seq_begin): This function has exactly the same issue, fixed in the same way.
*	rot, nrot: new functions.	Kaz Kylheku	2021-12-07	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): nrot, rot intrinsics registered. * lib.c (nrot, rot): New functions. * lib.h (nrot, rot): Declared. * tests/012/seq.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	tuples*: new function.	Kaz Kylheku	2021-12-04	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register tuples* intrinsic. * lib.c (tuples_star_func): New static function. (tuples_star): New function. * lib.h (tuples_star): Declared. * tests/012/seq.tl: New test cases. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	tuples: change to abstract iteration.	Kaz Kylheku	2021-12-02	1	-10/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* lib.c (make_like): In the COBJ case, recognize an iterator object. Pull out the underlying object and recurse on it. This is needed in tuples_func, where make_like will now be called on the abstract iterator, rather than the actual sequence object. (tuples_func): The incoming object is now an iterator, and not a sequence; we need to handle it with iter_more, iter_item and iter_step. (tuples): Instead of nullify, begin iteration with iter_begin, and use iter_more to test for empty. In non-empty case, put propagate the iterator thorugh the lazy cons car field, rather than the sequence.
*	tuples: check length argument.	Kaz Kylheku	2021-12-02	1	-0/+4
\| \| \| \| \| \| \|	* lib.c (tuples): Check that n argument giving tuple size is a is a positive integer. * tests/012/seql.tl: Test case added.
*	less: symbolic arguments: fix crash and incorrectness.	Kaz Kylheku	2021-11-01	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \|	* lib.c (less): We cannot direclty access right->s.package because the right operand can be nil. This causes a crash. Furthermore, the separate NIL case is wrong. If the left object is nil, the same logic must be carried out as for SYM. The opposite operand might have the same name, and so packages have to be compared. We simply merge the two cases, and make sure we use the proper accessors symbol_name and symbol_package to avoid blowing up on nil.
*	printer: bug: fallback syms printed without prefix.	Kaz Kylheku	2021-10-12	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a basic read/print consistency problem. When a symbol is printed that is anywhere in the fallback list of the current package, we are dumping it unqualified, even if it is hidden by a same-named symbol in the current package itself or such a symbol occurring earlier in the fallback list. * lib.c (symbol_needs_prefix): When the to-be-printed symbol is found in the fallback list, re-scan the current package for a symbol having the same name, as well as the preceding nodes in the fallback list. If such a symbol is found, then the to-be printed symbol must be package-qualified. * tests/012/syms.expected: New file. * tests/012/syms.tl: Likewise. * tests/012/compile.tl: Pull syms into compile job. * txr.1: Clarify text about this. The existing text's only reasonable interpretation supports the behavior which this patch ensures (which is needed on grounds of read/print consistency) but the text lacks precision.