summaryrefslogtreecommitdiffstats
path: root/filter.c
Commit message (Collapse)AuthorAgeFilesLines
* New function regex-from-trie.Kaz Kylheku2016-10-121-0/+50
| | | | | | | * filter.c (regex_from_trie): New static function. (filter_init): Register regex-from-trie intrinsic. * txr.1: Documented regex-from-trie.
* Synchronize license comments with LICENSE.Kaz Kylheku2016-10-011-16/+17
| | | | | | | | | | | | | | | | | | | | * Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/awk.tl, share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl, share/txr/stdlib/conv.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/socket.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/termios.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, socket.c, socket.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Revert to verbatim 2-Clause BSD.
* Header file cleanup.Kaz Kylheku2016-01-221-1/+0
| | | | | | | * arith.c, cadr.c, debug.c, eval.c, filter.c, gencadr.txr, glob.c, hash.c, linenoise/linenoise.c, lisplib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Remove unncessary header files.
* Support for Base 64 encoding.Kaz Kylheku2016-01-211-1/+142
| | | | | | | | | | | | | * filter.c (tobase64_k, frombase64_k): New keyword symbol variables. (col_check, get_base64_char, b64_code): New static functions. (base64_decode, base64_encode): New functions. (filter_init): Initialize new keyword symbol variables, and register base64-encode and base64-decode intrinsic functions. * filter.h (base64_encode, base64_decode): Declared. * txr.1: Documented base64-encode, base64-decode, as well as :tobase64 and :frombase64.
* Copyright year bump.Kaz Kylheku2015-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright. * linenoise/LICENSE, linenoise/linenoise.c, linenoise/linenoise.h: Bump one principal author's copyright from 2014 to 2015. The code is based on a snapshot of 2015 upstream work.
* Renaming html filters to get rid of underscore.Kaz Kylheku2015-11-201-15/+19
| | | | | | | | | | | | | | | | | * filter.c (to_html_k, from_html_k, to_html_relaxed_k): Variables removed. (tohtml_k, tohtml_star_k, fromhtml_k): New keyword symbol variables. (to_html_table): Array renamed to tohtml_table. (to_html_relaxed_table): Renamed to tohtml_star_table. (from_html_table): Renamed to fromhtml_table. (html_encode): Refers to tohtml_k. (html_encode_star): Refers to tohtml_star_k. (html_decode): Refers to fromhtml_k. (filter_init): Initialize new symbol variables. Remove old registrations. Register filters under old names too. * txr.1: Document new names.
* New :to_html_relaxed filter and html-encode*.Kaz Kylheku2015-11-201-0/+17
| | | | | | | | | | | * filter.c (to_html_relaxed_k): New keyword symbol variable. (to_html_relaxed_table): New static array. (html_encode_star): New static function. (filter_init): Initialize new symbol variable. Register new filter. Register html-encode* function. * txr.1: Documented :to_html_relaxed filter and html-encode* function.
* Filtering to HTML should include ' -> '.Kaz Kylheku2015-11-201-0/+1
| | | | * filter.c (to_html_table): Add entry for apostrophe.
* Stop using C library setjmp/longjmp.Kaz Kylheku2015-10-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR is moving to custom assembly-language routines. This is mainly motivated by a very dubious thing done in the GNU C Library setjmp and longjmp in the name of security. Evidently, glibc's setjmp "mangles" certain pointer values which are stored into the jmp_buf buffer. It's been that way since 2005, evidently. This means that, firstly, all along, the use of setjmp in gc.c to get registers into a buffer so they can be scanned has not actually worked properly. More importantly, this pointer mangling in setjmp and longjmp is very hostile to a stack copying implementation of delimited continuations. The reason is that continuations contain jmp_buf buffers, which get relocated in the process of capturing and reviving a continuation. Any pointers in a jmp_buf which point into the captured stack segment have to be fixed up to point into the relocated location. Mangled pointers make this difficult, requiring hacks which are specific to glibc and the machine architecture. We might as well implement a clean, well-behaved setjmp and longjmp. * Makefile (jmp.o): New object file. (dbg/%.o, opt/%.o): New rules for .S prerequisites. * args.c, arith.c, cadr.c, combi.c, cadr.c, combi.c, debug.c, eval.c, filter.c, glob.c, hash.c, lib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Removed <setjmp.h> include. * gc.c: Switch to struct jmp and jmp_save, instead of jmp_buf and setjmp. * jmp.S: New source file. * signal.h (struct jmp): New struct type. (jmp_save, jmp_restore): New function declarations denoting assembly language routines in jmp.S. (extended_jmp_buf): Uses struct jmp instead of setjmp. (extended_setjmp): Use jmp_save instead of setjmp. (extended_longjmp): Use jmp_restore instead of longjmp.
* Renaming some functions for consistency.Kaz Kylheku2015-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * combi.c (perm_list, perm_str, rperm_list, reperm_gen_fun, rperm_vec, comb_vec, rcomb_list, rcomb_vec, rcomb_str): Follow rename of list_vector to list_vec. * eval.c (vector_list_s): Global variable renamed to vec_list_s. (expand_qquote): Follow vector_list_s to vec_list_s. (eval_init): Follow renames of all identifiers. Functions num-chr, chr-num, vector-list and list-vector are registered under new names, while remaining registered under old names. * eval.h (vector_list_s): Declaration renamed. * filter.c (url_encode): Follow chr_num to chr_int rename. * lib.c (make_like, interpose, shuffle): Follow vector_list to vec_list rename. (tolist, replace, replace_list): Follow list_vector to list_vec rename. (num_chr): Renamed to int_chr. (chr_num): Renamed to chr_int. (vector_list): Renamed to vec_list. (list_vector): Renamed to list_vec. * lib.h (num_chr, chr_num, list_vector, vector_list): * Declarations renamed. * parser.y (vector): Follow vector_list to vec_list rename. * txr.1: Updated documentation for num-chr, chr-num, list-vector and vector-list with new names, and notes about the old names being supported, but obsolescent.
* Fix misspelled :toinger filter.Kaz Kylheku2015-09-101-3/+3
| | | | | | | | | | | | | | | | | | | The manual documents a filter called :tointeger, but it has never existed. The keyword was misspelled in the source code from the beginning as :toinger. I'm fixing the typo and renaming the filter to :toint at the same time. * filter.c (tointeger_k, toint_k): Former renamed to latter. (filter_init): Initialize toint_k with correctly spelled symbol. Register filter under toint_k. * filter.h (tointeger_k, toint_k): Former renamed to latter. * txr.1: Changed mention of :tointeger to :toint.
* C++ upkeep: resolve multiple definitions of fun_k.Kaz Kylheku2015-08-071-2/+1
| | | | | | | | | | | | | | | * eval.c (fun_k): Global definition removed. (eval_init): Do not initialize fun_k here. * filter.c (fun_k): Definition removed. (filter_init): Do not initialize fun_k. * filter.h (fun_k): Declaration removed. * lib.c (fun_k): Defined in this file now. (obj_init): Initialize fun_k here. * lib.h (fun_k): Declare here.
* * filter.c, utf8.c: Fix bad indentation introduced in whitespaceKaz Kylheku2015-07-301-1/+1
| | | | fix on 2013-08-09.
* Update copyright notices from 2014 to 2015.Kaz Kylheku2015-02-011-1/+1
| | | | | | | | | | | * arith.c, arith.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Update. * LICENSE, METALICENSE: Likewise.
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-2/+2
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.
* Converting cast expressions to macros that are retargettedKaz Kylheku2014-10-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to C++ style casts when compiling as C++. * lib.h (strip_qual, convert, coerce): New casting macros. (TAG_MASK, tag, type, wli_noex, auto_str, static_str, litptr, num_fast, chr, lit_noex, nil, nao): Use cast macros. * arith.c (mul, isqrt_fixnum, bit): Use cast macros. * configure (INT_PTR_MAX): Define using cast macro. * debug.c (debug_init): Use cast macro. * eval.c (do_eval, expand_macro, reg_op, reg_mac, eval_init): Use cast macros. * filter.c (filter_init): Use cast macro. * gc.c (more, mark_obj, in_heap, mark, sweep_one, unmark): Use cast macros. * hash.c (hash_double, equal_hash, eql_hash, hash_equal_op, hash_hash_op, hash_print_op, hash_mark, make_hash, make_similar_hash, copy_hash, gethash_c, gethash, gethash_f, gethash_n, remhash, hash_count, get_hash_userdata, set_hash_userdata, hash_iter_destroy, hash_iter_mark, hash_begin, hash_uni, hash_diff, hash_isec): Use cast macros. * lib.c (code2type, chk_malloc, chk_malloc_gc_more, chk_calloc, chk_realloc, chk_strdup, num, c_num, string, mkstring, mkustring, upcase_str, downcase_str, string_extend, sub_str, cat_str, trim_str, c_chr, vector, vec_set_length, copy_vec, sub_vec, cat_vec, cobj_print_op, obj_init): Likewise. * match.c (do_match_line, hv_trampoline, match_files, dir_tables_init): Likewise. * parser.l (grammar): Likewise. * parser.y (parse): Likewise. * rand.c (make_state, make_random_state, random_fixnum, random): Likewise. * regex.c (CHAR_SET_L2_LO, CHAR_SET_L2_HI, CHAR_SET_L1_LO, CHAR_SET_L1_HI, CHAR_SET_L0_LO, CHAR_SET_L0_HI, L0_full, L0_fill_range, L1_full, L1_fill_range, L1_contains, L1_free, L2_full, L2_fill_range, L2_contains, L2_free, L3_fill_range, L3_contains, L3_free, char_set_create, char_set_cobj_destroy, nfa_state_accept, nfa_state_empty, nfa_state_single, nfa_state_wild, nfa_state_set,
* * filter.c (filter_init): Expose the trie-lookup-begin,Kaz Kylheku2014-08-091-0/+3
| | | | | | trie-value-at and trie-lookup-feed-char functions as intrinsics. * txr.1: Document exposed functions.
* * Makefile, arith.c, arith.h, combi.c, combi.h, configure, debug.c,Kaz Kylheku2014-07-231-16/+16
| | | | | | | | debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Synchronize license header with LICENSE.
* Change to how locations are passed around, for the sake of generationalKaz Kylheku2014-03-291-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GC. The issue being solved here is the accuracy of the gc_set function. The existing impelmentation is too conservative. It has no generation information about the memory location being stored, and so it assumes the worst: that it is a location in the middle of a gen 1 object. This is sub-optimal, creating unacceptable pressure against the checkobj array and, worse, as a consequence causing unreachable gen 0 objects to be tenured into gen 1. To solve this problem, we replace "val *" pointers with a structure of type "loc" which keeps track of the object too, which lets us discover the generation. I tried another approach: using just a pointer with a bitfield indicating the generation. This turned out to have a serious issue: such a bitfield goes stale when the object is moved to a different generation. The object holding the memory location is in gen 1, but the annotated pointer still indicates gen 0. The gc_set function then makes the wrong decision, and premature reclamation takes place. * combi.c (perm_init_common, comb_gen_fun_common, rcomb_gen_fun_common, rcomb_list_gen_fun): Update to new interfaces for managing mutation. * debug.c (debug): Update to new interfaces for managing mutation. Avoid loc variable name. * eval.c (env_fbind, env_fbind): Update to new interfaces for managing mutation. (lookup_var_l, dwim_loc): Return loc type and update to new interfaces. (apply_frob_args, op_modplace, op_dohash, transform_op, mapcarv, mappendv, repeat_infinite_func, repeat_times_func): Update to new interfaces for managing mutation. * eval.h (lookup_var_l): Declaration updated. * filter.c (trie_add, trie_compress, trie_compress_intrinsic, * build_filter, built_filter_from_list, filter_init): Update to new * interfaces. * gc.c (gc_set): Rewritten to use loc type which provides the exact generation. We do not need the in_malloc_range hack any more, since we have the backpointer to the object. (gc_push): Take loc rather than raw pointer. * gc.h (gc_set, gc_push): Declarations updated. * hash.c (struct hash): The acons* functions use loc instead of val * now. (hash_equal_op, copy_hash, gethash_c, inhash, gethash_n, pushhash, Change to how locations are passed around, for the sake of generational GC. The issue being solved here is the accuracy of the gc_set function. The existing impelmentation is too conservative. It has no generation information about the memory location being stored, and so it assumes the worst: that it is a location in the middle of a gen 1 object. This is sub-optimal, creating unacceptable pressure against the checkobj array and, worse, as a consequence causing unreachable gen 0 objects to be tenured into gen 1. To solve this problem, we replace "val *" pointers with a structure of type "loc" which keeps track of the object too, which lets us discover the generation. I tried another approach: using just a pointer with a bitfield indicating the generation. This turned out to have a serious issue: such a bitfield goes stale when the object is moved to a different generation. The object holding the memory location is in gen 1, but the annotated pointer still indicates gen 0. The gc_set function then makes the wrong decision, and premature reclamation takes place. * combi.c (perm_init_common, comb_gen_fun_common, rcomb_gen_fun_common, rcomb_list_gen_fun): Update to new interfaces for managing mutation. * debug.c (debug): Update to new interfaces for managing mutation. Avoid loc variable name. * eval.c (env_fbind, env_fbind): Update to new interfaces for managing mutation. (lookup_var_l, dwim_loc): Return loc type and update to new interfaces. (apply_frob_args, op_modplace, op_dohash, transform_op, mapcarv, mappendv, repeat_infinite_func, repeat_times_func): Update to new interfaces for managing mutation. * eval.h (lookup_var_l): Declaration updated. * filter.c (trie_add, trie_compress, trie_compress_intrinsic, * build_filter, built_filter_from_list, filter_init): Update to new * interfaces. * gc.c (gc_set): Rewritten to use loc type which provides the exact generation. We do not need the in_malloc_range hack any more, since we have the backpointer to the object. (gc_push): Take loc rather than raw pointer. * gc.h (gc_set, gc_push): Declarations updated. * hash.c (struct hash): The acons* functions use loc instead of val * now. (hash_equal_op, copy_hash, gethash_c, inhash, gethash_n, pushhash,
* * eval.c (eval_init): Registration of url_encode and url_decodeKaz Kylheku2014-03-111-0/+28
| | | | | | | | | | | | | | | | moved to filter.c. * filter.c (trie_compress_intrinsic, html_encode, html_decode): New static functions. (filter_init): Register make_trie, trie_add, trie_compress_intrinsic, filter_string_tree, filter_equal, html_encode and html_decode as intrinsics. Move registration of url_encode and url_decode here. * genvim.txr: Look for registrations in filter.c too. * txr.1: Documented. * txr.vim: Updated.
* Replacing uses of the eq function which are used only as C booleans,Kaz Kylheku2014-02-221-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with just using the == operator. Removing cobj_equal_op since it's indistinguishable from eq. Streamlining missingp and null_or_missing_p. * eval.c (transform_op): eq to ==. (c_var_ops): cobj_equal_op to eq. * filter.c (trie_compress, trie_lookup_feed_char, filter_string_tree, html_hex_continue, html_dec_continue): eq to ==. * hash.c (hash_iter_ops): cobj_equal to eq. * lib.c (countq, getplist, getplist_f, search_str_tree, posq): eq to ==. (cobj_equal_op): Function removed. * lib.h (cobj_equal_op): Declaration removed. (missingp): Becomes a simple macro that yields a C boolean instead of t/nil val, because it's only used that way. (null_or_missing_p): Becomes inline function returning int. * match.c (v_output): eq to ==. * rand.c (random_state_ops): cobj_equal_op to eq. * regex.c (char_set_obj_ops, regex_obj_ops): cobj_equal_op to eq. (reg_derivative): Silly if3 expression replaced by null. (regexp): Redundant if2 expression wrapped around eq removed. * stream.c (null_ops, stdio_ops, tail_ops, pipe_ops, string_in_ops, byte_in_ops, string_out_ops, strlist_out_ops, dir_ops, cat_stream_ops): cobj_equal_op to eq. * syslog.c (syslog_strm_ops): cobj_equal_op to eq.
* The C function nullp is being renamed to null, and the rarelyKaz Kylheku2014-02-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | used global variable null which holds a symbol becomes null_s. A new macro called nilp is added that more efficiently checks whether an object is nil, producing a C boolean value rather than t or nil. Most of the uses of nullp in the codebase just become the more streamlined nilp. * debug.c (show_bindings): nullp to nilp * eval.c (lookup_var, lookup_var_l, lookup_fun, lookup_sym_lisp1, do_eval, expand_qquote, expand_quasi, expand_op): nullp to nilp. (op_modplace): nullp to null. (eval_init): Update registration of null and not from C function nullp to null. * filter.c (trie_compress, html_hex_continue): nullp to nil. (filter_string_tree): null to null_s. * hash.c (hash_next): nullp to nilp. * lib.c (null): Variable renamed to null_s. (code2type): null to null_s. (lazy_flatten_scan, chainv, lazy_str, lazy_str_force_upto, obj_print, obj_pprint): nullp to nilp. (obj_init): null to null_s; nullp to null. * lib.h (null): declaration changed to null_s. (nullp): Inline function renamed to null. (nilp): New macro. * match.c (do_match_line): nullp to nilp. * rand.c (make_random_state): Likewise. * regex.c (compile_regex): Likewise.
* * arith.c: Revised error messages to refer to Lisp names insteadKaz Kylheku2014-01-151-1/+1
| | | | | | | | | | of C names of functions, or otherwise clarified them. * filter.c: Likewise. * lib.c: Likewise. * match.c: Likewise.
* First cut at signal handling support.Kaz Kylheku2013-12-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Makefile (OBJS-y): Include signal.o if have_posix_sigs is "y". * configure (have_posix_sigs): New variable, set by detecting POSIX signal stuff. * dep.mk: Regenerated. * arith.c, debug.c, eval.c, filter.c, hash.c, match.c, parser.y, parser.l, rand.c, regex.c, syslog.c, txr.c, utf8.c: Include new signal.h header, now required by unwind, and the <signal.h> system header. * eval.c (exit_wrap): New function. (eval_init): New functions registered as intrinsics: exit_wrap, set_sig_handler, get_sig_handler, sig_check. * gc.c (release): Unused functions removed. * gc.h (release): Declaration removed. * lib.c (init): Call sig_init. * stream.c (set_putc, se_getc, se_fflush): New static functions. (stdio_put_char_callback, stdio_get_char_callback, stdio_put_byte, stdio_flush, stdio_get_byte): Use new functions to enable signals when blocked on I/O. (tail_strategy): Allow signals across sleep. (pipev_close): Allow signals across waitpid. (se_pclose): New static function. (pipe_close): Use new function to enable signals across pclose. * unwind.c (uw_unwind_to_exit_point): use extended_longjmp instead of longjmp. * unwind.h (struct uw_block, struct uw_catch): jb member changes from jmp_buf to extended_jmp_buf. (uw_block_begin, uw_simple_catch_begin, uw_catch_begin): Use extended_setjmp instead of setjmp. * signal.c: New file. * signal.h: New file.
* Bumping copyrights to 2014 and expressing them as year ranges.Kaz Kylheku2013-12-101-1/+1
| | | | Fixing some errors in copyright comments.
* * filter.c, utf8.c: Tabs changed to spaces. For some reason, filter.cKaz Kylheku2013-08-091-41/+41
| | | | used 4 space tabs and utf8.c used 8 space tabs, inconsistently.
* Rounding out hash table functionality with lazy lists thatKaz Kylheku2012-04-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | can walk table content in different ways. * eval.c (op_dohash): Follow interface change of hash_next. (eval_init): hash-keys, hash-values, hash-pairs and hash-alist intrinsics introduced. * filter.c (trie_compress): Follow interface change of hash_next. * hash.c (hash_next): Silly interface which takes a pointer to the iterator has changed to just take the iterator. The function unambiguously returns nil when the iteration ends, so there is no need to set the iterator variable to nil. (maphash): Follows interface change of hash_next. (hash_keys_lazy, hash_values_lazy, hash_pairs_lazy, hash_alist_lazy): New static functions. (hash_keys, hash_values, hash_pairs, hash_alist): New functions. * hash.h (hash_next): Declaration updated. (hash_keys, hash_values, hash_pairs, hash_alist): Declared. * lib.c (make_half_lazy_cons): New way of constructing lazy cons, with the car field specified. It simplifies situations when the previous cons computes the car of the next one. Why hadn't I thought of this before? * lib.h (make_half_lazy_cons): Declared. * txr.1: Doc stubs for new hash functions. * txr.vim: Highlighting for new hash functions.
* * configure: Support a gen-gc configuration variable whichKaz Kylheku2012-04-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | causes CONFIG_GEN_GC to be defined as 1 in config.h. * eval.c (op_defvar, dwim_loc, op_modplace, transform_op): Handle mutating assignments via set macro. (op_dohash): Inform gc about mutated variables. TODO here. * filter.c (trie_add, trie_compress): Handle mutating assignments via set macro. * gc.c (BACKPTR_VEC_SIZE, FULL_GC_INTERVAL): New preprocessor symbols. (backptr, backptr_idx, partial_gc_count, full): New static variables. (make_obj): Initialize generation to zero. (gc): Added logic for deciding between full and partial gc. (gc_set, gc_mutated): New functions. * gc.h (gc_set, gc_mutated): Declared. * hash.c (hash_mark): Changed useless use of vecref_l to vecref. (gethash_f): Use set when assigning through *found since it is a possible mutation. * lib.c (car_l, cdr_l, vecref_l): Got rid of loc macro uses. Using the value properly is going to be the caller's responsibility. (push): push may be a mutation, so use set. (intern): Uset set to mutate a hash entry. (acons_new_l, aconsq_new_l): Use set when replacing *list. * lib.h (PTR_BIT): New preprocessor symbol. (obj_common): New macro for defining common object fields. type_t is split into two bitfields, half a pointer wide, allowing for generation to be represented. (struct any, struct cons, struct string, struct sym, struct package, struct func, struct vec, struct lazy_cons, struct cobj, struct env, struct bignum, struct flonum): Use obj_common macro to defined common fields. (loc): Macro removed. (set, mut): Macros conditionally defined for real functionality. (list_collect, list_collect_nconc, list_collect_append): Replace mutating operations with set. * match.c (dest_set, v_cat, v_output, v_filter): Replace mutating operations with set. * stream.c (string_in_get_line, string_in_get_char, strlist_out_put_string, strlist_out_put_char): Replace mutating operations with set. * unwind.c (uw_register_subtype): Replace mutating operation with set.
* Filtering on lists and nested lists is hereby made to work.Kaz Kylheku2012-03-261-17/+25
| | | | | | | | | | | | | | | | | | For instance given @(bind a ("a" "b" "c")) it is now possible to do @(filter :upcase a) whereby a promptly takes on the value ("A" "B" "C"). * filter.c (string_filter): Function renamed to string_tree_filter. (compound_filter): Follows rename. (filter_string): Function renamed to filter string tree. Can filter tree of strings, or possibly other objects, if the filter function allows. (filter_equal): No special case test for objects that are strings. Just put them through the filter. * filter.h (filter_string): Declaration updated. * match.c (format_field, subst_vars, v_filter): Follow rename.
* * eval.c (eval_init): New intrinsic num-str registered.Kaz Kylheku2012-03-261-0/+9
| | | | | | | | | | | | | | | * filter.c (tonumber_k, tointeger_k, tofloat_k, hextoint_k): New keyword variables. (filter_init): New variables initialized; new filters registered. * filter.h (tonumber_k, tointeger_k, tofloat_k, hextoint_k): Declared. * lib.c (num_str): New function. * lib.h (num_str): Declared. * txr.1: New filters documented.
* * eval.c (eval_init): url_decode has two parameters now,Kaz Kylheku2012-03-181-6/+16
| | | | | | | | | | | | | | | | | | so we make the second one optional. * filter.c (topercent_k, frompercent_k): New keyword variables. (url_encode, url_decode): Take a second parameter, space_plus. This determines whether or not to apply the rule that a space encodes as a + character. (filter_init): Initialize new keyword variables, and register :topercent and :frompercent filters. Fix the previous registrations of :tourl and :fromurl using currying. * filter.h (urlencode, urldecode): Declarations updated. (topercent_k, frompercent_k): Declared. * txr.1: Documented.
* * filter.c (digit_value): static function moved.Kaz Kylheku2012-03-171-14/+12
| | | | | (html_hex_continue): Use digit_value instead of hex digits string literal.
* Version 61txr-61Kaz Kylheku2012-03-151-0/+1
| | | | | | | | | | | | | | | | | | | | | * txr.c (version): Bumped. * txr.1: Bumped version and set date. * configure (txr_ver): Bumped. * eval.c (op_modplace): Fix warning about uninitialized variable. No bug. * filter.c: gcc compilation regresion: missing <stdio.h> breaks inclusion of "stream.h" header. Strangely, didn't show up when configured for compiling with g++ on Ubuntu. * match.c (match_filter): Fixed ununsed variable warning. * txr.vim: Bunch of missing keywords added. * dep.mk: Regenerated.
* Implementing URL filtering.Kaz Kylheku2012-03-131-0/+79
| | | | | | | | | | | | | | | | * eval.c (eval_init): New intrinsic functions: url-encode, url-decode. * filter.c (tourl_k, fromurl_k): New keyword variables. (is_url_reserved, digit_value): New static functions. (url_encode, url_decode): New functions. (filter_init): Intialize new keyword variables and register new :tourl and :fromurl filters. * filter.h (tourl_k, fromurl_k, url_encode, url_decode): Declared. * txr.1: Updated. * txr.vim: Likewise.
* * arith.c: Updated copyright year.Kaz Kylheku2012-02-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * arith.h: Likewise. * debug.c: Added copyright header. * debug.h: Updated copyright year. * eval.c: Likewise. * eval.h: Likewise. * filter.c: Likewise. * filter.h: Likewise. * gc.c: Likewise. * gc.h: Likewise. * hash.c: Likewise. * hash.h: Likewise. * lib.c: Likewise. * lib.h: Likewise. * match.c: Likewise. * match.h: Likewise. * parser.h: Likewise. * regex.c: Likewise. * regex.h: Likewise. * stream.c: Likewise. * stream.h: Likewise. * txr.c: Likewise, and e-mail address. * txr.h: Updated copyright year. * unwind.c: Likewise. * unwind.h: Likewise.
* * match.c (match_funcall): Function renamed to match_filter.Kaz Kylheku2012-02-161-1/+1
| | | | | | * match.h (match_funcall): Declaration updated. * filter.c (get_filter): Updated.
* Infrastructure for storing line number informationKaz Kylheku2011-11-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | outside of the code, in hash tables. * filter.c (make_trie, trie_add): Update to three-argument make_hash. * hash.c (struct hash): New members, hash_fun, assoc_fun acons_new_l_fun. (ll_hash): Renamed to equal_hash. (eql_hash): New static function. (cobj_hash_op): Follows ll_hash rename. (hash_grow): Use new function indirection to call hashing function. (make_hash): New argument to specify type of hashing. Initialize new members of struct hash. (gethash_l, gethash, remhash): Use function indirection for hashing and chain search and update. (pushhash): New function. * hash.h (make_hash): Declaration updated with new parameter. (pushhash): Declared. * lib.c (eql_f): New global variable. (eql, assq, aconsq_new, aconsq_new_l): New functions. (make_package): Updated to new three-argument make_hash. (obj_init): gc-protect and initialize new variable eql_f. * lib.h (eql, assq, aconsq_new, aconsq_new_l): Declared. * match.c (dir_tables_init): Updated to there-argument make_hash. * parser.h (form_to_ln_hash, ln_to_forms_hash): Global variables declared. * parser.l (form_to_ln_hash, ln_to_forms_hash): New global variables. (grammar): Set yylval.lineno for tokens that are classified to that type in parser.y. (parse_init): Initialize and gc-protect new global variables. * parser.y (rl): New static helper function. (%union): New member, lineno. (ALL, SOME, NONE, MAYBE, CASES, CHOOSE, GATHER, AND, OR, END, COLLECT, UNTIL, COLL, OUTPUT, REPEAT, REP, SINGLE, FIRST, LAST, EMPTY, DEFINE, TRY, CATCH, FINALLY, ERRTOK, '('): Reclassified as lineno type. In the grammar, these keywords can thus provide a stable line number from the lexer. (grammar): Numerous rules updated to add constructs to the line number hash tables via the rl helper. * dep.mk: Updated. * Makefile (depend): Use the installed, stable txr in the system path to update dependencies rather than locally built ./txr, to prevent the problem that txr is broken because out out-of-date dependencies, and thus cannot regenerate dependencies.
* * filter.c (function_filter): Function removed.Kaz Kylheku2011-10-251-7/+1
| | | | | | | | | | | | | | (get_filter): Treat (:fun ...) syntax as a single function call with extra arguments, currying it up as curried function that invokes match_funcall once. * match.c (match_funcall): Extended to take a list of the additional arguments from get_filter. Adds these to the function call form generated for the v_func call. * match.h (match_funcall): Declaration updated. * txr.1: Function Filter additional arguments documented.
* Shorthand for filters which map multiple texts to a commonKaz Kylheku2011-10-251-2/+2
| | | | | | | | | | | | | | | | | replacement text. * filter.c (build_filter_from_list): Allow tuples to denote multiple keys mapping to the same value. * lib.c (do_curry_123_2, do_curry_123_1): New static functions. (curry_123_2, curry_123_1): New functions. * lib.h (curry_123_2, curry_123_1): New functions declared. * match.c (v_deffilter): Allow tuples of strings rather than just pairs. * txr.1: Updated.
* * filter.c (fun_k): New keyword variable.Kaz Kylheku2011-10-251-5/+6
| | | | | | | (function_filter): Use :fun keyword symbol instead of fun. (filter_init): New keyword variable initialized. * filter.h (upcase_k, downcase_k, fun_k): Declared.
* * filter.c (function_filter): New function.Kaz Kylheku2011-10-241-4/+15
| | | | | | | | | | | | | | | | | | | | | | (get_filter): Handle (fun ...) syntax. * match.c (v_bind): Establish dynamic environment frame around dest_bind, and stash the bindings there so filters can have access to the bindings. (v_output): Likewise, around do_output calls. (v_fun): New function. (match_files): Function handling broken out into v_fun. (match_funcall): New function. * match.h (match_funcall): Declared. * unwind.c (uw_push_env): Initialize match_context. (uw_get_match_context, uw_set_match_context): New functions. * unwind.h (struct uw_dynamic_env): New member, match_context. (uw_get_match_context, uw_set_match_context): Declared. * txr.1: Documented function filters.
* Task #11474Kaz Kylheku2011-10-221-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | * filter.c (filter_equal): Takes two filters instead of one. (lfilt_k, rfilt_k): New keyword variables. (filter_init): New keyword variables initialized. * filter.h (filter_equal): Declaration updated. (lfilt_k, rfilt_k): Declared. * lib.c (funcall4): New function. (do_curry_1234_34): New static function. (curry_1234_34): New function. (do_swap_12_21): New static function. (swap_12_21): New function. * lib.h (funcall4, curry_1234_34, swap_12_21): Declared. * match.c (dest_bind): Swap use the function argument swapping combinator when calling tree find such that the value being searched is on the left and pattern material is on the right. (v_bind): Implemented :lfilt and :rfilt. * txr.1: Documented :lfilt and :rfilt.
* * filter.c (get_filter_trie): Function renamed to get_filter. A filterKaz Kylheku2011-10-221-2/+21
| | | | | | | | | | | | | | is not necessarily a trie. (string_filter, compound_filter): New functions. (get_filter): Recognize a compound filters and return a function which implements it. * filter.h (get_filter_trie): Declaration renamed. * match.c (format_field, v_bind, v_output): Follow get_filter_trie rename. Error message text updated. * txr.1: Describe compound filters.
* Task #11474Kaz Kylheku2011-10-221-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | * filter.c (filter_equal): New function. (upcase_k, downcase_k): New keyword variables. (filter_init): New keyword variables initialized, and new upcase and downcase filters registered. * filter.h (filter_equal): Declared. * lib.c (tree_find): Takes new argument, the equality test function. (upcase_str, downcase_str): New functions. (do_curry_123_23): New static function. (curry_123_23): New function. * lib.h (tree_find): Declaration updated. (upcase_str, downcase_str, curry_123_23): Declared. * match.c (dest_bind): Updated to take equality function. Uses it and passes it down to tree_find. (v_bind): Filter feature implemented. (h_var, v_try): Add equal_f to dest_bind argument list. * txr.1: Updated to describe new filters and bind arguments.
* * filter.c (trie_filter_string): Fix warning about uninitializedKaz Kylheku2011-10-161-1/+1
| | | | variable (not a bug, but compiler cannot prove that).
* Following up to previous commit's TODO.Kaz Kylheku2011-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | * filter.c (struct filter_par): wchar_t becomes wchli_t. * lib.h (wchli_t): New type: an incomplete structure type, so that a pointer to this type is incompatible with anything else. (wli): Macro produces const wchli_t * pointer instead of const wchar_t *. (auto_str, static_str): Accept a const wchli_t * instead of const wchar_t *, making it impossible to misuse these functions by passing in a literal. * stream.c (string_out_put_char): These type changes showed this hack to have a bug. Confronted with the need to cast from const wchar_t * to const wchli_t *, it's obvious that the conversion has to be done properly with the + 1 in the one platform case, but not the other. * txr.c (version): Type changed to const wchli_t. * txr.h (version): Declaration updated.
* Ported to Cygwin.Kaz Kylheku2011-10-091-258/+258
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TODO: there should be some type safety with the new wli macro so that if it is forgotten, there will be a diagnostic. * configure (lit_align): New configuration variable and configuration test. Generates LIT_ALIGN in config.h. Fixed the integer-holds-pointer test for the different output from the nm program on Cygwin. The arrays become common symbols marked C which do not show an offset attribute, only size: one less column. * filter.c (to_html_table, from_html_table): wrap wide string literals with the wli macro. This must be done from now on for all literals and initializes of arrays that are going to be directly converted to type tagged val-s. * lib.h (wli): New macro. (auto_str, static_str, litptr, lit_noex): Handle wide literals on platforms where they are aligned to only two bytes, such that we don't have two bits in the pointer. We can still add our 11 bit type tag, but then when recovering the pointer to the data, we have may have to fix up the pointer. * parser.l: Another portability issue here. Flex generates a scanner which has #include <unistd.h> in the middle, after the source file's own #includes which can introduce macros. On Cygwin, there is some hygiene problem whereby our "noreturn" macro causes the <unistd.h> header to generate bad syntax and fail to compile. Stupid Cygwin and even stupider flex! The workaround is to include <unistd.h> at the top in the flex source. * stream.c (string_out_put_char): This is one more place where the string literal handling hack spreads. * txr.c (version): Wrap string in wli.
* * LICENSE, Makefile, configure, filter.c, filter.h, gc.c, gc.h, hash.c,Kaz Kylheku2011-10-041-1/+1
| | | | | | hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated e-mail address.
* Maintaining C++ compiling (except for two issues that willKaz Kylheku2011-10-011-2/+3
| | | | | | | | | | | | | | | | need another commit). * filter.c: Include "gc.h" for prototype of protect. (struct filter_pair): Use const wchar_t *, so we can assign literals. (html_hex_continue): Ditto. * lib.c (and): Function renamed to andf, since and is a C++ operator. * lib.h (and): Declaration renamed. * match.c (match_files): Use of and updated to andf.
* * filter.c (filters, filter_init): Serious gc bug fixed: neglected toKaz Kylheku2011-10-011-0/+2
| | | | | inform the garbage collector about the filters global variable. Ouch!