summaryrefslogtreecommitdiffstats
path: root/utf8.h
Commit message (Collapse)AuthorAgeFilesLines
* Update copyright notices from 2014 to 2015.Kaz Kylheku2015-02-011-1/+1
| | | | | | | | | | | * arith.c, arith.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Update. * LICENSE, METALICENSE: Likewise.
* * Makefile, arith.c, arith.h, combi.c, combi.h, configure, debug.c,Kaz Kylheku2014-07-231-16/+16
| | | | | | | | debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Synchronize license header with LICENSE.
* A trivial change in the UTF-8 decoder allows TXR to handle null bytesKaz Kylheku2014-02-151-0/+3
| | | | | | | | | | | | | | in text. * utf8.h (UTF8_ADMIT_NUL): New preprocessor symbol. (utf8_decoder): New member, flags. * utf8.c (utf8_decoder_init): Initialize flags to 0. (utf8_decode): If a null byte is encountered in the input, then convert it to 0xDC00, rather than keeping it as zero, unless flags contains UTF8_ADMIT_NUL. * txr.1: Document handling of null bytes.
* * stream.c (remove_path, rename_path): New functions.Kaz Kylheku2014-01-281-0/+2
| | | | | | | | | | | * stream.h (remove_path, rename_path): Declared. * utf8.c (w_remove, w_rename): New functions. * utf8.h (w_remove, w_rename): Declared. * eval.c (eval_init): Registered remove_path and rename_path as intrinsics.
* Bumping copyrights to 2014 and expressing them as year ranges.Kaz Kylheku2013-12-101-1/+1
| | | | Fixing some errors in copyright comments.
* * stream.c (struct stdio_handle): New member, mode.Kaz Kylheku2013-11-281-0/+1
| | | | | | | | | | | | | (stdio_stream_mark): Mark the new member during gc. (stdio_seek): When we seek, we should reset the utf8 machine. (tail_strategy): New function. (tail_get_line, tail_get_char, tail_get_byte): Use tail_strategy for polling the file at EOF. (open_tail): Store the mode in the file handle. * utf8.c (w_freopen): New function. * utf8.h (w_freopen): Declared.
* * arith.c: Updated copyright year.Kaz Kylheku2012-02-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * arith.h: Likewise. * debug.c: Added copyright header. * debug.h: Updated copyright year. * eval.c: Likewise. * eval.h: Likewise. * filter.c: Likewise. * filter.h: Likewise. * gc.c: Likewise. * gc.h: Likewise. * hash.c: Likewise. * hash.h: Likewise. * lib.c: Likewise. * lib.h: Likewise. * match.c: Likewise. * match.h: Likewise. * parser.h: Likewise. * regex.c: Likewise. * regex.h: Likewise. * stream.c: Likewise. * stream.h: Likewise. * txr.c: Likewise, and e-mail address. * txr.h: Updated copyright year. * unwind.c: Likewise. * unwind.h: Likewise.
* * utf8.c (utf8_from_uc, utf8_decode): Impose a minium value on theKaz Kylheku2012-02-021-1/+1
| | | | | | | decoded character based on which UTF-8 case it is from. This rejects overlong forms. * utf8.h (struct utf8_decoder): New member, wch_min.
* * LICENSE, Makefile, configure, filter.c, filter.h, gc.c, gc.h, hash.c,Kaz Kylheku2011-10-041-1/+1
| | | | | | hash.h, lib.c, lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated e-mail address.
* * LICENSE, Makefile, configure, gc.c, gc.h, hash.c, hash.h, lib.c,Kaz Kylheku2011-09-231-1/+1
| | | | | | lib.h, match.c, match.h, parser.h, parser.l, parser.y, regex.c, regex.h, stream.c, stream.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Updated copyright year.
* Bump copyrights to 2010.Kaz Kylheku2010-10-051-1/+1
|
* More void * to mem_t * conversion.Kaz Kylheku2009-12-051-2/+2
|
* Provide both char * and unsigned char * interfaces in UTF-8 module.Kaz Kylheku2009-11-141-4/+8
| | | | Fix unsigned and plan char * mixing.
* Fixed broken utf8_from.Kaz Kylheku2009-11-121-0/+14
| | | | Added utf8_encode, utf8_decoder_init, utf8_decode.
* Big conversion to wide characters and UTF-8 support.Kaz Kylheku2009-11-111-0/+32
This is incomplete. There are too many dependencies on wide character support from the C stream I/O library, and implicit use of some encoding which may not be UTF-8. The regex code does not handle wide characters properly. Character type is still int in some places, rather than wchar_t. Test suite passes though.