summaryrefslogtreecommitdiffstats
path: root/stream.c
Commit message (Collapse)AuthorAgeFilesLines
* streams: new get-buf function.Kaz Kylheku2025-06-031-0/+52
| | | | | | | | | | | | | * stream.c (get_buf) New function. (stream_init): Register get-buf intrinsic. * stream.h (get_buf): Declared. * stdlib/getput.tl (sys:get-buf-common): Function removed. (file-get-buf, command-get-buf, map-command-buf, map-process-buf): Use get-buf instead of sys:get-buf-common. * txr.1: Documented.
* streams: get-string for string byte input stream.Kaz Kylheku2025-06-021-1/+2
| | | | | | * stream.c (byte_in_ops): Wire get_string operation to generic_get_string, giving the stream get-line and get-string support.
* streams: regression: gc issue in get_string_from_stream.Kaz Kylheku2025-05-311-1/+1
| | | | | | | | | * stream.c (get_string_from_stream_common): The so->buf = 0 assignment must precede the call to string_own(buf), because the string out stream object may already be garbage, and the string_own call will reclaim it. If we don't null out the buffer, the string will get ownership of a freed buffer. This reproduced in the CSV test case on MacOS Lion, 32 bit x86.
* buf: alternative constructor with C type arguments.Kaz Kylheku2025-05-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Many functions call make_buf, having to convert C types to the Lisp arguments using num or unum. Those conversions immediately get undone inside make_buf, and are subject to a wasteful type check. * buf.c (make_buf_fast): New function. * buf.h (make_buf): Misleading parameter renamed. (make_buf_fast): Declared. (sub_buf, buf_list, make_buf_stream, buf_fash, buf_and, buf_trunc): Replace make_buf with make_buf_fast. * lib.c (seq_build_init): Likewise. * ffi.c (ffi_put): Likewise. * stream.c (get_line_as_buf, iobuf_get): Likewise. * parser.y (buflit, buflit_items): Likewise. * y.tab.c.shipped: Regenerated.
* buf: use C types in buf and buf_strm structures.Kaz Kylheku2025-05-301-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using Lisp types for lengths, indices and buffer sizes has been awkward due to all the conversions. The code in buf.c and other code elsewhere that touches buffers, overall improves when we revise this decision. Mostly there are fewer conversion from Lisp to C which require a type check and self symbol for error diagnsotis, like c_unum(len, self). In a few places, there are conversions in the other direction that were not needed before, like unum(b->len). These are simpler and faster. * lib.h (struct buf): Members len and size change from val to ucnum (pointer-sized unsigned integer). * gc.c (mark_obj): No need to do anything with BUF any more; it has no Lisp object references. * buf.c (BUR_BORROWED): New preprocessor symbol. Because the allocation size can no longer be nil to indicate that the buffer is borrowed (the buffer object doesn't own the memory) we use this value instead: the highest value of the ucnum type. (buf_check_alloc_size): len parameter changes from cnum to ucnum. Defends against the BUF_BORROWED value. (buf_check_index): Returns ucnum rather than cnum. (err_oflow): NORETURN attribute added to prevent some spurious compiler warnings. (prepare_pattern): Drop c_unum conversion of len. (make_buf): Some locals change from cnum to ucnum. We lose an unnecessary conversion. (init_borrowed_buf): Take len as ucnum rather than val. Use BUF_BORROWED value for allocated size to indicate borrowed status. (make_borrowed_buf): Take len as ucnum. (make_duplicate_buf, make_owned_buf): Take len as ucnum, and take a self argument. Check for the size being BUF_BORROWED and reject. (make_ubuf): Parameter renamed. Lose a conversion from ucnum to val. (copy_buf): Check for allocated size being BUF_BORROWED to distinguish the two cases, rather than it being nil or not. (buf_shrink): Simplifies: loses a C to Lisp integer conversion, and no longer needs the local variable self. (buf_trim): Reject borrowed buffers by noticing the BUF_BORROWED value. (buf_do_set_len): len param becomes ucnum. Two Lisp-to-C integer conversiond disappear; one C-to-Lisp moves elsewhere in the code. (buf_set_length): Use buf_check_len to check and convert incoming len to ucnum. This is an improvement over the previous approach of letting buf_do_set_len to just rely on c_num conversions and their generic diagnostics. (buf_free): Check for borrowed buffer by comparing allocated size to BUF_BORROWED constant. (length_buf): ucnum to Lisp conversion now required here. (buf_alloc_size): Check for alloc_size being BUF_BORROWED and convert that to a nil return value. When returning an integer size, we need a conversion to a Lisp integer. (sub_buf): Use self symbol rather than lit("sub") when obtaining buffer handle. Use buf_check_len to validate the length and convert to C type. (replace_buf): Some cnum local variables become ucnum. We need a very careful comparison of w and l because w remains signed while l is unsigned. (buf_list): Substantially rewritten. We don't calculate the length of the sequence upfront, but extend the buffer as we add the elements to it. (buf_move_bytes): size parameter changes from cnum to ucnum. Lisp arithmetic replaced with C arithmetic; conversions eliminated. (buf_put_buf): Conversion eliminated in call to buf_move_bytes. (buf_put_bytes): Function reduced to wrapper for but_move_bytes, since it is almost identical. The only difference is that it performs memcpy rather than memmove which is not worth a separate function. (buf_put_i8, buf_put_u8, buf_put_char, buf_put_uchar, buf_get_i8, buf_get_u8): Simplified with C arithmetic and fewer conversions; cnum use replaced with ucnum. (buf_get_bytes): size parameter goes from cnum to ucnum. Overflow check for p + size addition added. (buf_print): Two conversions removed. (buf_str_sep): Conversion removed. (struct buf_strm): pos member changes from val to ucnum. (buf_strm_mark): Do not mark p->pos, no longer a Lisp object. (buf_strm_put_byte_callback): Lisp arithmetic removed, but a unum conversion is needed now in calling buf_put_uchar. That could be eliminated by not using the public interface. (buf_strm_get_byte_callback): Eliminate buf_check_index to validate the stream position; we simply check it against b->len. Becomes simple one liner. (buf_strm_get_char): Local variable index renamd to pos. Two conversions from and to Lisp eliminated, leaving no conversions. (buf_strm_unget_byte): Local variable p renamed to pos and changes from cnum to ucnum. Two conversions eliminated leaving no conversions. (buf_strm_fill_buf): Conversions eliminated. Check for the allocated size being BUF_BORROWED, in which case we fall back on using the length. Lisp arithmetic eliminated. (buf_strm_seek): Offset calculation done with C arithmetic and bounds checks. (buf_strm_truncate): Check incoming len with buf_check_len and convert to ucnum. Lisp arithmetic and conversions eliminated; buf_do_set_len used instead of public interface buf_set_length. (buf_strm_get_error): Use C comparison rather than ge function, and convert to t or nil result. (buf_strm_get_error_str): Bug: do not call errno_to_string since buffers don't talk to an operating system API that uses errno. The only error condition is eof. Thus, return either "eof" or "no error". (make_buf_stream): Initialize pos to 0 rather than Lisp zero. (swap32, buf_str, str_buf, buf_int, buf_uint, int_buf, uint_buf): Conversions eliminated; int_buf and uint_buf use C multiplication by 8. We know this doesn't overflow because the MPI bignums restrict the number of bits to something countable by a word. (buf_compress, buf_decompress, str_compress, str_decompress): Conversions eliminated. (buf_ash, buf_fash, buf_and, buf_test, buf_or, buf_xor, buf_not, buf_trunc, buf_bitset, buf_bit, buf_zero_p, buf_count_ones, binary_width, buf_xor_pattern): Make necessary adjustments, adding and/or elimiating conversions. * buf.h (make_borrowed_buf, init_borrowed_buf, make_owned_buf, make_duplicate_buf, buf_put_bytes): Declarations updated. * lib.c (equal, less): Conversions eliminated in BUF cases. * eval.c (map_common): Add self argument to make_owned_buf call. * chksum.c (chksum_ensure_buf): len param changes from cnum to ucnum. Conversions eliminated and use of lt() switches to C less-than operator. (sha1_stream, sha1_buf, sha1, sha1_hash, sha1_end, sha256_stream, sha256_buf, sha256, sha256_hash, sha256_end, md5_stream, md5_buf, md5, md5_hash, md5_end): Adjustments: conversions eliminated. (crc32_buf): Conversion eliminated. * genchksum.txr: Changes to chksum.c actually made here. * ffi.c (ffi_buf_in, ffi_buf_get, ffi_buf_d_in, ffi_buf_d_get, buf_carray, put_carray, fill_carray, put_obj, get_obj): Simplified with removal of conversions. (fill_obj): Necessary adjustments, leaving same number of conversions. * hash.c (equal_hash): Remove conversion from BUF case. * rand.c (make_random_state): Remove conversion of seed to Lisp integer. (random_buf): Pass self to make_owned_buf. * strudel.c (strudel_unget_byte, strudel_fill_buf): Coversions removed, streamlining code. * stream.c (iobuf_get, iobuf_put): We cannot overload the len field with serving as a linked list since it's no longer a pointer. We instead use the struct any union member, which has a next pointer for this purpose. Because "t.next" overlaps with "b.size", and we must not clobber the size field, we save "b.size" by copying it into "b.len". When pulling buffers from the iobuf_free_list, we restore b.size from b.len. For good measure, We add a bug_unless assertion that the size is the expected one. I ran into a test case failure while working on this due to the size being clobbered to zero, and subsequent I/O with that zero-sized buffer being interpreted as EOF.
* utf8: move duplicated code from parser and stream.Kaz Kylheku2025-05-281-13/+2
| | | | | | | | | | | | | | | | | | | | | | * stream.c (struct byte_input_ungetch): Removed. (byte_in_unget_char_callback): Removed. (byte_in_unget_char): Use struct utf8_tiny_buf instead of struct byte_input_ungetch, and use utf8_tiny_buf_putc instead of byte_in_unget_char_callback. * parser.c (struct shadow_ungetch): Removed. (shadow_unget_char_callback): Removed. (shadow_unget_char): Use struct utf8_tiny_buf instead of struct shadow_ungetch, and use utf8_tiny_buf_putc instead of shadow_unget_char_callback. * utf8.c (utf8_tiny_buf_putc): New function, identical to shadow_unget_char_callback and byte_in_unget_char_callback. * utf8.h (struct utf8_tiny_buf): New struct type, identical to removed struct shadow_ungetch and struct byte_input_ungetch. (utf8_tiny_buf_putc): Declared.
* streams: bugfix in string input seek error diagnostic.Kaz Kylheku2025-05-281-1/+1
| | | | | | | * stream.c (string_in_seek): Update the value of len with the calculated value so that the out-of-bounds seek diagnostic diagnostic shows the string length rather than nil.
* streams: seek operation for string byte input stream.Kaz Kylheku2025-05-281-1/+37
| | | | | | | * stream.c (byte_in_seek): New function. (byte_in_ops): Wire in byte_in_seek. * tests/018/streams.tl: New tests.
* streams: seek-stream must fail on extracted string out stream.Kaz Kylheku2025-05-261-0/+3
| | | | | | | * stream.c (string_out_seek): Do the extracted error check so that we fail if the data has already been removed form the stream. There is a test case already for this, which is failing.
* streams: seek and truncate ops for string streams.Kaz Kylheku2025-05-251-18/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes seek-stream and truncate-stream work for string output streams, and seek-stream for string input streams. * stream.c (string_in_seek): New static function. (string_in_ops): Wire seek operation to string_in_seek. (struct string out): New member, len. Keeps track of the length of the string, so that fill can be freely positioned. (string_out_put_string): We can no longer add the string with a null terminator, because the put operation could be happening at any position. We only add the null terminator when we are writing the data at the end. This function also now supports buffer extension: the seek operation can seek beyond the current string. The seek operation then calls string_out_put_string with a null string. This function then grows th buffer as needed. In that case there is a need to fill the space with space characters. (string_out_truncate, string_out_seek): New static functions. (string_out_ops): Wire in string_out_seek and string_out_truncate. (make_string_output_stream); Initialize new so->len member to zero. (get_string_from_stream_common): New function, renamed from get_string_from_stream, and taking a parameter to optionally request non-destructive readout. (stream_init): Update registration of get-string-from-stream to get_string_from_stream_common. * stream.h (get_string_from_stream_common): Declared. (get_string_from_stream): Becomes inline function which calls get_string_from_stream, defaulting the argument. Why I didn't add the argument to get_string_from_stream is not to have to edit numerous calls to get_string_to_stream throughout the code base. * tests/018/streams.tl: New tests. * txr.1: Documentation updated to correct text claiming that string streams don't support truncate-stream and seek-stream, and describe the support in detal.
* streams: implement get_string using get_string virtual op.Kaz Kylheku2025-05-241-12/+6
| | | | | | | | | | | | | | * stream.c (get_string): Replace inefficient loop that pushes characters into a string output stream with a call to ops->get_string. Because that virtual can return nil at end-of-stream or if nchars is zero, we check for that and convert that to an empty string return, since get-string never returns nil. * txr.1: Clarify that get-string never returns nil, and also document the conditions under which get-delimited-stream (the most transparent wrapper around the get_string virtual) does return nil.
* streams: replace get_line virtual with new interface.Kaz Kylheku2025-05-241-33/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stream.h (struct strm_ops): The simple get_line virtual is replaced by get_string, which takes a character limit and a delimiting stop character. (strm_ops_init): Rename get_line parameter to get_string. (get_string_s): Declared. (generic_get_line): Declaration removed. (generic_get_string, get_delimited_string): Declared. * stream.c (get_string_s): New symbol variable. (unimpl_get_line): Function removed. (unimpl_get_string): New function. (null_get_line): Function removed. (null_get_string): New function. (fill_stream_ops): Configure ops->get_string rather than ops->get_line. (null_ops): Wire null_get_string in place of null_get_line. (generic_get_line): Renamed to generic_get_string. (generic_get_string): Implement the limit and stop_char parameters. (get_line_limited_check): New static function. (stdio_ops): Wire in generic_get_string instead of generic_get_line. (tail_get_line): Replaced by tail_get_string. (tail_get_string): Call generic_get_string instead of generic_get_line, and pass the limit and stop_char arguments down. (tail_ops): Wire in tail_get_string instead of tail_get_line. (pipe_ops): Wire generic_get_string instead of generic_get_line. (dir_get_line): Renamed to dir_get_string. (dir_get_string): Use get_line_limited_check to defend against unhandled argument values. (dir_ops): Wire dir_get_string instead of dir_get_line. (string_in_get_line): Replaced by string_in_get_string. (string_in_get_string): Implement limit and stop_char parameters. (string_in_ops): Wire string_in_get_string instead of string_in_get_line. (strlist_in_get_line): Replaced with strlist_in_get_string. (strlist_in_get_string): Use get_line_limited_check to defend against unsupported arguments. (strlist_in_ops): Wire in strlist_in_get_string instead of strlist_in_get_line. (cat_get_line): Replaced by cat_get_string. (cat_get_string): Rather than recursing into the get_line public interface, we fetch the stream's get_string virtual and pass all arguments to it. (cat_stream_ops): Wire cat_get_string instead of cat_get_line. (record_adapter_get_line): Replaced by record_adapter_get_string. (record_adapter_get_string): use get_line_limited_check to guard against unsupported arguments. (record_adapter_ops): Wire record_adapter_get_string instead of record_adapter_get_line. (get_line): Implement using get_string virtual now. We pass UINT_PTR_MAX as limit, which means no character limit, and '\n' as the delimiter for reading a line. (get_delimited_string): New function, which exposes the full semantics of the get_string virtual. (stream_init): Initialize get_string_s. Register get-delimited-string function. Use get_string_s symbol in registration of get-string. * strudel.c (strudel_get_line): Replaced by strudel_get_string. (strudel_get_string): Call look up the get-string method and pass all arguments to it, encoded into Lisp values in the right way, nil indicating not present. (strudel_ops): Wire strudel_get_string in place of strudel_get_line. * parser.c (shadow_ops_template): Replace generic_get_line with generic_get_string. * buf.c (buf_strm_ops): Likewise. * socket.c (dgram_strm_ops): Likewise. * gzio.c (gzio_ops_rd): Likewise. * stdlib/stream-wrap.tl (stream-wrap get-line): Method replaced by (stream-wrap get-string). This calls get-delimited-string rather than get-line. * tests/018/streams.tl: New tests, mainly concerned with the new logic in the string input stream which has its own implementation of get_string with several cases. * txr.1: Document new get-delimited-string function, and the get-string method of the delegate stream, removing the documentation for removed get-line method.
* streams: fill_buf for string byte input streams.Kaz Kylheku2025-05-241-1/+20
| | | | | | * stream.c (byte_in_fill_buf): New function. (byte_in_ops): Wire byte_in_fill_buf in place of generic_fill_buf.
* parser: recover stream data buffered by lexer.Kaz Kylheku2025-05-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After parsing out of a stream, we attach a shadow stream which temporarily patches the operations to make it appear that bytes that were taken into the lexer have been pushed back. This lets us call an ordinary input operation after a parsing operation to read the data immediately following the parsed-out construct. (Almost: there is still the issue of the parser consuming one token of lookahead in some situations.) * stream.h (struct strm_base): New member shadow_obj. This is a context pointer used by the shadow stream operations. (generic_fill_buf): Declare previously internal function. * stream.c (strm_base_init): Initialize shadow_obj to null. (generic_fill_buf): Function changed to external linkage. Also, reloads the ops pointer from the stream on each loop iteration. This is because it can change; part of the buffer may be filled by shadow_get_byte, which can detach the shadow operations, so then the rest of the buffer is filled by something else like stdio_get_byte. (generic_get_line): Reload ops in in the loop, like in gneric_fill_buf, for the same reason. * parser.l: Include <stddef.h> for ptrdiff_t. (scanner_has_buffered_bytes, scanner_get_buffered_bytes): New functions. * parser.c (SHADOW_TAB_SIZE): New preprocessor symbol. (shadow_tab): New static array. (struct shadow_context, struct shadow_ungetch): New struct types. (lisp_parse_impl): After calling parse, call parse_shadow_stream_attach to attach the shadow stream context and operations onto the stream. (shadow_detach, shadow_destroy_op, shadow_mark_op, shadow_put_string, shadow_put_char, shadow_put_byte, shadow_get_char_callback, shadow_get_char, shadow_unget_char_callback, shadow_unget_char, shadow_get_byte, shadow_unget_byte, shadow_put_buf, shadow_close, shadow_flush, shadow_seek, shadow_truncate): New static functions. (shadow_ops_template): New static structure. (customize_shad_ops): New static function. (parser_shadow_stream_attach): New function. (parser_free_all): New function. * parser.h (scanner_has_buffered_bytes, scanner_get_buffered_bytes, parser_shadow_stream_attach, parser_free_all): Declared. * txr.c (free_all): Call parser_free_all. * tests/018/streams.tl: New test cases. * lex.yy.c.shipped: Regenerated.
* streams: fix uninitialized last_op in stdio streams.Kaz Kylheku2025-05-231-6/+0
| | | | | | | | * stream.c (tail_strategy, make_stdio_stream_common): A recent change introduced this problem. Initialization of h->last_op must no longer be compile-time conditional on CONFIG_STDIO_STRICT, because it is used in non-strict mode also.
* streams: move utf8 decoder into strm_base.Kaz Kylheku2025-05-231-19/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This refactoring is required become some upcoming hack work is going to depend on the assumption that every stream object has a utf8_decoder_t. * utf8.h (utf8_decoder_initializer): New macro, used when initializing strm_base. * stream.h: Some of the content is now hidden unless the preprocessor symbol STREAM_IMPL is defined. This is motivated by the fact that stream.h now depends on utf8.h. When STREAM_IMPL is not defined, the bits that depend on utf8.h are disabled and we don't have to touch files which include "stream.h" which don't refer to those bits. (struct strm_base): New member ud, of type ut8_decoder_t. * stream.c: include "utf8.h" before "stream.h" and ensure STREAM_IMPL is defined. (strm_base_init): Initialize ud member of struct strm_buf using utf8_decoder_initializer. (struct stdio_handle, struct byte_input, struct string_out): Remove member ud. (stdio_switch, stdio_seek, stdio_get_char, stdio_get_byte, stdio_unget_byte, stdio_fill_buf, tail_strategy, byte_in_get_char string_out_byte_flush): Refer to new ud member in base. (make_stdio_stream_common, make_string_byte_input_stream, make_string_output_stream): No need to call utf8_decoder_init since strm_base_init takes care of it. * buf.c, gzio.c, hash.c, lib.c, parser.c, regex.c, socket.c, struct.c, strudel.c, syslog.c, tree.c: Move #include "utf8.h" above "stream.h", or in some cases add it. Define STREAM_IMPL before #include "stream.h".
* streams: get-char for string byte input streams.Kaz Kylheku2025-05-221-2/+59
| | | | | | | | | | | | | | | | | | | | | | | String byte input streams extended to provide characer input (get-char), and any mixture of unget-byte and unget-char. Also fill-buf is supported. * stream.c (struct byte_input): New ut8_decoder_t member ud. (struct byte_input_ungetch): New struct type. (byte_in_get_char_callback, byte_in_get_char, byte_in_unget_char_callback, byte_in_unget_char): New functions. (byte_in_ops): Wire in byte_in_get_char, byte_in_unget_char and byte_in_unget_byte. Also generic_fill_buf. (make_string_byte_input_stream): Initialize the UTF-8 decoder. * tests/018/streams.tl: New tests. * txr.1: Documented.
* streams: improve pushback semantics of stdio streams.Kaz Kylheku2025-05-221-25/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * utf8.[ch] (utf8_getc, utf8_ungetc): New functions which allow the push-back buffer of the decoder to be accessed. We can use the decoder's push-back buffer to implement a stream's byte push-back, so that the behavior is then consistent: invalid bytes pushed back by the decoder are treated uniformly with bytes pushed back using unget-char. * stream.c (stdio_switch): Bugfix: reset the UTF8 decoder when changing direction. Without this, it is possible that pushed back bytes in the decoder's buffer will be read, even though write operations moved the position. Thus stdio_switch is now defined as a function regardless of whether CONFIG_STDIO_STRICT is in effect. (stdio_get_byte): If there are pushed back characters present, throw an error. Otherwise, try to get a byte from the UTF8 buffer's pushback first via utf8_getc. If that produces something, just return it. Otherwise fall back on reading from the stdio stream. (stdio_unget_byte): If there are pushed back characters present, throw an error. Otherwise push back the character using utf8_ungetc. If that reports no space, throw an error. (stdio_fill_buf): Take bytes from the push-back buffer int he UTF8 decoder first, then fread the rest from the stdio stream, if necessary.
* quasistrings: support format notation.Kaz Kylheku2025-05-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to quasiliterals to have the inserted items formatted via a format conversion specifier, for example @~3,3a:abc is @abc modified by ~3,3a format conversion. When the inserted value is a list, the conversion is distributed over the elements individually. Otherwise it applies to the entire item. * eval.c (fmt_tostring, fmt_cat): Take additional format string argument. If it isn't nil, then do the string conversion via the fmt1 function rather than tostring. (do_format_field): Take format string argument, and pass down to fmt_cat. (format_field); Take format string argument and pass down to do_format_field. (fmt_simple, fmt_flex): Pass nil format string argument to fmt_tostring. (fmt_simple_fmstr, fmt_flex_fmstr): New static functions, like fmt_simple and fmt_flex but with format string arg. Used as run-time support for compiler-generated quasilit code for cases when format conversion specifier is present. (subst_vars): Extract the new format string frome each variable item. Pass it down to fmt_tostring, format_field and fmt_cat. (eval_init): Register sys:fmt-simple-fmstr and sys:flex-fmstr intrinsics. * eval.h (format_field): Declaration updated. * lib.c (out_quasi_str_sym): Take format string argument. If it is present, output it after the @, followed by a colon, to reproduce the read notation. (out_quasi_str): Pass down the format string, taken from the fourth element of a sys:var item of the quasiliteral. For simple symbolic items, pass down nil. * match.c (tx_subst_vars): Pass nil as new argument of format_field. The output variables of the TXR Pattern language do not exhibit this feature. * parser.l (FMT): New pattern for matching the format string part. (grammar): The rule which recognizes @ in quasiliterals optionally scans the format notation, and turns it into a string attached to the token's semantic value, which is now of type val (see parser.y remarks). * parser.y (tokens): The '@' token's %type changed from lineno to val so it can carry the format string. (q_var): If format string is present in the @ symbol, then include it as the fourth element of the sys:var form. This rule handles braced items. (meta): We can no longer get the line number from the @ item, so we get it from n_expr. (quasi_item): Similar to q_var change here. This handles @ followed by unbraced items: symbols and other expressions. * stdlib/compiler.tl (expand-quasi-mods): Take format string argument. When the format string is present, then generate code which uses the new alternative run-time support functions, and passes them the format string as an argument. (expand-quasi-args): Extend the sys:var match to extract the format string if it is present. Pass it down to expand-quasi-mods. * stdlib/match.tl (expand-quasi-match): Add an error case diagnosing the situation when the program tries to use a format-conversion-endowed item in a quasilit pattern. * stream.[ch] (fmt1): New function. * tests/012/quasi.tl: New tests. * txr.1: Documented. * lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
* get-csv: further get-char optimization.Kaz Kylheku2025-01-301-15/+6
| | | | | | | | | | | Another 5-6% gained form this. * stream.c (us_get_char, us_unget_char): Static functions removed. (get_csv): Retrieve the get_char and unget_char pointers from the strm_ops structure outside of the loop, and then just call these pointers. Careful: the unget_char virtual has reversed parameters compared to the global function.
* get-csv: use unsafe version string-extend.Kaz Kylheku2025-01-301-5/+5
| | | | | | | | | | | | | | | Another almost 16% speedup. * lib.c (us_length_STR): New static function. (string_extend): Use us_length_STR, since we know the object is of type STR. (us_string_extend_STR_CHR): New function. (length_str): Handle STR case via use_length_STR. * lib.h (us_string_extend_STR_CHR): Declared. * stream.c (get_csv): Use us_string_extend_STR_CHR instead of string_extend.
* get-csv: speed up with unsafe get-char.Kaz Kylheku2025-01-301-4/+17
| | | | | | | | | I'm seeing about a 6% improvement in get-csv from this. * stream.c (us_get_char, us_unget_char): New static functions, which assume all arguments have correct type. (get_csv): If we use source_opt, validate that it's a stream with class_check. Use us_get_char and use_unget_char.
* get-csv: bugfix: return nil on EOF.Kaz Kylheku2025-01-241-3/+9
| | | | | | | | | | | | | | | | | * stream.c (get_csv): Let's add a new state init. If get_char returns nil and we are in the init state, let's bail to a nil return. While we are at it, let's not allocate the record or string until we read at least one character. If we read a character in the init state, let's allocate those two objects, and then change to the rfield state and fall through to it to handle the character. * tests/010/csv.tl: Fix one incorrect test: (tocsv "") now returns nil, as it should. Add tests for multiple record extraction, also covering missing line termination on the last record as well as CR-LF termination. * txr.1: Documented nil return conditions.
* New functions for producing CSV.Kaz Kylheku2025-01-241-0/+42
| | | | | | | | | | | | * stream.c (put_csv, tocsv): New functions. (stream_init): put-csv and tocsv intrinsics registered. * stream.h (put_csv, tocsv): Declared. * tests/010/csv.tl (mtest-pcsv): New macro. New test cases. * txr.1: Documented.
* get-csv: refactor into switches.Kaz Kylheku2025-01-211-19/+26
| | | | | | | * stream.c (get_csv): All cases handle end-of-stream the same way, so we check for nil outside of the case switch. Then only characters need to be handled, so we can call c_chr(ch) and switch on it.
* get-csv: rewrite in C.Kaz Kylheku2025-01-211-0/+78
| | | | | | | | | | | | | | * autload.c (csv_set_entries, csv_instantiate): Functions removed. (autoload_init): Autoload registration for stdlib/csv removed. * stdlib/csv.tl: File removed. * stream.c (get_csv): New function. (stream_init): Register get-csv intrinsic. * stream.h (get_csv): Declared.
* Copyright year bump 2025.Kaz Kylheku2025-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright bumped to 2025.
* json: new special var *print-json-type*.Kaz Kylheku2024-07-121-1/+2
| | | | | | | | | | | | | | | | | | | | This variable controls whether we emit the "__type" key for structures. * lib.c (out_json_rec): React to the new variable, via the flag in the json_opts structure: include the "__type" key only if it is requested. (out_json, put_json): Initialize the type flag in the josn_opts according to the *print-json-type* dynamic variable. * stream.c (print_json_type_s): New symbol variable. (stream_init): print_json_type_s initialized, and corresponding special variable registered, with intial value t. * stream.h (struct json_opts): New bitfield member, type. (print_json_type_s): Declared. * txr.1: Documented.
* open-process: new ?fdno option for selecting stream fd.Kaz Kylheku2024-06-261-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | If, for instance ?2 is specified in the mode string argument of open-process and related functions, this means that the file descriptor 2 of the process will be used as the data source (or sink) for the stream that is returned by the function. With this feature we can easily read the standard error of a process while leaving its standard output unredirected. * stream.c (do_parse_mode): Parse the ? mode option. (open_subprocess): Check for the presence of the alternative file descriptor in the stdio_mode structure, and and use it isntead of STDIN_FILENO or STDOUT_FILENO. * stream.h (struct stdio_mode): New member, streamfd. (stdio_mode_init_blank, stdio_mode_init_r, stdio_mode_init_rpb, stdio_mode_init_blank, stdio_mode_init_r, stdio_mode_init_rpb): Update initializer macros to cover the new member, setting it to the default value -1 (not specified). * txr.1: Documented.
* cobj: clone method streamlines copy; structs get copy method.Kaz Kylheku2024-06-171-12/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib.h (struct cobj_ops): New function pointer, clone. (cobj_ops_init, cobj_ops_init_ex): Add clone argument to macros. * lib.c (seq_iter_cobj_ops): Use copy_iter as the clone operation. (cptr_ops): Use copy_cptr as clone operation. (copy): Replace if statements by check whether COBJ has a clone operation. If so, we use it to copy the object. * struct.h (enum special_slot): New member, copy_m. * struct.c (copy_s): New symbol variable. (special_sym): Associate copy_m enum value with copy symbol. (struct_init): Initialize copy_s with interned symbol. (struct_inst_clone): New static function. (struct_type_ops): Specify no clone operation via null pointer. (struct_inst_ops): Specify struct_inst_clone as clone operation. * arith.c (psq_ops): Indicate no clone operation via null pointer. * buf.c (buf_strm_ops): Likewise. * chksum.c (sha1_ops, sha256_ops, md5_ops): Likewise. * ffi.c (ffi_type_builtin_ops, ffi_type_struct_ops, ffi_type_ptr_ops, ffi_type_enum_ops, ffi_closure_ops, union_ops): Likewise. (carray_borrowed_ops, carry_owned_ops, carray_mmap_ops): Specify copy_carray as clone operation. * gc.c (prot_array_ops): Indicate no clone operation via null pointer. * gzip.c (gzio_ops_rd, gzip_ops_wr): Likewise. * hash.c (hash_iter_ops): Likewise. (hash_ops): Specify copy_hash as clone operation. * parser.c (parser_ops): Indicate no clone operation via null pointer. * rand.c (random_state_clone): New static function. (random_state_ops): Use random_state_clone as clone function. * regex.c (char_set_obj_ops, regex_obj_ops): Indicate no clone operation via null pointer. * socket.c (dgram_strm_ops): Likewise. * stream.c (null-ops, stdio_ops, tail_ops, pipe_ops, dir_ops, string_in_ops, byte_in_ops, strlist_in_ops, string_out_ops, strlist_out_ops, cat_stream_ops, record_adapter_ops): Likewise. * strudel.c (strudel_ops): Likewise. * sysif.c (cptr_dl_ops, opendir_ops): Likewise. * syslog.c (syslog_strm_ops): Likewise. * unwind.c (cont_ops): Likewise. * vm.c (vm_desc_ops, vm_closure_ops): Likewise. * tree.c (tree_ops): Use copy_search_tree for clone operation. (tree_iter_ops): Use copy_tree_iter for clone operation. * genchksum.txr: Changes in chksum.c specified in one place here. * tests/012/oop.tl: Couple of new tests. * txr.1: Documented.
* Copyright year bump 2024.Kaz Kylheku2024-01-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2024.
* lib: review cobj calls for gc incorrectness and fix.Kaz Kylheku2024-01-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I looked at all cobj calls to see if there is a potential problem, looking for situations whereby the cobj call could trigger a gc that would destroy Lisp objects that the new object either stores, or that its continued initialization depends on. * stream.c (make_strlist_input_stream): call cobj earlier, then fill in the structure. Use chk_calloc to allocate the structure so any Lisp objects in it look like nil until it is initialized. * struct.c (make_struct_impl, make_lazy_struct): Use gc_hint on the type argument, to pin down the st structure that we use in initializations after the cobj call. If the type object were to disappear, the st structure would become invalid. * tree.c (copy_search_tree, make_similar_tree): Use gc_hint on the tree argument to pin down the otr structure that we reference in initializations (copy_tree_iter): Use gc_hint on iter, for similar reasons. * vm.c (vm_copy_closure): Use gc_hint on oclosure, to pin down the environment we are copying from it.
* cygwin: run, sh: mangle termination status word.Kaz Kylheku2023-12-281-0/+10
| | | | | | | | | | * stream.c (run): On Cygwin, the spawnvp function is returning a 16 bit termination status word where the upper 8 bits is the termination status if the termination is normal, otherwise the lower 8 bits holds a termination signal. Let's massage it so that the function returns an integer termnation status, or nil if the termination was abnormal, same as we do on POSIX platforms with fork/wait.
* sh-esc: clean up mess I made.Kaz Kylheku2023-11-251-8/+39
| | | | | | | | | | | | | | | | | | | | | | Not all special characters can just be backslash escaped. Spaces and newlines must be quoted. * stream.c (sh_esc_common): New function. Handles both sh-esc and sh-esc-all logic, distinguished by a flag. Quoting is used, rather than backslash escaping. If the string contains no special characters, it is just erturned. If it can be double quoted, it is double quoted. Otherwise it is single quoted and any contained single quotes are replaced by '\''. (sh_esc, sh_esc_all): Now just wrap sh_esc_common. (sh_esc_dq): Remove the newline from the set of escaped characters. Escaping a newline generates a continuation sequence which eats the newline. * tests/018/sh-esc.tl: Most test cases deleted; many new test cases added. * txr.1: Documentation revised.
* New T mode for open-file.Kaz Kylheku2023-09-231-0/+13
| | | | | | | | | | | | | | | The T mode uses O_TMPFILE to create an unlinkd temporary file. * stream.h (struct stdio_mode): New flag, tmpfile. (stdio_mode_init_blank, stdio_mode_init_r, stdio_mode_init_rpb): Updated to cover new bitfield member. * stream.c (w_open_mode): If tmpfile flag is on, add O_TMPFILE. (do_parse_mode): Recognize "T" mode selector and set all appropriate mode bits. If we are not on a platform that has O_TMPFILE, set the maformed flag. * txr.1: Documented.
* Use vargs typedef instead of struct args *.Kaz Kylheku2023-09-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | The vargs typedef is underused. Let's use it consistently everywhere. * args.c, * args.h, * args.c, * args.h, * arith.c, * eval.c * ffi.c, * gc.c, * hash.c, * lib.c, * lib.h, * parser.c, * stream.c, * struct.c, * struct.h, * syslog.c, * syslog.h, * unwind.c, * vm.c, * vm.h: All "struct args * declarations replaced with existing "varg" typedef that comes from lib.h.
* New functions for shell escaping.Kaz Kylheku2023-09-011-0/+24
| | | | | | | | | | | * stream.c (sh_esc, sh_esc_all, sh_esc_dq, sh_esc_sq): New static functions. (stream_init): sh-esc, sh-esc-all, sh-esc-dq, sh-esc-sq: Intrinsics registered. * tests/018/sh-esc.tl: New file. * txr.1: Documented.
* close-stream: new : protocol from close method.Kaz Kylheku2023-08-071-3/+10
| | | | | | | | | | | | | * stream.c (close_stream): If the underlying method returns the colon symbol :, then keep the cached close_result as nil, so that the method can be called again, but return t to the caller to indicate success. * tests/018/close-delegate.tl: Test case added. * tests/018/close-delegate.expected: Updated. * txr.1: Documented.
* streams: a few close funtions should return t.Kaz Kylheku2023-08-071-0/+2
| | | | | | | | * socket.c (dgram_close): Return t when a descriptor is closed, returning nil only when the object is already in a closed state. * stream.c (dev_null_close, dir_close): Likewise.
* streams: close-stream only caches non-nil result.Kaz Kylheku2023-08-071-3/+3
| | | | | | | | | | | | | | | | | | | | This is motivated by trying to implement a struct delegate stream which performs reference counting in close, in order to close the real stream when the count hits zero. The caching behavior of close-stream is a problem. * stream.c (strm_base_init): Initialize close_result to nil, rather than nao. (strm_base_mark): Don't check close_result for nao. (close_stream): Suppress the call to op->close if close_result has a non-nil value, rather than a value other than nao. * tests/018/close-delegate.tl, * tests/018/close-delegate.expected: New files. * txr.1: Document that only a non-nil return is cached by close-stream.
* Copyright year bump 2023.Kaz Kylheku2023-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2023.
* streams: fixes in few type error diagnostics.Kaz Kylheku2022-11-211-5/+7
| | | | | | | | | | | * stream.c (make_string_byte_input_stream, get_string_from_stream): Use self in diagnostic, and print bad object using ~s rather than ~a. (get_list_from_stream): Likewise, and add missing nao as well. (catenated_stream_push): Add self string, use in diagnostics, print bad object using ~s.
* args: don't use alloca for const size cases.Kaz Kylheku2022-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * args.h (args_decl_list): This macro now handles only constant values of N. It declares an anonyous container struct type which juxtaposes the struc args header with exactly N values. This is simply defined as a local variable without alloca. (args_decl_constsize): Like args_decl, but requiring a constant N; implemented via args_decl_list. (args_decl_list_dyn): New name for the old args_decl_list which calls alloca. No places in the code depend on this at all, except the definition of args_decl. (args_decl): Retargeted to args_decl_list_dyn. There is some inconsistency in the macro naming in that args_decl_constsize depends on args_decl_list, and args_decl depends on arg_decl_list_dyn. This was done to minimize diffs. Most direct uses of args_decl_list have a constant size, but a large number of args_decl uses do not have a constant size. * eval.c (op_catch): Use args_decl_constsize. * ffi.c (ffi_struct_in, ffi_struct_get, union_out): Likewise. * ftw.c (ftw_callback): Likewise. * lib.c (funcall, funcall1, funcall2, funcall3, funcall4, uniq, relate): Likewise. * socket.c (sockaddr_in_unpack, sockaddr_in6_unpack, sockaddr_un_unpack): Likewise. * stream.c (formatv): Likewise. * struct.c (struct_from_plist, struct_from_args, make_struct_lit): Likewise. * sysif.c (termios_unpack): Likewise. * time.c (broken_time_struct): Likewise.
* json: support standard-style formatting.Kaz Kylheku2022-10-111-0/+4
| | | | | | | | | | | | | | | | | | | | * stream.c (standard_k, print_json_format_s): New symbol variables. (stream_init): New variables initialized. * stream.h (enum json_fmt): New enum. (standard_k, print_json_format_s): Declared. * lib.c (out_json_rec): Take enum json_fmt param, and pass it recursively. Printing for vector and dictionaries reacts to argument value. (out_json, put_json): Examine value of special var *print-json-format* and calculate enum json_fmt value from this. Pass to out_json_rec. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* streams: new function inc-indent-abs.Kaz Kylheku2022-10-111-0/+13
| | | | | | | | | | | * stream.c (inc_indent_abs): New function. (stream_init): inc-init-abs intrinsic registered. * stream.h (inc_indent_abs): Declared. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* Implement NaN boxing.Kaz Kylheku2022-09-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On platforms with 64 bit pointers, and therefore 64-bit-wide TXR values, we can use a representation technique which allows double floating-point values to be unboxed. Fixnum integers are reduced from 62 bits to 50, and there is a little more complexity in the run-time type checking and dispatch which costs extra cycles. The support is currently off by default; it must be explicitly enabled with ./configure --nan-boxing. * lib.h (NUM_MAX, NUM_MIN, NUM_BIT): Define separately for NaN boxing. (TAG_FLNUM, TAG_WIDTH, NAN_TAG_BIT, NAN_TAG_MASK, TAG_BIGMASK, TAG_BIGSHIFT, NAN_FLNUM_DELTA): New preprocessor symbols. (enum type, type_t): The FLNUM enumeration constant moves to just after LIT, so that its value is the same as TAG_FLNUM. (struct flonum): Does not exist under NaN boxing. (union obj): No fl member under NaN boxing. (tag, is_ptr): Separately defined for NaN boxing. (is_flo): New function under NaN boxing. (tag_ex): New function. It's like tag, but identifies floating-point values as TAG_FLNUM. The tag function continues to map them to TAG_PTR, which is wrong under NaN boxing, but needed in order not to separately write tons of cases in the arith.c module. (type): Use tag_ex, so TAG_FLNUM is handled, if it exists. (auto_str, static_str, litptr, num_fast, chr, c_n, c_u): Different definition for NaN boxing. (c_ch, c_f): New function. (throw_mismatch): Attribute with NORETURN. (nao): Separate definition for NaN boxing. * lib.c (seq_kind_tab): Reorder initializer to follow enum reordering. (seq_iter_rewind): use c_n and c_ch functions, since type checking has been done in those cases. The self parameter is no longer needed. (iter_more): use c_ch on CHR object. (equal): Use c_f accessor to get double value rather than assuming there is a struct flonum representation. (stringp): Use tag_ex, otherwise a floating-point number is identified as TAG_PTR. (diff, isec, isecp): Don't pass removed self parameter to seq_iter_rewind. * arith.c (c_unum, c_dbl_num, c_dbl_unum, plus, minus, signum, gt, lt, ge, le, numeq, logand, logior, logxor, logxor_old, bit, bitset, tofloat, toint, width, c_num, c_fixnum): Extract floating-point value using c_f accessor. Handle CHR type separately from NUM because the storage representation is no longer identical; CHR values have a two bit tag over bits where NUM has ordinary value bits. NUM is tagged at the NaN level with the upper 14 bits being 0xFFFC. The remaining 50 bits are the value. (flo): Construct unboxed float under NaN boxing by taking image of double as a 64 bit value, and adding the delta offset, then casting to the val pointer type. (c_flo): Separate implementation for NaN boxing. (integerp, numberp): Use tag_ex. * buf.c (str_buf, buf_int): Separate CHR and NUM cases, like in numerous arith.c functions. * chksum.c (sha256_hash, md5_hash): Use c_ch accessor for CHR value. * hash.c (equal_hash, eql_hash): Handle CHR separately. Use c_f accessor for floating-point value. (eq_hash): Use tag_ex and handle TAG_FLNUM value under NaN boxing. Handle CHR separately from NUM. * ffi.c (ffi_float_put, ffi_double_put, carray_uint, carray_int): Handle CHR and NUM separately. * stream.c (formatv): Use c_f accessor. * configure: disable automatic selection of NaN boxing on 64 bit platforms, for now. Add test whether -Wno-strict-aliasing is supported by the compiler, performed only if NaN boxing is enabled. We need to disable this warning because it goes off on the code that reinterprets an integer as a double and vice versa.
* close-stream: process wait cleanup.Kaz Kylheku2022-06-241-18/+6
| | | | | | | | | * stream.c (pipe_close_status_helper): Revise error messages. Get rid of impossible cases: we will not get WIFSTOPPED or WIFCONTINUED unless we used the WUNTRACED option in waitpid, which we don't. No platforms without HAVE_SYS_WAIT, don't throw if the status is nonzero; just return nil. It could be a normal termination.
* More HAVE_FORK_STUFF cleanup.Kaz Kylheku2022-06-241-3/+3
| | | | | | * stream.c (pipe_close_status_helper, pipe_close, pipe_ops): Included in #if HAVE_FORK_STUFF block. (stream_init): Refer to pipe ops only if HAVE_FORK_STUFF is true.
* streams: remove old code for popen on MingGW.Kaz Kylheku2022-06-211-156/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is vestigial code from a time when TXR supported being compiled with MinGW. Except in the case of the functions run and sh, which are implementable without fork on Cygwin using the spawn family of functions, there won't be any fallback; if HAVE_FORK_STUFF is zero or missing, then certain functions will be absent. * stream.c (struct stdio_handle): Do not define a pid member if we don't HAVE_FORK_STUFF. (se_pclose): Function removed; we won't be using any non-POSIX platforms popen/pclose. (pipe_close): Don't call the removed se_pclose. (make_pipe_stream): Function removed. (fds_subst, fds_swizzle): Don't define these if HAVE_FORK_STUFF is absent, and so is HAVE_WSPAWN and HAVE_SPAWN. They are now only needed by the version of the run function that uses spawn or wspawn. (open_command, open_process): Remove the versions of these function based on popen. (string_extend_count, win_escape_cmd, win_escape_arg, win_make_cmdline): Remove these functions used by the above open_process; we have no need for encoding arguments into a Windows command line string, since the Cygwin/Cygnal libraries do that for us in their spawn and exec functions. * stream.h (make_pipe_stream): Function removed. * utf8.[ch] (w_popen): Function removed.
* bugfix: missing gzip support in open-command.Kaz Kylheku2022-06-211-52/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stream.c (pipe_close_status_helper): New function, factored out of pipe_close and used by it, and also by gzio_close. (pipe_close): Call pipe_close, which now contains the classification of process wait status codes. (open_fileno): Now takes optional pid argument. If this specified, then make_pipevp_stream is used. (open_subprocess): Use the open_fileno function, rather than fopen. This simplifies things too, except that we have to catch exception. Pass pid to the newly added parameter of open_fileno so that we obtain a proper pipe stream that will wait for the process to terminate when closed. (mkstemp_wrap): Pass nil for pid argument of open_fileno. (stream_init): Update registration of open-fileno. * gzio.c (struct gzio_handle): New member, pid. (gzio_close): If there is a nonzero pid, wait for the process to terminate. (make_gzio_stream): Initialize h->pid to zero. (make_gzio_pipe_stream): New function. * parser.c (lino_fdopen): Pass nil for pid argument of open_fileno. * gzio.h (make_gzio_pipe_stream): Declared. * tests/018/gzip.tl: New test.