txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Version 301.HEAD txr-301 master	Kaz Kylheku	13 days	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated. * protsym.c: Regenerated.
*	compiler: bug: end instruction balance in tjmp/bjmp handling.	Kaz Kylheku	14 days	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	I was reading the compiler code and noticed that the code template for unwind protect has two end instructions: one for the protected calculation and one for the unwinds. * stdlib/compiler.tl (convert-t-b-jumps): When a uwprot instruction is encountered, the balance must be incremented by 2, in order to skip past two end instructions. Without this we will end up with incorrect code for a block return that jumps out of a block, in which there is a subsequent unwind-protect.
*	compiler: rewrite the rewrite function.	Kaz Kylheku	14 days	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/optimize.tl (rewrite): Rewrite. The function is begging to be rewritten. I mean, just look at its name! This is not just for shits and giggles. The rewrite makes it tail-recursive, so it makes a test case for TCO. Instead of using a list-builder object, it does the traditional thing: builds the output in reverse and then calls nreverse. The generated code is lovely. (rewrite-case): Pass a nil argument for the new accumulator parameter of rewrite.
*	compiler: replace lazy list integers with iterables.	Kaz Kylheku	2025-06-20	1	-6/+6
\| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler (get-datavec, get-symvec, comp-switch, comp-catch, comp-progn, comp-or)): Replace uses of the range function with much more memory efficient integer and integer range iteration.
*	compiler: only last case of tree-case is tail position.	Kaz Kylheku	2025-06-20	1	-2/+6
\| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-tree-case): Disable the tail position for all but the last cases. The reason is that the case result values are checked for : fallthrough. It's a bad hack we should think about restricting to static cases.
*	compiler: optimized block returns.	Kaz Kylheku	2025-06-19	1	-20/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.t (blockinfo): New slots: label, oreg. These inform the compiler, when it is generating a jump out of a block, what register to put in the block return value and where to jump. (env lookup-block): Lose the mark-used optional argument; this function is only called in one place, and that place will now decide whether to mark the block used after doing the lookup, not before. (env extend-block): Add the parameters label and oreg, to pass through these values to the block-info structure's new slots. (compiler): New slot: bjmp-occurs. We are going to use a pseudo instruction (bjmp ...) to denote a call out of a block similarly to how we used (tjmp ...) for a tail call. There will be a similar post-processing needed for them. (compiler comp-block): Pass oreg and lskip to extend-block, so block returns in the inner compilation have this info if they need to compile a direct jump out of the block. The esc-blocks needs to be set conditionally. If we are compiling a block, then name is not a symbol but an expression evaluating to it, and so we don't extend esc-blocks; there can be no direct jumps out of a block with a dynamic name. (Or perhaps there could be with more complication and more work). The case when the block is eliminated is more complicated now. Even though the block is eliminated, there can be jumps out of that block in the code. Those jumps expect the output register to be oreg and they expect the lskip label to be present, so we need to add these features to the bfrag.code and also adjust bfrag.oreg. (compiler comp-return-from): We use esc-blocks* to decide whether to compile a jmp or a dynamic block return. In the one case, we must inform the compiler structure that a bjmp instruction is now present. In the other we must indicate that the block is used for a dynamic transfer, and so cannot be optimized away. (convert-tjmps): Rename to convert-t-b-jmps and handle the bjmp instruction. When a (bjmp <label>) is seen, we scan forward to an occurrence of <label>, similarly to how for a (tjmp <...>) we scan toward a (jend ...) function end. We insert any intervening end instructions before the bjmp and convert to jmp. (compiler optimize): Call convert-t-b-jmps if either the tjmp-occurs or bjmp-occurs flag is set. These flags could be merged into a single one, but let's leave it for now.
*	compiler: block escape list.	Kaz Kylheku	2025-06-19	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new dynamic variable esc-blocks keeps track of what blocks, in a given scope, may be abandoned by a simple jmp instruction instead of a full blown dynamic return. We will not try to handle unwinding statically; any contour that needs unwinding cannot be jumped across. * stdlib/compiler.tl (esc-blocks): New special variable. (compile-in-top-level): Clear esc-blocks for top-level compilations. (compiler (comp-unwind-protect, comp-catch, comp-lambda-impl, comp-prof): These contexts cannot be abandoned by a jmp instruction: we cannot jump out of the middle of an unwind-protect, catch, lambda or prof. So we compile these with esc-blocks set to nil. New blocks entirely contained in these constructs can of course build up the list locally. E.g. within a function, of course we can have blocks that are abandoned by a simple jmp instruction. Just we cannot jump out. (compiler comp-block): When compiling a block, we bind esc-blocks to a list consisting of the previous value, plus the new block name consed to the front.
*	compiler: disable tail position in top level.	Kaz Kylheku	2025-06-19	1	-1/+2
\| \| \| \| \| \|	* stdlib/compiler.tl (compile-in-toplevel): We need to disable the tail position for compilations that take place in a top-level environment, like load-time forms.
*	compiler: opt-tail-calls compiler option.	Kaz Kylheku	2025-06-19	2	-10/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/comp-opts.tl (compile-opts): New slot, opt-tail-calls. (%warning-syms%): Variable removed; this just lists the slots of compile-opts, which can be obtained with the slots function. Moreover, it is badly named, since there is now a non-diagnostic option. (compile-opts): Specify an initial value of : for opt-tail-calls. This means that opt-level controls whether it is enabled. * stdlib/compiler.tl: Update comment about optimization levels, since level 2 now does tail calls. (compiler comp-lambda-impl): Only enable tail calls if the option has the default value : and opt-level is at least two, or else if the option has value t. (with-compile-opts): Recognize : as valid option value. Don't refer to %warning-syms% but the slots of compile-opts. We use load-time to avoid calculating this repeatedly. The wording of the error message has to be adjusted since not all options are diagnosic any more. * autoload.c (compiler_set_entries): Add opt-tail-calls to slot-name autoload triggers for the compiler module. * txr.1: Mention tail calls and the opt-tail-calls option in the description of opt-level 2. Document opt-tail-calls under compile-opts section. We describe what tail calls are there and also adjust the wording since not all options diagnostic. Describe the three-valued system for code generation options.
*	compiler: unnecessary test in link-graph.	Kaz Kylheku	2025-06-19	1	-2/+1
\| \| \| \| \| \|	* stdlib/optimize.tl (basic-blocks link-graph): Remove unnecessary null test of nxbl inside a when that is conditional on it not being null.
*	compiler: TCO: redundant code handling optionals.	Kaz Kylheku	2025-06-19	1	-8/+5
\| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-tail-call): Throw away the code generated by lambda-apply-transform for doing defaulting of optionals. The function already contains code to do that, right at the top where the tail call jumps. defaulting optionals twice is not just a waste of time, but can evaluate twice the expressions which provide the default values.
*	compiler/load: tlo version number increment,	Kaz Kylheku	2025-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new tail call optimization relies on a fix to the VM's block instruction. This means that .tlo files in which TCO has been applied might not run correctly with TXR 300 or older. For that reason, we bump up the version number. * parser.c (read_file_common): Accept version 8.0 files, while continuing to allow 6 and 7 regardless of minor number. We get picky about minor number so that in the future we can use a a minor number increment for backward compatible changes like this. We would only like to go to version 9 if the VM changes in such a way that we cannot load 8 any more. If we can still load 8.0, we would like to go to 8.1. * stdlib/compiler.tl (%tlo-ver%): Change to 8.0. * txr.1: Documented.
*	compiler: TCO code complete.	Kaz Kylheku	2025-06-19	1	-9/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed point iteration over stdlib works; tests pass. * stdlib/compiler.tl (tail-fun-info): Remove called slot. This is replaced by tjmp-occurs in the compiler. New slot, label. Identifies the backwards jump label for the tail call. (compiler): New slot, tjmp-occurs. If any tail call jump occurs we set this. Special post processing is required to insert some instructions before the jmp, in order to bail out of some nested blocks/frames. (compiler compile): Pass env in two parmeter positions to comp-setq. Compile new setq-in-env compiler-only operator which recurses to comp-setq but allows the variable env to be independently specified. (compiler comp-setq): Take two environment parameters; one for resolving the value, and the other the variable. We need this capability for setting the function parameters in before the tail call jump. The parameters are in an outer environment and may be shadowed. (compiler comp-setq-in-env): New method; parses compiler- generated (setq-in-env <var> <val> <env-obj>) syntax and calls comp-setq. (compiler comp-lambda-impl): If there is a tail context for this lambda, create the jump label for it and store it in the context. Also, we need the tfn.env to be nenv not env; env is the outside context of the lambda, without the parameters! Also, we inject the label into the top of the code. (compiler comp-fun-form): If we are in tail position, compile the function form via comp-tail-call. Turn off the tail position before recursing: the arguments of the tail call are not themselves in a tail position. (compiler comp-tail-call): New function. This is the workhorse. To generate the tail call, we create a fake lambda and use the lambda-apply-transform-function in order to obtain the syntax for an immediate call. We then destructure the pieces, arrange them into the code we need and compile it in the correct environments to generate the fragment, adding the backwards jump to it. This requires a post-processing fixup. (compiler comp-for): Bugfix: the body of a for is not in tail position, only the result forms. (compiler comp-prof): Also disable tail position; we don't want code to jump out of a prof block. (convert-tjmps): New function. This has to analyze the code to find (tjmp ...) pseudo-instructions representing the backwards jumps of tail calls. Before these jmps, we have to insert end instructions, so that the tail call does not jump out of a nested context, such as a variable frame/dframe or block. (usr:compile): When an interpreted function object is compiled, or a symbol naming such an object, we set up the tail-fun-info structure for it, so that tail calls work, like we are already doing for defun and labels.
*	compiler: bug: handling of block returns.	Kaz Kylheku	2025-06-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The handling of block returning instructions ret and abscsr is incorrect and causes miscompilations, such as infinite loops. * stdlib/optimize.tl (basic-blocks jump-ops): Remove ret and abscsr. These instructions will no longer terminate basic blocks. (basic-blocks link-graph): Remove the instructions from the pattern match here; they won't occur any more as the last instruction of a block. Note that they were being handled together as a jend: effectively as a signal indicating the brick wall end of control flow with no next basic block. This is what caused the problems.
*	compiler: eliminate wasteful treg nulling.	Kaz Kylheku	2025-06-17	2	-7/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/optimize.tl (live-info): New slot, clobbered. (basic-block): New slot, cycle. Struct also inherits clobbered slot from live-info. (basic-block print): Print clobbered and cycle. (basic-blocks local-liveness): Calculate clobbered for each instruction and from that for the basic block. (basic-blocks identify-cycle-members): New method. Discovers which basic blocks are part of any cycle, and identifies them by setting the new cycle slot to t. (basic-blocks do-peephole-block): New local functions here for determining whether a register has been clobbered before the first instruction, either in the same basic block or any ancestors. Only works when the block is not part of a cycle. We add a peephole pattern matching move instructions that set tregs to (t 0)/nil. When we are not in a cycle block, and the treg has not previously been clobbered, we know it is clean: it still has the initial nil value set by the VM and we can remove the instruction. * stdlib/compiler.tl (compiler optimize): Call the identify-cycle-members method before peephole.
*	compiler: don't null tregs before closure.	Kaz Kylheku	2025-06-17	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler eliminate-frame): New optional no-tregs parameter. If true, it disables the generation of code to null the tregs. (compiler comp-lambda-impl): Pass t to no-regs parameter of eliminate-frame to disable the nulling. It is not required for a lambda which executes which fresh t-registers, implicitly initialized to nil. The presence of these instructions prevents the when-match pattern from matching, which expects the instruction sequence to start with a close instruction, so that the effect of eliminate-frame is then lost. This is a latent bug exposed by the previous commit. We would have seen this previously in a lambda occurring inside a loop.
*	compiler: remove loop-nest counter hack	Kaz Kylheku	2025-06-16	1	-10/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The loop-nest counter in the compiler context is positive whenever a loop is being compiled. This informs the eliminate-frame function that a variable-binding block can be executed multiple times, and so when variables are converted to registers, those registers have to be explicitly initialized to nil (in order to bring about the semantics of fresh lexical variables being nil). In this patch, we get rid of the counter and just always generate the zero-initializations. They get well optimized away. The code is usually the same. Sometimes four bytes longer or shorter. I'm noticing smaller frame sizes across the board due to registers being eliminated. * stdlib/compiler.tl (compiler): Remove loop-nest slot. (compiler eliminate-frame): Unconditionally emit the mov instructions which set all the new tregs to nil (i.e. copy the value of nil register (t 0)). (comp-for): Do not increment and decrement the loop count.
*	compiler: bug: register liveness of block insn.	Kaz Kylheku	2025-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/optimize.tl (basic-blocks local-liveness): For the block instruction, we must mark the destination register as being defined. This is in spite of the block instruction not actually itself writing to it. The entire block as such defines the register; any part of the block could be interrupted by a nonlocal return to that bock, which will then set that register and jump past the block. Nothing in the block should be doing anything with the register except for a final move into it at the end of the block and the end instruction which references it. If we don't use ref-def, it's possible for register renaming code to wrongly rename the destination register.
*	compiler: whitespace issue in optimizer.	Kaz Kylheku	2025-06-16	1	-1/+1
\| \| \| \| \|	* stdlib/optimize.tl (basic-blocks do-peephole-block): Fix a bad indentation of one line.
*	compiler: rename local variable in optimizer.	Kaz Kylheku	2025-06-16	1	-7/+7
\| \| \| \| \| \| \|	* stdlib/optimize.tl (basic-blocks merge-jump-thunks): A local variable named bb is used for walking a list of basic blocks, and shadowing the self object, also named bb. It should be called bl.
*	compiler: bug: bad slot ref in optimizer.	Kaz Kylheku	2025-06-16	1	-1/+1
\| \| \| \| \| \| \| \|	* stdlib/optimize.tl (basic-blocks do-peephole-block): Update the links slot of the correct object, the basic block bl, not the basic blocks graph bb. This indicates that the code was never run hitherto. Some compiler changes I'm making revealed it.
*	compiler: missing wasteful register move elimination.	Kaz Kylheku	2025-06-16	1	-0/+2
\| \| \| \| \| \| \| \| \|	* stdlib/optimize.tl (basic-blocks do-peephole-block): Adding a case to remove a (mov X X) instruction, moving any register to itself. It's astonishing that this is missing. I'm seeing it happen in tail call cases now because when a tail call passes an unchanging argument, that becomes a self-assignment.
*	compiler: forgotten not/null reductions in if.	Kaz Kylheku	2025-06-16	1	-0/+6
\| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-if): Recognize cases like (if (not <expr>) <then> <else>) and convert to (if <expr> <else> <then>). Also the test (true <expr>) is reduced to <expr>.
*	compiler: value is no optional in fbind/lbind.	Kaz Kylheku	2025-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	This should hav been part of the May 26, 2025 commit d70b55a0023eda8d776f18d224e4487f5e0d484e. * stdlib/compiler.tl (compiler comp-fbind): The form is not optional in fbind/lbind bindings; the syntax is (sym form); we don't have to use optional binding syntax.
*	compiler: prepare tail call identification context.	Kaz Kylheku	2025-06-15	1	-2/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (tail-fun-info): New struct type. The tail-fun special will be bound to instances of this. (compiler compile): Handle sys:rt-defun specially, via new comp-rt-defun. (compiler comp-return-from): Adjustment here; tail-fun does not carry the name, but a context structure with a name slot. (compiler comp-fbind): Whe compiling lbind, and thus potentially recursive functions, bind tail-fun to a new tail-fun-info context object carrying the name and lambda function. The env will be filled in later the compilation of the lambda. (compiler comp-lambda-impl): When compiling exactly that lambda expression that is indicated the tail-fun structure, store the parameter environment object into that structure, and also bind tail-pos to indicate that the body of the lambda is in the tail position. (compiler comp-rt-defun): New method, which destructures the (sys:rt-defun ...) call to extract the name and lambda, and uses those to wrap a tail-fun-info context around the compilation, similarly to what is done for local functions in comp-fbind.
*	compiler: tidiness issue in top dispatcher.	Kaz Kylheku	2025-06-15	1	-4/+4
\| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler compile): Move the compiler-let case into the "compiler-only special operators" group. Consolidate the group of specially handled functions.
*	compiler: immediately called lambda: code gen tweak.	Kaz Kylheku	2025-06-15	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses some irregularities in the output of lambda-appply-transform, to make its output easier to destructure and use in tail recursion logic, in which the inner bindings will be turned into assignments of existing variables. * stdlib/compiler.tl (lambda-apply-transform): Move the binding of the al-val gensym from the inner let* block to the outer let/let where other gensyms are bound. Replace the ign-1 and ign-2 temporaries by a single gensym. Ensure that this gensym is bound.
*	compiler: track tail positions.	Kaz Kylheku	2025-06-13	1	-30/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (ntp): New macro. (tail-pos, tail-fun): New special variables. (compiler (comp-setq, comp-lisp1-setq, comp-setqf, comp-if, comp-ift, comp-switch, comp-unwind-protect, comp-block, comp-catch, comp-let, comp-fbind, comp-lambda-impl, comp-fun, comp-for)): Identify non-tail-position expressions and turn off the tail position flag for recursing over those. (compiler comp-return-from): The returned expression is in the tail position if this is the block for the current function, otherwise not. (compiler (comp-progn, comp-or)): Positions other than the last are non-tail. (compiler comp-prog1): Nothing is tail position in a prog1 that has two or more expressions. (usr:compile-toplevel): For a new compile job, bind tail-pos to nil. There is no tail position until we are compiling a named function (not yet implemented).
*	infix: no phony infix over lambdas and such.	Kaz Kylheku	2025-06-03	1	-1/+2
\| \| \| \| \| \| \|	* stdlib/infix.tl (funp): Do not recognize list forms as functions, such as lambda expressions or (meth ...) syntax. It causes surprisingly wrong transformations.
*	streams: new get-buf function.	Kaz Kylheku	2025-06-03	1	-26/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	* stream.c (get_buf) New function. (stream_init): Register get-buf intrinsic. * stream.h (get_buf): Declared. * stdlib/getput.tl (sys:get-buf-common): Function removed. (file-get-buf, command-get-buf, map-command-buf, map-process-buf): Use get-buf instead of sys:get-buf-common. * txr.1: Documented.
*	*-get-buf: bug in skipping non-seekable streams.	Kaz Kylheku	2025-06-02	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/getput.tl (sys:get-buf-common); Fix incorrect algorithm for skipping forward in a stream that doesn't support seek-stream. The problem is that when the seek amont is greater than 4096, it does nothing but 4096 byte reads, which will overshoot the target position if it isn't divisible by 4096. The last read must be adjusted to the remaining seek amount. * tests/018/getput.tl: New test case using property-based approach to show that the read-based skip in get-buf-common fetches the same data as the seek-based skip.
*	Version 300.txr-300	Kaz Kylheku	2025-05-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated. * protsym.c: Regenerated.
*	compiler: function bindings syntax cannot be atom.	Kaz Kylheku	2025-05-26	1	-3/+2
\| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-fbind): We don't have to normalize the function binding syntax of a sys:fbind or sys:lbind. This code was copy and pated from (compiler comp-let). These bindings are always (name lambda) pairs and are machine-generated that way. If a name ocurred, it woudl not be correct to rewrite it to (name).
*	streams: replace get_line virtual with new interface.	Kaz Kylheku	2025-05-24	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stream.h (struct strm_ops): The simple get_line virtual is replaced by get_string, which takes a character limit and a delimiting stop character. (strm_ops_init): Rename get_line parameter to get_string. (get_string_s): Declared. (generic_get_line): Declaration removed. (generic_get_string, get_delimited_string): Declared. * stream.c (get_string_s): New symbol variable. (unimpl_get_line): Function removed. (unimpl_get_string): New function. (null_get_line): Function removed. (null_get_string): New function. (fill_stream_ops): Configure ops->get_string rather than ops->get_line. (null_ops): Wire null_get_string in place of null_get_line. (generic_get_line): Renamed to generic_get_string. (generic_get_string): Implement the limit and stop_char parameters. (get_line_limited_check): New static function. (stdio_ops): Wire in generic_get_string instead of generic_get_line. (tail_get_line): Replaced by tail_get_string. (tail_get_string): Call generic_get_string instead of generic_get_line, and pass the limit and stop_char arguments down. (tail_ops): Wire in tail_get_string instead of tail_get_line. (pipe_ops): Wire generic_get_string instead of generic_get_line. (dir_get_line): Renamed to dir_get_string. (dir_get_string): Use get_line_limited_check to defend against unhandled argument values. (dir_ops): Wire dir_get_string instead of dir_get_line. (string_in_get_line): Replaced by string_in_get_string. (string_in_get_string): Implement limit and stop_char parameters. (string_in_ops): Wire string_in_get_string instead of string_in_get_line. (strlist_in_get_line): Replaced with strlist_in_get_string. (strlist_in_get_string): Use get_line_limited_check to defend against unsupported arguments. (strlist_in_ops): Wire in strlist_in_get_string instead of strlist_in_get_line. (cat_get_line): Replaced by cat_get_string. (cat_get_string): Rather than recursing into the get_line public interface, we fetch the stream's get_string virtual and pass all arguments to it. (cat_stream_ops): Wire cat_get_string instead of cat_get_line. (record_adapter_get_line): Replaced by record_adapter_get_string. (record_adapter_get_string): use get_line_limited_check to guard against unsupported arguments. (record_adapter_ops): Wire record_adapter_get_string instead of record_adapter_get_line. (get_line): Implement using get_string virtual now. We pass UINT_PTR_MAX as limit, which means no character limit, and '\n' as the delimiter for reading a line. (get_delimited_string): New function, which exposes the full semantics of the get_string virtual. (stream_init): Initialize get_string_s. Register get-delimited-string function. Use get_string_s symbol in registration of get-string. * strudel.c (strudel_get_line): Replaced by strudel_get_string. (strudel_get_string): Call look up the get-string method and pass all arguments to it, encoded into Lisp values in the right way, nil indicating not present. (strudel_ops): Wire strudel_get_string in place of strudel_get_line. * parser.c (shadow_ops_template): Replace generic_get_line with generic_get_string. * buf.c (buf_strm_ops): Likewise. * socket.c (dgram_strm_ops): Likewise. * gzio.c (gzio_ops_rd): Likewise. * stdlib/stream-wrap.tl (stream-wrap get-line): Method replaced by (stream-wrap get-string). This calls get-delimited-string rather than get-line. * tests/018/streams.tl: New tests, mainly concerned with the new logic in the string input stream which has its own implementation of get_string with several cases. * txr.1: Document new get-delimited-string function, and the get-string method of the delegate stream, removing the documentation for removed get-line method.
*	quasistrings: support format notation.	Kaz Kylheku	2025-05-18	2	-15/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support to quasiliterals to have the inserted items formatted via a format conversion specifier, for example @~3,3a:abc is @abc modified by ~3,3a format conversion. When the inserted value is a list, the conversion is distributed over the elements individually. Otherwise it applies to the entire item. * eval.c (fmt_tostring, fmt_cat): Take additional format string argument. If it isn't nil, then do the string conversion via the fmt1 function rather than tostring. (do_format_field): Take format string argument, and pass down to fmt_cat. (format_field); Take format string argument and pass down to do_format_field. (fmt_simple, fmt_flex): Pass nil format string argument to fmt_tostring. (fmt_simple_fmstr, fmt_flex_fmstr): New static functions, like fmt_simple and fmt_flex but with format string arg. Used as run-time support for compiler-generated quasilit code for cases when format conversion specifier is present. (subst_vars): Extract the new format string frome each variable item. Pass it down to fmt_tostring, format_field and fmt_cat. (eval_init): Register sys:fmt-simple-fmstr and sys:flex-fmstr intrinsics. * eval.h (format_field): Declaration updated. * lib.c (out_quasi_str_sym): Take format string argument. If it is present, output it after the @, followed by a colon, to reproduce the read notation. (out_quasi_str): Pass down the format string, taken from the fourth element of a sys:var item of the quasiliteral. For simple symbolic items, pass down nil. * match.c (tx_subst_vars): Pass nil as new argument of format_field. The output variables of the TXR Pattern language do not exhibit this feature. * parser.l (FMT): New pattern for matching the format string part. (grammar): The rule which recognizes @ in quasiliterals optionally scans the format notation, and turns it into a string attached to the token's semantic value, which is now of type val (see parser.y remarks). * parser.y (tokens): The '@' token's %type changed from lineno to val so it can carry the format string. (q_var): If format string is present in the @ symbol, then include it as the fourth element of the sys:var form. This rule handles braced items. (meta): We can no longer get the line number from the @ item, so we get it from n_expr. (quasi_item): Similar to q_var change here. This handles @ followed by unbraced items: symbols and other expressions. * stdlib/compiler.tl (expand-quasi-mods): Take format string argument. When the format string is present, then generate code which uses the new alternative run-time support functions, and passes them the format string as an argument. (expand-quasi-args): Extend the sys:var match to extract the format string if it is present. Pass it down to expand-quasi-mods. * stdlib/match.tl (expand-quasi-match): Add an error case diagnosing the situation when the program tries to use a format-conversion-endowed item in a quasilit pattern. * stream.[ch] (fmt1): New function. * tests/012/quasi.tl: New tests. * txr.1: Documented. * lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
*	compiler: fix unidiomatic if/cond combination.	Kaz Kylheku	2025-05-17	1	-13/+12
\| \| \| \| \| \| \|	* stdlib/compiler.tl (expand-quasi-mods): Fix unidiomatic if form which continues with a cond fallback. All I'm doing here is flattening (if a b (cond (c d) ...)) to (cond (a b) (c d) ...).
*	infix: add missing ~ complement operator.	Kaz Kylheku	2025-05-16	1	-0/+1
\| \| \| \| \| \| \| \| \|	* stdlib/infix.tl (toplevel): New ~ operator, prefix at level 35, tied to lognot function. * tests/012/infix.tl: New tests. * txr.1: Documented.
*	hmac: use buf-xor-pattern function.	Kaz Kylheku	2025-05-11	1	-11/+9
\| \| \| \| \|	* stdlib/hmac.tl (hmac-impl): Eliminate xor loop by using buf-xor-pattern.
*	match: make @(require) work over args in lambda-match.	Kaz Kylheku	2025-05-10	1	-32/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The issue is that in a lambda-match, when we wrap @(require) around an argument match, it becomes a single pattern which matches the variadic arguments as a list. As a result, the function also becomes variadic, and a list is consed up for the match. A nested list is worth a thousand words: Before this change: 1> (expand '(lambda-match (@(require (@a @b) (= 5 (+ a b))) (cons a b)))) (lambda (. #:rest0014) (let (#:result0015) (or (let* (a b) (let ((#:g0017 (list* #:rest0014))) (if (consp #:g0017) (let ((#:g0018 (car #:g0017)) (#:g0019 (cdr #:g0017))) (sys:setq a #:g0018) (if (consp #:g0019) (let ((#:g0020 (car #:g0019)) (#:g0021 (cdr #:g0019))) (sys:setq b #:g0020) (if (equal #:g0021 '()) (if (and (= 5 (+ a b))) (progn (sys:setq #:result0015 (cons a b)) t)))))))))) #:result0015)) After this change: 1> (expand '(lambda-match (@(require (@a @b) (= 5 (+ a b))) (cons a b)))) (lambda (#:arg-00015 #:arg-10016) (let (#:result0017) (or (let* (b a) (sys:setq b #:arg-10016) (sys:setq a #:arg-00015) (if (and (= 5 (+ a b))) (progn (sys:setq #:result0017 (cons a b)) t)))) #:result0017)) @(require (@a @b)) now leads to a two-argument function. The guard condition is applied to the a and b variables extracted from the arguments rather than a list. * stdlib/match.tl (when-exprs-match): Macro removed. (struct lambda-clause): New slot, require-conds. (parse-lambda-match-clause): Recognize @(require ...) syntax, destructure it and recurse into the argument pattern it contains. Because the incoming syntax includes the clause body, for the recursive call we synthesize syntax consisting of the pattern extracted from the @(require), coupled with the clause body. When the recursive call gives us a lambda-clause structure, we then add the require guard expressions to it. So in essence the behavior is that we parse the (@(require argpat cond ...) body) as if it were (argpat body), and decorate the object with the extracted conditions. (expand-lambda-match): This now takes an env argument due to the fact that when-exprs-match was removed. when-exprs-match relied on its :env parameter to get the environment, needed for compile-match. Now expand-lambda-match directly calls compile-match, doing all the work that when-exprs-match helper was doing. Integrating that into expand-lambda-match allows us to add logic to detect that the lambda-clause structure has require conditions, and add the code as a guard to the compiled match using add-guards-post. (lambda-match, defun-match, :match): Pass environment argument to expand-lambda-match.
*	compiler: improvements in reporting form in diagnostics.	Kaz Kylheku	2025-05-09	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (ctx_name): Do not report just the car of the form. If the form starts with three symbols, list those; if two, list those. * stdlib/compiler.tl (expand-defun): Set the defun form as the macro ancestor of the lambda, rather than propagating source location info. Then diagnostics that previously refer to a lambda will correctly refer to the defun and thank to the above change in eval.c, include its name. * stdlib/pmac.t (define-param-expander): A similar change here. We make the define-param-expander form the macro ancestor of the lambda, so that diagnostics agains tthe lambda will show that form.
*	trace: new parameter macro :trace for simple static tracing	Kaz Kylheku	2025-05-09	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \|	Just add :trace to the head of the parameter list, and the function is traced. No dynamic turning on and off though. * autoload.c (trace_set_entries): Trigger autoload on :trace keyword. * stdlib/trace.tl (:trace): New parameter list expander. * txr.1: Documented.
*	dig: don't stop at form that has source location.	Kaz Kylheku	2025-05-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	* stdlib/error.tl (sys:dig): Do not stop tracing the macro ancestry of a form upon finding source location info. This can be counterproductive because there are situations in which intermediate forms receive the info via rlcp or rlcp_tree. We would like to get to the form that actually exists in that file at that line number.
*	quips; Unicode quip.	Kaz Kylheku	2025-05-08	1	-0/+1
\| \| \| \|	* stdlib/quips.tl (%quips%): New one about grapheme clusters.
*	match: new macros in the "each" family.	Kaz Kylheku	2025-05-08	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* autoload.c (match_set_entries): Trigger autoload on new symbols in function namespace: each-match-case, collect-match-cases, append-match-cases, keep-match-cases, each-match-case-product, collect-match-case-products, append-match-case-products, keep-match-case-products. * stdlib/match.tl (each-match-case, collect-match-cases, append-match-cases, keep-match-cases, each-match-case-product, collect-match-case-products, append-match-case-products, keep-match-case-products): New macros. * tests/011/patmatch.tl: New tests. * txr.1: Documented.
*	infix: bug: non-infix expressions conflated with infix.	Kaz Kylheku	2025-05-07	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is that (parse-infix '(x < y < z)) and (parse-infix '(x < (< y z)) produce exactly the same parse and will be treated the same way. But we would like (< y z) to be left alone. The fix is to annotate all compound terms such that finish-infix will not recurse into them. * stdlib/infix.tl (parse-infix): When an operand is seen that is a compound expression X it is turned into @X, in other words (sys:expr X). (finish-infix): Recognize (sys:expr X) and convert it into X without recursing into it. * tests/012/infix.tl: Update a number of test cases. * txr.1: Documented.
*	infix: do not process square bracket forms.	Kaz Kylheku	2025-05-02	1	-6/+1
\| \| \| \| \| \| \| \| \|	* stdlib/infix.tl (infix-expand-hook): Do not process the interior of square bracket forms; jsut pass them through. Of course, square brackets continue to denote postfix array indexing. * txr.1: Updated and revised.
*	infix: phony infix requires 3 or more elements.	Kaz Kylheku	2025-05-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is undesirable to translate (1 fun) into (fun 1). Only cases similar to these patterns, using list as an example: (1 list 2) -> (list 1 2) (1 list 2 3) -> (list 1 2 3) (1 list 2 + 3) -> (list 1 (+ 2 3)) (list 2 3) -> (list 2 3) (list 2 + 3) -> (list (+ 2 3)) * stdlib/infix.tl (infix-expand-hook): Restrict the phony infix case to three or more elements. * txr.1: Update phony infix case 1 as requiring three or more elements. Also add (1 list) example emphasizing that it's not recognized.
*	infix: superfix relational operators; better code.	Kaz Kylheku	2025-05-01	1	-18/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit extends infix with a post-processing step applied to the output of parse-infix which improves the code, and also implements math-like semantics for relational operators, which I'm calling superfix. Improving the code means that expressions like a + b + c, which turn into (+ (+ a b) c) are cleaned up into (+ a b c). This is done for all n-ary operators. superfix means that clumps of certain operators behave as a compound. For instance a < b <= c means (and (< a b) (<= b c)), where b is evaluated only once. Some relational operators are n-ary; for those we generate the n-ary expression, so that a = b = c < d becomes (and (= a b c) (< c d)). * stdlib/infix.tl (ifx-env): New special variable. We use this for communicating the macro environment down into the new finish-infix function, without having to pass a parameter through all the recursion. (eq, eql, equal, neq, neql, nequal, /=, <, >, <=, >=, less, greater, lequal, gequal): These operators become right associative, and are merged into a single precedence level. (finish-infix): New function which coalesces compounds of n-ary operations and converts the postfix chains of relational operators into the correct translation of superfix semantics. (infix-expand-hook): Call finish-infix on the output of parse-infix, taking care to bind the ifx-env variable to the environment we are given. * tests/012/infix.tl: New tests. * txr.1: Documented.
*	infix: bug: (a = b) not parsing.	Kaz Kylheku	2025-04-29	1	-1/+2
\| \| \| \| \| \| \| \| \|	* stdlib.tl (detect-infix): Do not detect a prefix operator followed by argument, followed by anything whatsoever as being infix. The pair must be followed by nothing, or by a non-argument. * txr.1: Documented.
*	infix: bug: handle dotted function calls.	Kaz Kylheku	2025-04-28	1	-1/+1
\| \| \| \| \| \|	* stdlib/infix.tl (infix-expand-hook): In the phony prefix case, require rest to be a cons, rather than non-nil in order to invoke cdr.