summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Version 301.HEADtxr-301masterKaz Kylheku11 days7-1616/+1685
| | | | | | | | | | | | | | * RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated. * protsym.c: Regenerated.
* compiler: bug: end instruction balance in tjmp/bjmp handling.Kaz Kylheku12 days1-1/+4
| | | | | | | | | | | | | I was reading the compiler code and noticed that the code template for unwind protect has two end instructions: one for the protected calculation and one for the unwinds. * stdlib/compiler.tl (convert-t-b-jumps): When a uwprot instruction is encountered, the balance must be incremented by 2, in order to skip past two end instructions. Without this we will end up with incorrect code for a block return that jumps out of a block, in which there is a subsequent unwind-protect.
* compiler: rewrite the rewrite function.Kaz Kylheku12 days1-8/+10
| | | | | | | | | | | | * stdlib/optimize.tl (rewrite): Rewrite. The function is begging to be rewritten. I mean, just look at its name! This is not just for shits and giggles. The rewrite makes it tail-recursive, so it makes a test case for TCO. Instead of using a list-builder object, it does the traditional thing: builds the output in reverse and then calls nreverse. The generated code is lovely. (rewrite-case): Pass a nil argument for the new accumulator parameter of rewrite.
* compiler: replace lazy list integers with iterables.Kaz Kylheku12 days1-6/+6
| | | | | | | * stdlib/compiler.tl (compiler (get-datavec, get-symvec, comp-switch, comp-catch, comp-progn, comp-or)): Replace uses of the range function with much more memory efficient integer and integer range iteration.
* doc: mention new block optimization.Kaz Kylheku12 days1-0/+2
| | | | * txr.1: Mention new block jump optimization for *opt-level* 2.
* compiler: only last case of tree-case is tail position.Kaz Kylheku12 days1-2/+6
| | | | | | | | * stdlib/compiler.tl (compiler comp-tree-case): Disable the tail position for all but the last cases. The reason is that the case result values are checked for : fallthrough. It's a bad hack we should think about restricting to static cases.
* compiler: optimized block returns.Kaz Kylheku13 days1-20/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/compiler.t (blockinfo): New slots: label, oreg. These inform the compiler, when it is generating a jump out of a block, what register to put in the block return value and where to jump. (env lookup-block): Lose the mark-used optional argument; this function is only called in one place, and that place will now decide whether to mark the block used after doing the lookup, not before. (env extend-block): Add the parameters label and oreg, to pass through these values to the block-info structure's new slots. (compiler): New slot: bjmp-occurs. We are going to use a pseudo instruction (bjmp ...) to denote a call out of a block similarly to how we used (tjmp ...) for a tail call. There will be a similar post-processing needed for them. (compiler comp-block): Pass oreg and lskip to extend-block, so block returns in the inner compilation have this info if they need to compile a direct jump out of the block. The *esc-blocks* needs to be set conditionally. If we are compiling a block*, then name is not a symbol but an expression evaluating to it, and so we don't extend *esc-blocks*; there can be no direct jumps out of a block with a dynamic name. (Or perhaps there could be with more complication and more work). The case when the block is eliminated is more complicated now. Even though the block is eliminated, there can be jumps out of that block in the code. Those jumps expect the output register to be oreg and they expect the lskip label to be present, so we need to add these features to the bfrag.code and also adjust bfrag.oreg. (compiler comp-return-from): We use *esc-blocks* to decide whether to compile a jmp or a dynamic block return. In the one case, we must inform the compiler structure that a bjmp instruction is now present. In the other we must indicate that the block is used for a dynamic transfer, and so cannot be optimized away. (convert-tjmps): Rename to convert-t-b-jmps and handle the bjmp instruction. When a (bjmp <label>) is seen, we scan forward to an occurrence of <label>, similarly to how for a (tjmp <...>) we scan toward a (jend ...) function end. We insert any intervening end instructions before the bjmp and convert to jmp. (compiler optimize): Call convert-t-b-jmps if either the tjmp-occurs or bjmp-occurs flag is set. These flags could be merged into a single one, but let's leave it for now.
* compiler: block escape list.Kaz Kylheku13 days1-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | The new dynamic variable *esc-blocks* keeps track of what blocks, in a given scope, may be abandoned by a simple jmp instruction instead of a full blown dynamic return. We will not try to handle unwinding statically; any contour that needs unwinding cannot be jumped across. * stdlib/compiler.tl (*esc-blocks*): New special variable. (compile-in-top-level): Clear *esc-blocks* for top-level compilations. (compiler (comp-unwind-protect, comp-catch, comp-lambda-impl, comp-prof): These contexts cannot be abandoned by a jmp instruction: we cannot jump out of the middle of an unwind-protect, catch, lambda or prof. So we compile these with *esc-blocks* set to nil. New blocks entirely contained in these constructs can of course build up the list locally. E.g. within a function, of course we can have blocks that are abandoned by a simple jmp instruction. Just we cannot jump out. (compiler comp-block): When compiling a block, we bind *esc-blocks* to a list consisting of the previous value, plus the new block name consed to the front.
* compiler: disable tail position in top level.Kaz Kylheku13 days1-1/+2
| | | | | | * stdlib/compiler.tl (compile-in-toplevel): We need to disable the tail position for compilations that take place in a top-level environment, like load-time forms.
* compiler: opt-tail-calls compiler option.Kaz Kylheku13 days4-16/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/comp-opts.tl (compile-opts): New slot, opt-tail-calls. (%warning-syms%): Variable removed; this just lists the slots of compile-opts, which can be obtained with the slots function. Moreover, it is badly named, since there is now a non-diagnostic option. (*compile-opts*): Specify an initial value of : for opt-tail-calls. This means that *opt-level* controls whether it is enabled. * stdlib/compiler.tl: Update comment about optimization levels, since level 2 now does tail calls. (compiler comp-lambda-impl): Only enable tail calls if the option has the default value : and *opt-level* is at least two, or else if the option has value t. (with-compile-opts): Recognize : as valid option value. Don't refer to %warning-syms% but the slots of compile-opts. We use load-time to avoid calculating this repeatedly. The wording of the error message has to be adjusted since not all options are diagnosic any more. * autoload.c (compiler_set_entries): Add opt-tail-calls to slot-name autoload triggers for the compiler module. * txr.1: Mention tail calls and the opt-tail-calls option in the description of *opt-level* 2. Document opt-tail-calls under compile-opts section. We describe what tail calls are there and also adjust the wording since not all options diagnostic. Describe the three-valued system for code generation options.
* compiler: unnecessary test in link-graph.Kaz Kylheku13 days1-2/+1
| | | | | | * stdlib/optimize.tl (basic-blocks link-graph): Remove unnecessary null test of nxbl inside a when that is conditional on it not being null.
* compiler: TCO: redundant code handling optionals.Kaz Kylheku13 days1-8/+5
| | | | | | | | | | * stdlib/compiler.tl (compiler comp-tail-call): Throw away the code generated by lambda-apply-transform for doing defaulting of optionals. The function already contains code to do that, right at the top where the tail call jumps. defaulting optionals twice is not just a waste of time, but can evaluate twice the expressions which provide the default values.
* compiler/load: tlo version number increment,Kaz Kylheku14 days3-2/+11
| | | | | | | | | | | | | | | | | | | | The new tail call optimization relies on a fix to the VM's block instruction. This means that .tlo files in which TCO has been applied might not run correctly with TXR 300 or older. For that reason, we bump up the version number. * parser.c (read_file_common): Accept version 8.0 files, while continuing to allow 6 and 7 regardless of minor number. We get picky about minor number so that in the future we can use a a minor number increment for backward compatible changes like this. We would only like to go to version 9 if the VM changes in such a way that we cannot load 8 any more. If we can still load 8.0, we would like to go to 8.1. * stdlib/compiler.tl (%tlo-ver%): Change to 8.0. * txr.1: Documented.
* compiler: TCO code complete.Kaz Kylheku14 days1-9/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed point iteration over stdlib works; tests pass. * stdlib/compiler.tl (tail-fun-info): Remove called slot. This is replaced by tjmp-occurs in the compiler. New slot, label. Identifies the backwards jump label for the tail call. (compiler): New slot, tjmp-occurs. If any tail call jump occurs we set this. Special post processing is required to insert some instructions before the jmp, in order to bail out of some nested blocks/frames. (compiler compile): Pass env in two parmeter positions to comp-setq. Compile new setq-in-env compiler-only operator which recurses to comp-setq but allows the variable env to be independently specified. (compiler comp-setq): Take two environment parameters; one for resolving the value, and the other the variable. We need this capability for setting the function parameters in before the tail call jump. The parameters are in an outer environment and may be shadowed. (compiler comp-setq-in-env): New method; parses compiler- generated (setq-in-env <var> <val> <env-obj>) syntax and calls comp-setq. (compiler comp-lambda-impl): If there is a tail context for this lambda, create the jump label for it and store it in the context. Also, we need the tfn.env to be nenv not env; env is the outside context of the lambda, without the parameters! Also, we inject the label into the top of the code. (compiler comp-fun-form): If we are in tail position, compile the function form via comp-tail-call. Turn off the tail position before recursing: the arguments of the tail call are not themselves in a tail position. (compiler comp-tail-call): New function. This is the workhorse. To generate the tail call, we create a fake lambda and use the lambda-apply-transform-function in order to obtain the syntax for an immediate call. We then destructure the pieces, arrange them into the code we need and compile it in the correct environments to generate the fragment, adding the backwards jump to it. This requires a post-processing fixup. (compiler comp-for): Bugfix: the body of a for is not in tail position, only the result forms. (compiler comp-prof): Also disable tail position; we don't want code to jump out of a prof block. (convert-tjmps): New function. This has to analyze the code to find (tjmp ...) pseudo-instructions representing the backwards jumps of tail calls. Before these jmps, we have to insert end instructions, so that the tail call does not jump out of a nested context, such as a variable frame/dframe or block. (usr:compile): When an interpreted function object is compiled, or a symbol naming such an object, we set up the tail-fun-info structure for it, so that tail calls work, like we are already doing for defun and labels.
* vm: don't adjust ip in normal block return.Kaz Kylheku2025-06-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | This issue does not come up in code generated by the current compiler. A normal block return is always the end instruction which immediately precedes the exit point. Therefore the "vm->ip = exitpt" does nothing in that case; it only changes vm->ip if the block is being abandoned by a block return. In the TCO work I'm doing, it's possible for a tail call to occur in a block. Prior to the tail call's jump, an end instruction will be executed to terminate the block. That end instruction is inserted by the compilation of the tail call. In this situation, the "vm->ip = exitpt" is wrong; it diverts control to the end of the block, skipping the jmp instruction. * vm.c (vm_block): If we are terminating normally---i.e, the vm_execute of the block returns---then adjust the exitpt variable to the current ip. Then the subsequent vm->ip = exitpt will do nothing; execution continues after the end instruction that terminated the block.
* compiler: bug: handling of block returns.Kaz Kylheku2025-06-181-2/+2
| | | | | | | | | | | | | | | | The handling of block returning instructions ret and abscsr is incorrect and causes miscompilations, such as infinite loops. * stdlib/optimize.tl (basic-blocks jump-ops): Remove ret and abscsr. These instructions will no longer terminate basic blocks. (basic-blocks link-graph): Remove the instructions from the pattern match here; they won't occur any more as the last instruction of a block. Note that they were being handled together as a jend: effectively as a signal indicating the brick wall end of control flow with no next basic block. This is what caused the problems.
* compiler: eliminate wasteful treg nulling.Kaz Kylheku2025-06-172-7/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/optimize.tl (live-info): New slot, clobbered. (basic-block): New slot, cycle. Struct also inherits clobbered slot from live-info. (basic-block print): Print clobbered and cycle. (basic-blocks local-liveness): Calculate clobbered for each instruction and from that for the basic block. (basic-blocks identify-cycle-members): New method. Discovers which basic blocks are part of any cycle, and identifies them by setting the new cycle slot to t. (basic-blocks do-peephole-block): New local functions here for determining whether a register has been clobbered before the first instruction, either in the same basic block or any ancestors. Only works when the block is not part of a cycle. We add a peephole pattern matching move instructions that set tregs to (t 0)/nil. When we are not in a cycle block, and the treg has not previously been clobbered, we know it is clean: it still has the initial nil value set by the VM and we can remove the instruction. * stdlib/compiler.tl (compiler optimize): Call the identify-cycle-members method before peephole.
* compiler: don't null tregs before closure.Kaz Kylheku2025-06-171-3/+5
| | | | | | | | | | | | | | | | * stdlib/compiler.tl (compiler eliminate-frame): New optional no-tregs parameter. If true, it disables the generation of code to null the tregs. (compiler comp-lambda-impl): Pass t to no-regs parameter of eliminate-frame to disable the nulling. It is not required for a lambda which executes which fresh t-registers, implicitly initialized to nil. The presence of these instructions prevents the when-match pattern from matching, which expects the instruction sequence to start with a close instruction, so that the effect of eliminate-frame is then lost. This is a latent bug exposed by the previous commit. We would have seen this previously in a lambda occurring inside a loop.
* compiler: remove loop-nest counter hackKaz Kylheku2025-06-161-10/+3
| | | | | | | | | | | | | | | | | | | | | | The loop-nest counter in the compiler context is positive whenever a loop is being compiled. This informs the eliminate-frame function that a variable-binding block can be executed multiple times, and so when variables are converted to registers, those registers have to be explicitly initialized to nil (in order to bring about the semantics of fresh lexical variables being nil). In this patch, we get rid of the counter and just always generate the zero-initializations. They get well optimized away. The code is usually the same. Sometimes four bytes longer or shorter. I'm noticing smaller frame sizes across the board due to registers being eliminated. * stdlib/compiler.tl (compiler): Remove loop-nest slot. (compiler eliminate-frame): Unconditionally emit the mov instructions which set all the new tregs to nil (i.e. copy the value of nil register (t 0)). (comp-for): Do not increment and decrement the loop count.
* compiler: bug: register liveness of block insn.Kaz Kylheku2025-06-161-1/+1
| | | | | | | | | | | | | | | | * stdlib/optimize.tl (basic-blocks local-liveness): For the block instruction, we must mark the destination register as being defined. This is in spite of the block instruction not actually itself writing to it. The entire block as such defines the register; any part of the block could be interrupted by a nonlocal return to that bock, which will then set that register and jump past the block. Nothing in the block should be doing anything with the register except for a final move into it at the end of the block and the end instruction which references it. If we don't use ref-def, it's possible for register renaming code to wrongly rename the destination register.
* Makefile: bad calculation of STDLIB_LATE_TLOS.Kaz Kylheku2025-06-161-3/+2
| | | | | | * Makefile (STDLIB_LATE_TLOS): Fix the incorrect filter-out expression. We want to subtract from all tlo's the early and middle ones.
* compiler: whitespace issue in optimizer.Kaz Kylheku2025-06-161-1/+1
| | | | | * stdlib/optimize.tl (basic-blocks do-peephole-block): Fix a bad indentation of one line.
* compiler: rename local variable in optimizer.Kaz Kylheku2025-06-161-7/+7
| | | | | | | * stdlib/optimize.tl (basic-blocks merge-jump-thunks): A local variable named bb is used for walking a list of basic blocks, and shadowing the self object, also named bb. It should be called bl.
* compiler: bug: bad slot ref in optimizer.Kaz Kylheku2025-06-161-1/+1
| | | | | | | | * stdlib/optimize.tl (basic-blocks do-peephole-block): Update the links slot of the correct object, the basic block bl, not the basic blocks graph bb. This indicates that the code was never run hitherto. Some compiler changes I'm making revealed it.
* autoload: bug: not clearing *expand-hook* correctly.Kaz Kylheku2025-06-161-1/+1
| | | | | | | | | | * autoload.c (autload_try): We must bind the symbol name expand_hook_s, not its value from the expand_hook macro. This bug causes an intraction with ifx. If a form expanded under ifx hits autoload, the infix expansion hook is in effect for that autoloaded file. Users of the compiled TXR will not experience any ill effects, but when compiled files are removed, triggering fallback on source code, bad things happen.
* compiler: missing wasteful register move elimination.Kaz Kylheku2025-06-161-0/+2
| | | | | | | | | * stdlib/optimize.tl (basic-blocks do-peephole-block): Adding a case to remove a (mov X X) instruction, moving any register to itself. It's astonishing that this is missing. I'm seeing it happen in tail call cases now because when a tail call passes an unchanging argument, that becomes a self-assignment.
* compiler: forgotten not/null reductions in if.Kaz Kylheku2025-06-161-0/+6
| | | | | | | * stdlib/compiler.tl (compiler comp-if): Recognize cases like (if (not <expr>) <then> <else>) and convert to (if <expr> <else> <then>). Also the test (true <expr>) is reduced to <expr>.
* hash: unused variables.Kaz Kylheku2025-06-151-2/+0
| | | | * hash.c (hash_isec): Remove unused variables h1 and h2.
* compiler: value is no optional in fbind/lbind.Kaz Kylheku2025-06-151-1/+1
| | | | | | | | | This should hav been part of the May 26, 2025 commit d70b55a0023eda8d776f18d224e4487f5e0d484e. * stdlib/compiler.tl (compiler comp-fbind): The form is not optional in fbind/lbind bindings; the syntax is (sym form); we don't have to use optional binding syntax.
* compiler: prepare tail call identification context.Kaz Kylheku2025-06-151-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/compiler.tl (tail-fun-info): New struct type. The *tail-fun* special will be bound to instances of this. (compiler compile): Handle sys:rt-defun specially, via new comp-rt-defun. (compiler comp-return-from): Adjustment here; *tail-fun* does not carry the name, but a context structure with a name slot. (compiler comp-fbind): Whe compiling lbind, and thus potentially recursive functions, bind *tail-fun* to a new tail-fun-info context object carrying the name and lambda function. The env will be filled in later the compilation of the lambda. (compiler comp-lambda-impl): When compiling exactly that lambda expression that is indicated the *tail-fun* structure, store the parameter environment object into that structure, and also bind *tail-pos* to indicate that the body of the lambda is in the tail position. (compiler comp-rt-defun): New method, which destructures the (sys:rt-defun ...) call to extract the name and lambda, and uses those to wrap a tail-fun-info context around the compilation, similarly to what is done for local functions in comp-fbind.
* compiler: tidiness issue in top dispatcher.Kaz Kylheku2025-06-151-4/+4
| | | | | | | * stdlib/compiler.tl (compiler compile): Move the compiler-let case into the "compiler-only special operators" group. Consolidate the group of specially handled functions.
* compiler: immediately called lambda: code gen tweak.Kaz Kylheku2025-06-151-11/+11
| | | | | | | | | | | | | | This patch addresses some irregularities in the output of lambda-appply-transform, to make its output easier to destructure and use in tail recursion logic, in which the inner bindings will be turned into assignments of existing variables. * stdlib/compiler.tl (lambda-apply-transform): Move the binding of the al-val gensym from the inner let* block to the outer let/let where other gensyms are bound. Replace the ign-1 and ign-2 temporaries by a single gensym. Ensure that this gensym is bound.
* ffi: remove dud elements array of ffi types.Kaz Kylheku2025-06-141-27/+0
| | | | | | | | | | | | | We compute an array of ffi_type for aggregates, but never actually use it for anything; we don't give it to libffi. Let's get rid of it. * ffi.c (struct txr_ffi_type): Remove elements member. (ffi_type_struct_destroy_op): Remove freeing of elements and assignment to zero. (ffi_struct_calcft, ffi_union_calcft, ffi_array_calcft): Remove allocation and calculation of elements. (make_ffi_type_struct, make_ffi_type_union): No need to free elements when we are replacing the existing type.
* lib: optimize set functions with hashes.Kaz Kylheku2025-06-132-10/+67
| | | | | | | | | | | | | * lib.c (diff, symdiff, isec, isecp, uni): When both sequences have 50 items or more, abandon the current approach, built hash tables and use a hash operation. This avoids impractically quadratic behavior on large inputs. * txr.1: Remove wording which states that the diff implementation de facto preserves orders of items from the left argument, like the obsolete set-diff function. Adjustd other wording.
* hash: new functions hash-seq and hash-isecp.Kaz Kylheku2025-06-134-3/+91
| | | | | | | | | | | * hash.c (hash_seq, hash_isecp): New functions. (hash_init): hash-seq and hash-isecp intrinsics registered. * hash.h (hash_seq, hash_isecp): Declared. * tests/010/hash.tl: New tests. * txr.1: DoOcumented.
* compiler: track tail positions.Kaz Kylheku2025-06-131-30/+53
| | | | | | | | | | | | | | | | | | | | | * stdlib/compiler.tl (ntp): New macro. (*tail-pos*, *tail-fun*): New special variables. (compiler (comp-setq, comp-lisp1-setq, comp-setqf, comp-if, comp-ift, comp-switch, comp-unwind-protect, comp-block, comp-catch, comp-let, comp-fbind, comp-lambda-impl, comp-fun, comp-for)): Identify non-tail-position expressions and turn off the tail position flag for recursing over those. (compiler comp-return-from): The returned expression is in the tail position if this is the block for the current function, otherwise not. (compiler (comp-progn, comp-or)): Positions other than the last are non-tail. (compiler comp-prog1): Nothing is tail position in a prog1 that has two or more expressions. (usr:compile-toplevel): For a new compile job, bind *tail-pos* to nil. There is no tail position until we are compiling a named function (not yet implemented).
* listener: ignore_eof_count must be volatile.Kaz Kylheku2025-06-031-1/+1
| | | | | | | | | | * parser.c (repl): Due to the longjmp-like non-local control transfers taking place, ignore_eof_count must be volatile. The reason is that we change it after saving the context, and then examine it after catching an exception. I'm seeing it have a bad value after an exception is caught, resulting in the ** EOF ignored by user preference" even though I configured an integer value.
* listener: bugfix: evaluate *listener-egnore-eof* after rcfile.Kaz Kylheku2025-06-031-1/+3
| | | | | | | | * parser.c (repl): Our first sampling of *listener-ignore-eof* must occur after we load the rcfile, where it is typically configured, otherwise we pick up a nil value. If Ctrl-D is used on the very first command of a session, TXR will then quit in spite of the user having configured the variable.
* infix: no phony infix over lambdas and such.Kaz Kylheku2025-06-031-1/+2
| | | | | | | * stdlib/infix.tl (funp): Do not recognize list forms as functions, such as lambda expressions or (meth ...) syntax. It causes surprisingly wrong transformations.
* listener: new *listener-ignore-eof* variable.Kaz Kylheku2025-06-032-3/+56
| | | | | | | | | | | | | * parser.c (listener_ignore_eof_s): New symbol variable. (repl): Copy the value of *listener-ignore-eof* into a local variable, which is reloaded after each command evaluation. On EOF, handle the cases involving the variable: positive integers count down, any other integer values quit, any non-nil value prevents quitting. (parse_init): Initialize listener_ignore_eof_s with interned symbol, and register the *listener-ignore-eof* variable. * txr.1: Documented.
* listener: deodorize EOF handling.Kaz Kylheku2025-06-032-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is motivated by seeing a poor behavior, whose manifestation is platform dependent. In the listener if we run, say (get-json) and hit Ctrl-D, then after the get-json function reports failure, the listener will quit as if it received EOF. On older glibc/Linux systems, the listener does not experience EOF. Furthermore, in the EOF situation, this misleading diagnostic is seen: ** error reading interactive input. * parser.c (repl): Call clear_error on in_stream just before calling linenoise. This gets rid of any sticky EOF condition left behind by an input operation. On POSIX systems, if you use stdin to read from a terminal and receive EOF, you must clearerr(stdin) before continuing to read from the terminal. Otherwise input operations on the stream can just return the cached error indication without attempting to perform any input on the file descriptor. Somehow we are getting away without doing this on older systems like Ubuntu 18. Maybe something changed in the glibc stdio implementation. * linenoise/linenoise.c (complete_line): Don't directly return -1 on EOF, just set the stop = 1 variable, so the WEOF value will be returned, similarly to how it is done in history_search. (history_search): Set the error variable on EOF. (edit): Set the error variable on EOF.
* streams: new get-buf function.Kaz Kylheku2025-06-034-26/+94
| | | | | | | | | | | | | * stream.c (get_buf) New function. (stream_init): Register get-buf intrinsic. * stream.h (get_buf): Declared. * stdlib/getput.tl (sys:get-buf-common): Function removed. (file-get-buf, command-get-buf, map-command-buf, map-process-buf): Use get-buf instead of sys:get-buf-common. * txr.1: Documented.
* *-get-buf: bug in skipping non-seekable streams.Kaz Kylheku2025-06-022-4/+11
| | | | | | | | | | | | | | * stdlib/getput.tl (sys:get-buf-common); Fix incorrect algorithm for skipping forward in a stream that doesn't support seek-stream. The problem is that when the seek amont is greater than 4096, it does nothing but 4096 byte reads, which will overshoot the target position if it isn't divisible by 4096. The last read must be adjusted to the remaining seek amount. * tests/018/getput.tl: New test case using property-based approach to show that the read-based skip in get-buf-common fetches the same data as the seek-based skip.
* json: *read-bad-json* allows single quoted strings.Kaz Kylheku2025-06-028-5302/+5366
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It happens in the wild that sometimes JSON-like data must be processed in which strings are delimited by single quotes rather than double quotes. The data is valid Javascript syntax, so JS people don't even notice anything is wrong. * parser.c (struct parser): New member, json_quote_char. This helps the scanner keep track of which closing character it is expecting. * parser.c (parser_common_init): Initialize json_quote_char. * parser.l (JPUNC, NJPUNC): Include single quote (ASCII apostrophe) in JPUNC, and exclude it from NJPUNC. (grammar): When we see either a double quote or single quote in JLIT mode, we return it as itself if that character is the delimiter for the currently scanned string. Otherwise we return it as a LITCHAR, which gets accumulated by the parser into the current string. Include the double. When we see either a double quote or single quote, we transition to the JLIT state. The parser will check whether a single quoted literal is allowed. We allow \' escapes in a single-quote literal unconditionally. We allow them in a double-quoted literal also, but only in read bad JSON mode. * parser.y (json_val): Recognize single-quoted literals, but generate an error unless in read bad JSON mode. Also, error production for unterminated single quote only diagnosed that way in read bad JSON mode, otherwise rejected as invalid JSON. * tests/010/json.tl: New tests. * txr.1: Documented. * lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
* streams: get-string for string byte input stream.Kaz Kylheku2025-06-021-1/+2
| | | | | | * stream.c (byte_in_ops): Wire get_string operation to generic_get_string, giving the stream get-line and get-string support.
* Version 300.txr-300Kaz Kylheku2025-05-317-1409/+1552
| | | | | | | | | | | | | | * RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated. * protsym.c: Regenerated.
* streams: regression: gc issue in get_string_from_stream.Kaz Kylheku2025-05-311-1/+1
| | | | | | | | | * stream.c (get_string_from_stream_common): The so->buf = 0 assignment must precede the call to string_own(buf), because the string out stream object may already be garbage, and the string_own call will reclaim it. If we don't null out the buffer, the string will get ownership of a freed buffer. This reproduced in the CSV test case on MacOS Lion, 32 bit x86.
* parser: scan buflit characters faster.Kaz Kylheku2025-05-302-3907/+3935
| | | | | | | | | | * parser.l (BUFLIT): Instead of scanning a hexadecimal digit and using strol, we scan three separate cases, and do a very simple subtraction in each one. TXR Lisp .tlo files are full of large buffer literals, so this affects loading speed. * lex.yy.c.shipped: Regenerated.
* parser: two fixes in buf literals.Kaz Kylheku2025-05-302-4/+4
| | | | | | | | | | | * parser.y (buflit_items): Here we have length_buf($$) referring to the semantic result value of the rule. It should be referring to $1. It works because the Bison-generated code runs the $$ = $1 logic before all rules. (buflit_item): Let's use num_fast rather than num to produce the byte value since. * y.tab.c.shipped: Regenerated.
* buf: alternative constructor with C type arguments.Kaz Kylheku2025-05-307-19/+32
| | | | | | | | | | | | | | | | | | | | | | | | Many functions call make_buf, having to convert C types to the Lisp arguments using num or unum. Those conversions immediately get undone inside make_buf, and are subject to a wasteful type check. * buf.c (make_buf_fast): New function. * buf.h (make_buf): Misleading parameter renamed. (make_buf_fast): Declared. (sub_buf, buf_list, make_buf_stream, buf_fash, buf_and, buf_trunc): Replace make_buf with make_buf_fast. * lib.c (seq_build_init): Likewise. * ffi.c (ffi_put): Likewise. * stream.c (get_line_as_buf, iobuf_get): Likewise. * parser.y (buflit, buflit_items): Likewise. * y.tab.c.shipped: Regenerated.