summaryrefslogtreecommitdiffstats
path: root/stdlib/compiler.tl
Commit message (Collapse)AuthorAgeFilesLines
* compiler: bug: end instruction balance in tjmp/bjmp handling.Kaz Kylheku14 days1-1/+4
| | | | | | | | | | | | | I was reading the compiler code and noticed that the code template for unwind protect has two end instructions: one for the protected calculation and one for the unwinds. * stdlib/compiler.tl (convert-t-b-jumps): When a uwprot instruction is encountered, the balance must be incremented by 2, in order to skip past two end instructions. Without this we will end up with incorrect code for a block return that jumps out of a block, in which there is a subsequent unwind-protect.
* compiler: replace lazy list integers with iterables.Kaz Kylheku2025-06-201-6/+6
| | | | | | | * stdlib/compiler.tl (compiler (get-datavec, get-symvec, comp-switch, comp-catch, comp-progn, comp-or)): Replace uses of the range function with much more memory efficient integer and integer range iteration.
* compiler: only last case of tree-case is tail position.Kaz Kylheku2025-06-201-2/+6
| | | | | | | | * stdlib/compiler.tl (compiler comp-tree-case): Disable the tail position for all but the last cases. The reason is that the case result values are checked for : fallthrough. It's a bad hack we should think about restricting to static cases.
* compiler: optimized block returns.Kaz Kylheku2025-06-191-20/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/compiler.t (blockinfo): New slots: label, oreg. These inform the compiler, when it is generating a jump out of a block, what register to put in the block return value and where to jump. (env lookup-block): Lose the mark-used optional argument; this function is only called in one place, and that place will now decide whether to mark the block used after doing the lookup, not before. (env extend-block): Add the parameters label and oreg, to pass through these values to the block-info structure's new slots. (compiler): New slot: bjmp-occurs. We are going to use a pseudo instruction (bjmp ...) to denote a call out of a block similarly to how we used (tjmp ...) for a tail call. There will be a similar post-processing needed for them. (compiler comp-block): Pass oreg and lskip to extend-block, so block returns in the inner compilation have this info if they need to compile a direct jump out of the block. The *esc-blocks* needs to be set conditionally. If we are compiling a block*, then name is not a symbol but an expression evaluating to it, and so we don't extend *esc-blocks*; there can be no direct jumps out of a block with a dynamic name. (Or perhaps there could be with more complication and more work). The case when the block is eliminated is more complicated now. Even though the block is eliminated, there can be jumps out of that block in the code. Those jumps expect the output register to be oreg and they expect the lskip label to be present, so we need to add these features to the bfrag.code and also adjust bfrag.oreg. (compiler comp-return-from): We use *esc-blocks* to decide whether to compile a jmp or a dynamic block return. In the one case, we must inform the compiler structure that a bjmp instruction is now present. In the other we must indicate that the block is used for a dynamic transfer, and so cannot be optimized away. (convert-tjmps): Rename to convert-t-b-jmps and handle the bjmp instruction. When a (bjmp <label>) is seen, we scan forward to an occurrence of <label>, similarly to how for a (tjmp <...>) we scan toward a (jend ...) function end. We insert any intervening end instructions before the bjmp and convert to jmp. (compiler optimize): Call convert-t-b-jmps if either the tjmp-occurs or bjmp-occurs flag is set. These flags could be merged into a single one, but let's leave it for now.
* compiler: block escape list.Kaz Kylheku2025-06-191-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | The new dynamic variable *esc-blocks* keeps track of what blocks, in a given scope, may be abandoned by a simple jmp instruction instead of a full blown dynamic return. We will not try to handle unwinding statically; any contour that needs unwinding cannot be jumped across. * stdlib/compiler.tl (*esc-blocks*): New special variable. (compile-in-top-level): Clear *esc-blocks* for top-level compilations. (compiler (comp-unwind-protect, comp-catch, comp-lambda-impl, comp-prof): These contexts cannot be abandoned by a jmp instruction: we cannot jump out of the middle of an unwind-protect, catch, lambda or prof. So we compile these with *esc-blocks* set to nil. New blocks entirely contained in these constructs can of course build up the list locally. E.g. within a function, of course we can have blocks that are abandoned by a simple jmp instruction. Just we cannot jump out. (compiler comp-block): When compiling a block, we bind *esc-blocks* to a list consisting of the previous value, plus the new block name consed to the front.
* compiler: disable tail position in top level.Kaz Kylheku2025-06-191-1/+2
| | | | | | * stdlib/compiler.tl (compile-in-toplevel): We need to disable the tail position for compilations that take place in a top-level environment, like load-time forms.
* compiler: opt-tail-calls compiler option.Kaz Kylheku2025-06-191-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/comp-opts.tl (compile-opts): New slot, opt-tail-calls. (%warning-syms%): Variable removed; this just lists the slots of compile-opts, which can be obtained with the slots function. Moreover, it is badly named, since there is now a non-diagnostic option. (*compile-opts*): Specify an initial value of : for opt-tail-calls. This means that *opt-level* controls whether it is enabled. * stdlib/compiler.tl: Update comment about optimization levels, since level 2 now does tail calls. (compiler comp-lambda-impl): Only enable tail calls if the option has the default value : and *opt-level* is at least two, or else if the option has value t. (with-compile-opts): Recognize : as valid option value. Don't refer to %warning-syms% but the slots of compile-opts. We use load-time to avoid calculating this repeatedly. The wording of the error message has to be adjusted since not all options are diagnosic any more. * autoload.c (compiler_set_entries): Add opt-tail-calls to slot-name autoload triggers for the compiler module. * txr.1: Mention tail calls and the opt-tail-calls option in the description of *opt-level* 2. Document opt-tail-calls under compile-opts section. We describe what tail calls are there and also adjust the wording since not all options diagnostic. Describe the three-valued system for code generation options.
* compiler: TCO: redundant code handling optionals.Kaz Kylheku2025-06-191-8/+5
| | | | | | | | | | * stdlib/compiler.tl (compiler comp-tail-call): Throw away the code generated by lambda-apply-transform for doing defaulting of optionals. The function already contains code to do that, right at the top where the tail call jumps. defaulting optionals twice is not just a waste of time, but can evaluate twice the expressions which provide the default values.
* compiler/load: tlo version number increment,Kaz Kylheku2025-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | The new tail call optimization relies on a fix to the VM's block instruction. This means that .tlo files in which TCO has been applied might not run correctly with TXR 300 or older. For that reason, we bump up the version number. * parser.c (read_file_common): Accept version 8.0 files, while continuing to allow 6 and 7 regardless of minor number. We get picky about minor number so that in the future we can use a a minor number increment for backward compatible changes like this. We would only like to go to version 9 if the VM changes in such a way that we cannot load 8 any more. If we can still load 8.0, we would like to go to 8.1. * stdlib/compiler.tl (%tlo-ver%): Change to 8.0. * txr.1: Documented.
* compiler: TCO code complete.Kaz Kylheku2025-06-191-9/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed point iteration over stdlib works; tests pass. * stdlib/compiler.tl (tail-fun-info): Remove called slot. This is replaced by tjmp-occurs in the compiler. New slot, label. Identifies the backwards jump label for the tail call. (compiler): New slot, tjmp-occurs. If any tail call jump occurs we set this. Special post processing is required to insert some instructions before the jmp, in order to bail out of some nested blocks/frames. (compiler compile): Pass env in two parmeter positions to comp-setq. Compile new setq-in-env compiler-only operator which recurses to comp-setq but allows the variable env to be independently specified. (compiler comp-setq): Take two environment parameters; one for resolving the value, and the other the variable. We need this capability for setting the function parameters in before the tail call jump. The parameters are in an outer environment and may be shadowed. (compiler comp-setq-in-env): New method; parses compiler- generated (setq-in-env <var> <val> <env-obj>) syntax and calls comp-setq. (compiler comp-lambda-impl): If there is a tail context for this lambda, create the jump label for it and store it in the context. Also, we need the tfn.env to be nenv not env; env is the outside context of the lambda, without the parameters! Also, we inject the label into the top of the code. (compiler comp-fun-form): If we are in tail position, compile the function form via comp-tail-call. Turn off the tail position before recursing: the arguments of the tail call are not themselves in a tail position. (compiler comp-tail-call): New function. This is the workhorse. To generate the tail call, we create a fake lambda and use the lambda-apply-transform-function in order to obtain the syntax for an immediate call. We then destructure the pieces, arrange them into the code we need and compile it in the correct environments to generate the fragment, adding the backwards jump to it. This requires a post-processing fixup. (compiler comp-for): Bugfix: the body of a for is not in tail position, only the result forms. (compiler comp-prof): Also disable tail position; we don't want code to jump out of a prof block. (convert-tjmps): New function. This has to analyze the code to find (tjmp ...) pseudo-instructions representing the backwards jumps of tail calls. Before these jmps, we have to insert end instructions, so that the tail call does not jump out of a nested context, such as a variable frame/dframe or block. (usr:compile): When an interpreted function object is compiled, or a symbol naming such an object, we set up the tail-fun-info structure for it, so that tail calls work, like we are already doing for defun and labels.
* compiler: eliminate wasteful treg nulling.Kaz Kylheku2025-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/optimize.tl (live-info): New slot, clobbered. (basic-block): New slot, cycle. Struct also inherits clobbered slot from live-info. (basic-block print): Print clobbered and cycle. (basic-blocks local-liveness): Calculate clobbered for each instruction and from that for the basic block. (basic-blocks identify-cycle-members): New method. Discovers which basic blocks are part of any cycle, and identifies them by setting the new cycle slot to t. (basic-blocks do-peephole-block): New local functions here for determining whether a register has been clobbered before the first instruction, either in the same basic block or any ancestors. Only works when the block is not part of a cycle. We add a peephole pattern matching move instructions that set tregs to (t 0)/nil. When we are not in a cycle block, and the treg has not previously been clobbered, we know it is clean: it still has the initial nil value set by the VM and we can remove the instruction. * stdlib/compiler.tl (compiler optimize): Call the identify-cycle-members method before peephole.
* compiler: don't null tregs before closure.Kaz Kylheku2025-06-171-3/+5
| | | | | | | | | | | | | | | | * stdlib/compiler.tl (compiler eliminate-frame): New optional no-tregs parameter. If true, it disables the generation of code to null the tregs. (compiler comp-lambda-impl): Pass t to no-regs parameter of eliminate-frame to disable the nulling. It is not required for a lambda which executes which fresh t-registers, implicitly initialized to nil. The presence of these instructions prevents the when-match pattern from matching, which expects the instruction sequence to start with a close instruction, so that the effect of eliminate-frame is then lost. This is a latent bug exposed by the previous commit. We would have seen this previously in a lambda occurring inside a loop.
* compiler: remove loop-nest counter hackKaz Kylheku2025-06-161-10/+3
| | | | | | | | | | | | | | | | | | | | | | The loop-nest counter in the compiler context is positive whenever a loop is being compiled. This informs the eliminate-frame function that a variable-binding block can be executed multiple times, and so when variables are converted to registers, those registers have to be explicitly initialized to nil (in order to bring about the semantics of fresh lexical variables being nil). In this patch, we get rid of the counter and just always generate the zero-initializations. They get well optimized away. The code is usually the same. Sometimes four bytes longer or shorter. I'm noticing smaller frame sizes across the board due to registers being eliminated. * stdlib/compiler.tl (compiler): Remove loop-nest slot. (compiler eliminate-frame): Unconditionally emit the mov instructions which set all the new tregs to nil (i.e. copy the value of nil register (t 0)). (comp-for): Do not increment and decrement the loop count.
* compiler: forgotten not/null reductions in if.Kaz Kylheku2025-06-161-0/+6
| | | | | | | * stdlib/compiler.tl (compiler comp-if): Recognize cases like (if (not <expr>) <then> <else>) and convert to (if <expr> <else> <then>). Also the test (true <expr>) is reduced to <expr>.
* compiler: value is no optional in fbind/lbind.Kaz Kylheku2025-06-151-1/+1
| | | | | | | | | This should hav been part of the May 26, 2025 commit d70b55a0023eda8d776f18d224e4487f5e0d484e. * stdlib/compiler.tl (compiler comp-fbind): The form is not optional in fbind/lbind bindings; the syntax is (sym form); we don't have to use optional binding syntax.
* compiler: prepare tail call identification context.Kaz Kylheku2025-06-151-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/compiler.tl (tail-fun-info): New struct type. The *tail-fun* special will be bound to instances of this. (compiler compile): Handle sys:rt-defun specially, via new comp-rt-defun. (compiler comp-return-from): Adjustment here; *tail-fun* does not carry the name, but a context structure with a name slot. (compiler comp-fbind): Whe compiling lbind, and thus potentially recursive functions, bind *tail-fun* to a new tail-fun-info context object carrying the name and lambda function. The env will be filled in later the compilation of the lambda. (compiler comp-lambda-impl): When compiling exactly that lambda expression that is indicated the *tail-fun* structure, store the parameter environment object into that structure, and also bind *tail-pos* to indicate that the body of the lambda is in the tail position. (compiler comp-rt-defun): New method, which destructures the (sys:rt-defun ...) call to extract the name and lambda, and uses those to wrap a tail-fun-info context around the compilation, similarly to what is done for local functions in comp-fbind.
* compiler: tidiness issue in top dispatcher.Kaz Kylheku2025-06-151-4/+4
| | | | | | | * stdlib/compiler.tl (compiler compile): Move the compiler-let case into the "compiler-only special operators" group. Consolidate the group of specially handled functions.
* compiler: immediately called lambda: code gen tweak.Kaz Kylheku2025-06-151-11/+11
| | | | | | | | | | | | | | This patch addresses some irregularities in the output of lambda-appply-transform, to make its output easier to destructure and use in tail recursion logic, in which the inner bindings will be turned into assignments of existing variables. * stdlib/compiler.tl (lambda-apply-transform): Move the binding of the al-val gensym from the inner let* block to the outer let/let where other gensyms are bound. Replace the ign-1 and ign-2 temporaries by a single gensym. Ensure that this gensym is bound.
* compiler: track tail positions.Kaz Kylheku2025-06-131-30/+53
| | | | | | | | | | | | | | | | | | | | | * stdlib/compiler.tl (ntp): New macro. (*tail-pos*, *tail-fun*): New special variables. (compiler (comp-setq, comp-lisp1-setq, comp-setqf, comp-if, comp-ift, comp-switch, comp-unwind-protect, comp-block, comp-catch, comp-let, comp-fbind, comp-lambda-impl, comp-fun, comp-for)): Identify non-tail-position expressions and turn off the tail position flag for recursing over those. (compiler comp-return-from): The returned expression is in the tail position if this is the block for the current function, otherwise not. (compiler (comp-progn, comp-or)): Positions other than the last are non-tail. (compiler comp-prog1): Nothing is tail position in a prog1 that has two or more expressions. (usr:compile-toplevel): For a new compile job, bind *tail-pos* to nil. There is no tail position until we are compiling a named function (not yet implemented).
* compiler: function bindings syntax cannot be atom.Kaz Kylheku2025-05-261-3/+2
| | | | | | | | | * stdlib/compiler.tl (compiler comp-fbind): We don't have to normalize the function binding syntax of a sys:fbind or sys:lbind. This code was copy and pated from (compiler comp-let). These bindings are always (name lambda) pairs and are machine-generated that way. If a name ocurred, it woudl not be correct to rewrite it to (name).
* quasistrings: support format notation.Kaz Kylheku2025-05-181-15/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to quasiliterals to have the inserted items formatted via a format conversion specifier, for example @~3,3a:abc is @abc modified by ~3,3a format conversion. When the inserted value is a list, the conversion is distributed over the elements individually. Otherwise it applies to the entire item. * eval.c (fmt_tostring, fmt_cat): Take additional format string argument. If it isn't nil, then do the string conversion via the fmt1 function rather than tostring. (do_format_field): Take format string argument, and pass down to fmt_cat. (format_field); Take format string argument and pass down to do_format_field. (fmt_simple, fmt_flex): Pass nil format string argument to fmt_tostring. (fmt_simple_fmstr, fmt_flex_fmstr): New static functions, like fmt_simple and fmt_flex but with format string arg. Used as run-time support for compiler-generated quasilit code for cases when format conversion specifier is present. (subst_vars): Extract the new format string frome each variable item. Pass it down to fmt_tostring, format_field and fmt_cat. (eval_init): Register sys:fmt-simple-fmstr and sys:flex-fmstr intrinsics. * eval.h (format_field): Declaration updated. * lib.c (out_quasi_str_sym): Take format string argument. If it is present, output it after the @, followed by a colon, to reproduce the read notation. (out_quasi_str): Pass down the format string, taken from the fourth element of a sys:var item of the quasiliteral. For simple symbolic items, pass down nil. * match.c (tx_subst_vars): Pass nil as new argument of format_field. The output variables of the TXR Pattern language do not exhibit this feature. * parser.l (FMT): New pattern for matching the format string part. (grammar): The rule which recognizes @ in quasiliterals optionally scans the format notation, and turns it into a string attached to the token's semantic value, which is now of type val (see parser.y remarks). * parser.y (tokens): The '@' token's %type changed from lineno to val so it can carry the format string. (q_var): If format string is present in the @ symbol, then include it as the fourth element of the sys:var form. This rule handles braced items. (meta): We can no longer get the line number from the @ item, so we get it from n_expr. (quasi_item): Similar to q_var change here. This handles @ followed by unbraced items: symbols and other expressions. * stdlib/compiler.tl (expand-quasi-mods): Take format string argument. When the format string is present, then generate code which uses the new alternative run-time support functions, and passes them the format string as an argument. (expand-quasi-args): Extend the sys:var match to extract the format string if it is present. Pass it down to expand-quasi-mods. * stdlib/match.tl (expand-quasi-match): Add an error case diagnosing the situation when the program tries to use a format-conversion-endowed item in a quasilit pattern. * stream.[ch] (fmt1): New function. * tests/012/quasi.tl: New tests. * txr.1: Documented. * lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
* compiler: fix unidiomatic if/cond combination.Kaz Kylheku2025-05-171-13/+12
| | | | | | | * stdlib/compiler.tl (expand-quasi-mods): Fix unidiomatic if form which continues with a cond fallback. All I'm doing here is flattening (if a b (cond (c d) ...)) to (cond (a b) (c d) ...).
* compiler: improvements in reporting form in diagnostics.Kaz Kylheku2025-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | * eval.c (ctx_name): Do not report just the car of the form. If the form starts with three symbols, list those; if two, list those. * stdlib/compiler.tl (expand-defun): Set the defun form as the macro ancestor of the lambda, rather than propagating source location info. Then diagnostics that previously refer to a lambda will correctly refer to the defun and thank to the above change in eval.c, include its name. * stdlib/pmac.t (define-param-expander): A similar change here. We make the define-param-expander form the macro ancestor of the lambda, so that diagnostics agains tthe lambda will show that form.
* Rebind *expand-hook* in load and compile-file.Kaz Kylheku2025-04-171-0/+1
| | | | | | | | | | | | * eval.c (loadv): Rebind *expand-hook* to its current value, like we do with *package*. * match.c (v_load): Likewise. * stdlib/compiler.tl (compile-file-conditionally): Likewise. * txr.1: Documented.
* New function keep: generalized keepqual.Kaz Kylheku2025-03-281-0/+1
| | | | | | | | | | | * eval.c (eval_init): Register keep intrinsic. * lib.[ch] (keep): New function. * stdlib/compiler.tl (compiler comp-fun-form): Transform two argument keep to keepqual. * txr.1: Documented.
* compiler: reduce some equal-based sequence functions.Kaz Kylheku2025-03-281-0/+5
| | | | | | | | | | | | | | | * stdlib/compiler.tl (compiler comp-fun-form): Recognize two-argument forms of remove, count, pos, member and subst. When these don't specify test, key or map functions, they are equivalent to remqual, countqual, posqual, memqual and subqual. These functions are a bit faster because they have no arguments to default and some of their C implementations call the equal function either directly or via a pointer, rather than via going via funcall. The exceptions are posqual and subqual which actually call pos; but even for these it is still slightly advantageous to convert to to the fixed arity function, because funcall2 doesn't have to default the optional arguments with colon_k.
* Copyright year bump 2025.Kaz Kylheku2025-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright bumped to 2025.
* tests: suppress warnings in seq.tl.Kaz Kylheku2024-03-081-30/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When tests/012/compile.tl compiles tests/012/seq.tl, there are now some compiler warnings due to constant expressions that throw. We introduce a new compiler option to suppress them, and then use it. * stdlib/comp-opts.tl: New file. The definitions related to compiler options are moved here out of compile.tl, so that optimize.tl can use them. * stdlib/compiler.tl (compile-opts, %warning-syms%, when-opt, *compile-opts*, opt-controlled-diag): Moved to comp-opts.tl. New constant-throws option added to compile-opts and %warning-syms%. (safe-constantp): Make the constant expression throws diagnostic conditional on the new option. * stdlib/optimize.tl: Load comp-opts file. (basic-blocks do-peephole-block): Make diagnostic about throwing situation subject to constant-throws option. * tests/012/seq.tl: Turn off constant-throws warning option before the ref tests that work with ranges. Fix: one of the expressions calls refs with the wrong number of arguments, which was unintentional. * txr.1: Document new diagnostic option.
* compiler: use cons-count.Kaz Kylheku2024-02-091-1/+1
| | | | | | * stdlib/compiler.tl (simplify-variadic-lambda): Use cons-count to find occurrences of the rest variable rather than flatten and count.
* compiler: take advantage of fixed @(end) match.Kaz Kylheku2024-02-081-2/+1
| | | | | | * stdlib/compiler.tl (simplify-variadic-lambda): Remove work-around where two patterns are combined with or, expressing it the way it wants to be.
* compiler: inlined chain: simplify variadic lambdas.Kaz Kylheku2024-02-081-2/+15
| | | | | | | | | | | | | | | The opip syntax often generates lambdas that have a trailing parameter and use [sys:apply ...]. This is wasteful in the second and subsequent argument positions of a chain, because we know that only a single value is coming from the previous function. We can pattern match these lambdas and convert the trailing argument to a single fixed parameter. * stdlib/compiler.tl (simplify-variadic-lambda): New function. (inline-chain-rec): Try to simplify every function through simplify-variadic-lambda. The leftmost function is treated in inline-chain, so these are all second and subsequent functions.
* compiler: implement inlining for chain expressions.Kaz Kylheku2024-02-071-1/+34
| | | | | | | | | | | | | | | | | | | | The opip syntax and its variants transforms into chain expressions. Currently, we emit actual chain function calls, and so all the chain arguments that are lambda expressions have become closures. In this commit, an inlining optimization is introduced which turns some chain function calls into chained expressions. The lambdas are then immediately called, and so succumb to the lambda-eliminating optimization. * stdlib/compiler.tl (compiler comp-fun-form): Handle chain forms. At optimization level 6 or higher, if the form is eligible for the transform, perform it. (inline-chain-rec, can-inline-chain, inline-chain): New functions. * txr.1: Mention that *opt-level* 6 does this chain optimization.
* compiler: whitespace issue.Kaz Kylheku2024-02-071-1/+1
| | | | | * stdlib/compiler (lambda-apply-transform): Fix misleading indentation.
* Copyright year bump 2024.Kaz Kylheku2024-01-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2024.
* compiler: optimizer must watch for throwing constant exprsKaz Kylheku2023-12-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have these issues, which are regressions: 1> (compile-toplevel '(/ 1 0)) ** expr-1:1: warning: sys:b/: constant expression (sys:b/ 1 0) throws ** /: division by zero ** during evaluation at expr-1:1 of form (sys:b/ 1 0) 1> (compile-toplevel '(let ((a 1) (b 0)) (/ a b))) ** /: division by zero ** during evaluation at expr-1:1 of form (compile-toplevel [...]) While the compiler's early pass constant folding is careful to detect constant expressions that throw, care was not taken in the optimizer's later constant folding which takes place after constant values are propagated around. After the fix: 1> (compile-toplevel '(let ((a 1) (b 0) (c t)) (if c (/ a b)))) ** expr-1:1: warning: let: function sys:b/ with arguments (1 0) throws #<sys:vm-desc: 9aceb20> 2> (compile-toplevel '(let ((a 1) (b 0) (c nil)) (if c (/ a b)))) #<sys:vm-desc: 9aef9f0> * stdlib/compiler.tl (compiler): New slot top-form. (compile-toplevel): Initialize the top-form slot of the compiler. The optimizer uses this to issue a warning now. Since the warning is based on analyzing generated code, we cannot trace it to the code more precisely than to the top-level form. * stdlib/optimize.tl (basic-blocks): New slot, warned-insns. List of instructions that have been warned about. (basic-blocks do-peephole-block): Rearrange the constant folding case so that as part of the pattern match condition, we include the fact that the function will not throw when called with those constant arguments. Only in that case do we do the optimization. We warn in the case when the function call does throw. A function rejected due to throwing could be processed through this rule multiple times, under multiple peephole passes, so for that reason we use the warned-insns list to suppress duplicate warnings.
* compiler: don't retain last form if it's an atom.Kaz Kylheku2023-12-201-1/+2
| | | | | | * stdlib/compiler.tl (compiler compile): Don't store form into me.last-form if it's an atom; it won't be useful or error reporting.
* compiler: handle non-locally-exiting top-level forms.Kaz Kylheku2023-12-111-1/+4
| | | | | | | | | | | | | | * stdlib/compiler.tl (compile-file-conditionally): When evaluation of a compiled top-level form is not suppressed, there is a risk that it can terminate non-locally, via throwing an exception or performing a block return. The compilation of the file is then aborted. We can do better: using an unwind-protect, we can catch all non-local control transfers out of the form and just ignore them. The motivation for this is that it lets us compile files which call (return-from load ...), without requiring that it be written as (compile-only (return-from load ...)). Other things will work, like compiling a (load "foo") where foo doesn't exist or aborts due to errors.
* compiler/match: eliminate (subtypep (typeof x) y).Kaz Kylheku2023-08-091-0/+2
| | | | | | | | | * stdlib/compiler.tl (compiler comp-fun-form): Recognize the pattern (subtypep (typeof x) y) and rewrite it to (typep x y). * stdlib/match.tl (compile-struct-match): Don't generate the (subtype (typeof x) y) pattern, but (typeof x y).
* compiler: bug: ensure numbers externalized sanely.Kaz Kylheku2023-08-061-0/+3
| | | | | | | | | | | | * stdlib/compiler.tl (dump-to-tlo): To ensure numbers are externalized in such a way that they will be loaded back exactly, we need to set a few special variables. For integers, we want *print-base* to be 10. Numbers printed in other bases cannot be read back correctly. Octal, hex and binary could be, but they would need to be printed with the correct prefixes. For floating-point values, we want to switch to the default print format, and use flo-max-dig for the precision. That one s not not the default value; the default is flo-dig.
* compiler: compress symbol tables also.Kaz Kylheku2023-07-261-22/+46
| | | | | | | | | | | | | | | | | | | | | | | | When functions are optimized away due to constant folding, instead of replacing them with a nil, we now compact the table to close the gaps and renumber the references in the code. * stdlib/compiler.tl (compiler null-stab): Method removed. (compiler compact-dregs): Renamed to compact-dregs-and-syms. Now compacts the symbol table also. This is combined with D-reg compacting because it makes just two passes through the instruction: a pass to identify the used D registers and symbol indices, and then another pass to edit the instructions with the renamed D registers and renumbered symbol indices. (compiler optimize): Remove the call to the null-unused-data on the basic-blocks object; nulling out D regs and symbol table entries is no longer required. Fllow the rename of compact-dregs to compact-dregs-and-syms which is called the same way otherwise. * stdlib/optimize.tl (basic-blocks null-unused-data): No longer used method removed.
* compiler: compact D registers.Kaz Kylheku2023-07-251-11/+33
| | | | | | | | | | | | | | | | | | | | | | | | We now have some constant folding in the optimizer too, not just in the front end compiler pass. This is leaving behind dead D registers that are not referenced in the code. Let's compact the D register table to close the gap. * stdlib/compiler.tl (compiler get-dreg): In this function we no longer check that we have allocated too many D registers. We let the counter blow past %lev-size%. Because this creates the fighting chance that the compaction of D regs will reduce their number to %lev-size% or less. By doing this, we allow code to be compilable that otherwise would not be: code that allocates too many D regs which are then optimized away. (compiler compact-dregs): New function. Does all the work. (compiler optimize): Compact the D regs at optimization level 5 or higher. (compile-toplevel): Check for an overflowing D reg count here, after optimization. * stdlib/optimize.tl (basic-blocks null-unused-data): Here, we no longer have to do anything with the D registers.
* compiler: code formatting.Kaz Kylheku2023-07-251-3/+3
| | | | | | | | * stdlib/compiler.tl (compiler get-dreg): Fix indentation proble. * stdlib/optimize.tl (basic-block fill-treg-compacting-map): Likewise.
* compiler: move material into constfun.tlKaz Kylheku2023-07-151-30/+0
| | | | | | | | | | | | | * stdlib/compiler.tl (%effect-free-funs%, %effect-free%, %functional-funs%, %functional%): Move variables into stdlib/constfun.tl * stdlib/constfun.tl %effect-free-funs%, %effect-free%, %functional-funs%, %functional%): Moved here. * stdlib/optimize.tl: Use load-for to express dependency on constfun module; don't depend on the compiler having loaded it.
* compiler: constant folding in optimizer.Kaz Kylheku2023-07-151-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The compiler handles trivial constant folding over the source code, as a source to source transformation. However, there are more opportunities for constant folding after data flow optimizations of the VM code. Early constant folding will not fold, for instance, (let ((a 2) (b 3)) (* a b)) but we can reduce this to an end instruction that returns the value of a D register that holds 6. Data flow optimizations will propagate the D registers for 2 and 3 into the gcall instruction. We can then recognize that we have a gcall with nothing but D register operands, calling a constant-foldable function. We can allocate a new D register to hold the result of that calculation and just move that D register's value into the target register of the original gcall. * stdlib/compiler.tl (compiler get-dreg): When allocating a new D reg, we must invalidate the datavec slot which is calculated from the data hash. This didn't matter before, because until now, get-datavec was called after compilation, at which point no new D regs will exist. That is changing; the optimizer can allocate D regs. (compiler null-dregs, compiler null-stab): New methods. (compiler optimize): Pass self to constructor for basic-blocks. basic-blocks now references back to the compiler. At optimization level 5 or higher, constant folding can now happen, so we call the new method in the optimizer to null the unused data. This overwrites unused D registers and unused parts of the symbol vector with nil. * stdlib/optimize (basic-blocks): Boa constructor now takes a new leftmost param, the compiler. (basic-blocks do-peephole-block): New optimization case: gcall instruction invoking const-foldable function, with all arguments being dregs. (basic-blocks null-unused-data): New method.
* compiler: more logging regarding compiled files.Kaz Kylheku2023-06-051-12/+23
| | | | | | | | * stdlib/compiler.tl (clean-file): Under a log-level of 1 or more, report clean-file removes a file. (compile-update-file): Under a log level of 1 or more, report when a compiled file was skipped due to being up-to-date.
* compiler: new compiler option log-levelKaz Kylheku2023-06-041-3/+15
| | | | | | | | | | | | | | | | | | With log-level, we can obtain trace messages about what file is being compiled and individual forms within that file. * autoload.c (compiler_set_entries): Intern the slot symbol log-level. * stdlib/compiler.tl (compile-opts): New slot, log-level. (%warning-syms%): Add log-level to %warning-syms%. Probably we need to rename this variable. (compile-file-conditionally): Implement the two log level messages. (with-compile-opts): Allow/recognize integer option values. * txr.1: Documented.
* compiler: new function, clean-file.Kaz Kylheku2023-06-041-0/+16
| | | | | | | | | | | | | | | | This function simplifies cleaning, by allowing a file to be cleaned to be identified in much the same way as an input file to load or compile-file. * autoload.c (compiler_set_entries): The clean-file symbol is interned and becomes an autoload trigger for the compiler module. * stdlib/compiler.tl (clean-file): New function. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* bug: compile-file can put out nil, confusing load.Kaz Kylheku2023-06-031-1/+1
| | | | | | | | | | | | | | | | | | | | The file compiler combines compiled forms into a single list as much as possible so that objects in the list can share structure (e.g. merged string literals). However, when package-manipulating forms occur, like defpackage, it has to spit these lists, since the package manipulations of an earlier form affect the processing of a later form, such as whether symbols in that form are valid. This splitting does not take care of the case that an empty piece may result when the very last form is a package manipulation form. A nil gets written to the .tlo file, which the load function does not like; load thinks that since this is not a valid list of compiled forms, it must be the version number field of a catenated .tlo file, and proceeds to find it an invalid, incompatible version. * stdlib/compiler.tl (dump-to-tlo): Use partition* rather than split*. partition* doesn't leave empty pieces.
* compiler: fbind/lbind: elide unnecessary frames.Kaz Kylheku2023-05-241-9/+15
| | | | | | | | | | | * stdlib/compiler.tl (comp-fbind): When after removing unused functions we are left with an empty list (or the list of functions was empty to begin with), let's only emit the body fragment without any frame wrapping. We can't just return bfrag because that was compiled in the environment which matches the frame. Instead of the expense of compiling the code again, we rely on eliminate-frame to move all v registers up one level.
* with-compile-options: reimplement using compiler-letKaz Kylheku2023-05-161-12/+14
| | | | | | | | | | | | | | | | | | | The with-compile-opts macro is rewritten such that it cad occur inside code that is being compiled, and change compiler options for individual subexpressions. It continues to work as before in scripted build steps such as when calls to (compile-file ...) are wrapped in it. However, for the time being, that now only works in interpreted code, because with this change, when a with-compile-opts form is compiled, it no longer arranges for the binding of *compile-opts* to be visible to the subforms; the binding affects the compiler's own environment. * stdlib/compiler.tl (with-compile-opts): Rewrite. * txr.1: Documented.