txr - TXR: A data munging language.

	Commit message (Collapse)	Author	Age	Files	Lines
*	compiler: forgotten not/null reductions in if.	Kaz Kylheku	2025-06-16	1	-0/+6
\| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-if): Recognize cases like (if (not <expr>) <then> <else>) and convert to (if <expr> <else> <then>). Also the test (true <expr>) is reduced to <expr>.
*	compiler: value is no optional in fbind/lbind.	Kaz Kylheku	2025-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	This should hav been part of the May 26, 2025 commit d70b55a0023eda8d776f18d224e4487f5e0d484e. * stdlib/compiler.tl (compiler comp-fbind): The form is not optional in fbind/lbind bindings; the syntax is (sym form); we don't have to use optional binding syntax.
*	compiler: prepare tail call identification context.	Kaz Kylheku	2025-06-15	1	-2/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (tail-fun-info): New struct type. The tail-fun special will be bound to instances of this. (compiler compile): Handle sys:rt-defun specially, via new comp-rt-defun. (compiler comp-return-from): Adjustment here; tail-fun does not carry the name, but a context structure with a name slot. (compiler comp-fbind): Whe compiling lbind, and thus potentially recursive functions, bind tail-fun to a new tail-fun-info context object carrying the name and lambda function. The env will be filled in later the compilation of the lambda. (compiler comp-lambda-impl): When compiling exactly that lambda expression that is indicated the tail-fun structure, store the parameter environment object into that structure, and also bind tail-pos to indicate that the body of the lambda is in the tail position. (compiler comp-rt-defun): New method, which destructures the (sys:rt-defun ...) call to extract the name and lambda, and uses those to wrap a tail-fun-info context around the compilation, similarly to what is done for local functions in comp-fbind.
*	compiler: tidiness issue in top dispatcher.	Kaz Kylheku	2025-06-15	1	-4/+4
\| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler compile): Move the compiler-let case into the "compiler-only special operators" group. Consolidate the group of specially handled functions.
*	compiler: immediately called lambda: code gen tweak.	Kaz Kylheku	2025-06-15	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses some irregularities in the output of lambda-appply-transform, to make its output easier to destructure and use in tail recursion logic, in which the inner bindings will be turned into assignments of existing variables. * stdlib/compiler.tl (lambda-apply-transform): Move the binding of the al-val gensym from the inner let* block to the outer let/let where other gensyms are bound. Replace the ign-1 and ign-2 temporaries by a single gensym. Ensure that this gensym is bound.
*	compiler: track tail positions.	Kaz Kylheku	2025-06-13	1	-30/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (ntp): New macro. (tail-pos, tail-fun): New special variables. (compiler (comp-setq, comp-lisp1-setq, comp-setqf, comp-if, comp-ift, comp-switch, comp-unwind-protect, comp-block, comp-catch, comp-let, comp-fbind, comp-lambda-impl, comp-fun, comp-for)): Identify non-tail-position expressions and turn off the tail position flag for recursing over those. (compiler comp-return-from): The returned expression is in the tail position if this is the block for the current function, otherwise not. (compiler (comp-progn, comp-or)): Positions other than the last are non-tail. (compiler comp-prog1): Nothing is tail position in a prog1 that has two or more expressions. (usr:compile-toplevel): For a new compile job, bind tail-pos to nil. There is no tail position until we are compiling a named function (not yet implemented).
*	compiler: function bindings syntax cannot be atom.	Kaz Kylheku	2025-05-26	1	-3/+2
\| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-fbind): We don't have to normalize the function binding syntax of a sys:fbind or sys:lbind. This code was copy and pated from (compiler comp-let). These bindings are always (name lambda) pairs and are machine-generated that way. If a name ocurred, it woudl not be correct to rewrite it to (name).
*	quasistrings: support format notation.	Kaz Kylheku	2025-05-18	1	-15/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support to quasiliterals to have the inserted items formatted via a format conversion specifier, for example @~3,3a:abc is @abc modified by ~3,3a format conversion. When the inserted value is a list, the conversion is distributed over the elements individually. Otherwise it applies to the entire item. * eval.c (fmt_tostring, fmt_cat): Take additional format string argument. If it isn't nil, then do the string conversion via the fmt1 function rather than tostring. (do_format_field): Take format string argument, and pass down to fmt_cat. (format_field); Take format string argument and pass down to do_format_field. (fmt_simple, fmt_flex): Pass nil format string argument to fmt_tostring. (fmt_simple_fmstr, fmt_flex_fmstr): New static functions, like fmt_simple and fmt_flex but with format string arg. Used as run-time support for compiler-generated quasilit code for cases when format conversion specifier is present. (subst_vars): Extract the new format string frome each variable item. Pass it down to fmt_tostring, format_field and fmt_cat. (eval_init): Register sys:fmt-simple-fmstr and sys:flex-fmstr intrinsics. * eval.h (format_field): Declaration updated. * lib.c (out_quasi_str_sym): Take format string argument. If it is present, output it after the @, followed by a colon, to reproduce the read notation. (out_quasi_str): Pass down the format string, taken from the fourth element of a sys:var item of the quasiliteral. For simple symbolic items, pass down nil. * match.c (tx_subst_vars): Pass nil as new argument of format_field. The output variables of the TXR Pattern language do not exhibit this feature. * parser.l (FMT): New pattern for matching the format string part. (grammar): The rule which recognizes @ in quasiliterals optionally scans the format notation, and turns it into a string attached to the token's semantic value, which is now of type val (see parser.y remarks). * parser.y (tokens): The '@' token's %type changed from lineno to val so it can carry the format string. (q_var): If format string is present in the @ symbol, then include it as the fourth element of the sys:var form. This rule handles braced items. (meta): We can no longer get the line number from the @ item, so we get it from n_expr. (quasi_item): Similar to q_var change here. This handles @ followed by unbraced items: symbols and other expressions. * stdlib/compiler.tl (expand-quasi-mods): Take format string argument. When the format string is present, then generate code which uses the new alternative run-time support functions, and passes them the format string as an argument. (expand-quasi-args): Extend the sys:var match to extract the format string if it is present. Pass it down to expand-quasi-mods. * stdlib/match.tl (expand-quasi-match): Add an error case diagnosing the situation when the program tries to use a format-conversion-endowed item in a quasilit pattern. * stream.[ch] (fmt1): New function. * tests/012/quasi.tl: New tests. * txr.1: Documented. * lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
*	compiler: fix unidiomatic if/cond combination.	Kaz Kylheku	2025-05-17	1	-13/+12
\| \| \| \| \| \| \|	* stdlib/compiler.tl (expand-quasi-mods): Fix unidiomatic if form which continues with a cond fallback. All I'm doing here is flattening (if a b (cond (c d) ...)) to (cond (a b) (c d) ...).
*	compiler: improvements in reporting form in diagnostics.	Kaz Kylheku	2025-05-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (ctx_name): Do not report just the car of the form. If the form starts with three symbols, list those; if two, list those. * stdlib/compiler.tl (expand-defun): Set the defun form as the macro ancestor of the lambda, rather than propagating source location info. Then diagnostics that previously refer to a lambda will correctly refer to the defun and thank to the above change in eval.c, include its name. * stdlib/pmac.t (define-param-expander): A similar change here. We make the define-param-expander form the macro ancestor of the lambda, so that diagnostics agains tthe lambda will show that form.
*	Rebind expand-hook in load and compile-file.	Kaz Kylheku	2025-04-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (loadv): Rebind expand-hook to its current value, like we do with package. * match.c (v_load): Likewise. * stdlib/compiler.tl (compile-file-conditionally): Likewise. * txr.1: Documented.
*	New function keep: generalized keepqual.	Kaz Kylheku	2025-03-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	* eval.c (eval_init): Register keep intrinsic. * lib.[ch] (keep): New function. * stdlib/compiler.tl (compiler comp-fun-form): Transform two argument keep to keepqual. * txr.1: Documented.
*	compiler: reduce some equal-based sequence functions.	Kaz Kylheku	2025-03-28	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-fun-form): Recognize two-argument forms of remove, count, pos, member and subst. When these don't specify test, key or map functions, they are equivalent to remqual, countqual, posqual, memqual and subqual. These functions are a bit faster because they have no arguments to default and some of their C implementations call the equal function either directly or via a pointer, rather than via going via funcall. The exceptions are posqual and subqual which actually call pos; but even for these it is still slightly advantageous to convert to to the fixed arity function, because funcall2 doesn't have to default the optional arguments with colon_k.
*	Copyright year bump 2025.	Kaz Kylheku	2025-01-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright bumped to 2025.
*	tests: suppress warnings in seq.tl.	Kaz Kylheku	2024-03-08	1	-30/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When tests/012/compile.tl compiles tests/012/seq.tl, there are now some compiler warnings due to constant expressions that throw. We introduce a new compiler option to suppress them, and then use it. * stdlib/comp-opts.tl: New file. The definitions related to compiler options are moved here out of compile.tl, so that optimize.tl can use them. * stdlib/compiler.tl (compile-opts, %warning-syms%, when-opt, compile-opts, opt-controlled-diag): Moved to comp-opts.tl. New constant-throws option added to compile-opts and %warning-syms%. (safe-constantp): Make the constant expression throws diagnostic conditional on the new option. * stdlib/optimize.tl: Load comp-opts file. (basic-blocks do-peephole-block): Make diagnostic about throwing situation subject to constant-throws option. * tests/012/seq.tl: Turn off constant-throws warning option before the ref tests that work with ranges. Fix: one of the expressions calls refs with the wrong number of arguments, which was unintentional. * txr.1: Document new diagnostic option.
*	compiler: use cons-count.	Kaz Kylheku	2024-02-09	1	-1/+1
\| \| \| \| \| \|	* stdlib/compiler.tl (simplify-variadic-lambda): Use cons-count to find occurrences of the rest variable rather than flatten and count.
*	compiler: take advantage of fixed @(end) match.	Kaz Kylheku	2024-02-08	1	-2/+1
\| \| \| \| \| \|	* stdlib/compiler.tl (simplify-variadic-lambda): Remove work-around where two patterns are combined with or, expressing it the way it wants to be.
*	compiler: inlined chain: simplify variadic lambdas.	Kaz Kylheku	2024-02-08	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The opip syntax often generates lambdas that have a trailing parameter and use [sys:apply ...]. This is wasteful in the second and subsequent argument positions of a chain, because we know that only a single value is coming from the previous function. We can pattern match these lambdas and convert the trailing argument to a single fixed parameter. * stdlib/compiler.tl (simplify-variadic-lambda): New function. (inline-chain-rec): Try to simplify every function through simplify-variadic-lambda. The leftmost function is treated in inline-chain, so these are all second and subsequent functions.
*	compiler: implement inlining for chain expressions.	Kaz Kylheku	2024-02-07	1	-1/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The opip syntax and its variants transforms into chain expressions. Currently, we emit actual chain function calls, and so all the chain arguments that are lambda expressions have become closures. In this commit, an inlining optimization is introduced which turns some chain function calls into chained expressions. The lambdas are then immediately called, and so succumb to the lambda-eliminating optimization. * stdlib/compiler.tl (compiler comp-fun-form): Handle chain forms. At optimization level 6 or higher, if the form is eligible for the transform, perform it. (inline-chain-rec, can-inline-chain, inline-chain): New functions. * txr.1: Mention that opt-level 6 does this chain optimization.
*	compiler: whitespace issue.	Kaz Kylheku	2024-02-07	1	-1/+1
\| \| \| \| \|	* stdlib/compiler (lambda-apply-transform): Fix misleading indentation.
*	Copyright year bump 2024.	Kaz Kylheku	2024-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h, args.c, args.h, arith.c, arith.h, autoload.c, autoload.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c, gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, psquare.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl, stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl, stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl, stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h, termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year bumped to 2024.
*	compiler: optimizer must watch for throwing constant exprs	Kaz Kylheku	2023-12-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have these issues, which are regressions: 1> (compile-toplevel '(/ 1 0)) expr-1:1: warning: sys:b/: constant expression (sys:b/ 1 0) throws /: division by zero during evaluation at expr-1:1 of form (sys:b/ 1 0) 1> (compile-toplevel '(let ((a 1) (b 0)) (/ a b))) /: division by zero during evaluation at expr-1:1 of form (compile-toplevel [...]) While the compiler's early pass constant folding is careful to detect constant expressions that throw, care was not taken in the optimizer's later constant folding which takes place after constant values are propagated around. After the fix: 1> (compile-toplevel '(let ((a 1) (b 0) (c t)) (if c (/ a b)))) expr-1:1: warning: let: function sys:b/ with arguments (1 0) throws #<sys:vm-desc: 9aceb20> 2> (compile-toplevel '(let ((a 1) (b 0) (c nil)) (if c (/ a b)))) #<sys:vm-desc: 9aef9f0> * stdlib/compiler.tl (compiler): New slot top-form. (compile-toplevel): Initialize the top-form slot of the compiler. The optimizer uses this to issue a warning now. Since the warning is based on analyzing generated code, we cannot trace it to the code more precisely than to the top-level form. * stdlib/optimize.tl (basic-blocks): New slot, warned-insns. List of instructions that have been warned about. (basic-blocks do-peephole-block): Rearrange the constant folding case so that as part of the pattern match condition, we include the fact that the function will not throw when called with those constant arguments. Only in that case do we do the optimization. We warn in the case when the function call does throw. A function rejected due to throwing could be processed through this rule multiple times, under multiple peephole passes, so for that reason we use the warned-insns list to suppress duplicate warnings.
*	compiler: don't retain last form if it's an atom.	Kaz Kylheku	2023-12-20	1	-1/+2
\| \| \| \| \| \|	* stdlib/compiler.tl (compiler compile): Don't store form into me.last-form if it's an atom; it won't be useful or error reporting.
*	compiler: handle non-locally-exiting top-level forms.	Kaz Kylheku	2023-12-11	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compile-file-conditionally): When evaluation of a compiled top-level form is not suppressed, there is a risk that it can terminate non-locally, via throwing an exception or performing a block return. The compilation of the file is then aborted. We can do better: using an unwind-protect, we can catch all non-local control transfers out of the form and just ignore them. The motivation for this is that it lets us compile files which call (return-from load ...), without requiring that it be written as (compile-only (return-from load ...)). Other things will work, like compiling a (load "foo") where foo doesn't exist or aborts due to errors.
*	compiler/match: eliminate (subtypep (typeof x) y).	Kaz Kylheku	2023-08-09	1	-0/+2
\| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-fun-form): Recognize the pattern (subtypep (typeof x) y) and rewrite it to (typep x y). * stdlib/match.tl (compile-struct-match): Don't generate the (subtype (typeof x) y) pattern, but (typeof x y).
*	compiler: bug: ensure numbers externalized sanely.	Kaz Kylheku	2023-08-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (dump-to-tlo): To ensure numbers are externalized in such a way that they will be loaded back exactly, we need to set a few special variables. For integers, we want print-base to be 10. Numbers printed in other bases cannot be read back correctly. Octal, hex and binary could be, but they would need to be printed with the correct prefixes. For floating-point values, we want to switch to the default print format, and use flo-max-dig for the precision. That one s not not the default value; the default is flo-dig.
*	compiler: compress symbol tables also.	Kaz Kylheku	2023-07-26	1	-22/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When functions are optimized away due to constant folding, instead of replacing them with a nil, we now compact the table to close the gaps and renumber the references in the code. * stdlib/compiler.tl (compiler null-stab): Method removed. (compiler compact-dregs): Renamed to compact-dregs-and-syms. Now compacts the symbol table also. This is combined with D-reg compacting because it makes just two passes through the instruction: a pass to identify the used D registers and symbol indices, and then another pass to edit the instructions with the renamed D registers and renumbered symbol indices. (compiler optimize): Remove the call to the null-unused-data on the basic-blocks object; nulling out D regs and symbol table entries is no longer required. Fllow the rename of compact-dregs to compact-dregs-and-syms which is called the same way otherwise. * stdlib/optimize.tl (basic-blocks null-unused-data): No longer used method removed.
*	compiler: compact D registers.	Kaz Kylheku	2023-07-25	1	-11/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We now have some constant folding in the optimizer too, not just in the front end compiler pass. This is leaving behind dead D registers that are not referenced in the code. Let's compact the D register table to close the gap. * stdlib/compiler.tl (compiler get-dreg): In this function we no longer check that we have allocated too many D registers. We let the counter blow past %lev-size%. Because this creates the fighting chance that the compaction of D regs will reduce their number to %lev-size% or less. By doing this, we allow code to be compilable that otherwise would not be: code that allocates too many D regs which are then optimized away. (compiler compact-dregs): New function. Does all the work. (compiler optimize): Compact the D regs at optimization level 5 or higher. (compile-toplevel): Check for an overflowing D reg count here, after optimization. * stdlib/optimize.tl (basic-blocks null-unused-data): Here, we no longer have to do anything with the D registers.
*	compiler: code formatting.	Kaz Kylheku	2023-07-25	1	-3/+3
\| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler get-dreg): Fix indentation proble. * stdlib/optimize.tl (basic-block fill-treg-compacting-map): Likewise.
*	compiler: move material into constfun.tl	Kaz Kylheku	2023-07-15	1	-30/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (%effect-free-funs%, %effect-free%, %functional-funs%, %functional%): Move variables into stdlib/constfun.tl * stdlib/constfun.tl %effect-free-funs%, %effect-free%, %functional-funs%, %functional%): Moved here. * stdlib/optimize.tl: Use load-for to express dependency on constfun module; don't depend on the compiler having loaded it.
*	compiler: constant folding in optimizer.	Kaz Kylheku	2023-07-15	1	-4/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The compiler handles trivial constant folding over the source code, as a source to source transformation. However, there are more opportunities for constant folding after data flow optimizations of the VM code. Early constant folding will not fold, for instance, (let ((a 2) (b 3)) (* a b)) but we can reduce this to an end instruction that returns the value of a D register that holds 6. Data flow optimizations will propagate the D registers for 2 and 3 into the gcall instruction. We can then recognize that we have a gcall with nothing but D register operands, calling a constant-foldable function. We can allocate a new D register to hold the result of that calculation and just move that D register's value into the target register of the original gcall. * stdlib/compiler.tl (compiler get-dreg): When allocating a new D reg, we must invalidate the datavec slot which is calculated from the data hash. This didn't matter before, because until now, get-datavec was called after compilation, at which point no new D regs will exist. That is changing; the optimizer can allocate D regs. (compiler null-dregs, compiler null-stab): New methods. (compiler optimize): Pass self to constructor for basic-blocks. basic-blocks now references back to the compiler. At optimization level 5 or higher, constant folding can now happen, so we call the new method in the optimizer to null the unused data. This overwrites unused D registers and unused parts of the symbol vector with nil. * stdlib/optimize (basic-blocks): Boa constructor now takes a new leftmost param, the compiler. (basic-blocks do-peephole-block): New optimization case: gcall instruction invoking const-foldable function, with all arguments being dregs. (basic-blocks null-unused-data): New method.
*	compiler: more logging regarding compiled files.	Kaz Kylheku	2023-06-05	1	-12/+23
\| \| \| \| \| \| \| \|	* stdlib/compiler.tl (clean-file): Under a log-level of 1 or more, report clean-file removes a file. (compile-update-file): Under a log level of 1 or more, report when a compiled file was skipped due to being up-to-date.
*	compiler: new compiler option log-level	Kaz Kylheku	2023-06-04	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With log-level, we can obtain trace messages about what file is being compiled and individual forms within that file. * autoload.c (compiler_set_entries): Intern the slot symbol log-level. * stdlib/compiler.tl (compile-opts): New slot, log-level. (%warning-syms%): Add log-level to %warning-syms%. Probably we need to rename this variable. (compile-file-conditionally): Implement the two log level messages. (with-compile-opts): Allow/recognize integer option values. * txr.1: Documented.
*	compiler: new function, clean-file.	Kaz Kylheku	2023-06-04	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function simplifies cleaning, by allowing a file to be cleaned to be identified in much the same way as an input file to load or compile-file. * autoload.c (compiler_set_entries): The clean-file symbol is interned and becomes an autoload trigger for the compiler module. * stdlib/compiler.tl (clean-file): New function. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	bug: compile-file can put out nil, confusing load.	Kaz Kylheku	2023-06-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The file compiler combines compiled forms into a single list as much as possible so that objects in the list can share structure (e.g. merged string literals). However, when package-manipulating forms occur, like defpackage, it has to spit these lists, since the package manipulations of an earlier form affect the processing of a later form, such as whether symbols in that form are valid. This splitting does not take care of the case that an empty piece may result when the very last form is a package manipulation form. A nil gets written to the .tlo file, which the load function does not like; load thinks that since this is not a valid list of compiled forms, it must be the version number field of a catenated .tlo file, and proceeds to find it an invalid, incompatible version. * stdlib/compiler.tl (dump-to-tlo): Use partition* rather than split. partition doesn't leave empty pieces.
*	compiler: fbind/lbind: elide unnecessary frames.	Kaz Kylheku	2023-05-24	1	-9/+15
\| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (comp-fbind): When after removing unused functions we are left with an empty list (or the list of functions was empty to begin with), let's only emit the body fragment without any frame wrapping. We can't just return bfrag because that was compiled in the environment which matches the frame. Instead of the expense of compiling the code again, we rely on eliminate-frame to move all v registers up one level.
*	with-compile-options: reimplement using compiler-let	Kaz Kylheku	2023-05-16	1	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The with-compile-opts macro is rewritten such that it cad occur inside code that is being compiled, and change compiler options for individual subexpressions. It continues to work as before in scripted build steps such as when calls to (compile-file ...) are wrapped in it. However, for the time being, that now only works in interpreted code, because with this change, when a with-compile-opts form is compiled, it no longer arranges for the binding of compile-opts to be visible to the subforms; the binding affects the compiler's own environment. * stdlib/compiler.tl (with-compile-opts): Rewrite. * txr.1: Documented.
*	New special operator: compiler-let	Kaz Kylheku	2023-05-16	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* eval.c (compiler_let_s): New symbol variable. (op_let): Recognize compiler-let for sequential binding. (do_expand): Traverse and diagnose compiler-let form. (eval_init): Initialize compiler_let_s and register the interpreted version of the operator. * stdlib/compiler.tl (compiler compile): Handle compiler-let form. (compiler comp-compiler-let): New method. (no-dvbind-eval): New function. * autoload.c (compiler-set-entries): Intern the compiler-let symbol in the user package. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	New special operator: progv	Kaz Kylheku	2023-05-15	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding a progv operator, similar to the Common Lisp one. * eval.c (progv_s): New symbol variable. (op_progv): New static function. (do_expand): Recognize and traverse the progv form. (rt_progv): New static function: run-time support for compiled progv. (eval_init): Initialize progv_s, and register the the op_progv operator interpreting function. * stdlib/compilert (compiler compile): Handle progv operator ... (compiler comp-progv): ... via this new method. * tests/019/progv.tl: New file. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
*	compiler: spelling error in diagnostic.	Kaz Kylheku	2023-05-12	1	-1/+1
\| \| \| \| \|	* stdlib/compiler.tl (with-compile-opts): Remove stray character from "uncrecognized".
*	compiler: multiple issues in macro-parameter forms.	Kaz Kylheku	2023-05-05	1	-24/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a defmacro form is compiled, the entire form is retained as a literal in the output. This is wasteful and gives away the source code. In spite of that, errors in using the macro are incorrectly reported against defmacro, because that is the first symbol in the form. These issues arise with what arguments are passed as the first two parameters of the compiler's expand-bind-mac-params function, and what exactly it does with them. We make a tweak to that, as well as some tweaks to all the calls. * stdlib/compiler.tl (expand-bind-mac-params): There is a mix-up here in that both the ctx-form and err-form arguments are ending up in the compiled output. Let's have only the first agument, ctx-form going into the compiled output. Thus that is what is inserted into the sys:bind-mach-check call that is generated. Secondly, ctx-form should not be passed to the constructor for mac-param-parser. ctx-form is a to-be-evaluated expression which might just be a gensym; we cannot use it at compile time for error reporting. Here we must use the second argument. Thus the second argument is now used only for two purposes: copying the source code info to the output code, and for error reporting in the mac-param-parser class. This second purpose is minor, because the code has been passed through the macro expander before being compiled, which has caught all the errors. Thus the argument is changed to rlcp-form, reflecting its principal use. (comp-tree-bind, comp-tree-case): Calculate a simplified version of the tree-bind or tree-case form for error reporting and pass that as argument the ctx-form argument of expand-bind-mac-params. Just pass form as the second argument. (comp-mac-param-bind, comp-mac-env-param-bind): Just pass form as the second argument of expand-bind-mac-params.
*	compiler: bugfix: lingering funarg eval order issue.	Kaz Kylheku	2023-04-17	1	-3/+1
\| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler comp-call-impl): We can no longer free the temporary registers as-we-go based on whether the argument expression frag uses them as the output register frag. Let's just put them all into the aoregs list to be freed afterward.
*	compiler: better handling for mutated locals in funargs.	Kaz Kylheku	2023-04-17	1	-20/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of the conservative strategy in compiler comp-var of loading variables into t-registers, and relying on optimization to remove them, let's just go back to the old way: variables are just registers. For function calls, we can detect mutated variables and generate the conservative code. * stdlib/compiler.tl (frag): New slots vbin and alt-oreg. When a variable access is compiled, the binding is recorded in vbin, and the desired output register in alt-oreg. (simplify-var-spy): New struct type, used for detecting mutated lexical variables when we compile a function argument list. (compiler comp-var): Revert to the old compilation strategy for lexicals: the code fragment is empty, and the output register is just the v-reg. However, we record the variable binding and remember the caller's desired register in the new frag fields. (compiler comp-setq): Also revert the strategy here. Here we get our frag from a recursive compilation, so we just annotate it. (compiler comp-call-impl): Use the simplify-var-spy to obtain a list of the lexical variables that were mutated. This is used for rewriting the frags, if necessary. (handle-mutated-var-args): New function. If the mutated-vars list is non-empty, it rewrites the frag list. Every element in the frag which is a compiled reference to a lexical variable which is mutated over the evaluation of the arg list is substituted with a conservative frag which loads the variable into a temporary register. That register thus samples the value of the variable at the correct point in the left-to-right evaluation, so the function is called with the correct values.
*	compiler: bugfix: eval order of variables.	Kaz Kylheku	2023-04-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have the following problem: when function call argument expressions mutate some of the variables that are being passed as arguments, the left-to-right semantics isn't obeyed. The problem is that the funcction call simply refers to the registers that hold the variables, rather than to the evaluated values. For instance (fun a (inc a)) will translate to something like (gcall <n> (v 3) (v 3)) which is incorrect: both argument positions refer to the current value of a, whereas we need the left argument to refer to the value before the increment. * stdlib/compiler.tl (compiler comp-var): Do not assert the variable as the output register, with null code. Indicate that the value is in the caller's output register, and if necessary generate the move. (compiler comp-setq): When compiling the right-hand-side, use the original output register, so that we don't end up reporting the variable as the result location.
*	compiler: discard wrongheaded discards.	Kaz Kylheku	2023-04-08	1	-55/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (compiler): Remove discards slot. (compile-in-toplevel, compile-with-fresh-tregs): Do not save and restore discards. (compiler maybe-mov): Method removed. It doesn't require the compiler object so it can just be a function. (maybe-mov): New function. (compiler alloc-discard-treg): Method removed. (compiler free-treg): No need to do anything with discards. (compiler maybe-alloc-treg): No need to check discards. (compiler (comp-setq, comp-if, comp-ift, comp-switch, comp-block, comp-catch, comp-let, comp-fbind, comp-lambda-impl, comp-or, comp-tree-case, comp-load-time-lit): Use maybe-mov function instead of method. (compiler comp-progn): Use alloc-treg rather than alloc-discard-treg, and use maybe-mov function.
*	compiler: iterate on level 4-5 optimizations.	Kaz Kylheku	2023-04-07	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/optimize.tl (basic-blocks num-blocks): New method. * stdlib/compiler.tl (compiler optimize): At optimization level 6, instead of performing one extra pass of jump threading, dead-code elimintation and peephole optimizations, keep iterating on these until the number of basic blocks stays the same. * txr.1: Documented.
*	compiler: optimization improvements	Kaz Kylheku	2023-04-07	1	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/optimize.tl (basic-blocks peephole-block): Drop the code argument, and operate on bl.insns, which is stored back. Perform the renames in the rename list after the peephole pass. (basic-blocks rename): New method. (basic-blocks do-peephole-block): Implementation of peephole-block, under a new name. The local function called rename is removed; calls to it go to the new rename method. (basic-blocks peephole): Simplify code around calls to peephole-block; we no longer have to pass bl.insns to it, capture the return value and store it back into bl.insns. * stdlib/compiler.tl (opt-level): Initial value changes from 6 to 7. (compiler optimize): At optimization level 6, we now do another jump threading pass, and peephole, like at levels 4 and 5. The peephole optimizations at level 5 make it possible to coalesce some basic blocks in some cases, and that opens up the possibility for more reductions. The previously level 6 optimizations are moved to level 7. * txr.1: Updated documentation of optimization levels, and default value of opt-level. * stdlib/doc-syms.tl: Updated.
*	compiler/doc: document compiler-opts and enable unused warning	Kaz Kylheku	2023-03-23	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (sys:env shadow-fun): Also diagnose if a global macro is shadowed. * txr.1: Documented compiler-opts structure, compiler-opts variable and with-compiler-opts macro. * stdlib/doc-syms.tl: Updated.
*	compiler: dohash: source location propagation	Kaz Kylheku	2023-03-22	1	-8/+10
\| \| \| \|	* stdlib/compiler.tl (expand-dohash): Add missing rlcp.
*	compiler: forward source location for defun and defmacro	Kaz Kylheku	2023-03-22	1	-11/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* stdlib/compiler.tl (expand-defun): Sprinkling of rlcp to pass source location info to the generated lambda, and to the sys:define-method call. (expand-defmacro): bugfix here: in with-gensyms we shadowed the form parameter, and then passed that as both form arguments to expand-bind-mac-params. We rename the gensym to mform, and then for the error-form, we pass the original form, quoted as necessary and with source location info. Thus, now source location info flows from the original defmacro form to the generated let* which binds the destructured parameters.