| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It happens in the wild that sometimes JSON-like data
must be processed in which strings are delimited by
single quotes rather than double quotes. The data
is valid Javascript syntax, so JS people don't even notice
anything is wrong.
* parser.c (struct parser): New member, json_quote_char.
This helps the scanner keep track of which closing character
it is expecting.
* parser.c (parser_common_init): Initialize json_quote_char.
* parser.l (JPUNC, NJPUNC): Include single quote (ASCII
apostrophe) in JPUNC, and exclude it from NJPUNC.
(grammar): When we see either a double quote or single
quote in JLIT mode, we return it as itself if that
character is the delimiter for the currently scanned
string. Otherwise we return it as a LITCHAR, which gets
accumulated by the parser into the current string.
Include the double. When we see either a double quote or
single quote, we transition to the JLIT state. The parser
will check whether a single quoted literal is allowed.
We allow \' escapes in a single-quote literal
unconditionally. We allow them in a double-quoted literal
also, but only in read bad JSON mode.
* parser.y (json_val): Recognize single-quoted literals,
but generate an error unless in read bad JSON mode.
Also, error production for unterminated single quote
only diagnosed that way in read bad JSON mode, otherwise
rejected as invalid JSON.
* tests/010/json.tl: New tests.
* txr.1: Documented.
* lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (BUFLIT): Instead of scanning a hexadecimal
digit and using strol, we scan three separate cases,
and do a very simple subtraction in each one.
TXR Lisp .tlo files are full of large buffer literals,
so this affects loading speed.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use Flex buffer cycling to avoid a memcpy.
* parser.h (struct shadow_context): Moved here out of
parser.c. New members bs and scanner: it will keep track
of the flex buffer directly rather than copying it.
(scanner_get_buffered_bytes): Declaration updated.
(scanner_free_buffer_bytes): Declared.
* parser.l (scanner_get_buffered_bytes): Reimplemented with
different interface. We switch the scanner to a new, empty
buffer, which liberates the previous one, allowing us to
take ownership. We store the scanner and that buffer into
the context, and set up the buf, index and size to reference
into the buffer. We no longer have to mess with yy_hold_char;
it is restored into the buffer by the yy_switch_to_buffer
operation.
(scanner_free_buffer_bytes): New function.
* parser.c (struct shadow context): Removed from here.
(shadow_detach): Call scanner_free_buffered_bytes.
* lex.yy.c: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After parsing out of a stream, we attach a shadow
stream which temporarily patches the operations
to make it appear that bytes that were taken into
the lexer have been pushed back. This lets us
call an ordinary input operation after a parsing
operation to read the data immediately following the
parsed-out construct. (Almost: there is still the
issue of the parser consuming one token of lookahead
in some situations.)
* stream.h (struct strm_base): New member shadow_obj.
This is a context pointer used by the shadow stream
operations.
(generic_fill_buf): Declare previously internal
function.
* stream.c (strm_base_init): Initialize shadow_obj
to null.
(generic_fill_buf): Function changed to external linkage.
Also, reloads the ops pointer from the stream on
each loop iteration. This is because it can change;
part of the buffer may be filled by shadow_get_byte,
which can detach the shadow operations, so then
the rest of the buffer is filled by something else
like stdio_get_byte.
(generic_get_line): Reload ops in in the loop, like
in gneric_fill_buf, for the same reason.
* parser.l: Include <stddef.h> for ptrdiff_t.
(scanner_has_buffered_bytes, scanner_get_buffered_bytes):
New functions.
* parser.c (SHADOW_TAB_SIZE): New preprocessor symbol.
(shadow_tab): New static array.
(struct shadow_context, struct shadow_ungetch): New
struct types.
(lisp_parse_impl): After calling parse, call
parse_shadow_stream_attach to attach the shadow stream
context and operations onto the stream.
(shadow_detach, shadow_destroy_op, shadow_mark_op,
shadow_put_string, shadow_put_char, shadow_put_byte,
shadow_get_char_callback, shadow_get_char,
shadow_unget_char_callback, shadow_unget_char,
shadow_get_byte, shadow_unget_byte, shadow_put_buf,
shadow_close, shadow_flush, shadow_seek, shadow_truncate):
New static functions.
(shadow_ops_template): New static structure.
(customize_shad_ops): New static function.
(parser_shadow_stream_attach): New function.
(parser_free_all): New function.
* parser.h (scanner_has_buffered_bytes,
scanner_get_buffered_bytes, parser_shadow_stream_attach,
parser_free_all): Declared.
* txr.c (free_all): Call parser_free_all.
* tests/018/streams.tl: New test cases.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support to quasiliterals to have the inserted
items formatted via a format conversion specifier, for example
@~3,3a:abc is @abc modified by ~3,3a format conversion.
When the inserted value is a list, the conversion is distributed
over the elements individually. Otherwise it applies to the
entire item.
* eval.c (fmt_tostring, fmt_cat): Take additional format
string argument. If it isn't nil, then do the string
conversion via the fmt1 function rather than tostring.
(do_format_field): Take format string argument, and
pass down to fmt_cat.
(format_field); Take format string argument and pass down
to do_format_field.
(fmt_simple, fmt_flex): Pass nil format string argument to
fmt_tostring.
(fmt_simple_fmstr, fmt_flex_fmstr): New static functions,
like fmt_simple and fmt_flex but with format string arg.
Used as run-time support for compiler-generated quasilit code
for cases when format conversion specifier is present.
(subst_vars): Extract the new format string frome each
variable item. Pass it down to fmt_tostring, format_field
and fmt_cat.
(eval_init): Register sys:fmt-simple-fmstr and sys:flex-fmstr
intrinsics.
* eval.h (format_field): Declaration updated.
* lib.c (out_quasi_str_sym): Take format string argument.
If it is present, output it after the @, followed by
a colon, to reproduce the read notation.
(out_quasi_str): Pass down the format string, taken
from the fourth element of a sys:var item of the quasiliteral.
For simple symbolic items, pass down nil.
* match.c (tx_subst_vars): Pass nil as new argument of
format_field. The output variables of the TXR Pattern
language do not exhibit this feature.
* parser.l (FMT): New pattern for matching the format
string part.
(grammar): The rule which recognizes @ in quasiliterals
optionally scans the format notation, and turns it
into a string attached to the token's semantic value,
which is now of type val (see parser.y remarks).
* parser.y (tokens): The '@' token's %type changed
from lineno to val so it can carry the format string.
(q_var): If format string is present in the @ symbol,
then include it as the fourth element of the sys:var
form. This rule handles braced items.
(meta): We can no longer get the line number from the @
item, so we get it from n_expr.
(quasi_item): Similar to q_var change here. This handles
@ followed by unbraced items: symbols and other expressions.
* stdlib/compiler.tl (expand-quasi-mods): Take format
string argument. When the format string is present,
then generate code which uses the new alternative
run-time support functions, and passes them the format
string as an argument.
(expand-quasi-args): Extend the sys:var match to extract
the format string if it is present. Pass it down to
expand-quasi-mods.
* stdlib/match.tl (expand-quasi-match): Add an error case
diagnosing the situation when the program tries to use
a format-conversion-endowed item in a quasilit pattern.
* stream.[ch] (fmt1): New function.
* tests/012/quasi.tl: New tests.
* txr.1: Documented.
* lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introducing a relaxation in the obj.slot.(method arg)
syntax. There can be whitespace to the right of the dot,
for splitting across multiple lines, as:
obj.
slot.
(method arg)
* parser.l (OREFDOT): Allow optional whitespace
to the right of .?
* parser.y (n_expr): Add a n_expr LAMBDOT n_expr
phrase, with same semantic rule as n_expr '.' n_expr.
We cannot add optional whitespace after . in
the lexer because that is ambiguous with LAMBDOT.
* tests/012/syntax.tl: New tewt cases.
* txr.1: Documented.
* lex.yy.c.shipped, y.tab.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We allow the | character to be an identifier or constituent
of identifers. This is going to be immediately useful
in infix syntax to provide the C-like operators.
* parser.l (BSCHR, NSCHR, ID_END): Add | character.
* genvim.txr (glyph, iskeyword): Add | character.
* lex.yy.c.shipped,
* tl.vim,
* txr.vim: Regenerated.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl,
stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl,
stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl,
stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl,
stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl,
stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl,
stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl,
stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl,
stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl,
stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl,
stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl,
stdlib/vm-param.tl, stdlib/with-resources.tl,
stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright bumped to 2025.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.h (struct parser): New member, read_json_int.
* parser.c (read_json_int_s): New symbol variable
for *read-json-int* symbol.
(parser_common_init): Look up value of *read-json-int*
and store in read_json_int struct member.
(parse_init): Initialize read_json_int_s with interned
symbol and also register the dynamic variable.
* parser.l (grammar): Extend the {JNUM} rule to check
the read_json_int flag and produce an integer value if
the lexeme does not contain a decimal point, e or E.
* tests/010/json.tl: New tests.
* txr.1: Documented.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (int_str_wc): New function, made out of int_str.
This can be used by the parser to work with a wchar_t *
string without having to create a string object.
(int_str): Implemented in terms of int_str_wc.
* parser.l (grammar): Remove string_own calls from numerous
rule bodies that use int_str to return a number.
These rules now capture the wchar_t string, pass it to
int_str_wc and then immediately free it. Whereas string_own
allocates an extra object and leaves it to the garbage
collector.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
| |
* Makefile (shipped): Copy the shipped materials unconditionally,
rather than checking if they are different. If a patch exists
for a shipped file, then apply it.
* lex.yy.c.shipped, y.tab.c.shipped: Updated.
* lex.yy.c.patch, y.tab.c.patch: New files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've run into situations in which I wanted a comment in a
big JSON quasiliteral to explain some embedded piece of code.
We support only semicolon comments, and no #; ignore notation.
* parser.l (grammar): Recognize Lisp comments in the JSON
state also. That does it.
* tests/010/json.tl: One modest little test.
* txr.1: Documented.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
@(push) is like @(output), but feeds back into input.
Use carefully.
* parser.y (PUSH): New token.
(output_push): New nonterminal symbol.
(output_clause): Handle OUTPUT or PUSH via output_push.
Some logic moved to output_helper.
(output_helper): New function. Transforms both @(output)
and @(push) directives. Checks both for valid keywords;
push has only :filter.
* parser.l (grammar): Recognize @(push similarly to other
directives.
* lib.[ch] (push_s): New symbol variable.
* match.c (v_output_keys): Internal linkage changes to external.
(v_push): New function.
(v_parallel): We must fix the max_line algorithm not to
use an initial value of zero, because lines can go negative
thanks to @(push). We end up rejecting the pushed data.
(v_collect): We can no longer assert that the data line
number doesn't retreat.
(dir_tables_init): Register push directive in table of
vertical directives.
* match.h (append_k, continue_k, finish_k): Existing symbol
variables declared.
(v_output_keys): Declared.
* y.tab.c.shipped,
* y.tab.h.shipped,
* lex.yy.c.shipped: Updated.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
| |
* parser.l (YY_FATAL_ERROR): New macro.
(lex_irrecovarable_error): New function.
(parser_l_init): Take address of yy_fatal_error and cast to
void, to suppress warning that the function is unused.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/compiler.tl,
stdlib/constfun.tl, stdlib/conv.tl, stdlib/copy-file.tl,
stdlib/debugger.tl, stdlib/defset.tl, stdlib/doloop.tl,
stdlib/each-prod.tl, stdlib/error.tl, stdlib/except.tl,
stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl,
stdlib/hash.tl, stdlib/ifa.tl, stdlib/keyparams.tl,
stdlib/match.tl, stdlib/op.tl, stdlib/optimize.tl,
stdlib/package.tl, stdlib/param.tl, stdlib/path-test.tl,
stdlib/pic.tl, stdlib/place.tl, stdlib/pmac.tl,
stdlib/quips.tl, stdlib/save-exe.tl, stdlib/socket.tl,
stdlib/stream-wrap.tl, stdlib/struct.tl, stdlib/tagbody.tl,
stdlib/termios.tl, stdlib/trace.tl, stdlib/txr-case.tl,
stdlib/type.tl, stdlib/vm-param.tl, stdlib/with-resources.tl,
stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright year bumped to 2023.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (remove_char): New static function.
(DIGSEP, XDIGSEP, NUMSEP, FLOSEP, XNUMSEP, ONUMSEP,
BNUMSEP, ONUM, BNUM): New named lex patterns.
(FLODOT): Use DIGSEP instead of DIG.
(ONUM): Use ODIG instead of [0-7].
(BNUM): Use BDIG instead of [0-1].
(grammar): New rule for producing NUMBER from decimal
token with commas based on BNUMSEP instead of BNUM.
This is a copy and paste so that the BNUM rule doesn't
deal with the comma removal, not to slow it down.
For the octal, binary and hex, we just switch to
BNUMSEP, ONUMSEP and XNUMSEP, so they all go through
one case.
Floating point numbers are also handled with a copy
pasted case using FLOSEP.
* tests/012/syntax.tl: New test cases.
* txr.1: Documented.
* genvim.txr (alpha-noe, digsep, hexsep, octsep, binsep): New
variables.
(txr_pnum, txr_xnum, txr_onum, txr_bnum, txr_num): Integrate
separating commas. Some bugs fixed in txr_num, some simplifications,
better txr_badnum pattern.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, buf.c, buf.h, cadr.c,
cadr.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h,
combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h,
ffi.c, ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h,
glob.c, glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, lisplib.c, lisplib.h, match.c, match.h,
parser.c, parser.h, parser.l, parser.y, protsym.c, psquare.h,
rand.c, rand.h, regex.c, regex.h, signal.c, signal.h,
socket.c, socket.h, stdlib/arith-each.tl, stdlib/asm.tl,
stdlib/awk.tl, stdlib/build.tl, stdlib/cadr.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl,
stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.1,
txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h,
vmop.h, win/cleansvg.txr, y.tab.c.shipped: Copyright year
bumped to 2022.
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (NJPUNC): This inverted class lexical category must
exclude the carriage return character \r, otherwise it matches
it. The JSON keywords true, false and null are recognized as
sequences of NJPUNC. If we don't exclude \r from NJPUNC, it
looks like a symbol constituent, comprising an unrecognized
JSON keyword.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Makefile, alloca.h, args.c, args.h, arith.c, arith.h, buf.c,
buf.h, chksum.c, chksum.h, chksums/crc32.c, chksums/crc32.h,
combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, ffi.c,
ffi.h, filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c,
glob.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S, lib.c,
lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c,
parser.h, parser.l, parser.y, rand.c, rand.h, regex.c,
regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/asm.tl, stdlib/awk.tl, stdlib/build.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/debugger.tl, stdlib/defset.tl,
stdlib/doloop.tl, stdlib/each-prod.tl, stdlib/error.tl,
stdlib/except.tl, stdlib/ffi.tl, stdlib/getopts.tl,
stdlib/getput.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/match.tl, stdlib/op.tl,
stdlib/optimize.tl, stdlib/package.tl, stdlib/param.tl,
stdlib/path-test.tl, stdlib/pic.tl, stdlib/place.tl,
stdlib/pmac.tl, stdlib/quips.tl, stdlib/save-exe.tl,
stdlib/socket.tl, stdlib/stream-wrap.tl, stdlib/struct.tl,
stdlib/tagbody.tl, stdlib/termios.tl, stdlib/trace.tl,
stdlib/txr-case.tl, stdlib/type.tl, stdlib/vm-param.tl,
stdlib/with-resources.tl, stdlib/with-stream.tl,
stdlib/yield.tl, stream.c, stream.h, struct.c, struct.h,
strudel.c, strudel.h, sysif.c, sysif.h, syslog.c, syslog.h,
termios.c, termios.h, time.c, time.h, tree.c, tree.h, txr.c,
txr.h, unwind.c, unwind.h, utf8.c, utf8.h, vm.c, vm.h, vmop.h:
License reformatted.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The make_hash function now takes the hash_weak_opt_t
enumeration instead of a pair of flags.
* hash.c (do_make_hash): Take enum argument instead of pair of
flags. Just store the option; nothing to calculate.
(weak_opt_from_flags): New static function.
(tweak_hash): Function removed.
(make_seeded_hash): Adjust to new do_make_hash interface with
help from weak_opt_from_flags.
(make_hash, make_eq_hash): Take enum argument instead of pair
of flags.
(hashv): Calculate hash_weak_opt_t enum from the extracted
flags, pass down to make_eq_hash or make_hash.
* hash.h (tweak_hash): Declration removed.
(make_hash, make_eq_hash): Declarations updated.
* eval.c (me_case, expand_switch): Update make_hash
calls to new style.
(eval_init): Update make_hash calls and get rid of tweak_hash
calls. This renders the tweak_hash function unused.
* ffi.c (make_ffi_type_enum, ffi_init): Update make_hash calls
to new style.
* filter.c (make_trie, trie_add, filter_init): Likewise.
* lib.c (make_package_common, obj_init, obj_print): Likewise.
* lisplib.c (lisplib_init): Likewise.
* match.c (dir_tables_init): Likewise.
* parser.c (parser_circ_def, repl, parse_init): Likewise.
* parser.l (parser_l_init): Likewise.
* struct.c (struct_init, get_slot_syms): Likewise.
* sysif.c (get_env_hash): Likewise.
* lex.yy.c.shipped, y.tab.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each time the scanner processes a floating-point token,
it allocates a string object, just so it can call flo_str.
The object is then garbage. Let's stop doing that.
* lib.c (flo_str_utf8): New function, closely based on flo_str.
Takes a char * string.
* lib.h (flo_str_utf8): Declared.
* parser.l (out_of_range_float): Take the token as a const
char * string instead of a Lisp string, so we can just pass
yytext to this function.
(grammar): Use flo_str_utf8 instead flo_str, and pass yytext
to out_of_range_float.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The big comment I added above end_of_json_unquote summarizes
the issue. This issue has been uncovered by some test cases in
a JSON test suite, not yet committed.
* parser.l <JMARKER>: New start condition. Used as a reliable
marker in the start condition stack, based on which
end_of_json_quasiquote can intelligently fix-up the stack.
(JSON): In the transitions to the quasiquote scanning NESTED
state, push the JMARKER start condition before NESTED.
(JMARKER): The lexer should never read input in the JMARKER
state. If we don't put in a rule to catch this, if that ever
happens, the lexer will just copy the source code to standard
output. I ran into this during debugging.
(end_of_json_unquote): Rewrite the start condition stack
intelligently based on what the Lisp lookahead token has done
to it, so parsing can smoothly continue.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (eval_init): get-json intrinsic registered.
* parser.c (prime_parser): Handle prime_json.
(lisp_parse_impl): Take enum prime_parser argument
directly instead of the interactive flag.
(lisp_parse, nread, iread): Pass appropriate prime_parser
value instead of the original flag.
(get_json): New function. Like nread, but passes prime_json.
* parser.h (enum prime_parser): New constant, prime_json.
(get_json): Declared.
* parser.l (prime_scanner): Handle prime_json.
* parser.y (SECRET_ESCAPE_J): New terminal symbol.
(spec): New productions around SECRET_ESCAPE_J for parsing
JSON.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
* txr.1: Documented.
* share/txr/stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The JSON null will map to the Lisp null symbol. I thought
about using : but that could cause surprises; like when it's
passed to functions as an optional argument, it will trigger
the default value.
* parser.l (JSON): Add rules for producing null keyword.
* txr.1: Documented.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.l <JLIT>: Convert \u+0000 sequence to U+DC00
code point, the pseudo-null. Also include JLIT
in in the rule for catching bad bytes that are not
matched by {UANYN}.
* txr.1: Document this treatment as extensions to JSON.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because #J<json> produces the (json ...) form that translates
into quote, ^#J<json> yields a quasiquote around a quote. This
has some disadvantages, because it requires an explicit eval
in some situtions to do what the programmer wants.
Here, we introduce an alternative: the syntax #J^<json> will
produce a quasiquote instead of a quote.
The new translation scheme is
#J X -> (json quote <X>)
#J^ X -> (json sys:qquote <X>)
where <X> denotes the Lisp object translation of JSON syntax X.
* parser.c (me_json): The quote symbol is now already in the
json form, so all that is left to do here is to take the cdr
to pop off the json symbol.
* parser.l (JPUNC, NJPUNC): Allow ^ to be a punctuator in
JSON mode.
* parser.y (json): For regular #J, generate the new (json
quote ...) syntax. Implement J# ^ which sets up the nonzero
quasi_level around the processing of the JSON syntax, so that
everything is in a quasiquote, finally producing the
(json sys:qquote ...) syntax.
* lex.yy.c.shipped, y.tab.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parser.h (end_of_json_unquote): Declared.
* parser.l (JPUNC, NJPUNC): Add ~ and * characters to set of
JSON punctuators.
(grammar): Allow closing brace character in NESTED, SPECIAL
and QSPECIAL statues to be a token. This is because it occurs
as a lookahead character in this situation #J{"foo":~expr}.
The lexer switches from the JSON to the NESTED start state
when it scans the ~ token, so that expr is treated as Lisp.
But then } is consumed as a lookahead token by the parser in
that same mode; when we pop back to JSON mode, the } token has
already been scanned in NESTED mode.
We add two new rules in JSON mode to the lexer to recognize
the ~ unquote and ~* splicing unquote. Both have to push the
NESTED start condition.
(end_of_json_unquote): New function.
* parser.y (JSPLICE): New token.
(json_val): Logic for unquoting. The array and hash rules must
now be prepared to deal with json_vals and json_pairs now
producing a list object instead of a hash or vector. That is
the signal that the data contains active quasiquotes and must
be translated to the special literal syntax for quasiquoted
vectors and hashes. Here we also add the rules for ~ and ~*
unquoting syntax, including managing the lexer's transition
back to the JSON start condition.
(json_vals, json_pairs): We add the logic here to recognize
unquotes in quasiquoting state. This is more clever than the
way it is done in the Lisp areas of the grammar. If no
quasiquotes occur, we construct a vector or hash,
respectively, and add to it. If unquotes occur and if we
are nested in a quasiquote, we switch the object to a list,
and continue it that way.
(yybadtoken): Handle JSPLICE.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (HASH_N_EQUALS, HASH_N_HASH): Recognize these
tokens in the JSON start state also.
* parser.y (json_val): Add the circular syntax, exactly like
it is done for n_expr and i_expr. And it works!
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(needs buffer literal error message cleanup)
* parser.c (json_s): New symbol variable.
(is_balanced_line): Follow braces out of initial state.
This concession allows the listener to accept input
like #J{"a":"b"}.
(me_json): New static function (macro expander). The #J X
syntax produces a (json Y) form, with the JSON syntax X
translated to a Lisp object Y. If that is evaluated,
this macro translates it to (quote Y).
(parse_init): initialize json_s variable with interned symbol,
and register the json macro.
* parser.h (json_s): Declared.
(end_of_json): Declared.
* parser.l (num_esc): Treat u escape sequences in the same way
as x. This function can then be used for handling the \u
escapes in JSON string literals.
(DIG19, JNUM, JPUNC, NJPUNC): New lex named patterns.
(JSON, JLIT): New lex start conditions.
(grammar): Recognize #J syntax, mapping to HASH_J token,
which transitions into JSON start state.
In JSON start state, handle all the elements: numbers,
keywords, arrays and objects. Transition into JLIT state.
In JLIT start state, handle all the elements of JSON string
literals, including surrogate pair escapes.
JSON literals share the fallback {UANY} fallback patter with
other literals.
(end_of_jason): New function.
* parser.y (HASH_J, JSKW): New token symbols.
(json, json_val, json_vals, json_pairs): New nonterminal
symbols, and rules.
(i_expr, n_expr): Generate json nonterminal, to hook the
stuff into the grammar.
(yybadtoken): Handle JKSW and HASH_J tokens.
* lex.yy.c.shipped, y.tab.c.shipped, y.tab.h.shipped:
Updated.
|
|
|
|
|
|
|
| |
* parser.l (BUFLIT): When reporting a bad characters, do not
show it in the form of an escape sequence.
* lex.yy.c.shipped: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is fairly obscure. A repro test case is a file which
contains:
3"foo"
When the 3 is parsed, the " is also scanned as a lookahead
token, and when that happens, the lexer shifts into the
STRLIT state. At that point the parse job finishes for
that top-level form.
The next time the parser is called, it will prime the token
stream by pushing the " token into it. But, the lex state is
not put into the STRLIT. State. The result is that the parser
obtains the " token, and then foo is lexically analyzed in the
wrong state as a symbol. A syntax error occurs: symbol token
in the middle of a string literal, instead of just a sequence
of LITCHAR tokens, as expected.
What we can do is associate a lex state with pushback tokens.
If a pushback token has a nonzero lex state which is different
from the current YYSTATE, then when that pushback token is
consumed, we push that state also.
* parser.h (struct yy_token): New member, yy_lex_state.
* parser.c (parser_common_init): Initialize the new
yy_lex_state member of every token member of the parser
structure.
* parser.l (yylex): When feeding a pushed token to the parser,
if that token has a nonzero state, and the state is different
from YYSTATE, we push that state. So for instance a pushed
back " token will carry the STRLIT state, which is different
from the NESTED state that will be in effect at the start of
the parse job, and so it will be pushed, as if the " character
had been scanned. Also, when we call the real yylex_impl,
when we are storing the recenty seen token in recent_tok, we
also store the current YYSTATE along with it. That's how
tokens get associated with a state. The artificial tokens that
are used for priming parsing like SECRET_ESCAPE_E are never
associated with a nonzero state.
* tests/012/syntax.tl: Some test cases that didn't pass
before this.
* lex.yy.c.shipped: Regenerated.
|
|
|
|
|
|
|
|
|
|
| |
* parser.l (grammar): Just like we do in SREGEX, allow an
arbitrary byte in REGEX, mapping it to the DCxx range.
Do the same inside string literals of all types.
* lex.yy.c.shipped: Updated.
* tests/012/parse.tl: New tests.
|
|
|
|
|
|
|
|
|
|
|
| |
This picks up the changes introduced by the previous
three commits.
* lex.yy.c.shipped: Updated.
* y.tab.c.shipped: Likewise.
* y.tab.h.shipped: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whereas @a..@b parses and transforms to (rcons @a @a),
@(a)..@(a) goes to @(rcons a @(a)).
* parser.l (grammar): Under 248 compatibility or lower, the @
character now produces the OLD_AT token. Otherwise it produces
the '@' character, as before.
* parser.y (OLD_AT): New token replaces the '@' at the old
low precedence position. '@' is now at the highest precedence,
together with OLD_DOTDOT. (We don't care about interactions
between '@' and OLD_DOTDOT, because OLD_DOTDOT only exists in
185 compatibility, in which '@' is OLD_AT).
(meta): The two rules have to be unfortunately duplicated for
OLD_AT, since there is no BNF OR operator in Yacc.
* txr.1: Compat note added.
* lex.yy.c.shipped: Updated.
* y.tab.c.shipped, y.tab.h.shipped: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* METALICENSE: 2020 copyrights bumped to 2021. Added note
about SHA-256 routines from Colin Percival.
* LICENSE, LICENSE-CYG, Makefile, alloca.h, args.c, args.h,
arith.c, arith.h, buf.c, buf.h, cadr.c, cadr.h, chksum.c,
chksum.h, chksums/crc32.c, chksums/crc32.h, combi.c, combi.h,
configure, debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h,
filter.c, filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h,
hash.c, hash.h, itypes.c, itypes.h, jmp.S, lex.yy.c.shipped,
lib.c, lib.h, linenoise/linenoise.c, linenoise/linenoise.h,
lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, rand.c, rand.h, regex.c,
regex.h, share/txr/stdlib/asm.tl, share/txr/stdlib/awk.tl,
share/txr/stdlib/build.tl, share/txr/stdlib/cadr.tl,
share/txr/stdlib/compiler.tl, share/txr/stdlib/conv.tl,
share/txr/stdlib/copy-file.tl, share/txr/stdlib/debugger.tl,
share/txr/stdlib/defset.tl, share/txr/stdlib/doloop.tl,
share/txr/stdlib/each-prod.tl, share/txr/stdlib/error.tl,
share/txr/stdlib/except.tl, share/txr/stdlib/ffi.tl,
share/txr/stdlib/getopts.tl, share/txr/stdlib/getput.tl,
share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl,
share/txr/stdlib/keyparams.tl, share/txr/stdlib/op.tl,
share/txr/stdlib/package.tl, share/txr/stdlib/param.tl,
share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl,
share/txr/stdlib/pmac.tl, share/txr/stdlib/quips.tl,
share/txr/stdlib/save-exe.tl, share/txr/stdlib/socket.tl,
share/txr/stdlib/stream-wrap.tl, share/txr/stdlib/struct.tl,
share/txr/stdlib/tagbody.tl, share/txr/stdlib/termios.tl,
share/txr/stdlib/trace.tl, share/txr/stdlib/txr-case.tl,
share/txr/stdlib/type.tl, share/txr/stdlib/vm-param.tl,
share/txr/stdlib/with-resources.tl,
share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl,
signal.c, signal.h, socket.c, socket.h, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright year bumped to 2021.
|
|
|
|
|
| |
* lex.yy.c.shippped (YY_DECL): Fix some bad indentation, most
likely caused by using a mixture of tabs and spaces.
|
|
* Makefile (BS_LIC_FROM, BS_LIC_TO): Variables removed.
(y.tab.c): Remove all filtering hacks. Don't remove the
license from y.tab.c. Don't remove yyparse declaration from
y.tab.h. Provide a pattern rule for producing any missing
file X from X.shipped. That's how y.tab.c and y.tab.h
get produced from y.tab.c.shipped and y.tab.h.shipped,
respectively, in user mode.
* y.tab.c.shipped, y.tab.h.shipped: New files, generated
using Bison 2.5.
|