## What is `egawk`? `egawk` is Enhanced GNU Awk. It is a fork of GNU Awk with some enhancements designed and implemented by Kaz Kylheku. **NOTE:** If you have problems with `egawk` or questions about it, please do not contact the GNU Awk maintainers or the `bug-gawk` mailing list at `gnu.org`. Only contact those people if you have a bug that you can reproduce with the mainline GNU Awk. Always remember to try to produce a minimal sample of code, and any required input, to reproduce the problem. ## The `@let` statement. The `@let` statement in Enhanced GNU Awk provides block-scoped lexical variables. The syntax looks like this: ::awk @let (x = 1, y = 3, z = f(x, y)) print z The token sequence `@` and `let` introduces the statement. This is followed by a list of variable bindings in parentheses. That list is then followed by a statement. The statement is executed in a scope in which the variables are visible. As the above example shows, the bindings are established sequentially, which is why `z` can be initialized using an expression which depends on `x` and `y`. A `@let` variable need not have an initializer: ::awk @let (a, b) print a == 0 && b == "" # prints 1 Variables without an initializer are reliably initialized to the Awk null value: the same value that is exhibited by ordinary Awk variables that have not been assigned. This value compares equal to both 0 and the empty string `""` under the `==` operator. The scope of a `@let` variable begins immediately after its binding, including initializing expression, if any. The following is possible: ::awk function f(x) { @let (x = x + 1) return x } Here `x` is initialized with an expression that uses `x`. That expression still refers to the previously visible `x`; the scope of the newly introduced `x` begins after that initializing expression. The new `x` shadows the previous `x`. ## Restrictions `@let` variables may not have the same names as Awk's special variables such as `NF`, `FS` and whatnot. Inside a function, a `@let` variable must not have the same name as the function. Lastly, variables may not use namespace prefixes: `foo::bar` cannot be used as a `@let` variables names. These restrictions are not new; mainline GNU Awk's function parameters have the same restrictions. `@let` may appear inside functions, as well as outside of functions in the actions bodies of patterns, and in the `BEGIN` and `END` blocks: ::awk BEGIN { @let (x = 3) ... } /^id=/ { @let (id = ...) ... } ## Rationale Why not Javascript-like syntax? ::js { let x = 3 ... } The reason is that this syntax is not friendly toward macros. The motivation for `egawk` comes from the [`cppawk`](https://www.kylheku.com/cgit/cppawk) project. With `@let`, this sort of thing is possible: ::c #define repeat(n) @let (__c, __n = (n)) for (__c = 0; __c < __n; __c++) Here, the expansion of `repeat(42)` produces the structure `@let (...) for (...)` which just requires the addition of a statement to produce a complete construct: ::c repeat(42) { print "hello" } The Javascript-style syntax doesn't make it possible. We would have to rely on the feature of declaring variables inside the `for`: ::js for (let __c = 0, __n = (n); __c < __n; c++) This is not attractive because it requires us to inject the `let` syntax into the phrase structure of every statement type: `if`, `while`, `switch`. Whereas the selected design blends easily with any statement like a prefix: ::awk @let (x = 3) return x @let (x = c / 2) switch (x) { } The `@` prefix in `@let` follows a convention established by GNU Awk. GNU has extensions like `@include` for including files, and `@fun(arg)` for indirect functions. ## Compatibility If you have GNU Awk code that uses `let` as the name of an indirect function, `egawk` interpret that as the start of a let statement. It's possible that no syntax error will take place, only different behavior. This GNU Awk program produces the output `42`, because `@let()` means "call the function whose name is stored in the `let` variable": ::awk function f() { print 42 } BEGIN { let = "f"; @let(); } When executed with `egawk`, it produces no output, because `@let();` looks like an empty let statement. The superfluous semicolon satisfies its need for a statement and so everything parses. ## Implementation Notes The implementation of `@let` is different inside functions versus outside. `@let` statements outside of a function are compiled to code which uses hidden, global variables. These variables have numbered names similar to `$let0001`. When the GNU Awk `-d-` option is used to dump the symbol table, these names show up in it. Inside a function `@let` is compiled to code which assumes that the variables are allocated in the function's local frame. Unmodified, upstream GNU Awk has a parameter frame which is entirely dedicated to parameter passing. Local variables are simulated by defining additional parameters, which is a standard Awk idiom. Enhanced GNU Awk separates the frame into a parameter area and a locals-only area that is off-limits to the parameter passing mechanism. The compiler extends this local-only area to accommodate all the `@let` variables that occur in the function. Whether inside or outside a function, `@let` statements allocate variables in a stack-like fashion. Whenever a `@let` scope terminates, the compiler releases the storage locations used for that let, allowing them to be re-used for a subsequent `@let`. Thus this program allocates exactly two hidden global variables: ::awk BEGIN { @let (a, b); @let (c, d); @let (e, f); } This one allocates three: ::awk BEGIN { @let (z) { @let (a, b); @let (c, d); @let (e, f); } } In order to support the dynamic (compile-time) extension of the local frame with new local variables, I changed the representation of the function parameter frame. Upstream `gawk` has it as dynamic array of `NODE` objects; I made it a dynamic array of `NODE *` pointers to individually allocated `NODE` objects. Gawk's fixed array cannot be reallocated to fit the exact size, because the `NODE` addresses would change, after they have been inserted into generated bytecode. I have an idea for solving that, which could restore the original representation. Enhanced GNU Awk adds one bytecode instruction called `Op_clear_var`. This is necessary to reset lexical variables. Recall from the above paragraphs that lexicals variables whose scopes do not overlap are allocated in the same storage. This causes several problems. When a new variable is allocated in the space of an old one, the space contains the prior value. That "garbage" must be cleared out. (Contrast that with the C language in which uninitialized block-scope locals appear to have garbage values, taking on whatever bits happen to be in the memory.) There is another problem though, which affects even initialized variables. Awk does not like it when a variable that holds an array is used as a scalar, or vice versa: ::awk x[3] = 42 x = "abc" # error: array used as scalar y = 3.14 y["foo"] = "bar" # error: scalar used as array In standard Awk, and in GNU Awk, there is nothing that a program can do change the variable `x` such that it forgets it was an array, or to change `y` to forget that it was a scalar and work as an array. The new `Op_clear_var` opcode used by the `@let` implementation in `egawk` solves this problem, thanks to its access to the internal representation of a variable. ## Credits The `@let` syntax is inspired by Lisp: ::lisp (let* ((x 1) (y 2)) ...)