aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2022-04-09 11:55:51 -0700
committerKaz Kylheku <kaz@kylheku.com>2022-04-09 11:55:51 -0700
commit5a0d83f4b42b9ca28cc6b8dd190a570c47b203c8 (patch)
tree2ef4893a8f91d411f48f527877b35f180597d2ed
parentb6fd48c9891858d9f84ee49b6735be5db950a8a0 (diff)
downloadegawk-5a0d83f4b42b9ca28cc6b8dd190a570c47b203c8.tar.gz
egawk-5a0d83f4b42b9ca28cc6b8dd190a570c47b203c8.tar.bz2
egawk-5a0d83f4b42b9ca28cc6b8dd190a570c47b203c8.zip
Feature: @let statement provides block-scoped locals
This is a feature which builds on the @local work, providing a statement which looks like this: @let (var1 [= init1], var2 [= init2], ...) statement The variables are bound, given their optional values or else left undefined, and the statement is executed in the scope in which they exist. These variables may have the same name as existing block-scoped locals or function-wide @local locals, or function parameters, in which case shadowing takes place. * awk.h (NODETYPE): New enum symbol Node_alias. This marks entries in the new symbol table alias_table which is needed for compiling @let statements that occur outside of functions. (NODE): New union member sub.nodep.n. Member sub.nodep.rn is moved into this union becoming sub.nodep.n.rn. New member sub.nodep.n.rpn, of type NODE **. This is now the the implementation of fparms. (nxparam): New macro, denoting sub.nodep.x.extra. This is used as the link pointer for maintaining a stack of lexicals while compiling a function. (let_alias): New macro, denoting sub.nodep.l.lptr. This is the new Node_alias type's semantic payload: the node being aliased. (frame_cnt): New macro for sub.nodep.reserved. This is used to keep track of a function's frame size (number of local variables). If there are no local variables other than function parameters, then then f->frame_cnt == f->param_cnt. Otherwise f->frame_cnt > f->param_cnt. Note that traditional Awk local variables, which are indistinguishable from parameters, are part of param_cnt. We are talking about the new style local variables here. (fparms): Now sub.nodep.n.rpn. (for_array, xarray): Macros follow the move of sub.nodep.rn to sub.nodep.n.rn. (OPCODE): New opcode, Op_clear_val for clearing a variable to the null value. (make_params): Return NODE **, rather than NODE *. (extend_locals, install_let, install_global_let, remove_let): Functions declared. * awkgram.y (in_function): Global variable changes type from bool to INSTRUCTION *. It still functions as a Boolean, indicating that the parser is processing a function, but it also gives the instruction node. (in_loop): New static variable: loop nesting count. If this is positive, we are parsing a construct located inside a loop. (let_free): New static variable. This is the free list of lexical variables. When, during the compilation of a function, a let block is done, the lexical variable nodes are returned to the free list. The next time a let block opens, it can allocate nodes from that list. Thanks to this recycling list, lexical variables with non-overlapping scopes will map to the same underlying locations in the function's param/local frame. (let_stack): New static variable. When a new let variable is introduced into the scope, it is pushed onto this stack. The @let construct remembers the original stack top. When it's done compiling, it pops the stack back down to the original top, and returns all the entries thus removed into the let_free list. (LEX_LOCAL, LEX_LET): New terminal symbols for the param keyword in the @ param notation. (function_prologue): Reset let_stack and let_free to NULL when starting to compile function. Store the function node $1 into in_function. (statement): For all the looping constructs, decrement the in_loop nesting count that is incremented by the lexical analyzer. New @ 'LEX_LET' '(' ... phrase structure here. Add new production for '@' LEX_LOCAL. (local_var_list_opt, local_var_list): New nonterminal symbols. (let_var_list_opt, let_var_list): New nonterminal symbols. (simple_variable): New production for @ LEX_LOCAL ':' NAME. This is a copy of the NAME production, with the additional logic that when in_function is true, the symbol is added as a local to the current function via add_local before being processed as a variable reference. So it is as if it had been a parameter all along. (tokentab): Register "let" with the LEX_LET token number, attributing it as a special symbol that is a Gawk extension, much like "include" and other "@" items. Same for "local" and LET_LOCAL. (yylex): Recognize "let" and "local" only after "@". Increment in_loop nesting counter for for, while and do. (parms_shadow): Check the entire frame, not just the parameters, for shadowing. Update to NODE ** fparms representation. (mk_function): Updated due to rename of remove_params to remove_locals. (install_function): Pass the new third parameter to make_params in order to receive the allocation size of the parameter vector. This is installed into f->param_alloc. The new add_local function makes use of this to diagnose it if there is no room to add parameters. Update to NODE ** fparams, and initialize frame_cnt equal to param_cnt. (add_local): New static function for adding a parameter to the function currently being compiled. This has to check perform diagnostics on the local variable, similar to the checks done on a parameter. The extend_locals function in symbol.c is relied upon to do the reallocation to add the parameter and also register it in the symbol table. We strdup parm->lextok parameter name, because that will later be freed during parsing. (add_let): New function, closely based on add_local, but with these differences: no duplicate check since shadowing is allowed. And extend_locals is only used if the let_free list is empty. (check_params): A few parameter checks are moved into the new check_param function. This was because they were shared with an earlier implementation of add_local. The function then got cloned into check_locals in order to have different wording. (check_param): New static function, split off from check_params. (check_local): New static function, closely based on check_param, but using different wording to distinguish locals from params. Uses NULL value of fname to disable checks; this is for when we are are compiling a let that isn't in a function. (genym): New static function. Used for generating anonymous globals. Let variables outside of functions are implemented as aliases for anonymous globals. The globals are allocated/freed in a stack-like manner, exactly like frame locations are allocated for let variables inside a function. * command.y (variable_generator): Update to NODE ** fparms. * debug.c (do_info): Report the locals as if they were parameters, so that they are visible under debugging. Update parameter access to reflect NODE ** representation. (find_param): Find a local variable too. Update to NODE ** fparms. (print_function, print_memory, print_instruction): Adjust to NODE ** fparms. In print_instruction, handle new Op_clear_var. (do_eval): Follow rename of remove_params to remove_locals. Append all the locals to the frame stack. Update to NODE ** fparms. (parse_condition): Follow rename of remove_params to remove_locals. * eval.c (nodetypes): Entry for Node_alias. (optypes): Entry for Op_clear_var. (setup_frame): Allocate the full locals frame (f->frame_cnt), but check the arguments against only the param count (f->param_cnt). Update to NODE ** fparms. (restore_frame): Destroy all the locals, not just parameters. * interpret.h (r_interpret): Implement Op_clear_var opcode. * profile.c (func_params): Change to NODE ** type. (pprint, pp_func): Update to NODE ** fparms. In pprint Handle Op_clear_var. * symbol.c (alias_table): New static variable. (TAB_PARAM, TAB_ALIAS, TAB_GLOBAL, TAB_FUNC, TAB_SYM, TAB_COUNT): New enum symbols. These identify indices into the tables array. (tables): New static array variable: replaces the automatic array used in the lookup function which is initialized on every lookup call. This array is no longer terminated by a null pointer. (init_symbol_table): Initialize alias_table. Initialize entries of tables. (lookup): Local array tables and its initialization removed. Iterate over table up to below TAB_COUNT rather than by looking for a null pointer. Check for the global_table by comparing the index to TAB_GLOBAL rather than the table pointer to global_table. Interestingly, gawk already has support for placing multiple NODE objects in the param_table under the same name with LIFO discipline, i.e. shadowing. Each node in the param_table is understood to have a stack of duplicates using the dup_ent link pointer. Only thing is, the lookup() function ignores this. We fix it so that if there is a stack of duplicates, it substitutes the top indicated by dup_ent. We implement the alias table: if lookup finds an entry in this table, we don't return that entry, but what it aliases for, indicated by n->let_alias. The alias table thus implements symbol macros; Node_alias nodes are never seen at run-time because aliases resolve during compiling, and that's why we don't check for them in any of the run-time code, like eval.c, debug.c, profile.c or interpret.h. (make_params): Returns NODE ** now and allocates array of pointers, using individual getnode calls to allocate the NODE objects, rather than allocating params as a contiguous block of NODE objects. (extend_locals): New function. This reallocates the param array to hold one more parameter, on the assumption that the caller's lcount parameter is one greater than the current size. The parameter is initialized and registered in the symbol table. (install_params): Update to NODE ** fparms. (remove_params): Renamed to remove_locals. Refers to frame_cnt rather than param_cnt. Symbol logic moved to remove_common. (remove_common): New static function, factored out from remove_locals. (install_let): New public wrapper around install: lets us install a node under a specific alias name. (install_global_let): New function: this is the public interface for installing an entry in the alias table. (remove_let): New function: essentially a wrapper around remove_common. Previously, this functionality is only exposed to other modules via the install_params and remove_locals, which take a function object from which they retrieve an array of locals. For the let construct, we need more flexibility to bind and unbind individual symbols, not everything at once. (destroy_symbol): Refer to frame_cnt rather than param_cnt. (install): Handle nodes of NODETYPE Node_alias, by selecting the alias_table. The alias_table supports entries via the dup_ent mechanism just like param_table. (check_param_names): Refer to frame_cnt rather than param_cnt and update to NODE ** representation of fparms. * test/let[1-6].awk, test/let[1-6].ok: New files * test/Makefile.am, test/Makefile.in, test/Maketests: rules for new tests. * doc/gawktexi.in, doc/gawk.texi: Documented.
-rw-r--r--awk.h30
-rw-r--r--awkgram.y284
-rw-r--r--command.y2
-rw-r--r--debug.c27
-rw-r--r--doc/gawk.texi185
-rw-r--r--doc/gawktexi.in185
-rw-r--r--eval.c19
-rw-r--r--interpret.h20
-rw-r--r--profile.c14
-rw-r--r--symbol.c190
-rw-r--r--test/Makefile.am3
-rw-r--r--test/Makefile.in3
-rw-r--r--test/Maketests30
-rw-r--r--test/let1.awk39
-rw-r--r--test/let1.ok5
-rw-r--r--test/let2.awk3
-rw-r--r--test/let2.ok2
-rw-r--r--test/let3.awk3
-rw-r--r--test/let3.ok2
-rw-r--r--test/let4.awk3
-rw-r--r--test/let4.ok2
-rw-r--r--test/let5.awk3
-rw-r--r--test/let5.ok2
-rw-r--r--test/let6.awk3
-rw-r--r--test/let6.ok2
25 files changed, 952 insertions, 109 deletions
diff --git a/awk.h b/awk.h
index 732aec04..df00f99f 100644
--- a/awk.h
+++ b/awk.h
@@ -259,6 +259,7 @@ typedef enum nodevals {
Node_var_array, /* array is ptr to elements, table_size num of eles */
Node_var_new, /* newly created variable, may become an array */
Node_param_list, /* lnode is a variable, rnode is more list */
+ Node_alias, /* entry in alias_table */
Node_func, /* lnode is param. list, rnode is body */
Node_ext_func, /* extension function, code_ptr is builtin code */
Node_builtin_func, /* built-in function, main use is for FUNCTAB */
@@ -362,7 +363,10 @@ typedef struct exp_node {
} x;
char *name;
size_t reserved;
- struct exp_node *rn;
+ union {
+ struct exp_node *rn;
+ struct exp_node **rpn;
+ } n;
unsigned long cnt;
enum reflagvals {
CONSTANT = 1,
@@ -480,12 +484,19 @@ typedef struct exp_node {
/* Node_param_list */
#define param vname
#define dup_ent sub.nodep.r.rptr
+#define nxparam sub.nodep.x.extra /* Compile-time linked list of lexicals */
/* Node_param_list, Node_func */
-#define param_cnt sub.nodep.l.ll
+#define param_cnt sub.nodep.l.ll /* Number of locals that are params */
+
+/* Node_alias */
+#define let_alias sub.nodep.l.lptr /* Alias in alias table */
+
+/* Node_func */
+#define frame_cnt sub.nodep.reserved /* No locals allocated at run-time */
/* Node_func */
-#define fparms sub.nodep.rn
+#define fparms sub.nodep.n.rpn
#define code_ptr sub.nodep.r.iptr
/* Node_regex, Node_dynregex */
@@ -533,7 +544,7 @@ typedef struct exp_node {
#define for_list sub.nodep.r.av
#define for_list_size sub.nodep.reflags
#define cur_idx sub.nodep.l.ll
-#define for_array sub.nodep.rn
+#define for_array sub.nodep.n.rn
/* Node_frame: */
#define stack sub.nodep.r.av
@@ -554,7 +565,7 @@ typedef struct exp_node {
#define table_size sub.nodep.reflags
#define array_size sub.nodep.cnt
#define array_capacity sub.nodep.reserved
-#define xarray sub.nodep.rn
+#define xarray sub.nodep.n.rn
#define parent_array sub.nodep.x.extra
#define ainit array_funcs->init
@@ -628,6 +639,7 @@ typedef enum opcodeval {
/* assignments */
Op_assign,
Op_store_var, /* simple variable assignment optimization */
+ Op_clear_var, /* clear simple var to undefined state */
Op_store_sub, /* array[subscript] assignment optimization */
Op_store_field, /* $n assignment optimization */
Op_assign_times,
@@ -1777,9 +1789,13 @@ extern void destroy_symbol(NODE *r);
extern void release_symbols(NODE *symlist, int keep_globals);
extern void append_symbol(NODE *r);
extern NODE *lookup(const char *name);
-extern NODE *make_params(char **pnames, int pcount);
+extern NODE **make_params(char **pnames, int pcount);
+NODE **extend_locals(NODE **parms, const char *pname, int lcount);
extern void install_params(NODE *func);
-extern void remove_params(NODE *func);
+extern void remove_locals(NODE *func);
+extern void install_let(NODE *let, const char *alias);
+extern NODE *install_global_let(const char *alias, NODE *anon_global);
+extern void remove_let(NODE *let);
extern void release_all_vars(void);
extern int foreach_func(NODE **table, int (*)(INSTRUCTION *, void *), void *);
extern INSTRUCTION *bcalloc(OPCODE op, int size, int srcline);
diff --git a/awkgram.y b/awkgram.y
index 08ee381c..44a4a48c 100644
--- a/awkgram.y
+++ b/awkgram.y
@@ -44,7 +44,11 @@ static int yylex(void);
int yyparse(void);
static INSTRUCTION *snode(INSTRUCTION *subn, INSTRUCTION *op);
static char **check_params(char *fname, int pcount, INSTRUCTION *list);
+static void check_param(const char *fname, const char *name, INSTRUCTION *parm);
+static void check_local(const char *fname, const char *name, INSTRUCTION *local);
+static char *gensym(const char *prefix);
static int install_function(char *fname, INSTRUCTION *fi, INSTRUCTION *plist);
+static bool add_let(INSTRUCTION *fi, INSTRUCTION *parm);
static NODE *mk_rexp(INSTRUCTION *exp);
static void param_sanity(INSTRUCTION *arglist);
static int parms_shadow(INSTRUCTION *pc, bool *shadow);
@@ -120,7 +124,10 @@ static enum {
FUNC_BODY,
DONT_CHECK
} want_param_names = DONT_CHECK; /* ditto */
-static bool in_function; /* parsing kludge */
+static INSTRUCTION *in_function; /* parsing kludge */
+static int in_loop; /* parsing kludge */
+static NODE *let_free; /* free list of lexical vars */
+static NODE *let_stack; /* stack of allocated lexicals */
static int rule = 0;
const char *const ruletab[] = {
@@ -205,7 +212,7 @@ extern double fmod(double x, double y);
%token LEX_AND LEX_OR INCREMENT DECREMENT
%token LEX_BUILTIN LEX_LENGTH
%token LEX_EOF
-%token LEX_INCLUDE LEX_EVAL LEX_LOAD LEX_NAMESPACE
+%token LEX_INCLUDE LEX_EVAL LEX_LOAD LEX_NAMESPACE LEX_LET
%token NEWLINE
/* Lowest to highest */
@@ -531,9 +538,11 @@ function_prologue
$1->source_file = source;
$1->comment = func_comment;
+ /* Clear out lexical allocator, just in case */
+ let_stack = let_free = NULL;
if (install_function($2->lextok, $1, $5) < 0)
YYABORT;
- in_function = true;
+ in_function = $1;
$2->lextok = NULL;
bcfree($2);
/* $5 already free'd in install_function */
@@ -782,6 +791,8 @@ statement
INSTRUCTION *ip, *tbreak, *tcont;
+ in_loop--;
+
tbreak = instruction(Op_no_op);
add_lint($3, LINT_assign_in_cond);
tcont = $3->nexti;
@@ -832,6 +843,8 @@ statement
INSTRUCTION *ip, *tbreak, *tcont;
+ in_loop--;
+
tbreak = instruction(Op_no_op);
tcont = $6->nexti;
add_lint($6, LINT_assign_in_cond);
@@ -871,6 +884,8 @@ statement
INSTRUCTION *ip;
char *var_name = $3->lextok;
+ in_loop--;
+
if ($8 != NULL
&& $8->lasti->opcode == Op_K_delete
&& $8->lasti->expr_count == 1
@@ -994,6 +1009,8 @@ regular_loop:
}
| LEX_FOR '(' opt_simple_stmt semi opt_nls exp semi opt_nls opt_simple_stmt r_paren opt_nls statement
{
+ in_loop--;
+
if ($5 != NULL) {
merge_comments($5, NULL);
$1->comment = $5;
@@ -1016,6 +1033,8 @@ regular_loop:
}
| LEX_FOR '(' opt_simple_stmt semi opt_nls semi opt_nls opt_simple_stmt r_paren opt_nls statement
{
+ in_loop--;
+
if ($5 != NULL) {
merge_comments($5, NULL);
$1->comment = $5;
@@ -1035,6 +1054,45 @@ regular_loop:
break_allowed--;
continue_allowed--;
}
+ | '@' LEX_LET '('
+ {
+ /* Trick: remember current let stack top in LEX_LET token,
+ * which is an INSTRUCTION of Op_type_sym. All the lets get
+ * pushed onto this stack. If we know the old top, we can then
+ * tear them down.
+ */
+ $2->memory = let_stack;
+ }
+ let_var_list_opt r_paren opt_nls statement
+ {
+ NODE *old_let_stack = $2->memory;
+
+ if ($7 != NULL) {
+ merge_comments($7, NULL);
+ $2->comment = $7;
+ }
+
+ if ($5 == NULL)
+ $$ = $8;
+ else if ($8 == NULL)
+ $$ = $5;
+ else
+ $$ = list_merge($5, $8);
+
+ /* Let block is processed; remove the variables */
+ while (let_stack != old_let_stack) {
+ NODE *let = let_stack;
+ /* pop from let stack */
+ let_stack = let->nxparam;
+ /* push onto free list */
+ let->nxparam = let_free;
+ let_free = let;
+ /* scrub from symbol table */
+ remove_let(let);
+ }
+
+ yyerrok;
+ }
| non_compound_stmt
{
if (do_pretty_print)
@@ -1535,6 +1593,80 @@ param_list
{ $$ = $1; }
;
+let_var_list_opt
+ : /* empty */
+ { $$ = NULL; }
+ | let_var_list
+ { $$ = $1; }
+ ;
+
+let_var_list
+ : NAME
+ {
+ bool is_reused_location = add_let(in_function, $1);
+
+ /* If we are not in a loop, and the variable is using
+ a fresh location, then we can count on that being
+ clear. Otherwise we have to generate code to clear it */
+ if (!in_loop && !is_reused_location) {
+ $$ = NULL;
+ } else {
+ $1->opcode = Op_clear_var;
+ $1->memory = variable($1->source_line, $1->lextok,
+ Node_var_new);
+ $$ = list_create($1);
+ }
+ }
+ | let_var_list comma NAME
+ {
+ bool is_reused_location = add_let(in_function, $3);
+
+ /* If we are not in a loop, and the variable is using
+ a fresh location, then we can count on that being
+ clear. Otherwise we have to generate code to clear it */
+ if (!in_loop && !is_reused_location) {
+ $$ = $1;
+ } else {
+ $3->opcode = Op_clear_var;
+ $3->memory = variable($3->source_line, $3->lextok,
+ Node_var_new);
+
+ if ($1 == NULL)
+ $$ = list_create($3);
+ else
+ $$ = list_append($1, $3);
+ }
+ }
+ | NAME ASSIGN exp
+ {
+ add_let(in_function, $1);
+ $1->opcode = Op_push;
+ $1->memory = variable($1->source_line, $1->lextok, Node_var_new);
+ $$ = list_append(mk_assignment(list_create($1), $3, $2),
+ instruction(Op_pop));
+
+ }
+ | let_var_list comma NAME ASSIGN exp
+ {
+ INSTRUCTION *assn;
+ add_let(in_function, $3);
+ $3->opcode = Op_push;
+ $3->memory = variable($3->source_line, $3->lextok, Node_var_new);
+ assn = list_append(mk_assignment(list_create($3), $5, $4),
+ instruction(Op_pop));
+ if ($1 == NULL)
+ $$ = assn;
+ else
+ $$ = list_merge($1, assn);
+ }
+ | error
+ { $$ = NULL; }
+ | let_var_list error
+ { $$ = $1; }
+ | let_var_list comma error
+ { $$ = $1; }
+ ;
+
/* optional expression, as in for loop */
opt_exp
: /* empty */
@@ -2325,6 +2457,7 @@ static const struct token tokentab[] = {
#endif
{"isarray", Op_builtin, LEX_BUILTIN, GAWKX|A(1), do_isarray, 0},
{"length", Op_builtin, LEX_LENGTH, A(0)|A(1), do_length, 0},
+{"let", Op_symbol, LEX_LET, GAWKX, 0, 0},
{"load", Op_symbol, LEX_LOAD, GAWKX, 0, 0},
{"log", Op_builtin, LEX_BUILTIN, A(1), do_log, MPF(log)},
{"lshift", Op_builtin, LEX_BUILTIN, GAWKX|A(2), do_lshift, MPF(lshift)},
@@ -4375,6 +4508,7 @@ retry:
switch (class) {
case LEX_EVAL:
case LEX_INCLUDE:
+ case LEX_LET:
case LEX_LOAD:
case LEX_NAMESPACE:
if (lasttok != '@')
@@ -4464,6 +4598,8 @@ retry:
case LEX_FOR:
case LEX_WHILE:
case LEX_DO:
+ in_loop++;
+ /* falltrhough */
case LEX_SWITCH:
if (! do_pretty_print)
return lasttok = class;
@@ -4869,9 +5005,9 @@ snode(INSTRUCTION *subn, INSTRUCTION *r)
static int
parms_shadow(INSTRUCTION *pc, bool *shadow)
{
- int pcount, i;
+ int pcount, lcount, i;
bool ret = false;
- NODE *func, *fp;
+ NODE *func, **fp;
char *fname;
func = pc->func_body;
@@ -4884,8 +5020,9 @@ parms_shadow(INSTRUCTION *pc, bool *shadow)
#endif
pcount = func->param_cnt;
+ lcount = func->frame_cnt;
- if (pcount == 0) /* no args, no problem */
+ if (lcount == 0) /* no locals, no problem */
return 0;
source = pc->source_file;
@@ -4894,11 +5031,12 @@ parms_shadow(INSTRUCTION *pc, bool *shadow)
* Use warning() and not lintwarn() so that can warn
* about all shadowed parameters.
*/
- for (i = 0; i < pcount; i++) {
- if (lookup(fp[i].param) != NULL) {
- warning(
- _("function `%s': parameter `%s' shadows global variable"),
- fname, fp[i].param);
+ for (i = 0; i < lcount; i++) {
+ if (lookup(fp[i]->param) != NULL) {
+ warning((i < pcount)
+ ? _("function `%s': parameter `%s' shadows global variable")
+ : _("function `%s': local `%s' shadows global variable"),
+ fname, fp[i]->param);
ret = true;
}
}
@@ -5046,8 +5184,8 @@ mk_function(INSTRUCTION *fi, INSTRUCTION *def)
/* update lint table info */
func_use(thisfunc->vname, FUNC_DEFINE);
- /* remove params from symbol table */
- remove_params(thisfunc);
+ /* remove params/locals from symbol table */
+ remove_locals(thisfunc);
return fi;
}
@@ -5078,7 +5216,12 @@ install_function(char *fname, INSTRUCTION *fi, INSTRUCTION *plist)
}
fi->func_body = f;
- f->param_cnt = pcount;
+ /*
+ * param_cnt and frame_cnt stay the same if there are no @local
+ * variables. add_let increments frame_cnt, and frame_cnt
+ * is what is allocated when a function is invoked.
+ */
+ f->frame_cnt = f->param_cnt = pcount;
f->code_ptr = fi;
f->fparms = NULL;
if (pcount > 0) {
@@ -5091,6 +5234,59 @@ install_function(char *fname, INSTRUCTION *fi, INSTRUCTION *plist)
return 0;
}
+static bool
+add_let(INSTRUCTION *fi, INSTRUCTION *local)
+{
+ NODE *f = fi != NULL ? fi->func_body : NULL;
+ const char *fname = f != NULL ? f->vname : NULL;
+ const char *name = estrdup(local->lextok, strlen(local->lextok));
+
+ /* Basic checks:*/
+ check_local(fname, name, local);
+
+ /* No duplicate check for lexicals */
+
+ /*
+ * Try to get lexical from the free list.
+ */
+ if (let_free) {
+ /* pop let from stack */
+ NODE *let = let_free;
+ let_free = let_free->nxparam;
+ /* register in param or alias table under the given name */
+ install_let(let, name);
+ /* push onto let stack */
+ let->nxparam = let_stack;
+ let_stack = let;
+ return true; /* Reused frame slot */
+ } else if (f != NULL) { /* allocate new local in function */
+ NODE **parms = f->fparms;
+ int lcount = f->frame_cnt, i;
+ NODE *let;
+
+ /* Reallocate the function's param vector to accommodate
+ * the new one, or allocate if null.
+ */
+ lcount++;
+ f->fparms = extend_locals(parms, name, lcount);
+ f->frame_cnt = lcount;
+
+ let = f->fparms[lcount - 1];
+ let->nxparam = let_stack;
+ let_stack = let;
+
+ return false; /* Fresh, not re-used frame slot */
+ } else { /* allocate new let outside of function as alias for anon global */
+ char *var = gensym("let");
+ NODE *anon_global = variable(local->source_line, var, Node_var_new);
+ NODE *let = install_global_let(name, anon_global);
+
+ let->nxparam = let_stack;
+ let_stack = let;
+
+ return false;
+ }
+}
/* check_params --- build a list of function parameter names after
* making sure that the names are valid and there are no duplicates.
@@ -5113,18 +5309,7 @@ check_params(char *fname, int pcount, INSTRUCTION *list)
name = p->lextok;
p->lextok = NULL;
- if (strcmp(name, fname) == 0) {
- /* check for function foo(foo) { ... }. bleah. */
- error_ln(p->source_line,
- _("function `%s': cannot use function name as parameter name"), fname);
- } else if (is_std_var(name)) {
- error_ln(p->source_line,
- _("function `%s': cannot use special variable `%s' as a function parameter"),
- fname, name);
- } else if (strchr(name, ':') != NULL)
- error_ln(p->source_line,
- _("function `%s': parameter `%s' cannot contain a namespace"),
- fname, name);
+ check_param(fname, name, p);
/* check for duplicate parameters */
for (j = 0; j < i; j++) {
@@ -5143,6 +5328,53 @@ check_params(char *fname, int pcount, INSTRUCTION *list)
return pnames;
}
+/* check_param --- perform basic checks on one parameter.
+ */
+static void
+check_param(const char *fname, const char *name, INSTRUCTION *parm)
+{
+ if (strcmp(name, fname) == 0) {
+ /* check for function foo(foo) { ... }. bleah. */
+ error_ln(parm->source_line,
+ _("function `%s': cannot use function name as parameter name"), fname);
+ } else if (is_std_var(name)) {
+ error_ln(parm->source_line,
+ _("function `%s': cannot use special variable `%s' as a function parameter"),
+ fname, name);
+ } else if (strchr(name, ':') != NULL) {
+ error_ln(parm->source_line,
+ _("function `%s': parameter `%s' cannot contain a namespace"),
+ fname, name);
+ }
+}
+
+/* check_local == like check_param but with wording about locals
+ */
+static void
+check_local(const char *fname, const char *name, INSTRUCTION *local)
+{
+ if (fname && strcmp(name, fname) == 0) {
+ /* check for function foo(foo) { ... }. bleah. */
+ error_ln(local->source_line,
+ _("function `%s': cannot use function name as local variable name"), fname);
+ } else if (is_std_var(name)) {
+ error_ln(local->source_line,
+ _("cannot use special variable `%s' as a local variable"), name);
+ } else if (strchr(name, ':') != NULL) {
+ error_ln(local->source_line,
+ _("local variable `%s' cannot contain a namespace"), name);
+ }
+}
+
+static char *
+gensym(const char *prefix)
+{
+ char buf[64];
+ static unsigned int gensym_counter;
+
+ size_t len = snprintf(buf, sizeof buf, "$%s%04d", prefix, ++gensym_counter);
+ return estrdup(buf, len);
+}
#ifdef HASHSIZE
undef HASHSIZE
diff --git a/command.y b/command.y
index 18980d38..b43f54bb 100644
--- a/command.y
+++ b/command.y
@@ -1697,7 +1697,7 @@ variable_generator(const char *text, int state)
idx = 0;
break;
}
- name = func->fparms[idx++].param;
+ name = func->fparms[idx++]->param;
if (strncmp(name, text, textlen) == 0)
return estrdup(name, strlen(name));
}
diff --git a/debug.c b/debug.c
index 2849a4c1..b343e427 100644
--- a/debug.c
+++ b/debug.c
@@ -841,7 +841,7 @@ do_info(CMDARG *arg, int cmd ATTRIBUTE_UNUSED)
return false;
}
- pcount = func->param_cnt; /* # of defined params */
+ pcount = func->frame_cnt; /* # of defined params/locals */
pc = (INSTRUCTION *) f->reti; /* Op_func_call instruction */
arg_count = (pc + 1)->expr_count; /* # of arguments supplied */
@@ -861,7 +861,7 @@ do_info(CMDARG *arg, int cmd ATTRIBUTE_UNUSED)
r = f->stack[i];
if (r->type == Node_array_ref)
r = r->orig_array;
- fprintf(out_fp, "%s = ", func->fparms[i].param);
+ fprintf(out_fp, "%s = ", func->fparms[i]->param);
print_symbol(r, true);
}
if (to < from)
@@ -998,7 +998,7 @@ find_frame(long num)
return fcall_list[num];
}
-/* find_param --- find a function parameter in a given frame number */
+/* find_param --- find a function parameter/local in a given frame number */
static NODE *
find_param(const char *name, long num, char **pname)
@@ -1018,9 +1018,9 @@ find_param(const char *name, long num, char **pname)
int i, pcount;
func = f->func_node;
- pcount = func->param_cnt;
+ pcount = func->frame_cnt;
for (i = 0; i < pcount; i++) {
- fparam = func->fparms[i].param;
+ fparam = func->fparms[i]->param;
if (strcmp(name, fparam) == 0) {
r = f->stack[i];
if (r->type == Node_array_ref)
@@ -1917,7 +1917,7 @@ print_function(INSTRUCTION *pc, void *x)
print_func(fp, "%s(", func->vname);
for (i = 0; i < pcount; i++) {
- print_func(fp, "%s", func->fparms[i].param);
+ print_func(fp, "%s", func->fparms[i]->param);
if (i < pcount - 1)
print_func(fp, ", ");
}
@@ -3744,7 +3744,7 @@ print_memory(NODE *m, NODE *func, Func_print print_func, FILE *fp)
case Node_param_list:
assert(func != NULL);
- print_func(fp, "%s", func->fparms[m->param_cnt].param);
+ print_func(fp, "%s", func->fparms[m->param_cnt]->param);
break;
case Node_var:
@@ -3781,7 +3781,7 @@ print_instruction(INSTRUCTION *pc, Func_print print_func, FILE *fp, int in_dump)
int j;
print_func(fp, "\n\t# Function: %s (", func->vname);
for (j = 0; j < pcount; j++) {
- print_func(fp, "%s", func->fparms[j].param);
+ print_func(fp, "%s", func->fparms[j]->param);
if (j < pcount - 1)
print_func(fp, ", ");
}
@@ -3990,7 +3990,7 @@ print_instruction(INSTRUCTION *pc, Func_print print_func, FILE *fp, int in_dump)
case Op_arrayfor_incr:
print_func(fp, "[array_var = %s] [target_jmp = " PTRFMT "]\n",
pc->array_var->type == Node_param_list ?
- func->fparms[pc->array_var->param_cnt].param : pc->array_var->vname,
+ func->fparms[pc->array_var->param_cnt]->param : pc->array_var->vname,
pc->target_jmp);
break;
@@ -4132,6 +4132,7 @@ print_instruction(INSTRUCTION *pc, Func_print print_func, FILE *fp, int in_dump)
case Op_quotient_i:
case Op_mod_i:
case Op_assign_concat:
+ case Op_clear_var:
print_memory(pc->memory, func, print_func, fp);
/* fall through */
default:
@@ -5614,7 +5615,7 @@ do_eval(CMDARG *arg, int cmd ATTRIBUTE_UNUSED)
do_flags &= DO_MPFR; // preserve this flag only
ret = parse_program(&code, true);
do_flags = save_flags;
- remove_params(this_func);
+ remove_locals(this_func);
if (ret != 0) {
pop_context(); /* switch to prev context */
free_context(ctxt, false /* keep_globals */);
@@ -5650,7 +5651,7 @@ do_eval(CMDARG *arg, int cmd ATTRIBUTE_UNUSED)
t->opcode = Op_stop;
/* add or append eval locals to the current frame stack */
- ecount = f->param_cnt; /* eval local count */
+ ecount = f->frame_cnt; /* eval local count */
pcount = this_func->param_cnt;
if (ecount > 0) {
@@ -5663,7 +5664,7 @@ do_eval(CMDARG *arg, int cmd ATTRIBUTE_UNUSED)
for (i = 0; i < ecount; i++) {
NODE *np;
- np = f->fparms + i;
+ np = f->fparms[i];
np->param_cnt += pcount; /* appending eval locals: fixup param_cnt */
getnode(r);
@@ -5806,7 +5807,7 @@ parse_condition(int type, int num, char *expr)
do_flags = false;
ret = parse_program(&code, true);
do_flags = save_flags;
- remove_params(this_func);
+ remove_locals(this_func);
pop_context();
if (ret != 0 || invalid_symbol) {
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 68b52536..dea485c6 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -783,6 +783,7 @@ particular records in a file and perform operations upon them.
* Function Calling:: Calling user-defined functions.
* Calling A Function:: Don't use spaces.
* Variable Scope:: Controlling variable scope.
+* Local Variables:: Enhanced Awk (@command{egawk}) local variables.
* Pass By Value/Reference:: Passing parameters.
* Function Caveats:: Other points to know about functions.
* Return Statement:: Specifying the value a function
@@ -21464,6 +21465,7 @@ the function.
@menu
* Calling A Function:: Don't use spaces.
* Variable Scope:: Controlling variable scope.
+* Local Variables:: Enhanced Awk (@command{egawk}) local variables.
* Pass By Value/Reference:: Passing parameters.
* Function Caveats:: Other points to know about functions.
@end menu
@@ -21502,8 +21504,11 @@ there is no way to make a variable local to a @code{@{} @dots{} @code{@}} block
good practice to do so whenever a variable is needed only in that
function.
-To make a variable local to a function, simply declare the variable as
-an argument after the actual function arguments
+Enhanced GNU Awk (@command{egawk}) has language extensions in this area,
+described in @ref{Local Variables}.
+
+In standard @command{awk}, to make a variable local to a function, simply declare the
+variable as an argument after the actual function arguments
(@pxref{Definition Syntax}).
Look at the following example, where variable
@code{i} is a global variable used by both functions @code{foo()} and
@@ -21628,6 +21633,182 @@ At level 2, index 1 is not found in a
At level 2, index 2 is found in a
@end example
+@node Local Variables
+@subsubsection @command{egawk} Local Variable Extension
+@cindex @code{@@let} statement
+This @value{SECTION} describes an extension specific to a custom
+version of GNU Awk called Enhanced GNU Awk, which is installed
+under the command name @command{egawk}.
+
+As documented in @ref{Variable Scope}, function-wide local variables
+are defined as function parameters in standard @command{awk}. The
+language does not distinguish parameters used as local variables
+from true parameters that receive arguments. This is only a programmer
+convention, which is enforced by discipline and the use of traditional
+annotation devices, such as visually separating the parameters intended
+for argument passing from the parameters intended to serve as local
+variables.
+
+@command{egawk} provides a language extension in this area, allowing
+the programmer to specify conventional function-wide local variables which do
+not appear in the parameter list and cannot receive arguments.
+
+The extension takes the form of the construct @code{@@let}
+statement.
+
+The @code{@@let} statement is introduced by the @code{@@} symbol
+followed by the special keyword @code{let}. These tokens are
+then followed by a comma-separated list of variable declarators,
+enclosed in parentheses. After the parentheses comes a required statement,
+The list of variables may be empty.
+
+The statement is executed in a scope in which the specified variables are
+visible, in addition to any other variables that were previously visible that
+do not have the same names. When the statement terminates, the variables
+specified in that statement disappear.
+
+Declarators consist of variable names, optionally initialized by expressions.
+The initializing expressions are indicated by the @code{=} sign:
+
+@example
+function fun(x)
+{
+ ...
+ @let (a, b = 3, ir2 = 0.707) {
+ ...
+ }
+ ...
+}
+@end example
+
+Local variables introduced by @code{@@let} may have the same names as global
+variables, or, in a function, the parameter names of the enclosing function.
+In this situation, over their scope, the @code{@@let} variables are visible,
+hiding the same-named parameters or variables. This is called @emph{shadowing}.
+
+Shadowing also takes place among same-named @code{@@let} variables,
+which occurs when a variable name is repeated in the same @code{@@let}
+construct, or in two different @code{@@let} constructs which are nested.
+
+A @code{@@let} variable may not have the same name as the enclosing
+function, or the same name as an Awk special variable such as @code{NF}.
+A name with a namespace prefix such as @code{awk::score} also may not be used
+as a local variable.
+
+The @code{@@let} construct may be used outside or inside of a function.
+The semantics is identical, but the implementation is different.
+Inside a function, the construct allocates variables from the local variable
+frame of the function invocation. Outside of a function, it allocates
+anonymous variables from the global namespace. These hidden variables
+can be seen in the output of the @code{-d} option, having numbered names
+which look like @code{$let0001}. This is an implementation detail that may
+change in the future.
+
+A local variable that has no initializing expression has the empty numeric
+string value, just like a regular Awk variable that has not been assigned: it
+compares equal to the empty string as well as to zero.
+
+In the following example, the function's first reference to @code{accum} is a
+reference to the global variable. The second reference is local.
+
+@example
+function fun()
+{
+ accum = 42
+ @let (accum) {
+ print "fun: accum = ", accum
+ accum = 43
+ }
+}
+
+BEGIN { fun(); print "BEGIN: accum = ", accum }
+@end example
+
+The output is
+
+@example
+fun: accum =
+BEGIN: accum = 42
+@end example
+
+After the @code{@@let} statement inside the function, @code{accum} no longer
+appears to have a defined value, even though @code{accum} was just assigned the
+value 42. This is because @code{@@let} has introduced a local variable
+unrelated to any global variable, and that variable is not initialized.
+
+The @code{print} statement in the @code{BEGIN} block confirms that the
+assigning the value 43 to the local @code{accum} had no effect on the global
+@code{accum}.
+
+The scope of a local variable begins from its point of declaration, just
+after the initializing expression, if any. The initializing expression
+is evaluated in a scope in which the variable is not yet visible.
+
+@example
+function helper()
+{
+ print "helper: level =", level
+}
+
+function main()
+{
+ @let (level = level + 1) {
+ print "main: level =", level
+ helper()
+ }
+}
+
+BEGIN {
+ level = 0
+ main()
+}
+@end example
+
+the output is:
+
+@example
+main: level = 1
+helper: level = 0
+@end example
+
+In this example, the function @code{main} locally shadows the global
+variable @code{level}, giving the local @code{level} value which is one greater
+than the global @code{level}.
+
+This local variable is lexically scoped; when @code{main} invokes
+@code{helper}, it is evident that @code{helper} is, again, referring to the
+global @code{level} variable; the @code{helper} function has no visibility
+into the scope of the caller, @code{main}.
+
+Because a local variable's scope begins immediately after its declaration,
+within a single @code{@@let} statement, the initializing expressions of
+later variables are evaluated in the scope of earlier variables. Furthermore,
+later variables may repeat the names of earlier variables. These later
+variables are new variables which shadow the earlier ones.
+
+The following statement makes sense:
+
+@example
+BEGIN {
+ @let (x = 0, x = x + 1, x = x + 1, x = x + 1)
+ print x
+}
+@end example
+
+Output:
+
+@example
+3
+@end example
+
+While the variable initializations may resemble the steps of an
+imperative program which assigns four successive values to a single
+accumulator, that is not the case; four different variables named
+@code{x} are defined here, each one shadowing the preceding one.
+The @code{print} statement is then executed in the scope of the
+rightmost @code{x}. The initializing expressions @code{x + 1}
+have the previous @code{x} still in scope.
+
@node Pass By Value/Reference
@subsubsection Passing Function Arguments by Value Or by Reference
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index bfefda24..651bd8d2 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -778,6 +778,7 @@ particular records in a file and perform operations upon them.
* Function Calling:: Calling user-defined functions.
* Calling A Function:: Don't use spaces.
* Variable Scope:: Controlling variable scope.
+* Local Variables:: Enhanced Awk (@command{egawk}) local variables.
* Pass By Value/Reference:: Passing parameters.
* Function Caveats:: Other points to know about functions.
* Return Statement:: Specifying the value a function
@@ -20376,6 +20377,7 @@ the function.
@menu
* Calling A Function:: Don't use spaces.
* Variable Scope:: Controlling variable scope.
+* Local Variables:: Enhanced Awk (@command{egawk}) local variables.
* Pass By Value/Reference:: Passing parameters.
* Function Caveats:: Other points to know about functions.
@end menu
@@ -20414,8 +20416,11 @@ there is no way to make a variable local to a @code{@{} @dots{} @code{@}} block
good practice to do so whenever a variable is needed only in that
function.
-To make a variable local to a function, simply declare the variable as
-an argument after the actual function arguments
+Enhanced GNU Awk (@command{egawk}) has language extensions in this area,
+described in @ref{Local Variables}.
+
+In standard @command{awk}, to make a variable local to a function, simply declare the
+variable as an argument after the actual function arguments
(@pxref{Definition Syntax}).
Look at the following example, where variable
@code{i} is a global variable used by both functions @code{foo()} and
@@ -20540,6 +20545,182 @@ At level 2, index 1 is not found in a
At level 2, index 2 is found in a
@end example
+@node Local Variables
+@subsubsection @command{egawk} Local Variable Extension
+@cindex @code{@@let} statement
+This @value{SECTION} describes an extension specific to a custom
+version of GNU Awk called Enhanced GNU Awk, which is installed
+under the command name @command{egawk}.
+
+As documented in @ref{Variable Scope}, function-wide local variables
+are defined as function parameters in standard @command{awk}. The
+language does not distinguish parameters used as local variables
+from true parameters that receive arguments. This is only a programmer
+convention, which is enforced by discipline and the use of traditional
+annotation devices, such as visually separating the parameters intended
+for argument passing from the parameters intended to serve as local
+variables.
+
+@command{egawk} provides a language extension in this area, allowing
+the programmer to specify conventional function-wide local variables which do
+not appear in the parameter list and cannot receive arguments.
+
+The extension takes the form of the construct @code{@@let}
+statement.
+
+The @code{@@let} statement is introduced by the @code{@@} symbol
+followed by the special keyword @code{let}. These tokens are
+then followed by a comma-separated list of variable declarators,
+enclosed in parentheses. After the parentheses comes a required statement,
+The list of variables may be empty.
+
+The statement is executed in a scope in which the specified variables are
+visible, in addition to any other variables that were previously visible that
+do not have the same names. When the statement terminates, the variables
+specified in that statement disappear.
+
+Declarators consist of variable names, optionally initialized by expressions.
+The initializing expressions are indicated by the @code{=} sign:
+
+@example
+function fun(x)
+{
+ ...
+ @let (a, b = 3, ir2 = 0.707) {
+ ...
+ }
+ ...
+}
+@end example
+
+Local variables introduced by @code{@@let} may have the same names as global
+variables, or, in a function, the parameter names of the enclosing function.
+In this situation, over their scope, the @code{@@let} variables are visible,
+hiding the same-named parameters or variables. This is called @emph{shadowing}.
+
+Shadowing also takes place among same-named @code{@@let} variables,
+which occurs when a variable name is repeated in the same @code{@@let}
+construct, or in two different @code{@@let} constructs which are nested.
+
+A @code{@@let} variable may not have the same name as the enclosing
+function, or the same name as an Awk special variable such as @code{NF}.
+A name with a namespace prefix such as @code{awk::score} also may not be used
+as a local variable.
+
+The @code{@@let} construct may be used outside or inside of a function.
+The semantics is identical, but the implementation is different.
+Inside a function, the construct allocates variables from the local variable
+frame of the function invocation. Outside of a function, it allocates
+anonymous variables from the global namespace. These hidden variables
+can be seen in the output of the @code{-d} option, having numbered names
+which look like @code{$let0001}. This is an implementation detail that may
+change in the future.
+
+A local variable that has no initializing expression has the empty numeric
+string value, just like a regular Awk variable that has not been assigned: it
+compares equal to the empty string as well as to zero.
+
+In the following example, the function's first reference to @code{accum} is a
+reference to the global variable. The second reference is local.
+
+@example
+function fun()
+{
+ accum = 42
+ @let (accum) {
+ print "fun: accum = ", accum
+ accum = 43
+ }
+}
+
+BEGIN { fun(); print "BEGIN: accum = ", accum }
+@end example
+
+The output is
+
+@example
+fun: accum =
+BEGIN: accum = 42
+@end example
+
+After the @code{@@let} statement inside the function, @code{accum} no longer
+appears to have a defined value, even though @code{accum} was just assigned the
+value 42. This is because @code{@@let} has introduced a local variable
+unrelated to any global variable, and that variable is not initialized.
+
+The @code{print} statement in the @code{BEGIN} block confirms that the
+assigning the value 43 to the local @code{accum} had no effect on the global
+@code{accum}.
+
+The scope of a local variable begins from its point of declaration, just
+after the initializing expression, if any. The initializing expression
+is evaluated in a scope in which the variable is not yet visible.
+
+@example
+function helper()
+{
+ print "helper: level =", level
+}
+
+function main()
+{
+ @let (level = level + 1) {
+ print "main: level =", level
+ helper()
+ }
+}
+
+BEGIN {
+ level = 0
+ main()
+}
+@end example
+
+the output is:
+
+@example
+main: level = 1
+helper: level = 0
+@end example
+
+In this example, the function @code{main} locally shadows the global
+variable @code{level}, giving the local @code{level} value which is one greater
+than the global @code{level}.
+
+This local variable is lexically scoped; when @code{main} invokes
+@code{helper}, it is evident that @code{helper} is, again, referring to the
+global @code{level} variable; the @code{helper} function has no visibility
+into the scope of the caller, @code{main}.
+
+Because a local variable's scope begins immediately after its declaration,
+within a single @code{@@let} statement, the initializing expressions of
+later variables are evaluated in the scope of earlier variables. Furthermore,
+later variables may repeat the names of earlier variables. These later
+variables are new variables which shadow the earlier ones.
+
+The following statement makes sense:
+
+@example
+BEGIN {
+ @let (x = 0, x = x + 1, x = x + 1, x = x + 1)
+ print x
+}
+@end example
+
+Output:
+
+@example
+3
+@end example
+
+While the variable initializations may resemble the steps of an
+imperative program which assigns four successive values to a single
+accumulator, that is not the case; four different variables named
+@code{x} are defined here, each one shadowing the preceding one.
+The @code{print} statement is then executed in the scope of the
+rightmost @code{x}. The initializing expressions @code{x + 1}
+have the previous @code{x} still in scope.
+
@node Pass By Value/Reference
@subsubsection Passing Function Arguments by Value Or by Reference
diff --git a/eval.c b/eval.c
index c6f8bcb9..7d8e2847 100644
--- a/eval.c
+++ b/eval.c
@@ -239,6 +239,7 @@ static const char *const nodetypes[] = {
"Node_var_array",
"Node_var_new",
"Node_param_list",
+ "Node_alias",
"Node_func",
"Node_ext_func",
"Node_builtin_func",
@@ -291,6 +292,7 @@ static struct optypetab {
{ "Op_not", "! " },
{ "Op_assign", " = " },
{ "Op_store_var", " = " },
+ { "Op_clear_var", " = <undefined>" },
{ "Op_store_sub", " = " },
{ "Op_store_field", " = " },
{ "Op_assign_times", " *= " },
@@ -1262,17 +1264,18 @@ static INSTRUCTION *
setup_frame(INSTRUCTION *pc)
{
NODE *r = NULL;
- NODE *m, *f, *fp;
+ NODE *m, *f, **fp;
NODE **sp = NULL;
- int pcount, arg_count, i, j;
+ int pcount, lcount, arg_count, i, j;
f = pc->func_body;
pcount = f->param_cnt;
+ lcount = f->frame_cnt;
fp = f->fparms;
arg_count = (pc + 1)->expr_count;
- if (pcount > 0) {
- ezalloc(sp, NODE **, pcount * sizeof(NODE *), "setup_frame");
+ if (lcount > 0) {
+ ezalloc(sp, NODE **, lcount * sizeof(NODE *), "setup_frame");
}
/* check for extra args */
@@ -1287,7 +1290,7 @@ setup_frame(INSTRUCTION *pc)
} while (--arg_count > pcount);
}
- for (i = 0, j = arg_count - 1; i < pcount; i++, j--) {
+ for (i = 0, j = arg_count - 1; i < lcount; i++, j--) {
getnode(r);
memset(r, 0, sizeof(NODE));
sp[i] = r;
@@ -1295,7 +1298,7 @@ setup_frame(INSTRUCTION *pc)
if (i >= arg_count) {
/* local variable */
r->type = Node_var_new;
- r->vname = fp[i].param;
+ r->vname = fp[i]->param;
continue;
}
@@ -1347,7 +1350,7 @@ setup_frame(INSTRUCTION *pc)
default:
cant_happen("unexpected parameter type %s", nodetype2str(m->type));
}
- r->vname = fp[i].param;
+ r->vname = fp[i]->param;
}
stack_adj(-arg_count); /* adjust stack pointer */
@@ -1390,7 +1393,7 @@ restore_frame(NODE *fp)
INSTRUCTION *ri;
func = frame_ptr->func_node;
- n = func->param_cnt;
+ n = func->frame_cnt;
sp = frame_ptr->stack;
for (; n > 0; n--) {
diff --git a/interpret.h b/interpret.h
index ca67e966..ef087951 100644
--- a/interpret.h
+++ b/interpret.h
@@ -742,6 +742,26 @@ mod:
}
break;
+ case Op_clear_var:
+ /*
+ * Clear variable to the undefined value
+ * that is equal to 0 and "" represented by
+ * the global Nnull_string.
+ * This is used by @let to initialize
+ * locals, which may re-use previously
+ * initialized frame locations.
+ */
+ lhs = get_lhs(pc->memory, false);
+
+ /*
+ * If it's already clear, nothing to do
+ */
+ if (*lhs != Nnull_string) {
+ unref(*lhs);
+ *lhs = dupnode(Nnull_string);
+ }
+ break;
+
case Op_store_field:
{
/* field assignment optimization,
diff --git a/profile.c b/profile.c
index 15b33721..22ec4554 100644
--- a/profile.c
+++ b/profile.c
@@ -59,7 +59,7 @@ static void just_dump(int signum);
/* pretty printing related functions and variables */
static NODE *pp_stack = NULL;
-static NODE *func_params; /* function parameters */
+static NODE **func_params; /* function parameters */
static FILE *prof_fp; /* where to send the profile */
static long indent_level = 0;
@@ -345,6 +345,7 @@ pprint(INSTRUCTION *startp, INSTRUCTION *endp, int flags)
if (pc->initval != NULL)
pp_push(Op_push_i, pp_node(pc->initval), CAN_FREE, pc->comment);
/* fall through */
+ case Op_clear_var:
case Op_store_sub:
case Op_assign_concat:
case Op_push_lhs:
@@ -356,7 +357,7 @@ pprint(INSTRUCTION *startp, INSTRUCTION *endp, int flags)
m = pc->memory;
switch (m->type) {
case Node_param_list:
- pp_push(pc->opcode, func_params[m->param_cnt].param, DONT_FREE, pc->comment);
+ pp_push(pc->opcode, func_params[m->param_cnt]->param, DONT_FREE, pc->comment);
break;
case Node_var:
@@ -383,6 +384,11 @@ pprint(INSTRUCTION *startp, INSTRUCTION *endp, int flags)
fprintf(prof_fp, "%s%s%s", t2->pp_str, op2str(pc->opcode), t1->pp_str);
goto cleanup;
+ case Op_clear_var:
+ t2 = pp_pop(); /* l.h.s. */
+ fprintf(prof_fp, "%s%s", t2->pp_str, op2str(pc->opcode));
+ goto cleanup;
+
case Op_store_sub:
t1 = pp_pop(); /* array */
tmp = pp_list(pc->expr_count, op2str(Op_subscript), ", "); /*subscript*/
@@ -973,7 +979,7 @@ cleanup:
array = t1->pp_str;
m = ip1->forloop_cond->array_var;
if (m->type == Node_param_list)
- item = func_params[m->param_cnt].param;
+ item = func_params[m->param_cnt]->param;
else
item = m->vname;
indent(ip1->forloop_body->exec_count);
@@ -2000,7 +2006,7 @@ pp_func(INSTRUCTION *pc, void *data ATTRIBUTE_UNUSED)
pcount = func->param_cnt;
func_params = func->fparms;
for (j = 0; j < pcount; j++) {
- fprintf(prof_fp, "%s", func_params[j].param);
+ fprintf(prof_fp, "%s", func_params[j]->param);
if (j < pcount - 1)
fprintf(prof_fp, ", ");
}
diff --git a/symbol.c b/symbol.c
index 78b29bba..69f0a664 100644
--- a/symbol.c
+++ b/symbol.c
@@ -44,9 +44,21 @@ static NODE *get_name_from_awk_ns(const char *name);
static AWK_CONTEXT *curr_ctxt = NULL;
static int ctxt_level;
-static NODE *global_table, *param_table;
+static NODE *global_table, *param_table, *alias_table;
NODE *symbol_table, *func_table;
+/* Table search sequence used by lookup */
+enum {
+ TAB_PARAM,
+ TAB_ALIAS,
+ TAB_GLOBAL,
+ TAB_FUNC,
+ TAB_SYM,
+ TAB_COUNT
+};
+
+static NODE *tables[TAB_COUNT];
+
/* Use a flag to avoid a strcmp() call inside install() */
static bool installing_specials = false;
@@ -63,11 +75,22 @@ init_symbol_table()
memset(param_table, '\0', sizeof(NODE));
null_array(param_table);
+ getnode(alias_table);
+ memset(alias_table, '\0', sizeof(NODE));
+ null_array(alias_table);
+
installing_specials = true;
func_table = install_symbol(estrdup("FUNCTAB", 7), Node_var_array);
symbol_table = install_symbol(estrdup("SYMTAB", 6), Node_var_array);
installing_specials = false;
+
+ /* ``It's turtles, all the way down.'' */
+ tables[TAB_PARAM] = param_table; /* parameters shadow everything else */
+ tables[TAB_ALIAS] = alias_table; /* symbol macros for anonymous globals */
+ tables[TAB_GLOBAL] = global_table; /* SYMTAB and FUNCTAB found first, can't be redefined */
+ tables[TAB_FUNC] = func_table; /* then functions */
+ tables[TAB_SYM] = symbol_table; /* then globals */
}
/*
@@ -93,24 +116,16 @@ lookup(const char *name)
{
NODE *n;
NODE *tmp;
- NODE *tables[5]; /* manual init below, for z/OS */
int i;
- /* ``It's turtles, all the way down.'' */
- tables[0] = param_table; /* parameters shadow everything */
- tables[1] = global_table; /* SYMTAB and FUNCTAB found first, can't be redefined */
- tables[2] = func_table; /* then functions */
- tables[3] = symbol_table; /* then globals */
- tables[4] = NULL;
-
tmp = get_name_from_awk_ns(name);
n = NULL;
- for (i = 0; tables[i] != NULL; i++) {
+ for (i = 0; i < TAB_COUNT; i++) {
if (assoc_empty(tables[i]))
continue;
- if ((do_posix || do_traditional) && tables[i] == global_table)
+ if ((do_posix || do_traditional) && i == TAB_GLOBAL)
continue;
n = in_array(tables[i], tmp);
@@ -121,28 +136,71 @@ lookup(const char *name)
unref(tmp);
if (n == NULL || n->type == Node_val) /* non-variable in SYMTAB */
return NULL;
+ /* If the name is found in the param or alias table and has multiple
+ * entries, then select the most recently pushed one in place of n.
+ */
+ if ((i == TAB_PARAM || i == TAB_ALIAS) && n->dup_ent)
+ n = n->dup_ent;
+
+ /* Alias resolution */
+ if (i == TAB_ALIAS)
+ n = n->let_alias;
+
+ /* Simple entry: return the entry itself.
+ */
return n; /* new place */
}
/* make_params --- allocate function parameters for the symbol table */
-NODE *
+NODE **
make_params(char **pnames, int pcount)
{
- NODE *p, *parms;
+ NODE **pp, **parms;
int i;
if (pcount <= 0 || pnames == NULL)
return NULL;
- ezalloc(parms, NODE *, pcount * sizeof(NODE), "make_params");
+ emalloc(parms, NODE **, pcount * sizeof(NODE *), "make_params");
- for (i = 0, p = parms; i < pcount; i++, p++) {
+ for (i = 0, pp = parms; i < pcount; i++, pp++) {
+ NODE *p;
+ getnode(p);
+ memset(p, 0, sizeof *p);
p->type = Node_param_list;
p->param = pnames[i]; /* shadows pname and vname */
p->param_cnt = i;
+ *pp = p;
+ }
+
+ return parms;
+}
+
+/* extend_locals --- add a parameter to an existing param vector */
+
+NODE **extend_locals(NODE **parms, const char *pname, int lcount)
+{
+ NODE *p;
+
+ if (parms == NULL) {
+ emalloc(parms, NODE **, lcount * sizeof(NODE *), "extend_locals");
+ } else {
+ erealloc(parms, NODE **, lcount * sizeof(NODE *), "extend_locals");
}
+ getnode(p);
+ memset(p, 0, sizeof *p);
+
+ p->type = Node_param_list;
+ p->param = (char *) pname;
+ p->param_cnt = lcount - 1;
+
+ parms[lcount - 1] = p;
+
+ /* Unlike make_params, this also installs. */
+ install(pname, p, Node_param_list);
+
return parms;
}
@@ -152,7 +210,7 @@ void
install_params(NODE *func)
{
int i, pcount;
- NODE *parms;
+ NODE **parms;
if (func == NULL)
return;
@@ -164,7 +222,7 @@ install_params(NODE *func)
return;
for (i = 0; i < pcount; i++)
- (void) install(parms[i].param, parms + i, Node_param_list);
+ (void) install(parms[i]->param, parms[i], Node_param_list);
}
@@ -172,10 +230,35 @@ install_params(NODE *func)
* remove_params --- remove function parameters out of the symbol table.
*/
+static void
+remove_common(NODE *p)
+{
+ NODE *tmp;
+ NODE *tmp2;
+ NODE *table;
+
+ assert(p->type == Node_param_list || p->type == Node_alias);
+
+ if (p->type == Node_param_list)
+ table = param_table;
+ else
+ table = alias_table;
+
+ tmp = make_string(p->vname, strlen(p->vname));
+ tmp2 = in_array(table, tmp);
+
+ if (tmp2 != NULL && tmp2->dup_ent != NULL)
+ tmp2->dup_ent = tmp2->dup_ent->dup_ent;
+ else
+ (void) assoc_remove(table, tmp);
+
+ unref(tmp);
+}
+
void
-remove_params(NODE *func)
+remove_locals(NODE *func)
{
- NODE *parms, *p;
+ NODE **parms, *p;
int i, pcount;
if (func == NULL)
@@ -183,29 +266,46 @@ remove_params(NODE *func)
assert(func->type == Node_func);
- if ( (pcount = func->param_cnt) <= 0
+ if ( (pcount = func->frame_cnt) <= 0
|| (parms = func->fparms) == NULL)
return;
- for (i = pcount - 1; i >= 0; i--) {
- NODE *tmp;
- NODE *tmp2;
+ for (i = pcount - 1; i >= 0; i--)
+ remove_common(parms[i]);
- p = parms + i;
- assert(p->type == Node_param_list);
- tmp = make_string(p->vname, strlen(p->vname));
- tmp2 = in_array(param_table, tmp);
- if (tmp2 != NULL && tmp2->dup_ent != NULL)
- tmp2->dup_ent = tmp2->dup_ent->dup_ent;
- else
- (void) assoc_remove(param_table, tmp);
+ assoc_clear(param_table); /* shazzam! */
+}
- unref(tmp);
- }
+void
+install_let(NODE *let, const char *alias)
+{
+ free(let->param);
+ let->param = (char *) alias;
+ install(alias, let, let->type);
+}
- assoc_clear(param_table); /* shazzam! */
+NODE *
+install_global_let(const char *alias, NODE *anon_global)
+{
+ NODE *let;
+
+ getnode(let);
+ memset(let, 0, sizeof *let);
+
+ let->type = Node_alias;
+ let->param = (char *) alias;
+ let->let_alias = anon_global;
+
+ install(alias, let, Node_alias);
+
+ return let;
}
+void
+remove_let(NODE *let)
+{
+ remove_common(let);
+}
/* remove_symbol --- remove a symbol from the symbol table */
@@ -239,14 +339,14 @@ destroy_symbol(NODE *r)
switch (r->type) {
case Node_func:
- if (r->param_cnt > 0) {
+ if (r->frame_cnt > 0) {
NODE *n;
int i;
- int pcount = r->param_cnt;
+ int pcount = r->frame_cnt;
/* function parameters of type Node_param_list */
for (i = 0; i < pcount; i++) {
- n = r->fparms + i;
+ n = r->fparms[i];
efree(n->param);
}
efree(r->fparms);
@@ -315,6 +415,8 @@ install(const char *name, NODE *parm, NODETYPE type)
|| type == Node_ext_func
|| type == Node_builtin_func) {
table = func_table;
+ } else if (type == Node_alias) {
+ table = alias_table;
} else if (installing_specials) {
table = global_table;
}
@@ -330,7 +432,7 @@ install(const char *name, NODE *parm, NODETYPE type)
var_count++; /* total, includes Node_func */
}
- if (type == Node_param_list) {
+ if (type == Node_param_list || type == Node_alias) {
prev = in_array(table, n_name);
if (prev == NULL)
goto simple;
@@ -686,22 +788,22 @@ check_param_names(void)
for (i = 0; i < max; i += 2) {
f = list[i+1];
- if (f->type == Node_builtin_func || f->param_cnt == 0)
+ if (f->type == Node_builtin_func || f->frame_cnt == 0)
continue;
- /* loop over each param in function i */
- for (j = 0; j < f->param_cnt; j++) {
+ /* loop over each param/local in function i */
+ for (j = 0; j < f->frame_cnt; j++) {
/* compare to function names */
/* use a fake node to avoid malloc/free of make_string */
- n.stptr = f->fparms[j].param;
- n.stlen = strlen(f->fparms[j].param);
+ n.stptr = f->fparms[j]->param;
+ n.stlen = strlen(f->fparms[j]->param);
if (in_array(func_table, & n)) {
error(
_("function `%s': cannot use function `%s' as a parameter name"),
list[i]->stptr,
- f->fparms[j].param);
+ f->fparms[j]->param);
result = false;
}
}
diff --git a/test/Makefile.am b/test/Makefile.am
index e6652965..6fc0b281 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -1422,7 +1422,8 @@ BASIC_TESTS = \
getline4 getline5 getlnbuf getnr2tb getnr2tm gsubasgn gsubtest \
gsubtst2 gsubtst3 gsubtst4 gsubtst5 gsubtst6 gsubtst7 gsubtst8 \
hex hex2 hsprint inpref inputred intest intprec iobug1 leaddig \
- leadnl litoct longsub longwrds manglprm math membug1 memleak \
+ leadnl litoct let1 let2 let3 let4 let5 let6 \
+ longsub longwrds manglprm math membug1 memleak \
messages minusstr mmap8k nasty nasty2 negexp negrange nested \
nfldstr nfloop nfneg nfset nlfldsep nlinstr nlstrina noeffect \
nofile nofmtch noloop1 noloop2 nonl noparms nors nulinsrc \
diff --git a/test/Makefile.in b/test/Makefile.in
index ed60771d..87921f1b 100644
--- a/test/Makefile.in
+++ b/test/Makefile.in
@@ -1688,7 +1688,8 @@ BASIC_TESTS = \
getline4 getline5 getlnbuf getnr2tb getnr2tm gsubasgn gsubtest \
gsubtst2 gsubtst3 gsubtst4 gsubtst5 gsubtst6 gsubtst7 gsubtst8 \
hex hex2 hsprint inpref inputred intest intprec iobug1 leaddig \
- leadnl litoct longsub longwrds manglprm math membug1 memleak \
+ leadnl litoct let1 let2 let3 let4 let5 let6 \
+ longsub longwrds manglprm math membug1 memleak \
messages minusstr mmap8k nasty nasty2 negexp negrange nested \
nfldstr nfloop nfneg nfset nlfldsep nlinstr nlstrina noeffect \
nofile nofmtch noloop1 noloop2 nonl noparms nors nulinsrc \
diff --git a/test/Maketests b/test/Maketests
index d21d4c6c..8d698604 100644
--- a/test/Maketests
+++ b/test/Maketests
@@ -548,6 +548,36 @@ litoct:
@-AWKPATH="$(srcdir)" $(AWK) -f $@.awk --traditional < "$(srcdir)"/$@.in >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
@-$(CMP) "$(srcdir)"/$@.ok _$@ && rm -f _$@
+let1:
+ @echo $@
+ @-AWKPATH="$(srcdir)" $(AWK) -f $@.awk >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
+ @-$(CMP) "$(srcdir)"/$@.ok _$@ && rm -f _$@
+
+let2:
+ @echo $@
+ @-AWKPATH="$(srcdir)" $(AWK) -f $@.awk >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
+ @-$(CMP) "$(srcdir)"/$@.ok _$@ && rm -f _$@
+
+let3:
+ @echo $@
+ @-AWKPATH="$(srcdir)" $(AWK) -f $@.awk >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
+ @-$(CMP) "$(srcdir)"/$@.ok _$@ && rm -f _$@
+
+let4:
+ @echo $@
+ @-AWKPATH="$(srcdir)" $(AWK) -f $@.awk >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
+ @-$(CMP) "$(srcdir)"/$@.ok _$@ && rm -f _$@
+
+let5:
+ @echo $@
+ @-AWKPATH="$(srcdir)" $(AWK) -f $@.awk >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
+ @-$(CMP) "$(srcdir)"/$@.ok _$@ && rm -f _$@
+
+let6:
+ @echo $@
+ @-AWKPATH="$(srcdir)" $(AWK) -f $@.awk >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
+ @-$(CMP) "$(srcdir)"/$@.ok _$@ && rm -f _$@
+
longsub:
@echo $@
@-AWKPATH="$(srcdir)" $(AWK) -f $@.awk < "$(srcdir)"/$@.in >_$@ 2>&1 || echo EXIT CODE: $$? >>_$@
diff --git a/test/let1.awk b/test/let1.awk
new file mode 100644
index 00000000..c7e98468
--- /dev/null
+++ b/test/let1.awk
@@ -0,0 +1,39 @@
+function f0()
+{
+ @let (l0)
+ return l0
+}
+
+BEGIN {
+ l0 = "abc"
+ r0 = f0(42);
+ print r0 == 0 && r0 == ""
+}
+
+function f1(l1)
+{
+ @let (l1 = 42)
+ ;
+ return l1;
+}
+
+BEGIN {
+ print "f1", f1(3)
+}
+
+function f2(l2)
+{
+ @let (x = l2 + 1, x = x + 1, x = x + 1)
+ @let (x = x + 1, x = x + 1)
+ return l2 "-" x
+}
+
+BEGIN {
+ print "f2", f2(3)
+}
+
+BEGIN {
+ @let (l3 = 3, x = l3 + 1, x = x + 1, x = x + 1)
+ @let (x = x + 1, x = x + 1)
+ print "b3", l3 "-" x
+}
diff --git a/test/let1.ok b/test/let1.ok
new file mode 100644
index 00000000..153d9d96
--- /dev/null
+++ b/test/let1.ok
@@ -0,0 +1,5 @@
+gawk: let1.awk:9: warning: function `f0' called with more arguments than declared
+1
+f1 3
+f2 3-8
+b3 3-8
diff --git a/test/let2.awk b/test/let2.awk
new file mode 100644
index 00000000..da041c57
--- /dev/null
+++ b/test/let2.awk
@@ -0,0 +1,3 @@
+BEGIN {
+ @let (NR);
+}
diff --git a/test/let2.ok b/test/let2.ok
new file mode 100644
index 00000000..a97a22a4
--- /dev/null
+++ b/test/let2.ok
@@ -0,0 +1,2 @@
+gawk: let2.awk:2: error: cannot use special variable `NR' as a local variable
+EXIT CODE: 1
diff --git a/test/let3.awk b/test/let3.awk
new file mode 100644
index 00000000..5ecd0df8
--- /dev/null
+++ b/test/let3.awk
@@ -0,0 +1,3 @@
+BEGIN {
+ @let (awk::foo = 3);
+}
diff --git a/test/let3.ok b/test/let3.ok
new file mode 100644
index 00000000..bbad587c
--- /dev/null
+++ b/test/let3.ok
@@ -0,0 +1,2 @@
+gawk: let3.awk:2: error: local variable `awk::foo' cannot contain a namespace
+EXIT CODE: 1
diff --git a/test/let4.awk b/test/let4.awk
new file mode 100644
index 00000000..46c400a3
--- /dev/null
+++ b/test/let4.awk
@@ -0,0 +1,3 @@
+function fun() {
+ @let (fun = 42);
+}
diff --git a/test/let4.ok b/test/let4.ok
new file mode 100644
index 00000000..2a60aae6
--- /dev/null
+++ b/test/let4.ok
@@ -0,0 +1,2 @@
+gawk: let4.awk:2: error: function `fun': cannot use function name as local variable name
+EXIT CODE: 1
diff --git a/test/let5.awk b/test/let5.awk
new file mode 100644
index 00000000..a90f2e39
--- /dev/null
+++ b/test/let5.awk
@@ -0,0 +1,3 @@
+function fun() {
+ @let (awk::foo = 42);
+}
diff --git a/test/let5.ok b/test/let5.ok
new file mode 100644
index 00000000..e86742e8
--- /dev/null
+++ b/test/let5.ok
@@ -0,0 +1,2 @@
+gawk: let5.awk:2: error: local variable `awk::foo' cannot contain a namespace
+EXIT CODE: 1
diff --git a/test/let6.awk b/test/let6.awk
new file mode 100644
index 00000000..990c9647
--- /dev/null
+++ b/test/let6.awk
@@ -0,0 +1,3 @@
+function fun() {
+ @let (NR = 42);
+}
diff --git a/test/let6.ok b/test/let6.ok
new file mode 100644
index 00000000..c7fade5e
--- /dev/null
+++ b/test/let6.ok
@@ -0,0 +1,2 @@
+gawk: let6.awk:2: error: cannot use special variable `NR' as a local variable
+EXIT CODE: 1