From 1bba6cbbfce254ef9cf277502b8209fb03dad458 Mon Sep 17 00:00:00 2001 From: Kaz Kylheku Date: Tue, 20 Jul 2021 06:38:12 -0700 Subject: hash: revert bad fix in weak processing. This commit reverts the April 11, 2020 commit 89059a932cbfc030abc3ebf63582766b177300b2, subject line "hash: bugfix: spurious retention in weak processing". That commit is a regression. This revert requires a follow-up; the commit was trying to fix an issue which now reappears. It will have to be fixed differently. The regression is that in a situation in which data is referenced through two or more dependent weak tables, entries that are reachable can spontaneously disappear from downstream tables. Suppose H0 and H1 are weak-key tables. Suppose the program holds a datum K0, which is the only reference through which D1 is reached, in the following chain: K0 -> [H0] -> K1 -> [H1] -> D1 K0 is a key in hash table H0, which has weak keys. The the associated value K1 is a key in H1, which then references D1. H0 holds the only reference to K1, and H1 holds the only reference to D1. During the first GC marking phase, because we do not mark any part of a table which has weak keys, when we process H0 we do not mark the value K1. Thus K1 looks unreachable. In the second weak hash processing pass, because K1 was treated as unreachable, the entry in H1 is incorrectly expired. This issue affects TXR's origin_hash and form_to_ln_hash, which are in this relationship. The problem was uncovered in recent tags.tl work, manifesting as a spurious disappearance of line number info when processing .txr files. The line number info disappeared for entries which were traced through the origin_hash via the macro-ancestor function (see the unexpand function in tags.tl). * hash.c (hash_mark): Only skip marking tables that have both weak keys and values. For weak key tables, mark all the values and vice versa: for weak value tables, mark the keys. * txr.1: Text resembling original text restored. --- hash.c | 42 +++++++++++++++++++++++++++++------------- txr.1 | 12 +++++------- 2 files changed, 34 insertions(+), 20 deletions(-) diff --git a/hash.c b/hash.c index f900806e..b1d4f748 100644 --- a/hash.c +++ b/hash.c @@ -601,30 +601,46 @@ static void hash_mark(val hash) gc_mark(h->userdata); + if (h->count == 0) { + gc_mark(h->table); + return; + } + /* Use counts will be re-calculated by a scan of the hash iterators which are still reachable. */ h->usecount = 0; switch (h->flags) { + cnum i; + val iter; case hash_weak_none: - /* If the hash is not weak, we can simply mark the table - vector and we are done. */ gc_mark(h->table); - break; + return; case hash_weak_keys: + /* Mark values only. Don't mark the table. */ + for (i = 0; i < h->modulus; i++) { + for (iter = h->table->v.vec[i]; iter; iter = us_cdr(iter)) { + val entry = us_car(iter); + gc_mark(us_cdr(entry)); + } + } + break; case hash_weak_vals: - case hash_weak_both: - /* If the hash is weak, we don't touch it at this time, - but add it to the list of reachable weak hashes, - unless it is empty. */ - if (h->count > 0) { - h->next = reachable_weak_hashes; - reachable_weak_hashes = h; - } else { - gc_mark(h->table); - break; + /* Mark keys only. Don't mark the table. */ + for (i = 0; i < h->modulus; i++) { + for (iter = h->table->v.vec[i]; iter; iter = us_cdr(iter)) { + val entry = us_car(iter); + gc_mark(us_car(entry)); + } } + break; + case hash_weak_both: + /* mark nothing */ + break; } + + h->next = reachable_weak_hashes; + reachable_weak_hashes = h; } static struct cobj_ops hash_ops = cobj_ops_init(hash_equal_op, diff --git a/txr.1 b/txr.1 index 9c475232..e4c55f60 100644 --- a/txr.1 +++ b/txr.1 @@ -51061,13 +51061,11 @@ the weak references "lapse" in some way, which depends on what kind they are. Hash-table weak references lapse by entry removal. When an object used as a key in in one or more weak-key hash tables becomes unreachable, those hash entries disappear. This happens even if the values are themselves reachable. -That is what it means that. -Vice versa, when an object appearing as a value in -one or more hash-table entries in weak-value hash tables becomes unreachable, -those entries disappear, even if the keys are reachable. When a hash table has -both weak keys and weak values, then its entries are removed when either keys -or values become unreachable. In other words, both the key and value must be -reachable in order to retain the entry. +Vice versa, when an object appearing as a value in one or more weak-value hash +tables becomes unreachable, those entries disappear, even if the keys are +reachable. When a hash table has both weak keys and weak values, then its +entries are removed when either keys or values become unreachable. In other +words, both the key and value must be reachable in order to retain the entry. An open traversal of a hash table is performed by the .code maphash -- cgit v1.2.3