diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2016-07-24 21:44:27 +0300 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2016-07-24 21:44:27 +0300 |
commit | 92ec6835548d3612bd8f0e6a2b05adf4afb1c581 (patch) | |
tree | 9ff7c80f3819e5b8967d65f84cf2ea0cc8e7ff5e | |
parent | 140a4a886edc231f1c5f02c6cd4c82effe58139e (diff) | |
download | egawk-92ec6835548d3612bd8f0e6a2b05adf4afb1c581.tar.gz egawk-92ec6835548d3612bd8f0e6a2b05adf4afb1c581.tar.bz2 egawk-92ec6835548d3612bd8f0e6a2b05adf4afb1c581.zip |
Use dfa even in multibyte locales.
-rw-r--r-- | ChangeLog | 7 | ||||
-rw-r--r-- | re.c | 12 |
2 files changed, 13 insertions, 6 deletions
@@ -1,3 +1,10 @@ +2016-07-24 Norihiro Tanaka <noritnk@kcn.ne.jp> + + * re.c (research): Now that the dfa matcher correctly runs even + in multibyte locales, try it if even if need_start is true. + However, if start > 0, avoid dfa matcher, since it can't handle + the case where the search starts in the middle of a string. + 2016-07-23 Andrew J. Schorr <aschorr@telemetry-investments.com> * builtin.c (do_print): Improve logic for formatting @@ -266,17 +266,17 @@ research(Regexp *rp, char *str, int start, rp->pat.not_bol = 1; /* - * Always do dfa search if can; if it fails, then even if - * need_start is true, we won't bother with the regex search. + * Always do dfa search if can; if it fails, we won't bother + * with the regex search. * * The dfa matcher doesn't have a no_bol flag, so don't bother * trying it in that case. * - * 7/2008: Skip the dfa matcher if need_start. The dfa matcher - * has bugs in certain multibyte cases and it's too difficult - * to try to special case things. + * 7/2016: The dfa matcher can't handle a case where searching + * starts in the middle of a string, so don't bother trying it + * in that case. */ - if (rp->dfa && ! no_bol && ! need_start) { + if (rp->dfa && ! no_bol && start == 0) { char save; size_t count = 0; struct dfa *superset = dfasuperset(rp->dfareg); |