diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2016-02-20 21:07:13 +0200 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2016-02-20 21:07:13 +0200 |
commit | 090687d94ad8a411c5e2cc434345e843ad381082 (patch) | |
tree | 36f02d720c26513123b9c77c1540765af4b98dd7 /doc/gawk.texi | |
parent | 7744707de0e95e1e0009204a7d4886d69db24530 (diff) | |
download | egawk-090687d94ad8a411c5e2cc434345e843ad381082.tar.gz egawk-090687d94ad8a411c5e2cc434345e843ad381082.tar.bz2 egawk-090687d94ad8a411c5e2cc434345e843ad381082.zip |
Doc update: Unicode in bracket expresssions.
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r-- | doc/gawk.texi | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi index 211a0a7e..dcd49e6e 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -5618,6 +5618,15 @@ set. For example, @samp{[0-9]} is equivalent to @samp{[0123456789]}. standard and @command{gawk} have changed over time. This is mainly of historical interest.) +With the increasing popularity of the +@uref{http://www.unicode.org, Unicode character standard}, +there is an additional wrinkle to consider. Octal and hexadecimal +escape sequences inside bracket expressions are taken to represent +only single-byte characters (characters whose values fit within +the range 0--256). To match a range of characters where the endpoints +of the range are larger than 256, enter the multibyte encodings of +the characters directly. + @cindex @code{\} (backslash), in bracket expressions @cindex backslash (@code{\}), in bracket expressions @cindex @code{^} (caret), in bracket expressions |