Fix doc on ranges and locales.

author: Arnold D. Robbins <arnold@skeeve.com> 2012-07-20 12:26:59 +0300
committer: Arnold D. Robbins <arnold@skeeve.com> 2012-07-20 12:26:59 +0300
commit: 7bfc288d27bacb715ff63dbf71be53304917685a (patch)
tree: f575046eebd32bb710198e45072ec30e71255e7f /doc/gawk.texi
parent: 4fe1f4ac1aa0e4b99c9abb26794fc0d10ebb77c6 (diff)
download: egawk-7bfc288d27bacb715ff63dbf71be53304917685a.tar.gz
egawk-7bfc288d27bacb715ff63dbf71be53304917685a.tar.bz2
egawk-7bfc288d27bacb715ff63dbf71be53304917685a.zip
1 files changed, 20 insertions, 6 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index fb17b716..bf30d012 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -66,6 +66,15 @@
 @set DARKCORNER (d.c.)
 @set COMMONEXT (c.e.)
 @end ifdocbook
+@ifxml
+@set DOCUMENT book
+@set CHAPTER chapter
+@set APPENDIX appendix
+@set SECTION section
+@set SUBSECTION subsection
+@set DARKCORNER (d.c.)
+@set COMMONEXT (c.e.)
+@end ifxml
 @ifplaintext
 @set DOCUMENT book
 @set CHAPTER chapter
@@ -27062,7 +27071,7 @@ Almost all introductory Unix literature explained range expressions
 as working in this fashion, and in particular, would teach that the
 ``correct'' way to match lowercase letters was with @samp{[a-z]}, and
 that @samp{[A-Z]} was the ``correct'' way to match uppercase letters.
-And indeed, this was true.
+And indeed, this was true.@footnote{And Life was good.}
 
 The 1993 POSIX standard introduced the idea of locales (@pxref{Locales}).
 Since many locales include other letters besides the plain twenty-six
@@ -27080,12 +27089,14 @@ But outside those locales, the ordering was defined to be based on
 In many locales, @samp{A} and @samp{a} are both less than @samp{B}.
 In other words, these locales sort characters in dictionary order,
 and @samp{[a-dx-z]} is typically not equivalent to @samp{[abcdxyz]};
-instead it might be equivalent to @samp{[aBbCcdXxYyz]}, for example.
+instead it might be equivalent to @samp{[aBbCcDdXxYyZz]}, for example.
+(And to make things worse, on other systems, it might be equivalent to
+@samp{[aAbBcCdDxXyYz]}.)
 
 This point needs to be emphasized: Much literature teaches that you should
 use @samp{[a-z]} to match a lowercase character.  But on systems with
 non-ASCII locales, this also matched all of the uppercase characters
-except @samp{Z}!  This was a continuous cause of confusion, even well
+except @samp{A} or @samp{Z}!  This was a continuous cause of confusion, even well
 into the twenty-first century.
 
 To demonstrate these issues, the following example uses the @code{sub()}
@@ -27121,13 +27132,16 @@ the @command{gawk} maintainer grew weary of trying to explain that
 @command{gawk} was being nicely standards-compliant, and that the issue
 was in the user's locale.  During the development of version 4.0,
 he modified @command{gawk} to always treat ranges in the original,
-pre-POSIX fashion, unless @option{--posix} was used (@pxref{Options}).
+pre-POSIX fashion, unless @option{--posix} was used (@pxref{Options}).@footnote{And
+thus was born the Campain for Rational Range Interpretation (or RRI). A number
+of GNU tools, such as @command{grep} and @command{sed}, have either
+implemented this change, or will soon.  Thanks to Karl Berry for coining the phrase
+``Rational Range Interpretation.''}
 
 Fortunately, shortly before the final release of @command{gawk} 4.0,
 the maintainer learned that the 2008 standard had changed the
 definition of ranges, such that outside the @code{"C"} and @code{"POSIX"}
-locales, the meaning of range expressions was
-@emph{undefined}.@footnote{See
+locales, the meaning of range expressions was @emph{undefined}.@footnote{See
 @uref{http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05, the standard}
 and
 @uref{http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html#tag_21_09_03_05, its rationale}.}
author	Arnold D. Robbins <arnold@skeeve.com>	2012-07-20 12:26:59 +0300
committer	Arnold D. Robbins <arnold@skeeve.com>	2012-07-20 12:26:59 +0300
commit	7bfc288d27bacb715ff63dbf71be53304917685a (patch)
tree	f575046eebd32bb710198e45072ec30e71255e7f /doc/gawk.texi
parent	4fe1f4ac1aa0e4b99c9abb26794fc0d10ebb77c6 (diff)
download	egawk-7bfc288d27bacb715ff63dbf71be53304917685a.tar.gz egawk-7bfc288d27bacb715ff63dbf71be53304917685a.tar.bz2 egawk-7bfc288d27bacb715ff63dbf71be53304917685a.zip