diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2010-11-25 21:22:47 +0200 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2010-11-25 21:22:47 +0200 |
commit | 286748e1a8500f647c3bccfb467b02bf3a37f398 (patch) | |
tree | 6385bb2f1ee6c0837204edfd307babceeae7f89a /doc/gawk.texi | |
parent | 50d4a80f67e5bcbf3902138d85a25f6a90847d31 (diff) | |
download | egawk-286748e1a8500f647c3bccfb467b02bf3a37f398.tar.gz egawk-286748e1a8500f647c3bccfb467b02bf3a37f398.tar.bz2 egawk-286748e1a8500f647c3bccfb467b02bf3a37f398.zip |
Add POSIX string comparison with strcoll.
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r-- | doc/gawk.texi | 31 |
1 files changed, 29 insertions, 2 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi index 28692a39..59770d5f 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -417,6 +417,7 @@ particular records in a file and perform operations upon them. with @samp{<}, etc. * Variable Typing:: String type versus numeric type. * Comparison Operators:: The comparison operators. +* POSIX String Comparison:: String comparison with POSIX rules. * Boolean Ops:: Combining comparison expressions using boolean operators @samp{||} (``or''), @samp{&&} (``and'') and @samp{!} (``not''). @@ -8938,6 +8939,7 @@ compares variables. @menu * Variable Typing:: String type versus numeric type. * Comparison Operators:: The comparison operators. +* POSIX String Comparison:: String comparison with POSIX rules. @end menu @node Variable Typing @@ -9154,8 +9156,8 @@ the longer one. Thus, @code{"abc"} is less than @code{"abcd"}. @cindex troubleshooting, @code{==} operator It is very easy to accidentally mistype the @samp{==} operator and -leave off one of the @samp{=} characters. The result is still valid @command{awk} -code, but the program does not do what is intended: +leave off one of the @samp{=} characters. The result is still valid +@command{awk} code, but the program does not do what is intended: @example if (a = b) # oops! should be a == b @@ -9258,6 +9260,31 @@ One special place where @code{/foo/} is @emph{not} an abbreviation for @samp{!~}. @xref{Using Constant Regexps}, where this is discussed in more detail. + +@node POSIX String Comparison +@subsubsection String comparison with POSIX rules. + +The POSIX standard says that string comparison is performed based +on the locale's collating order. This is usually very different +from the results obtained when doing straight character-by-character +comparison.@footnote{Technically, string comparison is supposed +to behave the same way as if the strings are compared with the C +@code{strcoll()} function.} + +Because this behavior differs considerably from existing practice, +@command{gawk} only implements it when in POSIX mode (@pxref{Options}). +Here is an example to illustrate the difference, in a @code{en_US.UTF-8} +locale: + +@example +$ @kbd{gawk 'BEGIN @{ printf("ABC < abc = %s\n",} +> @kbd{("ABC" < "abc" ? "TRUE" : "FALSE")) @}'} +@print{} ABC < abc = TRUE +$ @kbd{gawk --posix 'BEGIN @{ printf("ABC < abc = %s\n",} +> @kbd{("ABC" < "abc" ? "TRUE" : "FALSE")) @}'} +@print{} ABC < abc = FALSE +@end example + @c ENDOFRANGE comex @c ENDOFRANGE excom @c ENDOFRANGE vartypc |