diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/ChangeLog | 10 | ||||
-rw-r--r-- | doc/gawk.1 | 19 | ||||
-rw-r--r-- | doc/gawk.info | 794 | ||||
-rw-r--r-- | doc/gawk.texi | 104 |
4 files changed, 534 insertions, 393 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog index 9c2ad6ec..2f25cb6a 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,13 @@ +Sun Mar 27 21:10:55 2011 Pat Rankin <rankin@pactechdata.com> + + * gawk.texi (Builit-in Variables: PROCINFO array, Scanning All + Elements of an Array: `for' statement): Update the documentation + for PROCINFO["sorted_in"]; add "ascending index number", + "descending index string", "ascending value", and "descending + value" as supported sort orderings. + * gawk.1 (PROCINFO array): Update PROCINFO["sorted_in"] to + reflect that the value matters, and list the supported sort orders. + Tue Feb 15 17:11:26 2011 Pat Rankin <rankin@pactechdata.com> * gawk.texi (Builit-in Variables: PROCINFO array, Scanning All @@ -1082,13 +1082,20 @@ system call. \fBPROCINFO["sorted_in"]\fP If this element exists in .BR PROCINFO , -.IR "no matter what its value" , -then -.I gawk -will cause array +then its value controls the order in which array elements +are traversed in .B for -loops -to traverse the array indices in sorted order. +loops. +Supported values are +\fB"ascending index string"\fR, +\fB"ascending index number"\fR, +\fB"ascending value"\fR, +\fB"descending index string"\fR, +\fB"descending index number"\fR, +\fB"descending value"\fR, and +\fB"unsorted"\fR. +The order specification words can be truncated, or omitted (provided +that at least one is present), or given in any order. .TP \fBPROCINFO["version"]\fP the version of diff --git a/doc/gawk.info b/doc/gawk.info index 714df826..f084fa3f 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -3633,7 +3633,7 @@ better performance when reading records. Otherwise, `gawk' has to make several function calls, _per input character_, to find the record terminator. - According to POSIX, string conmparison is also affected by locales + According to POSIX, string comparison is also affected by locales (similar to regular expressions). The details are presented in *note POSIX String Comparison::. @@ -9408,18 +9408,32 @@ with a pound sign (`#'). `PROCINFO["sorted_in"]' If this element exists in `PROCINFO', its value controls the - order in which array indices will be processed by `for(i in - arr) ...' loops. A value of `"ascending index string"', - which may be shortened to `"ascending index"' or just - `"ascending"', will result in either case sensitive or case - insensitive ascending order depending upon the value of - `IGNORECASE'. A value of `"descending index string"', which - may be shortened in a similar manner, will result in the - opposite order. The value `"unsorted"' is also recognized, - yielding the default result of arbitrary order. Any other - value will be ignored, and warned about (at the time of first - `for(in in arr) ...' execution) when lint checking is enabled. - *Note Scanning an Array::, for more information. + order in which array indices will be processed by `for (index + in array) ...' loops. The value should contain one to three + words; separate pairs of words by a single space. One word + controls sort direction, "ascending" or "descending;" another + controls the sort key, "index" or "value;" and the remaining + one, which is only valid for sorting by index, is comparison + mode, "string" or "number." When two or three words are + present, they may be specified in any order, so `ascending + index string' and `string ascending index' are equivalent. + Also, each word may be truncated, so `asc index str' and `a i + s' are also equivalent. Note that a separating space is + required even when the words have been shortened down to one + letter each. + + You can omit direction and/or key type and/or comparison + mode. Provided that at least one is present, missing parts + of a sort specification default to `ascending', `index', and + (for indices only) `string', respectively. An empty string, + `""', is the same as `unsorted' and will cause `for (index in + array) ...' to process the indices in arbitrary order. + Another thing to note is that the array sorting takes place + at the time `for (... in ...)' is about to start executing, + so changing the value of `PROCINFO["sorted_in"]' during loop + execution does not have any effect on the order in which any + remaining array elements get processed. *Note Scanning an + Array::, for more information. `PROCINFO["strftime"]' The default time format string for `strftime()'. Assigning a @@ -9938,15 +9952,43 @@ produce strange results. It is best to avoid such things. As an extension, `gawk' makes it possible for you to loop over the elements of an array in order, based on the value of -`PROCINFO["sorted_in"]' (*note Auto-set::). At present two sorting -options are available: `"ascending index string"' and `"descending -index string"'. They can be shortened by omitting `string' or `index -string'. The value `"unsorted"' can be used as an explicit "no-op" and -yields the same result as when `PROCINFO["sorted_in"]' has no value at -all. If the index strings contain letters, the value of `IGNORECASE' -affects the order of the result. This extension is disabled in POSIX -mode, since the `PROCINFO' array is not special in that case. For -example: +`PROCINFO["sorted_in"]' (*note Auto-set::). Several sorting options +are available: + +`"ascending index string"' + Order by indices compared as strings, the most basic sort. + (Internally, array indices are always strings, so with `a[2*5] = 1' + the index is actually `"10"' rather than numeric 10.) + +`"ascending index number"' + Order by indices but force them to be treated as numbers in the + process. Any index with non-numeric value will end up positioned + as if it were 0. + +`"ascending value"' + Order by element values rather than by indices. Comparisons are + done as numeric when both values being compared are numeric, or + done as strings when either or both aren't numeric. Sub-arrays, + if present, come out last. + +`"descending index string"' + Reverse order from the most basic sort. + +`"descending index number"' + Numeric indices ordered from high to low. + +`"descending value"' + Element values ordered from high to low. Sub-arrays, if present, + come out first. + +`"unsorted"' + Array elements are processed in arbitrary order, the normal `awk' + behavior. + + Portions of the sort specification string may be truncated or +omitted. The default is `ascending' for direction, `index' for sort +key type, and (when sorting by index only) `string' for comparison mode. +For example: $ gawk 'BEGIN { > a[4] = 4 @@ -9957,7 +9999,7 @@ example: -| 4 4 -| 3 3 $ gawk 'BEGIN { - > PROCINFO["sorted_in"] = "ascending index" + > PROCINFO["sorted_in"] = "asc index" > a[4] = 4 > a[3] = 3 > for (i in a) @@ -9971,6 +10013,26 @@ array has been reported to add 15% to 20% overhead to the execution time of `awk' programs. For this reason, sorted array traversal is not the default. + When sorting an array by element values, if a value happens to be a +sub-array then it is considered to be greater than any string or +numeric value, regardless of what the sub-array itself contains, and +all sub-arrays are treated as being equal to each other. Their order +relative to each other is determined by their index strings. + + Sorting by array element values (for values other than sub-arrays) +always uses basic `awk' comparison mode: if both values happen to be +numbers then they're compared as numbers, otherwise they're compared as +strings. + + When string comparisons are made during a sort, either for element +values where one or both aren't numbers or for element indices handled +as strings, the value of `IGNORECASE' controls whether the comparisons +treat corresponding upper and lower case letters as equivalent or +distinct. + + This sorting extension is disabled in POSIX mode, since the +`PROCINFO' array is not special in that case. + File: gawk.info, Node: Delete, Next: Numeric Array Subscripts, Prev: Array Basics, Up: Arrays @@ -24437,7 +24499,7 @@ Index (line 67) * advanced features, data files as single record: Records. (line 175) * advanced features, fixed-width data: Constant Size. (line 9) -* advanced features, FNR/NR variables: Auto-set. (line 215) +* advanced features, FNR/NR variables: Auto-set. (line 229) * advanced features, gawk: Advanced Features. (line 6) * advanced features, gawk, network programming: TCP/IP Networking. (line 6) @@ -24937,7 +24999,7 @@ Index (line 47) * dark corner, FILENAME variable <1>: Auto-set. (line 92) * dark corner, FILENAME variable: Getline Notes. (line 19) -* dark corner, FNR/NR variables: Auto-set. (line 215) +* dark corner, FNR/NR variables: Auto-set. (line 229) * dark corner, format-control characters: Control Letters. (line 18) * dark corner, FS as null string: Single Character Fields. (line 20) @@ -25136,7 +25198,7 @@ Index * differences in awk and gawk, regular expressions: Case-sensitivity. (line 26) * differences in awk and gawk, RS/RT variables: Records. (line 167) -* differences in awk and gawk, RT variable: Auto-set. (line 204) +* differences in awk and gawk, RT variable: Auto-set. (line 218) * differences in awk and gawk, single-character fields: Single Character Fields. (line 6) * differences in awk and gawk, split() function: String Functions. @@ -25417,7 +25479,7 @@ Index * floating-point, numbers, AWKNUM internal type: Internals. (line 19) * FNR variable <1>: Auto-set. (line 102) * FNR variable: Records. (line 6) -* FNR variable, changing: Auto-set. (line 215) +* FNR variable, changing: Auto-set. (line 229) * for statement: For Statement. (line 6) * for statement, in arrays: Scanning an Array. (line 20) * force_number() internal function: Internals. (line 27) @@ -25591,7 +25653,7 @@ Index * gawk, regular expressions, operators: GNU Regexp Operators. (line 6) * gawk, regular expressions, precedence: Regexp Operators. (line 157) -* gawk, RT variable in <1>: Auto-set. (line 204) +* gawk, RT variable in <1>: Auto-set. (line 218) * gawk, RT variable in <2>: Getline/Variable/File. (line 10) * gawk, RT variable in <3>: Multiple Line. (line 129) @@ -26038,7 +26100,7 @@ Index * not Boolean-logic operator: Boolean Ops. (line 6) * NR variable <1>: Auto-set. (line 118) * NR variable: Records. (line 6) -* NR variable, changing: Auto-set. (line 215) +* NR variable, changing: Auto-set. (line 229) * null strings <1>: Basic Data Typing. (line 50) * null strings <2>: Truth Values. (line 6) * null strings <3>: Regexp Field Splitting. @@ -26470,7 +26532,7 @@ Index * right angle bracket (>), >> operator (I/O): Redirection. (line 50) * right shift, bitwise: Bitwise Functions. (line 32) * Ritchie, Dennis: Basic Data Typing. (line 74) -* RLENGTH variable: Auto-set. (line 191) +* RLENGTH variable: Auto-set. (line 205) * RLENGTH variable, match() function and: String Functions. (line 205) * Robbins, Arnold <1>: Future Extensions. (line 6) * Robbins, Arnold <2>: Bugs. (line 32) @@ -26495,9 +26557,9 @@ Index * RS variable: Records. (line 20) * RS variable, multiline records and: Multiple Line. (line 17) * rshift() function (gawk): Bitwise Functions. (line 51) -* RSTART variable: Auto-set. (line 197) +* RSTART variable: Auto-set. (line 211) * RSTART variable, match() function and: String Functions. (line 205) -* RT variable <1>: Auto-set. (line 204) +* RT variable <1>: Auto-set. (line 218) * RT variable <2>: Getline/Variable/File. (line 10) * RT variable <3>: Multiple Line. (line 129) @@ -27008,339 +27070,339 @@ Ref: Case-sensitivity-Footnote-2154459 Node: Leftmost Longest154567 Node: Computed Regexps155768 Node: Locales159194 -Node: Reading Files162902 -Node: Records164843 -Ref: Records-Footnote-1173517 -Node: Fields173554 -Ref: Fields-Footnote-1176587 -Node: Nonconstant Fields176673 -Node: Changing Fields178875 -Node: Field Separators184853 -Node: Default Field Splitting187482 -Node: Regexp Field Splitting188599 -Node: Single Character Fields191941 -Node: Command Line Field Separator193000 -Node: Field Splitting Summary196441 -Ref: Field Splitting Summary-Footnote-1199633 -Node: Constant Size199734 -Node: Splitting By Content204318 -Ref: Splitting By Content-Footnote-1208044 -Node: Multiple Line208084 -Ref: Multiple Line-Footnote-1213931 -Node: Getline214110 -Node: Plain Getline216338 -Node: Getline/Variable218427 -Node: Getline/File219568 -Node: Getline/Variable/File220890 -Ref: Getline/Variable/File-Footnote-1222489 -Node: Getline/Pipe222576 -Node: Getline/Variable/Pipe225136 -Node: Getline/Coprocess226243 -Node: Getline/Variable/Coprocess227486 -Node: Getline Notes228200 -Node: Getline Summary230142 -Ref: table-getline-variants230485 -Node: Command line directories231341 -Node: Printing231966 -Node: Print233597 -Node: Print Examples234934 -Node: Output Separators237718 -Node: OFMT239478 -Node: Printf240836 -Node: Basic Printf241742 -Node: Control Letters243281 -Node: Format Modifiers247093 -Node: Printf Examples253102 -Node: Redirection255817 -Node: Special Files262801 -Node: Special FD263334 -Ref: Special FD-Footnote-1266958 -Node: Special Network267032 -Node: Special Caveats267882 -Node: Close Files And Pipes268678 -Ref: Close Files And Pipes-Footnote-1275701 -Ref: Close Files And Pipes-Footnote-2275849 -Node: Expressions275999 -Node: Values277068 -Node: Constants277744 -Node: Scalar Constants278424 -Ref: Scalar Constants-Footnote-1279283 -Node: Nondecimal-numbers279465 -Node: Regexp Constants282524 -Node: Using Constant Regexps282999 -Node: Variables286054 -Node: Using Variables286709 -Node: Assignment Options288433 -Node: Conversion290305 -Ref: table-locale-affects295681 -Ref: Conversion-Footnote-1296305 -Node: All Operators296414 -Node: Arithmetic Ops297044 -Node: Concatenation299549 -Ref: Concatenation-Footnote-1302342 -Node: Assignment Ops302462 -Ref: table-assign-ops307450 -Node: Increment Ops308858 -Node: Truth Values and Conditions312328 -Node: Truth Values313411 -Node: Typing and Comparison314460 -Node: Variable Typing315249 -Ref: Variable Typing-Footnote-1319146 -Node: Comparison Operators319268 -Ref: table-relational-ops319678 -Node: POSIX String Comparison323227 -Ref: POSIX String Comparison-Footnote-1324183 -Node: Boolean Ops324321 -Ref: Boolean Ops-Footnote-1328399 -Node: Conditional Exp328490 -Node: Function Calls330222 -Node: Precedence333816 -Node: Patterns and Actions337469 -Node: Pattern Overview338523 -Node: Regexp Patterns340189 -Node: Expression Patterns340732 -Node: Ranges344306 -Node: BEGIN/END347272 -Node: Using BEGIN/END348034 -Ref: Using BEGIN/END-Footnote-1350765 -Node: I/O And BEGIN/END350871 -Node: BEGINFILE/ENDFILE353153 -Node: Empty355984 -Node: Using Shell Variables356300 -Node: Action Overview358585 -Node: Statements360942 -Node: If Statement362796 -Node: While Statement364295 -Node: Do Statement366339 -Node: For Statement367495 -Node: Switch Statement370647 -Node: Break Statement372744 -Node: Continue Statement374734 -Node: Next Statement376521 -Node: Nextfile Statement378911 -Node: Exit Statement381387 -Node: Built-in Variables383803 -Node: User-modified384898 -Ref: User-modified-Footnote-1392924 -Node: Auto-set392986 -Ref: Auto-set-Footnote-1402861 -Node: ARGC and ARGV403066 -Node: Arrays406917 -Node: Array Basics408488 -Node: Array Intro409199 -Node: Reference to Elements413517 -Node: Assigning Elements415787 -Node: Array Example416278 -Node: Scanning an Array418010 -Node: Delete421542 -Ref: Delete-Footnote-1423977 -Node: Numeric Array Subscripts424034 -Node: Uninitialized Subscripts426217 -Node: Multi-dimensional427845 -Node: Multi-scanning430936 -Node: Array Sorting432520 -Ref: Array Sorting-Footnote-1435614 -Node: Arrays of Arrays435808 -Node: Functions440381 -Node: Built-in441203 -Node: Calling Built-in442281 -Node: Numeric Functions444269 -Ref: Numeric Functions-Footnote-1448034 -Ref: Numeric Functions-Footnote-2448391 -Ref: Numeric Functions-Footnote-3448439 -Node: String Functions448708 -Ref: String Functions-Footnote-1471210 -Ref: String Functions-Footnote-2471339 -Ref: String Functions-Footnote-3471587 -Node: Gory Details471674 -Ref: table-sub-escapes473353 -Ref: table-posix-sub474667 -Ref: table-gensub-escapes475580 -Node: I/O Functions476751 -Ref: I/O Functions-Footnote-1483406 -Node: Time Functions483553 -Ref: Time Functions-Footnote-1494448 -Ref: Time Functions-Footnote-2494516 -Ref: Time Functions-Footnote-3494674 -Ref: Time Functions-Footnote-4494785 -Ref: Time Functions-Footnote-5494897 -Ref: Time Functions-Footnote-6495124 -Node: Bitwise Functions495390 -Ref: table-bitwise-ops495948 -Ref: Bitwise Functions-Footnote-1500108 -Node: Type Functions500292 -Node: I18N Functions500762 -Node: User-defined502389 -Node: Definition Syntax503193 -Ref: Definition Syntax-Footnote-1508103 -Node: Function Example508172 -Node: Function Caveats510766 -Node: Calling A Function511187 -Node: Variable Scope512302 -Node: Pass By Value/Reference514277 -Node: Return Statement517717 -Node: Dynamic Typing520698 -Node: Indirect Calls521433 -Node: Internationalization531118 -Node: I18N and L10N532544 -Node: Explaining gettext533230 -Ref: Explaining gettext-Footnote-1538296 -Ref: Explaining gettext-Footnote-2538480 -Node: Programmer i18n538645 -Node: Translator i18n542845 -Node: String Extraction543638 -Ref: String Extraction-Footnote-1544599 -Node: Printf Ordering544685 -Ref: Printf Ordering-Footnote-1547469 -Node: I18N Portability547533 -Ref: I18N Portability-Footnote-1549982 -Node: I18N Example550045 -Ref: I18N Example-Footnote-1552680 -Node: Gawk I18N552752 -Node: Advanced Features553369 -Node: Nondecimal Data554688 -Node: Two-way I/O556269 -Ref: Two-way I/O-Footnote-1561703 -Node: TCP/IP Networking561773 -Node: Profiling564617 -Node: Library Functions572091 -Ref: Library Functions-Footnote-1575196 -Node: Library Names575367 -Ref: Library Names-Footnote-1578838 -Ref: Library Names-Footnote-2579058 -Node: General Functions579144 -Node: Nextfile Function580207 -Node: Strtonum Function584588 -Node: Assert Function587544 -Node: Round Function590870 -Node: Cliff Random Function592413 -Node: Ordinal Functions593429 -Ref: Ordinal Functions-Footnote-1596499 -Ref: Ordinal Functions-Footnote-2596751 -Node: Join Function596960 -Ref: Join Function-Footnote-1598731 -Node: Gettimeofday Function598931 -Node: Data File Management602646 -Node: Filetrans Function603278 -Node: Rewind Function607514 -Node: File Checking608967 -Node: Empty Files610061 -Node: Ignoring Assigns612291 -Node: Getopt Function613844 -Ref: Getopt Function-Footnote-1625148 -Node: Passwd Functions625351 -Ref: Passwd Functions-Footnote-1634326 -Node: Group Functions634414 -Node: Walking Arrays642498 -Node: Sample Programs644067 -Node: Running Examples644732 -Node: Clones645460 -Node: Cut Program646684 -Node: Egrep Program656533 -Ref: Egrep Program-Footnote-1664304 -Node: Id Program664414 -Node: Split Program668030 -Ref: Split Program-Footnote-1671549 -Node: Tee Program671677 -Node: Uniq Program674480 -Node: Wc Program681903 -Ref: Wc Program-Footnote-1686167 -Node: Miscellaneous Programs686367 -Node: Dupword Program687555 -Node: Alarm Program689586 -Node: Translate Program694343 -Ref: Translate Program-Footnote-1698722 -Ref: Translate Program-Footnote-2698950 -Node: Labels Program699084 -Ref: Labels Program-Footnote-1702455 -Node: Word Sorting702539 -Node: History Sorting706422 -Node: Extract Program708260 -Ref: Extract Program-Footnote-1715741 -Node: Simple Sed715869 -Node: Igawk Program718931 -Ref: Igawk Program-Footnote-1733963 -Ref: Igawk Program-Footnote-2734164 -Node: Anagram Program734302 -Node: Signature Program737400 -Node: Debugger738503 -Node: Debugging739414 -Node: Debugging Concepts739728 -Node: Debugging Terms741584 -Node: Awk Debugging744129 -Node: Sample dgawk session745021 -Node: dgawk invocation745513 -Node: Finding The Bug746695 -Node: List of Debugger Commands753180 -Node: Breakpoint Control754491 -Node: Dgawk Execution Control757967 -Node: Viewing And Changing Data761318 -Node: Dgawk Stack764627 -Node: Dgawk Info766087 -Node: Miscellaneous Dgawk Commands770035 -Node: Readline Support775466 -Node: Dgawk Limitations776293 -Node: Language History778432 -Node: V7/SVR3.1779864 -Node: SVR4782159 -Node: POSIX783601 -Node: BTL784599 -Node: POSIX/GNU785333 -Node: Common Extensions790519 -Node: Contributors791620 -Node: Installation795655 -Node: Gawk Distribution796549 -Node: Getting797033 -Node: Extracting797859 -Node: Distribution contents799550 -Node: Unix Installation804568 -Node: Quick Installation805185 -Node: Additional Configuration Options807147 -Node: Configuration Philosophy808624 -Node: Non-Unix Installation810966 -Node: PC Installation811424 -Node: PC Binary Installation812723 -Node: PC Compiling814571 -Node: PC Testing817515 -Node: PC Using818691 -Node: Cygwin822876 -Node: MSYS823873 -Node: VMS Installation824387 -Node: VMS Compilation824993 -Ref: VMS Compilation-Footnote-1826000 -Node: VMS Installation Details826058 -Node: VMS Running827693 -Node: VMS Old Gawk829300 -Node: Bugs829774 -Node: Other Versions833639 -Node: Notes838918 -Node: Compatibility Mode839610 -Node: Additions840393 -Node: Accessing The Source841205 -Node: Adding Code842628 -Node: New Ports848176 -Node: Dynamic Extensions852289 -Node: Internals853665 -Node: Plugin License862781 -Node: Sample Library863415 -Node: Internal File Description864101 -Node: Internal File Ops867808 -Ref: Internal File Ops-Footnote-1872576 -Node: Using Internal File Ops872724 -Node: Future Extensions875101 -Node: Basic Concepts877605 -Node: Basic High Level878362 -Ref: Basic High Level-Footnote-1882397 -Node: Basic Data Typing882582 -Node: Floating Point Issues887107 -Node: String Conversion Precision888190 -Ref: String Conversion Precision-Footnote-1889884 -Node: Unexpected Results889993 -Node: POSIX Floating Point Problems891819 -Ref: POSIX Floating Point Problems-Footnote-1895521 -Node: Glossary895559 -Node: Copying919702 -Node: GNU Free Documentation License957259 -Node: Index982396 +Node: Reading Files162901 +Node: Records164842 +Ref: Records-Footnote-1173516 +Node: Fields173553 +Ref: Fields-Footnote-1176586 +Node: Nonconstant Fields176672 +Node: Changing Fields178874 +Node: Field Separators184852 +Node: Default Field Splitting187481 +Node: Regexp Field Splitting188598 +Node: Single Character Fields191940 +Node: Command Line Field Separator192999 +Node: Field Splitting Summary196440 +Ref: Field Splitting Summary-Footnote-1199632 +Node: Constant Size199733 +Node: Splitting By Content204317 +Ref: Splitting By Content-Footnote-1208043 +Node: Multiple Line208083 +Ref: Multiple Line-Footnote-1213930 +Node: Getline214109 +Node: Plain Getline216337 +Node: Getline/Variable218426 +Node: Getline/File219567 +Node: Getline/Variable/File220889 +Ref: Getline/Variable/File-Footnote-1222488 +Node: Getline/Pipe222575 +Node: Getline/Variable/Pipe225135 +Node: Getline/Coprocess226242 +Node: Getline/Variable/Coprocess227485 +Node: Getline Notes228199 +Node: Getline Summary230141 +Ref: table-getline-variants230484 +Node: Command line directories231340 +Node: Printing231965 +Node: Print233596 +Node: Print Examples234933 +Node: Output Separators237717 +Node: OFMT239477 +Node: Printf240835 +Node: Basic Printf241741 +Node: Control Letters243280 +Node: Format Modifiers247092 +Node: Printf Examples253101 +Node: Redirection255816 +Node: Special Files262800 +Node: Special FD263333 +Ref: Special FD-Footnote-1266957 +Node: Special Network267031 +Node: Special Caveats267881 +Node: Close Files And Pipes268677 +Ref: Close Files And Pipes-Footnote-1275700 +Ref: Close Files And Pipes-Footnote-2275848 +Node: Expressions275998 +Node: Values277067 +Node: Constants277743 +Node: Scalar Constants278423 +Ref: Scalar Constants-Footnote-1279282 +Node: Nondecimal-numbers279464 +Node: Regexp Constants282523 +Node: Using Constant Regexps282998 +Node: Variables286053 +Node: Using Variables286708 +Node: Assignment Options288432 +Node: Conversion290304 +Ref: table-locale-affects295680 +Ref: Conversion-Footnote-1296304 +Node: All Operators296413 +Node: Arithmetic Ops297043 +Node: Concatenation299548 +Ref: Concatenation-Footnote-1302341 +Node: Assignment Ops302461 +Ref: table-assign-ops307449 +Node: Increment Ops308857 +Node: Truth Values and Conditions312327 +Node: Truth Values313410 +Node: Typing and Comparison314459 +Node: Variable Typing315248 +Ref: Variable Typing-Footnote-1319145 +Node: Comparison Operators319267 +Ref: table-relational-ops319677 +Node: POSIX String Comparison323226 +Ref: POSIX String Comparison-Footnote-1324182 +Node: Boolean Ops324320 +Ref: Boolean Ops-Footnote-1328398 +Node: Conditional Exp328489 +Node: Function Calls330221 +Node: Precedence333815 +Node: Patterns and Actions337468 +Node: Pattern Overview338522 +Node: Regexp Patterns340188 +Node: Expression Patterns340731 +Node: Ranges344305 +Node: BEGIN/END347271 +Node: Using BEGIN/END348033 +Ref: Using BEGIN/END-Footnote-1350764 +Node: I/O And BEGIN/END350870 +Node: BEGINFILE/ENDFILE353152 +Node: Empty355983 +Node: Using Shell Variables356299 +Node: Action Overview358584 +Node: Statements360941 +Node: If Statement362795 +Node: While Statement364294 +Node: Do Statement366338 +Node: For Statement367494 +Node: Switch Statement370646 +Node: Break Statement372743 +Node: Continue Statement374733 +Node: Next Statement376520 +Node: Nextfile Statement378910 +Node: Exit Statement381386 +Node: Built-in Variables383802 +Node: User-modified384897 +Ref: User-modified-Footnote-1392923 +Node: Auto-set392985 +Ref: Auto-set-Footnote-1403716 +Node: ARGC and ARGV403921 +Node: Arrays407772 +Node: Array Basics409343 +Node: Array Intro410054 +Node: Reference to Elements414372 +Node: Assigning Elements416642 +Node: Array Example417133 +Node: Scanning an Array418865 +Node: Delete424132 +Ref: Delete-Footnote-1426567 +Node: Numeric Array Subscripts426624 +Node: Uninitialized Subscripts428807 +Node: Multi-dimensional430435 +Node: Multi-scanning433526 +Node: Array Sorting435110 +Ref: Array Sorting-Footnote-1438204 +Node: Arrays of Arrays438398 +Node: Functions442971 +Node: Built-in443793 +Node: Calling Built-in444871 +Node: Numeric Functions446859 +Ref: Numeric Functions-Footnote-1450624 +Ref: Numeric Functions-Footnote-2450981 +Ref: Numeric Functions-Footnote-3451029 +Node: String Functions451298 +Ref: String Functions-Footnote-1473800 +Ref: String Functions-Footnote-2473929 +Ref: String Functions-Footnote-3474177 +Node: Gory Details474264 +Ref: table-sub-escapes475943 +Ref: table-posix-sub477257 +Ref: table-gensub-escapes478170 +Node: I/O Functions479341 +Ref: I/O Functions-Footnote-1485996 +Node: Time Functions486143 +Ref: Time Functions-Footnote-1497038 +Ref: Time Functions-Footnote-2497106 +Ref: Time Functions-Footnote-3497264 +Ref: Time Functions-Footnote-4497375 +Ref: Time Functions-Footnote-5497487 +Ref: Time Functions-Footnote-6497714 +Node: Bitwise Functions497980 +Ref: table-bitwise-ops498538 +Ref: Bitwise Functions-Footnote-1502698 +Node: Type Functions502882 +Node: I18N Functions503352 +Node: User-defined504979 +Node: Definition Syntax505783 +Ref: Definition Syntax-Footnote-1510693 +Node: Function Example510762 +Node: Function Caveats513356 +Node: Calling A Function513777 +Node: Variable Scope514892 +Node: Pass By Value/Reference516867 +Node: Return Statement520307 +Node: Dynamic Typing523288 +Node: Indirect Calls524023 +Node: Internationalization533708 +Node: I18N and L10N535134 +Node: Explaining gettext535820 +Ref: Explaining gettext-Footnote-1540886 +Ref: Explaining gettext-Footnote-2541070 +Node: Programmer i18n541235 +Node: Translator i18n545435 +Node: String Extraction546228 +Ref: String Extraction-Footnote-1547189 +Node: Printf Ordering547275 +Ref: Printf Ordering-Footnote-1550059 +Node: I18N Portability550123 +Ref: I18N Portability-Footnote-1552572 +Node: I18N Example552635 +Ref: I18N Example-Footnote-1555270 +Node: Gawk I18N555342 +Node: Advanced Features555959 +Node: Nondecimal Data557278 +Node: Two-way I/O558859 +Ref: Two-way I/O-Footnote-1564293 +Node: TCP/IP Networking564363 +Node: Profiling567207 +Node: Library Functions574681 +Ref: Library Functions-Footnote-1577786 +Node: Library Names577957 +Ref: Library Names-Footnote-1581428 +Ref: Library Names-Footnote-2581648 +Node: General Functions581734 +Node: Nextfile Function582797 +Node: Strtonum Function587178 +Node: Assert Function590134 +Node: Round Function593460 +Node: Cliff Random Function595003 +Node: Ordinal Functions596019 +Ref: Ordinal Functions-Footnote-1599089 +Ref: Ordinal Functions-Footnote-2599341 +Node: Join Function599550 +Ref: Join Function-Footnote-1601321 +Node: Gettimeofday Function601521 +Node: Data File Management605236 +Node: Filetrans Function605868 +Node: Rewind Function610104 +Node: File Checking611557 +Node: Empty Files612651 +Node: Ignoring Assigns614881 +Node: Getopt Function616434 +Ref: Getopt Function-Footnote-1627738 +Node: Passwd Functions627941 +Ref: Passwd Functions-Footnote-1636916 +Node: Group Functions637004 +Node: Walking Arrays645088 +Node: Sample Programs646657 +Node: Running Examples647322 +Node: Clones648050 +Node: Cut Program649274 +Node: Egrep Program659123 +Ref: Egrep Program-Footnote-1666894 +Node: Id Program667004 +Node: Split Program670620 +Ref: Split Program-Footnote-1674139 +Node: Tee Program674267 +Node: Uniq Program677070 +Node: Wc Program684493 +Ref: Wc Program-Footnote-1688757 +Node: Miscellaneous Programs688957 +Node: Dupword Program690145 +Node: Alarm Program692176 +Node: Translate Program696933 +Ref: Translate Program-Footnote-1701312 +Ref: Translate Program-Footnote-2701540 +Node: Labels Program701674 +Ref: Labels Program-Footnote-1705045 +Node: Word Sorting705129 +Node: History Sorting709012 +Node: Extract Program710850 +Ref: Extract Program-Footnote-1718331 +Node: Simple Sed718459 +Node: Igawk Program721521 +Ref: Igawk Program-Footnote-1736553 +Ref: Igawk Program-Footnote-2736754 +Node: Anagram Program736892 +Node: Signature Program739990 +Node: Debugger741093 +Node: Debugging742004 +Node: Debugging Concepts742318 +Node: Debugging Terms744174 +Node: Awk Debugging746719 +Node: Sample dgawk session747611 +Node: dgawk invocation748103 +Node: Finding The Bug749285 +Node: List of Debugger Commands755770 +Node: Breakpoint Control757081 +Node: Dgawk Execution Control760557 +Node: Viewing And Changing Data763908 +Node: Dgawk Stack767217 +Node: Dgawk Info768677 +Node: Miscellaneous Dgawk Commands772625 +Node: Readline Support778056 +Node: Dgawk Limitations778883 +Node: Language History781022 +Node: V7/SVR3.1782454 +Node: SVR4784749 +Node: POSIX786191 +Node: BTL787189 +Node: POSIX/GNU787923 +Node: Common Extensions793109 +Node: Contributors794210 +Node: Installation798245 +Node: Gawk Distribution799139 +Node: Getting799623 +Node: Extracting800449 +Node: Distribution contents802140 +Node: Unix Installation807158 +Node: Quick Installation807775 +Node: Additional Configuration Options809737 +Node: Configuration Philosophy811214 +Node: Non-Unix Installation813556 +Node: PC Installation814014 +Node: PC Binary Installation815313 +Node: PC Compiling817161 +Node: PC Testing820105 +Node: PC Using821281 +Node: Cygwin825466 +Node: MSYS826463 +Node: VMS Installation826977 +Node: VMS Compilation827583 +Ref: VMS Compilation-Footnote-1828590 +Node: VMS Installation Details828648 +Node: VMS Running830283 +Node: VMS Old Gawk831890 +Node: Bugs832364 +Node: Other Versions836229 +Node: Notes841508 +Node: Compatibility Mode842200 +Node: Additions842983 +Node: Accessing The Source843795 +Node: Adding Code845218 +Node: New Ports850766 +Node: Dynamic Extensions854879 +Node: Internals856255 +Node: Plugin License865371 +Node: Sample Library866005 +Node: Internal File Description866691 +Node: Internal File Ops870398 +Ref: Internal File Ops-Footnote-1875166 +Node: Using Internal File Ops875314 +Node: Future Extensions877691 +Node: Basic Concepts880195 +Node: Basic High Level880952 +Ref: Basic High Level-Footnote-1884987 +Node: Basic Data Typing885172 +Node: Floating Point Issues889697 +Node: String Conversion Precision890780 +Ref: String Conversion Precision-Footnote-1892474 +Node: Unexpected Results892583 +Node: POSIX Floating Point Problems894409 +Ref: POSIX Floating Point Problems-Footnote-1898111 +Node: Glossary898149 +Node: Copying922292 +Node: GNU Free Documentation License959849 +Node: Index984986 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 7c63476f..0b410fc1 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -5138,7 +5138,7 @@ will give you much better performance when reading records. Otherwise, @command{gawk} has to make several function calls, @emph{per input character}, to find the record terminator. -According to POSIX, string conmparison is also affected by locales +According to POSIX, string comparison is also affected by locales (similar to regular expressions). The details are presented in @ref{POSIX String Comparison}. @@ -12756,17 +12756,30 @@ The parent process ID of the current process. @item PROCINFO["sorted_in"] If this element exists in @code{PROCINFO}, its value controls the order in which array indices will be processed by -@samp{for(i in arr) @dots{}} loops. -A value of @code{"ascending index string"}, which may be shortened to -@code{"ascending index"} or just @code{"ascending"}, will result in either -case sensitive or case insensitive ascending order depending upon -the value of @code{IGNORECASE}. -A value of @code{"descending index string"}, which may be shortened in -a similar manner, will result in the opposite order. -The value @code{"unsorted"} is also recognized, yielding the default -result of arbitrary order. Any other value will be ignored, and -warned about (at the time of first @samp{for(in in arr) @dots{}} -execution) when lint checking is enabled. +@samp{for (index in array) @dots{}} loops. +The value should contain one to three words; separate pairs of words +by a single space. +One word controls sort direction, ``ascending'' or ``descending;'' +another controls the sort key, ``index'' or ``value;'' and the remaining +one, which is only valid for sorting by index, is comparison mode, +``string'' or ``number.'' When two or three words are present, they may +be specified in any order, so @samp{ascending index string} and +@samp{string ascending index} are equivalent. Also, each word may +be truncated, so @samp{asc index str} and @samp{a i s} are also +equivalent. Note that a separating space is required even when the +words have been shortened down to one letter each. + +You can omit direction and/or key type and/or comparison mode. Provided +that at least one is present, missing parts of a sort specification +default to @samp{ascending}, @samp{index}, and (for indices only) @samp{string}, +respectively. +An empty string, @code{""}, is the same as @samp{unsorted} and will cause +@samp{for (index in array) @dots{}} to process the indices in +arbitrary order. Another thing to note is that the array sorting +takes place at the time @samp{for (@dots{} in @dots{})} is about to +start executing, so changing the value of @code{PROCINFO["sorted_in"]} +during loop execution does not have any effect on the order in which any +remaining array elements get processed. @xref{Scanning an Array}, for more information. @item PROCINFO["strftime"] @@ -13439,14 +13452,43 @@ strange results. It is best to avoid such things. As an extension, @command{gawk} makes it possible for you to loop over the elements of an array in order, based on the value of @code{PROCINFO["sorted_in"]} (@pxref{Auto-set}). -At present two sorting options are available: @code{"ascending -index string"} and @code{"descending index string"}. They can be -shortened by omitting @samp{string} or @samp{index string}. The value -@code{"unsorted"} can be used as an explicit ``no-op'' and yields the same -result as when @code{PROCINFO["sorted_in"]} has no value at all. If the -index strings contain letters, the value of @code{IGNORECASE} affects -the order of the result. This extension is disabled in POSIX mode, -since the @code{PROCINFO} array is not special in that case. For example: +Several sorting options are available: + +@table @code +@item "ascending index string" +Order by indices compared as strings, the most basic sort. +(Internally, array indices are always strings, so with @code{a[2*5] = 1} +the index is actually @code{"10"} rather than numeric 10.) + +@item "ascending index number" +Order by indices but force them to be treated as numbers in the process. +Any index with non-numeric value will end up positioned as if it were 0. + +@item "ascending value" +Order by element values rather than by indices. Comparisons are done +as numeric when both values being compared are numeric, or done as +strings when either or both aren't numeric. Sub-arrays, if present, +come out last. + +@item "descending index string" +Reverse order from the most basic sort. + +@item "descending index number" +Numeric indices ordered from high to low. + +@item "descending value" +Element values ordered from high to low. Sub-arrays, if present, +come out first. + +@item "unsorted" +Array elements are processed in arbitrary order, the normal @command{awk} +behavior. +@end table + +Portions of the sort specification string may be truncated or omitted. +The default is @samp{ascending} for direction, @samp{index} for sort key type, +and (when sorting by index only) @samp{string} for comparison mode. +For example: @example $ @kbd{gawk 'BEGIN @{} @@ -13458,7 +13500,7 @@ $ @kbd{gawk 'BEGIN @{} @print{} 4 4 @print{} 3 3 $ @kbd{gawk 'BEGIN @{} -> @kbd{ PROCINFO["sorted_in"] = "ascending index"} +> @kbd{ PROCINFO["sorted_in"] = "asc index"} > @kbd{ a[4] = 4} > @kbd{ a[3] = 3} > @kbd{ for (i in a)} @@ -13476,6 +13518,26 @@ sorted array traversal is not the default. @c maintainers believe that only the people who wish to use a @c feature should have to pay for it. +When sorting an array by element values, if a value happens to be +a sub-array then it is considered to be greater than any string or +numeric value, regardless of what the sub-array itself contains, +and all sub-arrays are treated as being equal to each other. Their +order relative to each other is determined by their index strings. + +Sorting by array element values (for values other than sub-arrays) +always uses basic @command{awk} comparison mode: if both values +happen to be numbers then they're compared as numbers, otherwise +they're compared as strings. + +When string comparisons are made during a sort, either for element +values where one or both aren't numbers or for element indices +handled as strings, the value of @code{IGNORECASE} controls whether +the comparisons treat corresponding upper and lower case letters as +equivalent or distinct. + +This sorting extension is disabled in POSIX mode, +since the @code{PROCINFO} array is not special in that case. + @node Delete @section The @code{delete} Statement @cindex @code{delete} statement |