diff options
-rw-r--r-- | awklib/eg/prog/anagram.awk | 6 | ||||
-rw-r--r-- | awklib/eg/prog/extract.awk | 2 | ||||
-rw-r--r-- | awklib/eg/prog/translate.awk | 2 | ||||
-rw-r--r-- | doc/ChangeLog | 4 | ||||
-rw-r--r-- | doc/gawk.info | 586 | ||||
-rw-r--r-- | doc/gawk.texi | 97 | ||||
-rw-r--r-- | doc/gawktexi.in | 97 |
7 files changed, 387 insertions, 407 deletions
diff --git a/awklib/eg/prog/anagram.awk b/awklib/eg/prog/anagram.awk index 7ca14559..df2768d9 100644 --- a/awklib/eg/prog/anagram.awk +++ b/awklib/eg/prog/anagram.awk @@ -1,5 +1,5 @@ -# anagram.awk --- An implementation of the anagram finding algorithm -# from Jon Bentley's "Programming Pearls", 2nd edition. +# anagram.awk --- An implementation of the anagram-finding algorithm +# from Jon Bentley's "Programming Pearls," 2nd edition. # Addison Wesley, 2000, ISBN 0-201-65788-0. # Column 2, Problem C, section 2.8, pp 18-20. # @@ -21,7 +21,7 @@ key = word2key($1) # Build signature data[key][$1] = $1 # Store word with signature } -# word2key --- split word apart into letters, sort, joining back together +# word2key --- split word apart into letters, sort, and join back together function word2key(word, a, i, n, result) { diff --git a/awklib/eg/prog/extract.awk b/awklib/eg/prog/extract.awk index 24f40ce5..f5dfcf40 100644 --- a/awklib/eg/prog/extract.awk +++ b/awklib/eg/prog/extract.awk @@ -1,4 +1,4 @@ -# extract.awk --- extract files and run programs from texinfo files +# extract.awk --- extract files and run programs from Texinfo files # # Arnold Robbins, arnold@skeeve.com, Public Domain # May 1993 diff --git a/awklib/eg/prog/translate.awk b/awklib/eg/prog/translate.awk index cf7f3897..e7403717 100644 --- a/awklib/eg/prog/translate.awk +++ b/awklib/eg/prog/translate.awk @@ -4,7 +4,7 @@ # August 1989 # February 2009 - bug fix -# Bugs: does not handle things like: tr A-Z a-z, it has +# Bugs: does not handle things like tr A-Z a-z; it has # to be spelled out. However, if `to' is shorter than `from', # the last character in `to' is used for the rest of `from'. diff --git a/doc/ChangeLog b/doc/ChangeLog index b4eb2c70..95022eec 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,7 @@ +2015-02-04 Arnold D. Robbins <arnold@skeeve.com> + + * gawktexi.in: O'Reilly fixes. + 2015-02-02 Arnold D. Robbins <arnold@skeeve.com> * gawktexi.in: O'Reilly fixes. diff --git a/doc/gawk.info b/doc/gawk.info index bebdf4bf..8e6a4f29 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -16441,7 +16441,7 @@ you. to replace the installed versions on your system. Nor may all of these programs be fully compliant with the most recent POSIX standard. This is not a problem; their purpose is to illustrate `awk' language -programming for "real world" tasks. +programming for "real-world" tasks. The programs are presented in alphabetical order. @@ -16467,7 +16467,7 @@ separated by TABs by default, but you may supply a command-line option to change the field "delimiter" (i.e., the field-separator character). `cut''s definition of fields is less general than `awk''s. - A common use of `cut' might be to pull out just the login name of + A common use of `cut' might be to pull out just the login names of logged-on users from the output of `who'. For example, the following pipeline generates a sorted, unique list of the logged-on users: @@ -16876,7 +16876,7 @@ unsuccessful match. If the line does not match, the `next' statement just moves on to the next record. A number of additional tests are made, but they are only done if we -are not counting lines. First, if the user only wants exit status +are not counting lines. First, if the user only wants the exit status (`no_print' is true), then it is enough to know that _one_ line in this file matched, and we can skip on to the next file with `nextfile'. Similarly, if we are only printing file names, we can print the file @@ -16910,7 +16910,7 @@ line is printed, with a leading file name and colon if necessary: } The `END' rule takes care of producing the correct exit status. If -there are no matches, the exit status is one; otherwise it is zero: +there are no matches, the exit status is one; otherwise, it is zero: END { exit (total == 0) @@ -16952,7 +16952,8 @@ a more palatable output than just individual numbers. Here is a simple version of `id' written in `awk'. It uses the user database library functions (*note Passwd Functions::) and the group -database library functions (*note Group Functions::): +database library functions (*note Group Functions::) from *note Library +Functions::. The program is fairly straightforward. All the work is done in the `BEGIN' rule. The user and group ID numbers are obtained from @@ -17049,8 +17050,8 @@ is as follows:(1) By default, the output files are named `xaa', `xab', and so on. Each file has 1,000 lines in it, with the likely exception of the last file. To change the number of lines in each file, supply a number on the -command line preceded with a minus (e.g., `-500' for files with 500 -lines in them instead of 1,000). To change the name of the output +command line preceded with a minus sign (e.g., `-500' for files with +500 lines in them instead of 1,000). To change the names of the output files to something like `myfileaa', `myfileab', and so on, supply an additional argument that specifies the file name prefix. @@ -17687,7 +17688,7 @@ checking and setting of defaults: the delay, the count, and the message to print. If the user supplied a message without the ASCII BEL character (known as the "alert" character, `"\a"'), then it is added to the message. (On many systems, printing the ASCII BEL generates an -audible alert. Thus when the alarm goes off, the system calls attention +audible alert. Thus, when the alarm goes off, the system calls attention to itself in case the user is not looking at the computer.) Just for a change, this program uses a `switch' statement (*note Switch Statement::), but the processing could be done with a series of @@ -17819,7 +17820,7 @@ the "from" list. Once upon a time, a user proposed adding a transliteration function to `gawk'. The following program was written to prove that character transliteration could be done with a user-level function. This program -is not as complete as the system `tr' utility but it does most of the +is not as complete as the system `tr' utility, but it does most of the job. The `translate' program was written long before `gawk' acquired the @@ -17829,13 +17830,13 @@ and `gsub()' built-in functions (*note String Functions::). There are two functions. The first, `stranslate()', takes three arguments: `from' - A list of characters from which to translate. + A list of characters from which to translate `to' - A list of characters to which to translate. + A list of characters to which to translate `target' - The string on which to do the translation. + The string on which to do the translation Associative arrays make the translation part fairly easy. `t_ar' holds the "to" characters, indexed by the "from" characters. Then a @@ -17843,7 +17844,7 @@ simple loop goes through `from', one character at a time. For each character in `from', if the character appears in `target', it is replaced with the corresponding `to' character. - The `translate()' function calls `stranslate()' using `$0' as the + The `translate()' function calls `stranslate()', using `$0' as the target. The main program sets two global variables, `FROM' and `TO', from the command line, and then changes `ARGV' so that `awk' reads from the standard input. @@ -17852,7 +17853,7 @@ the standard input. record: # translate.awk --- do tr-like stuff - # Bugs: does not handle things like: tr A-Z a-z, it has + # Bugs: does not handle things like tr A-Z a-z; it has # to be spelled out. However, if `to' is shorter than `from', # the last character in `to' is used for the rest of `from'. @@ -17930,13 +17931,13 @@ File: gawk.info, Node: Labels Program, Next: Word Sorting, Prev: Translate Pr 11.3.4 Printing Mailing Labels ------------------------------ -Here is a "real world"(1) program. This script reads lists of names and +Here is a "real-world"(1) program. This script reads lists of names and addresses and generates mailing labels. Each page of labels has 20 labels on it, two across and 10 down. The addresses are guaranteed to be no more than five lines of data. Each address is separated from the next by a blank line. - The basic idea is to read 20 labels worth of data. Each line of + The basic idea is to read 20 labels' worth of data. Each line of each label is stored in the `line' array. The single rule takes care of filling the `line' array and printing the page when 20 labels have been read. @@ -17948,13 +17949,13 @@ splits records at blank lines (*note Records::). It sets `MAXLINES' to Most of the work is done in the `printpage()' function. The label lines are stored sequentially in the `line' array. But they have to -print horizontally; `line[1]' next to `line[6]', `line[2]' next to +print horizontally: `line[1]' next to `line[6]', `line[2]' next to `line[7]', and so on. Two loops accomplish this. The outer loop, controlled by `i', steps through every 10 lines of data; this is each row of labels. The inner loop, controlled by `j', goes through the -lines within the row. As `j' goes from 0 to 4, `i+j' is the `j'-th -line in the row, and `i+j+5' is the entry next to it. The output ends -up looking something like this: +lines within the row. As `j' goes from 0 to 4, `i+j' is the `j'th line +in the row, and `i+j+5' is the entry next to it. The output ends up +looking something like this: line 1 line 6 line 2 line 7 @@ -18057,8 +18058,8 @@ a useful format. printf "%s\t%d\n", word, freq[word] } - The program relies on `awk''s default field splitting mechanism to -break each line up into "words," and uses an associative array named + The program relies on `awk''s default field-splitting mechanism to +break each line up into "words" and uses an associative array named `freq', indexed by each word, to count the number of times the word occurs. In the `END' rule, it prints the counts. @@ -18144,7 +18145,7 @@ File: gawk.info, Node: History Sorting, Next: Extract Program, Prev: Word Sor 11.3.6 Removing Duplicates from Unsorted Text --------------------------------------------- -The `uniq' program (*note Uniq Program::), removes duplicate lines from +The `uniq' program (*note Uniq Program::) removes duplicate lines from _sorted_ data. Suppose, however, you need to remove duplicate lines from a data @@ -18197,7 +18198,7 @@ hand. Here we present a program that can extract parts of a Texinfo input file into separate files. This Info file is written in Texinfo -(http://www.gnu.org/software/texinfo/), the GNU project's document +(http://www.gnu.org/software/texinfo/), the GNU Project's document formatting language. A single Texinfo source file can be used to produce both printed documentation, with TeX, and online documentation. (The Texinfo language is described fully, starting with *note @@ -18238,7 +18239,7 @@ them in a standard directory where `gawk' can find them. The Texinfo file looks something like this: ... - This program has a @code{BEGIN} rule, + This program has a @code{BEGIN} rule that prints a nice message: @example @@ -18263,7 +18264,7 @@ upper- and lowercase letters in the directives won't matter. given (`NF' is at least three) and also checking that the command exits with a zero exit status, signifying OK: - # extract.awk --- extract files and run programs from texinfo files + # extract.awk --- extract files and run programs from Texinfo files BEGIN { IGNORECASE = 1 } @@ -18290,11 +18291,11 @@ The variable `e' is used so that the rule fits nicely on the screen. file name is given in the directive. If the file named is not the current file, then the current file is closed. Keeping the current file open until a new file is encountered allows the use of the `>' -redirection for printing the contents, keeping open file management +redirection for printing the contents, keeping open-file management simple. The `for' loop does the work. It reads lines using `getline' (*note -Getline::). For an unexpected end of file, it calls the +Getline::). For an unexpected end-of-file, it calls the `unexpected_eof()' function. If the line is an "endfile" line, then it breaks out of the loop. If the line is an `@group' or `@end group' line, then it ignores it and goes on to the next line. Similarly, @@ -18384,10 +18385,10 @@ File: gawk.info, Node: Simple Sed, Next: Igawk Program, Prev: Extract Program 11.3.8 A Simple Stream Editor ----------------------------- -The `sed' utility is a stream editor, a program that reads a stream of -data, makes changes to it, and passes it on. It is often used to make -global changes to a large file or to a stream of data generated by a -pipeline of commands. Although `sed' is a complicated program in its +The `sed' utility is a "stream editor", a program that reads a stream +of data, makes changes to it, and passes it on. It is often used to +make global changes to a large file or to a stream of data generated by +a pipeline of commands. Although `sed' is a complicated program in its own right, its most common use is to perform global substitutions in the middle of a pipeline: @@ -18501,7 +18502,7 @@ include a library function twice. `igawk' should behave just like `gawk' externally. This means it should accept all of `gawk''s command-line arguments, including the -ability to have multiple source files specified via `-f', and the +ability to have multiple source files specified via `-f' and the ability to mix command-line and library source files. The program is written using the POSIX Shell (`sh') command @@ -18531,8 +18532,8 @@ language.(1) It works as follows: file names). This program uses shell variables extensively: for storing -command-line arguments, the text of the `awk' program that will expand -the user's program, for the user's original program, and for the +command-line arguments and the text of the `awk' program that will +expand the user's program, for the user's original program, and for the expanded program. Doing so removes some potential problems that might arise were we to use temporary files instead, at the cost of making the script somewhat more complicated. @@ -18790,7 +18791,7 @@ It's done in these steps: The last step is to call `gawk' with the expanded program, along with the original options and command-line arguments that the user -supplied. +supplied: eval gawk $opts -- '"$processed_program"' '"$@"' @@ -18853,15 +18854,15 @@ One word is an anagram of another if both words contain the same letters Column 2, Problem C, of Jon Bentley's `Programming Pearls', Second Edition, presents an elegant algorithm. The idea is to give words that are anagrams a common signature, sort all the words together by their -signature, and then print them. Dr. Bentley observes that taking the -letters in each word and sorting them produces that common signature. +signatures, and then print them. Dr. Bentley observes that taking the +letters in each word and sorting them produces those common signatures. The following program uses arrays of arrays to bring together words with the same signature and array sorting to print the words in sorted order: - # anagram.awk --- An implementation of the anagram finding algorithm - # from Jon Bentley's "Programming Pearls", 2nd edition. + # anagram.awk --- An implementation of the anagram-finding algorithm + # from Jon Bentley's "Programming Pearls," 2nd edition. # Addison Wesley, 2000, ISBN 0-201-65788-0. # Column 2, Problem C, section 2.8, pp 18-20. @@ -18881,7 +18882,7 @@ signature; the second dimension is the word itself: apart into individual letters, sorts the letters, and then joins them back together: - # word2key --- split word apart into letters, sort, joining back together + # word2key --- split word apart into letters, sort, and join back together function word2key(word, a, i, n, result) { @@ -18979,12 +18980,13 @@ File: gawk.info, Node: Programs Summary, Next: Programs Exercises, Prev: Misc characters. The ability to use `split()' with the empty string as the separator can considerably simplify such tasks. - * The library functions from *note Library Functions::, proved their - usefulness for a number of real (if small) programs. + * The examples here demonstrate the usefulness of the library + functions from *note Library Functions:: for a number of real (if + small) programs. * Besides reinventing POSIX wheels, other programs solved a - selection of interesting problems, such as finding duplicates - words in text, printing mailing labels, and finding anagrams. + selection of interesting problems, such as finding duplicate words + in text, printing mailing labels, and finding anagrams. @@ -33119,7 +33121,7 @@ Index * hyphen (-), in bracket expressions: Bracket Expressions. (line 17) * i debugger command (alias for info): Debugger Info. (line 13) * id utility: Id Program. (line 6) -* id.awk program: Id Program. (line 30) +* id.awk program: Id Program. (line 31) * if statement: If Statement. (line 6) * if statement, actions, changing: Ranges. (line 25) * if statement, use of regexps in: Regexp Usage. (line 19) @@ -34783,251 +34785,251 @@ Node: Sample Programs675577 Node: Running Examples676347 Node: Clones677075 Node: Cut Program678299 -Node: Egrep Program688018 -Ref: Egrep Program-Footnote-1695516 -Node: Id Program695626 -Node: Split Program699271 -Ref: Split Program-Footnote-1702719 -Node: Tee Program702847 -Node: Uniq Program705636 -Node: Wc Program713055 -Ref: Wc Program-Footnote-1717305 -Node: Miscellaneous Programs717399 -Node: Dupword Program718612 -Node: Alarm Program720643 -Node: Translate Program725447 -Ref: Translate Program-Footnote-1730012 -Node: Labels Program730282 -Ref: Labels Program-Footnote-1733633 -Node: Word Sorting733717 -Node: History Sorting737788 -Node: Extract Program739624 -Node: Simple Sed747149 -Node: Igawk Program750217 -Ref: Igawk Program-Footnote-1764541 -Ref: Igawk Program-Footnote-2764742 -Ref: Igawk Program-Footnote-3764864 -Node: Anagram Program764979 -Node: Signature Program768036 -Node: Programs Summary769283 -Node: Programs Exercises770476 -Ref: Programs Exercises-Footnote-1774607 -Node: Advanced Features774698 -Node: Nondecimal Data776646 -Node: Array Sorting778236 -Node: Controlling Array Traversal778933 -Ref: Controlling Array Traversal-Footnote-1787266 -Node: Array Sorting Functions787384 -Ref: Array Sorting Functions-Footnote-1791273 -Node: Two-way I/O791469 -Ref: Two-way I/O-Footnote-1796414 -Ref: Two-way I/O-Footnote-2796600 -Node: TCP/IP Networking796682 -Node: Profiling799555 -Node: Advanced Features Summary807102 -Node: Internationalization809035 -Node: I18N and L10N810515 -Node: Explaining gettext811201 -Ref: Explaining gettext-Footnote-1816226 -Ref: Explaining gettext-Footnote-2816410 -Node: Programmer i18n816575 -Ref: Programmer i18n-Footnote-1821441 -Node: Translator i18n821490 -Node: String Extraction822284 -Ref: String Extraction-Footnote-1823415 -Node: Printf Ordering823501 -Ref: Printf Ordering-Footnote-1826287 -Node: I18N Portability826351 -Ref: I18N Portability-Footnote-1828806 -Node: I18N Example828869 -Ref: I18N Example-Footnote-1831672 -Node: Gawk I18N831744 -Node: I18N Summary832382 -Node: Debugger833721 -Node: Debugging834743 -Node: Debugging Concepts835184 -Node: Debugging Terms837037 -Node: Awk Debugging839609 -Node: Sample Debugging Session840503 -Node: Debugger Invocation841023 -Node: Finding The Bug842407 -Node: List of Debugger Commands848882 -Node: Breakpoint Control850215 -Node: Debugger Execution Control853911 -Node: Viewing And Changing Data857275 -Node: Execution Stack860653 -Node: Debugger Info862290 -Node: Miscellaneous Debugger Commands866307 -Node: Readline Support871336 -Node: Limitations872228 -Node: Debugging Summary874342 -Node: Arbitrary Precision Arithmetic875510 -Node: Computer Arithmetic876926 -Ref: table-numeric-ranges880524 -Ref: Computer Arithmetic-Footnote-1881383 -Node: Math Definitions881440 -Ref: table-ieee-formats884728 -Ref: Math Definitions-Footnote-1885332 -Node: MPFR features885437 -Node: FP Math Caution887108 -Ref: FP Math Caution-Footnote-1888158 -Node: Inexactness of computations888527 -Node: Inexact representation889486 -Node: Comparing FP Values890843 -Node: Errors accumulate891925 -Node: Getting Accuracy893358 -Node: Try To Round896020 -Node: Setting precision896919 -Ref: table-predefined-precision-strings897603 -Node: Setting the rounding mode899392 -Ref: table-gawk-rounding-modes899756 -Ref: Setting the rounding mode-Footnote-1903211 -Node: Arbitrary Precision Integers903390 -Ref: Arbitrary Precision Integers-Footnote-1906376 -Node: POSIX Floating Point Problems906525 -Ref: POSIX Floating Point Problems-Footnote-1910398 -Node: Floating point summary910436 -Node: Dynamic Extensions912630 -Node: Extension Intro914182 -Node: Plugin License915448 -Node: Extension Mechanism Outline916245 -Ref: figure-load-extension916673 -Ref: figure-register-new-function918153 -Ref: figure-call-new-function919157 -Node: Extension API Description921143 -Node: Extension API Functions Introduction922593 -Node: General Data Types927417 -Ref: General Data Types-Footnote-1933156 -Node: Memory Allocation Functions933455 -Ref: Memory Allocation Functions-Footnote-1936294 -Node: Constructor Functions936390 -Node: Registration Functions938124 -Node: Extension Functions938809 -Node: Exit Callback Functions941106 -Node: Extension Version String942354 -Node: Input Parsers943019 -Node: Output Wrappers952898 -Node: Two-way processors957413 -Node: Printing Messages959617 -Ref: Printing Messages-Footnote-1960693 -Node: Updating `ERRNO'960845 -Node: Requesting Values961585 -Ref: table-value-types-returned962313 -Node: Accessing Parameters963270 -Node: Symbol Table Access964501 -Node: Symbol table by name965015 -Node: Symbol table by cookie966996 -Ref: Symbol table by cookie-Footnote-1971140 -Node: Cached values971203 -Ref: Cached values-Footnote-1974702 -Node: Array Manipulation974793 -Ref: Array Manipulation-Footnote-1975891 -Node: Array Data Types975928 -Ref: Array Data Types-Footnote-1978583 -Node: Array Functions978675 -Node: Flattening Arrays982529 -Node: Creating Arrays989421 -Node: Extension API Variables994192 -Node: Extension Versioning994828 -Node: Extension API Informational Variables996729 -Node: Extension API Boilerplate997794 -Node: Finding Extensions1001603 -Node: Extension Example1002163 -Node: Internal File Description1002935 -Node: Internal File Ops1007002 -Ref: Internal File Ops-Footnote-11018672 -Node: Using Internal File Ops1018812 -Ref: Using Internal File Ops-Footnote-11021195 -Node: Extension Samples1021468 -Node: Extension Sample File Functions1022994 -Node: Extension Sample Fnmatch1030632 -Node: Extension Sample Fork1032123 -Node: Extension Sample Inplace1033338 -Node: Extension Sample Ord1035013 -Node: Extension Sample Readdir1035849 -Ref: table-readdir-file-types1036725 -Node: Extension Sample Revout1037536 -Node: Extension Sample Rev2way1038126 -Node: Extension Sample Read write array1038866 -Node: Extension Sample Readfile1040806 -Node: Extension Sample Time1041901 -Node: Extension Sample API Tests1043250 -Node: gawkextlib1043741 -Node: Extension summary1046399 -Node: Extension Exercises1050088 -Node: Language History1050810 -Node: V7/SVR3.11052466 -Node: SVR41054647 -Node: POSIX1056092 -Node: BTL1057481 -Node: POSIX/GNU1058215 -Node: Feature History1063779 -Node: Common Extensions1076877 -Node: Ranges and Locales1078201 -Ref: Ranges and Locales-Footnote-11082819 -Ref: Ranges and Locales-Footnote-21082846 -Ref: Ranges and Locales-Footnote-31083080 -Node: Contributors1083301 -Node: History summary1088842 -Node: Installation1090212 -Node: Gawk Distribution1091158 -Node: Getting1091642 -Node: Extracting1092465 -Node: Distribution contents1094100 -Node: Unix Installation1099817 -Node: Quick Installation1100434 -Node: Additional Configuration Options1102858 -Node: Configuration Philosophy1104596 -Node: Non-Unix Installation1106965 -Node: PC Installation1107423 -Node: PC Binary Installation1108742 -Node: PC Compiling1110590 -Ref: PC Compiling-Footnote-11113611 -Node: PC Testing1113720 -Node: PC Using1114896 -Node: Cygwin1119011 -Node: MSYS1119834 -Node: VMS Installation1120334 -Node: VMS Compilation1121126 -Ref: VMS Compilation-Footnote-11122348 -Node: VMS Dynamic Extensions1122406 -Node: VMS Installation Details1124090 -Node: VMS Running1126342 -Node: VMS GNV1129178 -Node: VMS Old Gawk1129912 -Node: Bugs1130382 -Node: Other Versions1134265 -Node: Installation summary1140689 -Node: Notes1141745 -Node: Compatibility Mode1142610 -Node: Additions1143392 -Node: Accessing The Source1144317 -Node: Adding Code1145752 -Node: New Ports1151909 -Node: Derived Files1156391 -Ref: Derived Files-Footnote-11161866 -Ref: Derived Files-Footnote-21161900 -Ref: Derived Files-Footnote-31162496 -Node: Future Extensions1162610 -Node: Implementation Limitations1163216 -Node: Extension Design1164464 -Node: Old Extension Problems1165618 -Ref: Old Extension Problems-Footnote-11167135 -Node: Extension New Mechanism Goals1167192 -Ref: Extension New Mechanism Goals-Footnote-11170552 -Node: Extension Other Design Decisions1170741 -Node: Extension Future Growth1172849 -Node: Old Extension Mechanism1173685 -Node: Notes summary1175447 -Node: Basic Concepts1176633 -Node: Basic High Level1177314 -Ref: figure-general-flow1177586 -Ref: figure-process-flow1178185 -Ref: Basic High Level-Footnote-11181414 -Node: Basic Data Typing1181599 -Node: Glossary1184927 -Node: Copying1216856 -Node: GNU Free Documentation License1254412 -Node: Index1279548 +Node: Egrep Program688019 +Ref: Egrep Program-Footnote-1695522 +Node: Id Program695632 +Node: Split Program699308 +Ref: Split Program-Footnote-1702762 +Node: Tee Program702890 +Node: Uniq Program705679 +Node: Wc Program713098 +Ref: Wc Program-Footnote-1717348 +Node: Miscellaneous Programs717442 +Node: Dupword Program718655 +Node: Alarm Program720686 +Node: Translate Program725491 +Ref: Translate Program-Footnote-1730054 +Node: Labels Program730324 +Ref: Labels Program-Footnote-1733675 +Node: Word Sorting733759 +Node: History Sorting737829 +Node: Extract Program739664 +Node: Simple Sed747188 +Node: Igawk Program750258 +Ref: Igawk Program-Footnote-1764584 +Ref: Igawk Program-Footnote-2764785 +Ref: Igawk Program-Footnote-3764907 +Node: Anagram Program765022 +Node: Signature Program768083 +Node: Programs Summary769330 +Node: Programs Exercises770550 +Ref: Programs Exercises-Footnote-1774681 +Node: Advanced Features774772 +Node: Nondecimal Data776720 +Node: Array Sorting778310 +Node: Controlling Array Traversal779007 +Ref: Controlling Array Traversal-Footnote-1787340 +Node: Array Sorting Functions787458 +Ref: Array Sorting Functions-Footnote-1791347 +Node: Two-way I/O791543 +Ref: Two-way I/O-Footnote-1796488 +Ref: Two-way I/O-Footnote-2796674 +Node: TCP/IP Networking796756 +Node: Profiling799629 +Node: Advanced Features Summary807176 +Node: Internationalization809109 +Node: I18N and L10N810589 +Node: Explaining gettext811275 +Ref: Explaining gettext-Footnote-1816300 +Ref: Explaining gettext-Footnote-2816484 +Node: Programmer i18n816649 +Ref: Programmer i18n-Footnote-1821515 +Node: Translator i18n821564 +Node: String Extraction822358 +Ref: String Extraction-Footnote-1823489 +Node: Printf Ordering823575 +Ref: Printf Ordering-Footnote-1826361 +Node: I18N Portability826425 +Ref: I18N Portability-Footnote-1828880 +Node: I18N Example828943 +Ref: I18N Example-Footnote-1831746 +Node: Gawk I18N831818 +Node: I18N Summary832456 +Node: Debugger833795 +Node: Debugging834817 +Node: Debugging Concepts835258 +Node: Debugging Terms837111 +Node: Awk Debugging839683 +Node: Sample Debugging Session840577 +Node: Debugger Invocation841097 +Node: Finding The Bug842481 +Node: List of Debugger Commands848956 +Node: Breakpoint Control850289 +Node: Debugger Execution Control853985 +Node: Viewing And Changing Data857349 +Node: Execution Stack860727 +Node: Debugger Info862364 +Node: Miscellaneous Debugger Commands866381 +Node: Readline Support871410 +Node: Limitations872302 +Node: Debugging Summary874416 +Node: Arbitrary Precision Arithmetic875584 +Node: Computer Arithmetic877000 +Ref: table-numeric-ranges880598 +Ref: Computer Arithmetic-Footnote-1881457 +Node: Math Definitions881514 +Ref: table-ieee-formats884802 +Ref: Math Definitions-Footnote-1885406 +Node: MPFR features885511 +Node: FP Math Caution887182 +Ref: FP Math Caution-Footnote-1888232 +Node: Inexactness of computations888601 +Node: Inexact representation889560 +Node: Comparing FP Values890917 +Node: Errors accumulate891999 +Node: Getting Accuracy893432 +Node: Try To Round896094 +Node: Setting precision896993 +Ref: table-predefined-precision-strings897677 +Node: Setting the rounding mode899466 +Ref: table-gawk-rounding-modes899830 +Ref: Setting the rounding mode-Footnote-1903285 +Node: Arbitrary Precision Integers903464 +Ref: Arbitrary Precision Integers-Footnote-1906450 +Node: POSIX Floating Point Problems906599 +Ref: POSIX Floating Point Problems-Footnote-1910472 +Node: Floating point summary910510 +Node: Dynamic Extensions912704 +Node: Extension Intro914256 +Node: Plugin License915522 +Node: Extension Mechanism Outline916319 +Ref: figure-load-extension916747 +Ref: figure-register-new-function918227 +Ref: figure-call-new-function919231 +Node: Extension API Description921217 +Node: Extension API Functions Introduction922667 +Node: General Data Types927491 +Ref: General Data Types-Footnote-1933230 +Node: Memory Allocation Functions933529 +Ref: Memory Allocation Functions-Footnote-1936368 +Node: Constructor Functions936464 +Node: Registration Functions938198 +Node: Extension Functions938883 +Node: Exit Callback Functions941180 +Node: Extension Version String942428 +Node: Input Parsers943093 +Node: Output Wrappers952972 +Node: Two-way processors957487 +Node: Printing Messages959691 +Ref: Printing Messages-Footnote-1960767 +Node: Updating `ERRNO'960919 +Node: Requesting Values961659 +Ref: table-value-types-returned962387 +Node: Accessing Parameters963344 +Node: Symbol Table Access964575 +Node: Symbol table by name965089 +Node: Symbol table by cookie967070 +Ref: Symbol table by cookie-Footnote-1971214 +Node: Cached values971277 +Ref: Cached values-Footnote-1974776 +Node: Array Manipulation974867 +Ref: Array Manipulation-Footnote-1975965 +Node: Array Data Types976002 +Ref: Array Data Types-Footnote-1978657 +Node: Array Functions978749 +Node: Flattening Arrays982603 +Node: Creating Arrays989495 +Node: Extension API Variables994266 +Node: Extension Versioning994902 +Node: Extension API Informational Variables996803 +Node: Extension API Boilerplate997868 +Node: Finding Extensions1001677 +Node: Extension Example1002237 +Node: Internal File Description1003009 +Node: Internal File Ops1007076 +Ref: Internal File Ops-Footnote-11018746 +Node: Using Internal File Ops1018886 +Ref: Using Internal File Ops-Footnote-11021269 +Node: Extension Samples1021542 +Node: Extension Sample File Functions1023068 +Node: Extension Sample Fnmatch1030706 +Node: Extension Sample Fork1032197 +Node: Extension Sample Inplace1033412 +Node: Extension Sample Ord1035087 +Node: Extension Sample Readdir1035923 +Ref: table-readdir-file-types1036799 +Node: Extension Sample Revout1037610 +Node: Extension Sample Rev2way1038200 +Node: Extension Sample Read write array1038940 +Node: Extension Sample Readfile1040880 +Node: Extension Sample Time1041975 +Node: Extension Sample API Tests1043324 +Node: gawkextlib1043815 +Node: Extension summary1046473 +Node: Extension Exercises1050162 +Node: Language History1050884 +Node: V7/SVR3.11052540 +Node: SVR41054721 +Node: POSIX1056166 +Node: BTL1057555 +Node: POSIX/GNU1058289 +Node: Feature History1063853 +Node: Common Extensions1076951 +Node: Ranges and Locales1078275 +Ref: Ranges and Locales-Footnote-11082893 +Ref: Ranges and Locales-Footnote-21082920 +Ref: Ranges and Locales-Footnote-31083154 +Node: Contributors1083375 +Node: History summary1088916 +Node: Installation1090286 +Node: Gawk Distribution1091232 +Node: Getting1091716 +Node: Extracting1092539 +Node: Distribution contents1094174 +Node: Unix Installation1099891 +Node: Quick Installation1100508 +Node: Additional Configuration Options1102932 +Node: Configuration Philosophy1104670 +Node: Non-Unix Installation1107039 +Node: PC Installation1107497 +Node: PC Binary Installation1108816 +Node: PC Compiling1110664 +Ref: PC Compiling-Footnote-11113685 +Node: PC Testing1113794 +Node: PC Using1114970 +Node: Cygwin1119085 +Node: MSYS1119908 +Node: VMS Installation1120408 +Node: VMS Compilation1121200 +Ref: VMS Compilation-Footnote-11122422 +Node: VMS Dynamic Extensions1122480 +Node: VMS Installation Details1124164 +Node: VMS Running1126416 +Node: VMS GNV1129252 +Node: VMS Old Gawk1129986 +Node: Bugs1130456 +Node: Other Versions1134339 +Node: Installation summary1140763 +Node: Notes1141819 +Node: Compatibility Mode1142684 +Node: Additions1143466 +Node: Accessing The Source1144391 +Node: Adding Code1145826 +Node: New Ports1151983 +Node: Derived Files1156465 +Ref: Derived Files-Footnote-11161940 +Ref: Derived Files-Footnote-21161974 +Ref: Derived Files-Footnote-31162570 +Node: Future Extensions1162684 +Node: Implementation Limitations1163290 +Node: Extension Design1164538 +Node: Old Extension Problems1165692 +Ref: Old Extension Problems-Footnote-11167209 +Node: Extension New Mechanism Goals1167266 +Ref: Extension New Mechanism Goals-Footnote-11170626 +Node: Extension Other Design Decisions1170815 +Node: Extension Future Growth1172923 +Node: Old Extension Mechanism1173759 +Node: Notes summary1175521 +Node: Basic Concepts1176707 +Node: Basic High Level1177388 +Ref: figure-general-flow1177660 +Ref: figure-process-flow1178259 +Ref: Basic High Level-Footnote-11181488 +Node: Basic Data Typing1181673 +Node: Glossary1185001 +Node: Copying1216930 +Node: GNU Free Documentation License1254486 +Node: Index1279622 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index abc9fa9c..9f06740c 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -23148,10 +23148,10 @@ in this @value{CHAPTER}. The second presents @command{awk} versions of several common POSIX utilities. These are programs that you are hopefully already familiar with, -and therefore, whose problems are understood. +and therefore whose problems are understood. By reimplementing these programs in @command{awk}, you can focus on the @command{awk}-related aspects of solving -the programming problem. +the programming problems. The third is a grab bag of interesting programs. These solve a number of different data-manipulation and management @@ -23211,7 +23211,7 @@ It should be noted that these programs are not necessarily intended to replace the installed versions on your system. Nor may all of these programs be fully compliant with the most recent POSIX standard. This is not a problem; their -purpose is to illustrate @command{awk} language programming for ``real world'' +purpose is to illustrate @command{awk} language programming for ``real-world'' tasks. The programs are presented in alphabetical order. @@ -23240,7 +23240,7 @@ but you may supply a command-line option to change the field @dfn{delimiter} (i.e., the field-separator character). @command{cut}'s definition of fields is less general than @command{awk}'s. -A common use of @command{cut} might be to pull out just the login name of +A common use of @command{cut} might be to pull out just the login names of logged-on users from the output of @command{who}. For example, the following pipeline generates a sorted, unique list of the logged-on users: @@ -23749,7 +23749,7 @@ successful or unsuccessful match. If the line does not match, the @code{next} statement just moves on to the next record. A number of additional tests are made, but they are only done if we -are not counting lines. First, if the user only wants exit status +are not counting lines. First, if the user only wants the exit status (@code{no_print} is true), then it is enough to know that @emph{one} line in this file matched, and we can skip on to the next file with @code{nextfile}. Similarly, if we are only printing @value{FN}s, we can @@ -23790,7 +23790,7 @@ if necessary: @end example The @code{END} rule takes care of producing the correct exit status. If -there are no matches, the exit status is one; otherwise it is zero: +there are no matches, the exit status is one; otherwise, it is zero: @example @c file eg/prog/egrep.awk @@ -23842,7 +23842,8 @@ Here is a simple version of @command{id} written in @command{awk}. It uses the user database library functions (@pxref{Passwd Functions}) and the group database library functions -(@pxref{Group Functions}): +(@pxref{Group Functions}) +from @ref{Library Functions}. The program is fairly straightforward. All the work is done in the @code{BEGIN} rule. The user and group ID numbers are obtained from @@ -23969,8 +23970,8 @@ By default, the output files are named @file{xaa}, @file{xab}, and so on. Each file has 1,000 lines in it, with the likely exception of the last file. To change the number of lines in each file, supply a number on the command line -preceded with a minus (e.g., @samp{-500} for files with 500 lines in them -instead of 1,000). To change the name of the output files to something like +preceded with a minus sign (e.g., @samp{-500} for files with 500 lines in them +instead of 1,000). To change the names of the output files to something like @file{myfileaa}, @file{myfileab}, and so on, supply an additional argument that specifies the @value{FN} prefix. @@ -24809,7 +24810,7 @@ checking and setting of defaults: the delay, the count, and the message to print. If the user supplied a message without the ASCII BEL character (known as the ``alert'' character, @code{"\a"}), then it is added to the message. (On many systems, printing the ASCII BEL generates an -audible alert. Thus when the alarm goes off, the system calls attention +audible alert. Thus, when the alarm goes off, the system calls attention to itself in case the user is not looking at the computer.) Just for a change, this program uses a @code{switch} statement (@pxref{Switch Statement}), but the processing could be done with a series of @@ -24978,7 +24979,7 @@ to @command{gawk}. @c at least theoretically The following program was written to prove that character transliteration could be done with a user-level -function. This program is not as complete as the system @command{tr} utility +function. This program is not as complete as the system @command{tr} utility, but it does most of the job. The @command{translate} program was written long before @command{gawk} @@ -24990,13 +24991,13 @@ takes three arguments: @table @code @item from -A list of characters from which to translate. +A list of characters from which to translate @item to -A list of characters to which to translate. +A list of characters to which to translate @item target -The string on which to do the translation. +The string on which to do the translation @end table Associative arrays make the translation part fairly easy. @code{t_ar} holds @@ -25005,7 +25006,7 @@ loop goes through @code{from}, one character at a time. For each character in @code{from}, if the character appears in @code{target}, it is replaced with the corresponding @code{to} character. -The @code{translate()} function calls @code{stranslate()} using @code{$0} +The @code{translate()} function calls @code{stranslate()}, using @code{$0} as the target. The main program sets two global variables, @code{FROM} and @code{TO}, from the command line, and then changes @code{ARGV} so that @command{awk} reads from the standard input. @@ -25027,7 +25028,7 @@ Finally, the processing rule simply calls @code{translate()} for each record: @c endfile @end ignore @c file eg/prog/translate.awk -# Bugs: does not handle things like: tr A-Z a-z, it has +# Bugs: does not handle things like tr A-Z a-z; it has # to be spelled out. However, if `to' is shorter than `from', # the last character in `to' is used for the rest of `from'. @@ -25103,7 +25104,7 @@ for inspiration. @cindex printing, mailing labels @cindex mailing labels@comma{} printing -Here is a ``real world''@footnote{``Real world'' is defined as +Here is a ``real-world''@footnote{``Real world'' is defined as ``a program actually used to get something done.''} program. This script reads lists of names and @@ -25112,7 +25113,7 @@ on it, two across and 10 down. The addresses are guaranteed to be no more than five lines of data. Each address is separated from the next by a blank line. -The basic idea is to read 20 labels worth of data. Each line of each label +The basic idea is to read 20 labels' worth of data. Each line of each label is stored in the @code{line} array. The single rule takes care of filling the @code{line} array and printing the page when 20 labels have been read. @@ -25135,12 +25136,12 @@ of lines on the page Most of the work is done in the @code{printpage()} function. The label lines are stored sequentially in the @code{line} array. But they -have to print horizontally; @code{line[1]} next to @code{line[6]}, +have to print horizontally: @code{line[1]} next to @code{line[6]}, @code{line[2]} next to @code{line[7]}, and so on. Two loops accomplish this. The outer loop, controlled by @code{i}, steps through every 10 lines of data; this is each row of labels. The inner loop, controlled by @code{j}, goes through the lines within the row. -As @code{j} goes from 0 to 4, @samp{i+j} is the @code{j}-th line in +As @code{j} goes from 0 to 4, @samp{i+j} is the @code{j}th line in the row, and @samp{i+j+5} is the entry next to it. The output ends up looking something like this: @@ -25258,8 +25259,8 @@ END @{ @} @end example -The program relies on @command{awk}'s default field splitting -mechanism to break each line up into ``words,'' and uses an +The program relies on @command{awk}'s default field-splitting +mechanism to break each line up into ``words'' and uses an associative array named @code{freq}, indexed by each word, to count the number of times the word occurs. In the @code{END} rule, it prints the counts. @@ -25364,7 +25365,7 @@ to use the @command{sort} program. @cindex lines, duplicate@comma{} removing The @command{uniq} program -(@pxref{Uniq Program}), +(@pxref{Uniq Program}) removes duplicate lines from @emph{sorted} data. Suppose, however, you need to remove duplicate lines from a @value{DF} but @@ -25451,7 +25452,7 @@ Texinfo input file into separate files. @cindex Texinfo This @value{DOCUMENT} is written in @uref{http://www.gnu.org/software/texinfo/, Texinfo}, -the GNU project's document formatting language. +the GNU Project's document formatting language. A single Texinfo source file can be used to produce both printed documentation, with @TeX{}, and online documentation. @ifnotinfo @@ -25510,7 +25511,7 @@ The Texinfo file looks something like this: @example @dots{} -This program has a @@code@{BEGIN@} rule, +This program has a @@code@{BEGIN@} rule that prints a nice message: @@example @@ -25539,7 +25540,7 @@ exits with a zero exit status, signifying OK: @cindex @code{extract.awk} program @example @c file eg/prog/extract.awk -# extract.awk --- extract files and run programs from texinfo files +# extract.awk --- extract files and run programs from Texinfo files @c endfile @ignore @c file eg/prog/extract.awk @@ -25580,12 +25581,12 @@ The second rule handles moving data into files. It verifies that a @value{FN} is given in the directive. If the file named is not the current file, then the current file is closed. Keeping the current file open until a new file is encountered allows the use of the @samp{>} -redirection for printing the contents, keeping open file management +redirection for printing the contents, keeping open-file management simple. The @code{for} loop does the work. It reads lines using @code{getline} (@pxref{Getline}). -For an unexpected end of file, it calls the @code{@w{unexpected_eof()}} +For an unexpected end-of-file, it calls the @code{@w{unexpected_eof()}} function. If the line is an ``endfile'' line, then it breaks out of the loop. If the line is an @samp{@@group} or @samp{@@end group} line, then it @@ -25687,7 +25688,7 @@ END @{ @cindex @command{sed} utility @cindex stream editors -The @command{sed} utility is a stream editor, a program that reads a +The @command{sed} utility is a @dfn{stream editor}, a program that reads a stream of data, makes changes to it, and passes it on. It is often used to make global changes to a large file or to a stream of data generated by a pipeline of commands. @@ -25832,7 +25833,7 @@ includes don't accidentally include a library function twice. @command{igawk} should behave just like @command{gawk} externally. This means it should accept all of @command{gawk}'s command-line arguments, including the ability to have multiple source files specified via -@option{-f}, and the ability to mix command-line and library source files. +@option{-f} and the ability to mix command-line and library source files. The program is written using the POSIX Shell (@command{sh}) command language.@footnote{Fully explaining the @command{sh} language is beyond @@ -25871,7 +25872,7 @@ Run the expanded program with @command{gawk} and any other original command-line arguments that the user supplied (such as the @value{DF} names). @end enumerate -This program uses shell variables extensively: for storing command-line arguments, +This program uses shell variables extensively: for storing command-line arguments and the text of the @command{awk} program that will expand the user's program, for the user's original program, and for the expanded program. Doing so removes some potential problems that might arise were we to use temporary files instead, @@ -26188,22 +26189,7 @@ Save the results of this processing in the shell variable The last step is to call @command{gawk} with the expanded program, along with the original -options and command-line arguments that the user supplied. - -@c this causes more problems than it solves, so leave it out. -@ignore -The special file @file{/dev/null} is passed as a @value{DF} to @command{gawk} -to handle an interesting case. Suppose that the user's program only has -a @code{BEGIN} rule and there are no @value{DF}s to read. -The program should exit without reading any @value{DF}s. -However, suppose that an included library file defines an @code{END} -rule of its own. In this case, @command{gawk} will hang, reading standard -input. In order to avoid this, @file{/dev/null} is explicitly added to the -command line. Reading from @file{/dev/null} always returns an immediate -end of file indication. - -@c Hmm. Add /dev/null if $# is 0? Still messes up ARGV. Sigh. -@end ignore +options and command-line arguments that the user supplied: @example @c file eg/prog/igawk.sh @@ -26269,8 +26255,8 @@ the same letters Column 2, Problem C, of Jon Bentley's @cite{Programming Pearls}, Second Edition, presents an elegant algorithm. The idea is to give words that are anagrams a common signature, sort all the words together by their -signature, and then print them. Dr.@: Bentley observes that taking the -letters in each word and sorting them produces that common signature. +signatures, and then print them. Dr.@: Bentley observes that taking the +letters in each word and sorting them produces those common signatures. The following program uses arrays of arrays to bring together words with the same signature and array sorting to print the words @@ -26279,8 +26265,8 @@ in sorted order: @cindex @code{anagram.awk} program @example @c file eg/prog/anagram.awk -# anagram.awk --- An implementation of the anagram finding algorithm -# from Jon Bentley's "Programming Pearls", 2nd edition. +# anagram.awk --- An implementation of the anagram-finding algorithm +# from Jon Bentley's "Programming Pearls," 2nd edition. # Addison Wesley, 2000, ISBN 0-201-65788-0. # Column 2, Problem C, section 2.8, pp 18-20. @c endfile @@ -26328,7 +26314,7 @@ sorts the letters, and then joins them back together: @example @c file eg/prog/anagram.awk -# word2key --- split word apart into letters, sort, joining back together +# word2key --- split word apart into letters, sort, and join back together function word2key(word, a, i, n, result) @{ @@ -26523,12 +26509,13 @@ characters. The ability to use @code{split()} with the empty string as the separator can considerably simplify such tasks. @item -The library functions from @ref{Library Functions}, proved their -usefulness for a number of real (if small) programs. +The examples here demonstrate the usefulness of the library +functions from @ref{Library Functions} +for a number of real (if small) programs. @item Besides reinventing POSIX wheels, other programs solved a selection of -interesting problems, such as finding duplicates words in text, printing +interesting problems, such as finding duplicate words in text, printing mailing labels, and finding anagrams. @end itemize diff --git a/doc/gawktexi.in b/doc/gawktexi.in index 2e2d0e04..cfddbd16 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -22239,10 +22239,10 @@ in this @value{CHAPTER}. The second presents @command{awk} versions of several common POSIX utilities. These are programs that you are hopefully already familiar with, -and therefore, whose problems are understood. +and therefore whose problems are understood. By reimplementing these programs in @command{awk}, you can focus on the @command{awk}-related aspects of solving -the programming problem. +the programming problems. The third is a grab bag of interesting programs. These solve a number of different data-manipulation and management @@ -22302,7 +22302,7 @@ It should be noted that these programs are not necessarily intended to replace the installed versions on your system. Nor may all of these programs be fully compliant with the most recent POSIX standard. This is not a problem; their -purpose is to illustrate @command{awk} language programming for ``real world'' +purpose is to illustrate @command{awk} language programming for ``real-world'' tasks. The programs are presented in alphabetical order. @@ -22331,7 +22331,7 @@ but you may supply a command-line option to change the field @dfn{delimiter} (i.e., the field-separator character). @command{cut}'s definition of fields is less general than @command{awk}'s. -A common use of @command{cut} might be to pull out just the login name of +A common use of @command{cut} might be to pull out just the login names of logged-on users from the output of @command{who}. For example, the following pipeline generates a sorted, unique list of the logged-on users: @@ -22840,7 +22840,7 @@ successful or unsuccessful match. If the line does not match, the @code{next} statement just moves on to the next record. A number of additional tests are made, but they are only done if we -are not counting lines. First, if the user only wants exit status +are not counting lines. First, if the user only wants the exit status (@code{no_print} is true), then it is enough to know that @emph{one} line in this file matched, and we can skip on to the next file with @code{nextfile}. Similarly, if we are only printing @value{FN}s, we can @@ -22881,7 +22881,7 @@ if necessary: @end example The @code{END} rule takes care of producing the correct exit status. If -there are no matches, the exit status is one; otherwise it is zero: +there are no matches, the exit status is one; otherwise, it is zero: @example @c file eg/prog/egrep.awk @@ -22933,7 +22933,8 @@ Here is a simple version of @command{id} written in @command{awk}. It uses the user database library functions (@pxref{Passwd Functions}) and the group database library functions -(@pxref{Group Functions}): +(@pxref{Group Functions}) +from @ref{Library Functions}. The program is fairly straightforward. All the work is done in the @code{BEGIN} rule. The user and group ID numbers are obtained from @@ -23060,8 +23061,8 @@ By default, the output files are named @file{xaa}, @file{xab}, and so on. Each file has 1,000 lines in it, with the likely exception of the last file. To change the number of lines in each file, supply a number on the command line -preceded with a minus (e.g., @samp{-500} for files with 500 lines in them -instead of 1,000). To change the name of the output files to something like +preceded with a minus sign (e.g., @samp{-500} for files with 500 lines in them +instead of 1,000). To change the names of the output files to something like @file{myfileaa}, @file{myfileab}, and so on, supply an additional argument that specifies the @value{FN} prefix. @@ -23900,7 +23901,7 @@ checking and setting of defaults: the delay, the count, and the message to print. If the user supplied a message without the ASCII BEL character (known as the ``alert'' character, @code{"\a"}), then it is added to the message. (On many systems, printing the ASCII BEL generates an -audible alert. Thus when the alarm goes off, the system calls attention +audible alert. Thus, when the alarm goes off, the system calls attention to itself in case the user is not looking at the computer.) Just for a change, this program uses a @code{switch} statement (@pxref{Switch Statement}), but the processing could be done with a series of @@ -24069,7 +24070,7 @@ to @command{gawk}. @c at least theoretically The following program was written to prove that character transliteration could be done with a user-level -function. This program is not as complete as the system @command{tr} utility +function. This program is not as complete as the system @command{tr} utility, but it does most of the job. The @command{translate} program was written long before @command{gawk} @@ -24081,13 +24082,13 @@ takes three arguments: @table @code @item from -A list of characters from which to translate. +A list of characters from which to translate @item to -A list of characters to which to translate. +A list of characters to which to translate @item target -The string on which to do the translation. +The string on which to do the translation @end table Associative arrays make the translation part fairly easy. @code{t_ar} holds @@ -24096,7 +24097,7 @@ loop goes through @code{from}, one character at a time. For each character in @code{from}, if the character appears in @code{target}, it is replaced with the corresponding @code{to} character. -The @code{translate()} function calls @code{stranslate()} using @code{$0} +The @code{translate()} function calls @code{stranslate()}, using @code{$0} as the target. The main program sets two global variables, @code{FROM} and @code{TO}, from the command line, and then changes @code{ARGV} so that @command{awk} reads from the standard input. @@ -24118,7 +24119,7 @@ Finally, the processing rule simply calls @code{translate()} for each record: @c endfile @end ignore @c file eg/prog/translate.awk -# Bugs: does not handle things like: tr A-Z a-z, it has +# Bugs: does not handle things like tr A-Z a-z; it has # to be spelled out. However, if `to' is shorter than `from', # the last character in `to' is used for the rest of `from'. @@ -24194,7 +24195,7 @@ for inspiration. @cindex printing, mailing labels @cindex mailing labels@comma{} printing -Here is a ``real world''@footnote{``Real world'' is defined as +Here is a ``real-world''@footnote{``Real world'' is defined as ``a program actually used to get something done.''} program. This script reads lists of names and @@ -24203,7 +24204,7 @@ on it, two across and 10 down. The addresses are guaranteed to be no more than five lines of data. Each address is separated from the next by a blank line. -The basic idea is to read 20 labels worth of data. Each line of each label +The basic idea is to read 20 labels' worth of data. Each line of each label is stored in the @code{line} array. The single rule takes care of filling the @code{line} array and printing the page when 20 labels have been read. @@ -24226,12 +24227,12 @@ of lines on the page Most of the work is done in the @code{printpage()} function. The label lines are stored sequentially in the @code{line} array. But they -have to print horizontally; @code{line[1]} next to @code{line[6]}, +have to print horizontally: @code{line[1]} next to @code{line[6]}, @code{line[2]} next to @code{line[7]}, and so on. Two loops accomplish this. The outer loop, controlled by @code{i}, steps through every 10 lines of data; this is each row of labels. The inner loop, controlled by @code{j}, goes through the lines within the row. -As @code{j} goes from 0 to 4, @samp{i+j} is the @code{j}-th line in +As @code{j} goes from 0 to 4, @samp{i+j} is the @code{j}th line in the row, and @samp{i+j+5} is the entry next to it. The output ends up looking something like this: @@ -24349,8 +24350,8 @@ END @{ @} @end example -The program relies on @command{awk}'s default field splitting -mechanism to break each line up into ``words,'' and uses an +The program relies on @command{awk}'s default field-splitting +mechanism to break each line up into ``words'' and uses an associative array named @code{freq}, indexed by each word, to count the number of times the word occurs. In the @code{END} rule, it prints the counts. @@ -24455,7 +24456,7 @@ to use the @command{sort} program. @cindex lines, duplicate@comma{} removing The @command{uniq} program -(@pxref{Uniq Program}), +(@pxref{Uniq Program}) removes duplicate lines from @emph{sorted} data. Suppose, however, you need to remove duplicate lines from a @value{DF} but @@ -24542,7 +24543,7 @@ Texinfo input file into separate files. @cindex Texinfo This @value{DOCUMENT} is written in @uref{http://www.gnu.org/software/texinfo/, Texinfo}, -the GNU project's document formatting language. +the GNU Project's document formatting language. A single Texinfo source file can be used to produce both printed documentation, with @TeX{}, and online documentation. @ifnotinfo @@ -24601,7 +24602,7 @@ The Texinfo file looks something like this: @example @dots{} -This program has a @@code@{BEGIN@} rule, +This program has a @@code@{BEGIN@} rule that prints a nice message: @@example @@ -24630,7 +24631,7 @@ exits with a zero exit status, signifying OK: @cindex @code{extract.awk} program @example @c file eg/prog/extract.awk -# extract.awk --- extract files and run programs from texinfo files +# extract.awk --- extract files and run programs from Texinfo files @c endfile @ignore @c file eg/prog/extract.awk @@ -24671,12 +24672,12 @@ The second rule handles moving data into files. It verifies that a @value{FN} is given in the directive. If the file named is not the current file, then the current file is closed. Keeping the current file open until a new file is encountered allows the use of the @samp{>} -redirection for printing the contents, keeping open file management +redirection for printing the contents, keeping open-file management simple. The @code{for} loop does the work. It reads lines using @code{getline} (@pxref{Getline}). -For an unexpected end of file, it calls the @code{@w{unexpected_eof()}} +For an unexpected end-of-file, it calls the @code{@w{unexpected_eof()}} function. If the line is an ``endfile'' line, then it breaks out of the loop. If the line is an @samp{@@group} or @samp{@@end group} line, then it @@ -24778,7 +24779,7 @@ END @{ @cindex @command{sed} utility @cindex stream editors -The @command{sed} utility is a stream editor, a program that reads a +The @command{sed} utility is a @dfn{stream editor}, a program that reads a stream of data, makes changes to it, and passes it on. It is often used to make global changes to a large file or to a stream of data generated by a pipeline of commands. @@ -24923,7 +24924,7 @@ includes don't accidentally include a library function twice. @command{igawk} should behave just like @command{gawk} externally. This means it should accept all of @command{gawk}'s command-line arguments, including the ability to have multiple source files specified via -@option{-f}, and the ability to mix command-line and library source files. +@option{-f} and the ability to mix command-line and library source files. The program is written using the POSIX Shell (@command{sh}) command language.@footnote{Fully explaining the @command{sh} language is beyond @@ -24962,7 +24963,7 @@ Run the expanded program with @command{gawk} and any other original command-line arguments that the user supplied (such as the @value{DF} names). @end enumerate -This program uses shell variables extensively: for storing command-line arguments, +This program uses shell variables extensively: for storing command-line arguments and the text of the @command{awk} program that will expand the user's program, for the user's original program, and for the expanded program. Doing so removes some potential problems that might arise were we to use temporary files instead, @@ -25279,22 +25280,7 @@ Save the results of this processing in the shell variable The last step is to call @command{gawk} with the expanded program, along with the original -options and command-line arguments that the user supplied. - -@c this causes more problems than it solves, so leave it out. -@ignore -The special file @file{/dev/null} is passed as a @value{DF} to @command{gawk} -to handle an interesting case. Suppose that the user's program only has -a @code{BEGIN} rule and there are no @value{DF}s to read. -The program should exit without reading any @value{DF}s. -However, suppose that an included library file defines an @code{END} -rule of its own. In this case, @command{gawk} will hang, reading standard -input. In order to avoid this, @file{/dev/null} is explicitly added to the -command line. Reading from @file{/dev/null} always returns an immediate -end of file indication. - -@c Hmm. Add /dev/null if $# is 0? Still messes up ARGV. Sigh. -@end ignore +options and command-line arguments that the user supplied: @example @c file eg/prog/igawk.sh @@ -25360,8 +25346,8 @@ the same letters Column 2, Problem C, of Jon Bentley's @cite{Programming Pearls}, Second Edition, presents an elegant algorithm. The idea is to give words that are anagrams a common signature, sort all the words together by their -signature, and then print them. Dr.@: Bentley observes that taking the -letters in each word and sorting them produces that common signature. +signatures, and then print them. Dr.@: Bentley observes that taking the +letters in each word and sorting them produces those common signatures. The following program uses arrays of arrays to bring together words with the same signature and array sorting to print the words @@ -25370,8 +25356,8 @@ in sorted order: @cindex @code{anagram.awk} program @example @c file eg/prog/anagram.awk -# anagram.awk --- An implementation of the anagram finding algorithm -# from Jon Bentley's "Programming Pearls", 2nd edition. +# anagram.awk --- An implementation of the anagram-finding algorithm +# from Jon Bentley's "Programming Pearls," 2nd edition. # Addison Wesley, 2000, ISBN 0-201-65788-0. # Column 2, Problem C, section 2.8, pp 18-20. @c endfile @@ -25419,7 +25405,7 @@ sorts the letters, and then joins them back together: @example @c file eg/prog/anagram.awk -# word2key --- split word apart into letters, sort, joining back together +# word2key --- split word apart into letters, sort, and join back together function word2key(word, a, i, n, result) @{ @@ -25614,12 +25600,13 @@ characters. The ability to use @code{split()} with the empty string as the separator can considerably simplify such tasks. @item -The library functions from @ref{Library Functions}, proved their -usefulness for a number of real (if small) programs. +The examples here demonstrate the usefulness of the library +functions from @DBREF{Library Functions} +for a number of real (if small) programs. @item Besides reinventing POSIX wheels, other programs solved a selection of -interesting problems, such as finding duplicates words in text, printing +interesting problems, such as finding duplicate words in text, printing mailing labels, and finding anagrams. @end itemize |