diff options
author | Arnold D. Robbins <arnold@skeeve.com> | 2014-04-30 21:30:19 +0300 |
---|---|---|
committer | Arnold D. Robbins <arnold@skeeve.com> | 2014-04-30 21:30:19 +0300 |
commit | 46a649988936bfcb2fd1f79dcb580b4a17c46dae (patch) | |
tree | 54feebaa2c2ffe0719c19d26d2a03371bb31ae09 | |
parent | b9e3fbd2db8410b4717d8589d6e315a617c62b4a (diff) | |
parent | 0ceab11e44cac45f8653fa79510726cc121719f4 (diff) | |
download | egawk-46a649988936bfcb2fd1f79dcb580b4a17c46dae.tar.gz egawk-46a649988936bfcb2fd1f79dcb580b4a17c46dae.tar.bz2 egawk-46a649988936bfcb2fd1f79dcb580b4a17c46dae.zip |
Merge branch 'gawk-4.1-stable'
-rw-r--r-- | doc/ChangeLog | 2 | ||||
-rw-r--r-- | doc/gawk.info | 1082 | ||||
-rw-r--r-- | doc/gawk.texi | 318 | ||||
-rw-r--r-- | doc/gawktexi.in | 312 |
4 files changed, 919 insertions, 795 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog index 2721f289..3bd47bb7 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,6 +1,8 @@ 2014-04-30 Arnold D. Robbins <arnold@skeeve.com> * gawktexi.in: Editing progress. Through Chapter 5. + * gawktexi.in: Editing progress. Through Chapter 6 and into + Chapter 7. 2014-04-29 Arnold D. Robbins <arnold@skeeve.com> diff --git a/doc/gawk.info b/doc/gawk.info index db86bf3a..dc0a7591 100644 --- a/doc/gawk.info +++ b/doc/gawk.info @@ -6921,7 +6921,8 @@ codes. (1) The internal representation of all numbers, including integers, uses double precision floating-point numbers. On most modern systems, -these are in IEEE 754 standard format. +these are in IEEE 754 standard format. *Note Arbitrary Precision +Arithmetic::, for much more information. File: gawk.info, Node: Nondecimal-numbers, Next: Regexp Constants, Prev: Scalar Constants, Up: Constants @@ -7054,8 +7055,8 @@ the contents of the current input record. Constant regular expressions are also used as the first argument for the `gensub()', `sub()', and `gsub()' functions, as the second argument -of the `match()' function, and as the third argument of the -`patsplit()' function (*note String Functions::). Modern +of the `match()' function, and as the third argument of the `split()' +and `patsplit()' functions (*note String Functions::). Modern implementations of `awk', including `gawk', allow the third argument of `split()' to be a regexp constant, but some older implementations do not. (d.c.) This can lead to confusion when attempting to use regexp @@ -7248,29 +7249,28 @@ use when printing numbers with `print'. `CONVFMT' was introduced in order to separate the semantics of conversion from the semantics of printing. Both `CONVFMT' and `OFMT' have the same default value: `"%.6g"'. In the vast majority of cases, old `awk' programs do not -change their behavior. However, these semantics for `OFMT' are -something to keep in mind if you must port your new-style program to -older implementations of `awk'. We recommend that instead of changing -your programs, just port `gawk' itself. *Note Print::, for more -information on the `print' statement. - - And, once again, where you are can matter when it comes to converting -between numbers and strings. In *note Locales::, we mentioned that the -local character set and language (the locale) can affect how `gawk' -matches characters. The locale also affects numeric formats. In -particular, for `awk' programs, it affects the decimal point character. -The `"C"' locale, and most English-language locales, use the period -character (`.') as the decimal point. However, many (if not most) -European and non-English locales use the comma (`,') as the decimal -point character. +change their behavior. *Note Print::, for more information on the +`print' statement. + + Where you are can matter when it comes to converting between numbers +and strings. The local character set and language--the "locale"--can +affect numeric formats. In particular, for `awk' programs, it affects +the decimal point character and the thousands-separator character. The +`"C"' locale, and most English-language locales, use the period +character (`.') as the decimal point and don't have a thousands +separator. However, many (if not most) European and non-English +locales use the comma (`,') as the decimal point character. European +locales often use either a space or a period as the thousands +separator, if they have one. The POSIX standard says that `awk' always uses the period as the decimal point when reading the `awk' program source code, and for command-line variable assignments (*note Other Arguments::). However, when interpreting input data, for `print' and `printf' output, and for number to string conversion, the local decimal point character is used. -(d.c.) Here are some examples indicating the difference in behavior, -on a GNU/Linux system: +(d.c.) In all cases, numbers in source code and in input data cannot +have a thousands separator. Here are some examples indicating the +difference in behavior, on a GNU/Linux system: $ export POSIXLY_CORRECT=1 Force POSIX behavior $ gawk 'BEGIN { printf "%g\n", 3.1415927 }' @@ -7475,9 +7475,9 @@ example: print (a " " (a = "panic")) } -It is not defined whether the assignment to `a' happens before or after -the value of `a' is retrieved for producing the concatenated value. -The result could be either `don't panic', or `panic panic'. +It is not defined whether the second assignment to `a' happens before +or after the value of `a' is retrieved for producing the concatenated +value. The result could be either `don't panic', or `panic panic'. The precedence of concatenation, when mixed with other operators, is often counter-intuitive. Consider this example: @@ -7550,9 +7550,9 @@ that the assignment stores in the specified variable, field, or array element. (Such values are called "rvalues".) It is important to note that variables do _not_ have permanent types. -A variable's type is simply the type of whatever value it happens to -hold at the moment. In the following program fragment, the variable -`foo' has a numeric value at first, and a string value later on: +A variable's type is simply the type of whatever value was last assigned +to it. In the following program fragment, the variable `foo' has a +numeric value at first, and a string value later on: foo = 1 print foo @@ -7625,9 +7625,10 @@ The indices of `bar' are practically guaranteed to be different, because the `rand()' function haven't been covered yet. *Note Arrays::, and see *note Numeric Functions::, for more information). This example illustrates an important fact about assignment operators: the lefthand -expression is only evaluated _once_. It is up to the implementation as -to which expression is evaluated first, the lefthand or the righthand. -Consider this example: +expression is only evaluated _once_. + + It is up to the implementation as to which expression is evaluated +first, the lefthand or the righthand. Consider this example: i = 1 a[i += 2] = i + 1 @@ -7640,14 +7641,14 @@ converted to a number. Operator Effect -------------------------------------------------------------------------- -LVALUE `+=' INCREMENT Adds INCREMENT to the value of LVALUE. -LVALUE `-=' DECREMENT Subtracts DECREMENT from the value of LVALUE. -LVALUE `*=' Multiplies the value of LVALUE by COEFFICIENT. +LVALUE `+=' INCREMENT Add INCREMENT to the value of LVALUE. +LVALUE `-=' DECREMENT Subtract DECREMENT from the value of LVALUE. +LVALUE `*=' Multiply the value of LVALUE by COEFFICIENT. COEFFICIENT -LVALUE `/=' DIVISOR Divides the value of LVALUE by DIVISOR. -LVALUE `%=' MODULUS Sets LVALUE to its remainder by MODULUS. +LVALUE `/=' DIVISOR Divide the value of LVALUE by DIVISOR. +LVALUE `%=' MODULUS Set LVALUE to its remainder by MODULUS. LVALUE `^=' POWER -LVALUE `**=' POWER Raises LVALUE to the power POWER. (c.e.) +LVALUE `**=' POWER Raise LVALUE to the power POWER. (c.e.) Table 6.2: Arithmetic Assignment Operators @@ -7670,8 +7671,8 @@ A workaround is: awk '/[=]=/' /dev/null - `gawk' does not have this problem, nor do the other freely available -versions described in *note Other Versions::. + `gawk' does not have this problem; Brian Kernighan's `awk' and +`mawk' also do not (*note Other Versions::). File: gawk.info, Node: Increment Ops, Prev: Assignment Ops, Up: All Operators @@ -7686,13 +7687,13 @@ they are convenient abbreviations for very common operations. The operator used for adding one is written `++'. It can be used to increment a variable either before or after taking its value. To -pre-increment a variable `v', write `++v'. This adds one to the value -of `v'--that new value is also the value of the expression. (The +"pre-increment" a variable `v', write `++v'. This adds one to the +value of `v'--that new value is also the value of the expression. (The assignment expression `v += 1' is completely equivalent.) Writing the -`++' after the variable specifies post-increment. This increments the -variable value just the same; the difference is that the value of the -increment expression itself is the variable's _old_ value. Thus, if -`foo' has the value four, then the expression `foo++' has the value +`++' after the variable specifies "post-increment". This increments +the variable value just the same; the difference is that the value of +the increment expression itself is the variable's _old_ value. Thus, +if `foo' has the value four, then the expression `foo++' has the value four, but it changes the value of `foo' to five. In other words, the operator returns the old value of the variable, but with the side effect of incrementing it. @@ -7839,10 +7840,12 @@ The 1992 POSIX standard introduced the concept of a "numeric string", which is simply a string that looks like a number--for example, `" +2"'. This concept is used for determining the type of a variable. The type of the variable is important because the types of two variables -determine how they are compared. The various versions of the POSIX -standard did not get the rules quite right for several editions. -Fortunately, as of at least the 2008 standard (and possibly earlier), -the standard has been fixed, and variable typing follows these rules:(1) +determine how they are compared. + + The various versions of the POSIX standard did not get the rules +quite right for several editions. Fortunately, as of at least the 2008 +standard (and possibly earlier), the standard has been fixed, and +variable typing follows these rules:(1) * A numeric constant or the result of a numeric operation has the NUMERIC attribute. @@ -7899,10 +7902,9 @@ comparison is performed. characters, and so is first and foremost of STRING type; input strings that look numeric are additionally given the STRNUM attribute. Thus, the six-character input string ` +3.14' receives the STRNUM attribute. -In contrast, the eight-character literal `" +3.14"' appearing in -program text is a string constant. The following examples print `1' -when the comparison between the two different constants is true, `0' -otherwise: +In contrast, the eight characters `" +3.14"' appearing in program text +comprise a string constant. The following examples print `1' when the +comparison between the two different constants is true, `0' otherwise: $ echo ' +3.14' | gawk '{ print $0 == " +3.14" }' True -| 1 @@ -8045,9 +8047,10 @@ File: gawk.info, Node: POSIX String Comparison, Prev: Comparison Operators, U .......................................... The POSIX standard says that string comparison is performed based on -the locale's collating order. This is usually very different from the -results obtained when doing straight character-by-character -comparison.(1) +the locale's "collating order". This is the order in which characters +sort, as defined by the locale (for more discussion, *note Ranges and +Locales::). This order is usually very different from the results +obtained when doing straight character-by-character comparison.(1) Because this behavior differs considerably from existing practice, `gawk' only implements it when in POSIX mode (*note Options::). Here @@ -8199,7 +8202,7 @@ not. *Note Arrays::, for more information about arrays. continued simply by putting a newline after either character. However, putting a newline in front of either character does not work without using backslash continuation (*note Statements/Lines::). If `--posix' -is specified (*note Options::), then this extension is disabled. +is specified (*note Options::), this extension is disabled. File: gawk.info, Node: Function Calls, Next: Precedence, Prev: Truth Values and Conditions, Up: Expressions @@ -8216,6 +8219,8 @@ available in every `awk' program. The `sqrt()' function is one of these. *Note Built-in::, for a list of built-in functions and their descriptions. In addition, you can define functions for use in your program. *Note User-defined::, for instructions on how to do this. +Finally, `gawk' lets you write functions in C or C++ that may be called +from your program: see *note Dynamic Extensions::. The way to use a function is with a "function call" expression, which consists of the function name followed immediately by a list of @@ -8255,11 +8260,12 @@ which is a way to choose the function to call at runtime, instead of when you write the source code to your program. We defer discussion of this feature until later; see *note Indirect Calls::. - Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of `sqrt(ARGUMENT)' is the square root of ARGUMENT. -The following program reads numbers, one number per line, and prints the -square root of each one: + Like every other expression, the function call has a value, often +called the "return value", which is computed by the function based on +the arguments you give it. In this example, the return value of +`sqrt(ARGUMENT)' is the square root of ARGUMENT. The following program +reads numbers, one number per line, and prints the square root of each +one: $ awk '{ print "The square root of", $1, "is", sqrt($1) }' 1 @@ -8472,10 +8478,10 @@ summary of the types of `awk' patterns: A single expression. It matches when its value is nonzero (if a number) or non-null (if a string). (*Note Expression Patterns::.) -`PAT1, PAT2' +`BEGPAT, ENDPAT' A pair of patterns separated by a comma, specifying a range of records. The range includes both the initial record that matches - PAT1 and the final record that matches PAT2. (*Note Ranges::.) + BEGPAT and the final record that matches ENDPAT. (*Note Ranges::.) `BEGIN' `END' @@ -8485,7 +8491,7 @@ summary of the types of `awk' patterns: `BEGINFILE' `ENDFILE' Special patterns for you to supply startup or cleanup actions to be - done on a per file basis. (*Note BEGINFILE/ENDFILE::.) + done on a per-file basis. (*Note BEGINFILE/ENDFILE::.) `EMPTY' The empty pattern matches every input record. (*Note Empty::.) @@ -8605,7 +8611,7 @@ record. When a record matches BEGPAT, the range pattern is "turned on" and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches ENDPAT against every input -record; when this succeeds, the range pattern is turned off again for +record; when this succeeds, the range pattern is "turned off" again for the following record. Then the range pattern goes back to checking BEGPAT against each record. @@ -8737,10 +8743,10 @@ File: gawk.info, Node: I/O And BEGIN/END, Prev: Using BEGIN/END, Up: BEGIN/EN 7.1.4.2 Input/Output from `BEGIN' and `END' Rules ................................................. -There are several (sometimes subtle) points to remember when doing I/O -from a `BEGIN' or `END' rule. The first has to do with the value of -`$0' in a `BEGIN' rule. Because `BEGIN' rules are executed before any -input is read, there simply is no input record, and therefore no +There are several (sometimes subtle) points to be aware of when doing +I/O from a `BEGIN' or `END' rule. The first has to do with the value +of `$0' in a `BEGIN' rule. Because `BEGIN' rules are executed before +any input is read, there simply is no input record, and therefore no fields, when executing `BEGIN' rules. References to `$0' and the fields yield a null string or zero, depending upon the context. One way to give `$0' a real value is to execute a `getline' command without a @@ -8808,10 +8814,10 @@ tasks that would otherwise be difficult or impossible to perform: entirely. Otherwise, `gawk' exits with the usual fatal error. * If you have written extensions that modify the record handling (by - inserting an "input parser"), you can invoke them at this point, - before `gawk' has started processing the file. (This is a _very_ - advanced feature, currently used only by the `gawkextlib' project - (http://gawkextlib.sourceforge.net).) + inserting an "input parser," *note Input Parsers::), you can invoke + them at this point, before `gawk' has started processing the file. + (This is a _very_ advanced feature, currently used only by the + `gawkextlib' project (http://gawkextlib.sourceforge.net).) The `ENDFILE' rule is called when `gawk' has finished processing the last record in an input file. For the last input file, it will be @@ -8863,15 +8869,15 @@ to get the value of the shell variable into the body of the `awk' program. The most common method is to use shell quoting to substitute the -variable's value into the program inside the script. For example, in -the following program: +variable's value into the program inside the script. For example, +consider the following program: printf "Enter search pattern: " read pattern awk "/$pattern/ "'{ nmatches++ } END { print nmatches, "found" }' /path/to/data -the `awk' program consists of two pieces of quoted text that are +The `awk' program consists of two pieces of quoted text that are concatenated together to form the program. The first part is double-quoted, which allows substitution of the `pattern' shell variable inside the quotes. The second part is single-quoted. @@ -8883,7 +8889,7 @@ quotes when reading the program. A better method is to use `awk''s variable assignment feature (*note Assignment Options::) to assign the shell variable's value to an `awk' -variable's value. Then use dynamic regexps to match the pattern (*note +variable. Then use dynamic regexps to match the pattern (*note Computed Regexps::). The following shows how to redo the previous example using this technique: @@ -8945,9 +8951,9 @@ Control statements well as a few special ones (*note Statements::). Compound statements - Consist of one or more statements enclosed in curly braces. A - compound statement is used in order to put several statements - together in the body of an `if', `while', `do', or `for' statement. + Enclose one or more statements in curly braces. A compound + statement is used in order to put several statements together in + the body of an `if', `while', `do', or `for' statement. Input statements Use the `getline' command (*note Getline::). Also supplied in @@ -9210,7 +9216,8 @@ File: gawk.info, Node: Switch Statement, Next: Break Statement, Prev: For Sta 7.4.5 The `switch' Statement ---------------------------- -This minor node describes a `gawk'-specific feature. +This minor node describes a `gawk'-specific feature. If `gawk' is in +compatibility mode (*note Options::), it is not available. The `switch' statement allows the evaluation of an expression and the execution of statements based on a `case' match. Case statements @@ -9261,9 +9268,6 @@ is executed and then falls through into the `default' section, executing its `print' statement. In turn, the -1 case will also be executed since the `default' does not halt execution. - This `switch' statement is a `gawk' extension. If `gawk' is in -compatibility mode (*note Options::), it is not available. - File: gawk.info, Node: Break Statement, Next: Continue Statement, Prev: Switch Statement, Up: Statements @@ -9276,15 +9280,15 @@ divisor of any integer, and also identifies prime numbers: # find smallest divisor of num { - num = $1 - for (div = 2; div * div <= num; div++) { - if (num % div == 0) - break - } - if (num % div == 0) - printf "Smallest divisor of %d is %d\n", num, div - else - printf "%d is prime\n", num + num = $1 + for (div = 2; div * div <= num; div++) { + if (num % div == 0) + break + } + if (num % div == 0) + printf "Smallest divisor of %d is %d\n", num, div + else + printf "%d is prime\n", num } When the remainder is zero in the first `if' statement, `awk' @@ -9299,17 +9303,17 @@ Statement::.) # find smallest divisor of num { - num = $1 - for (div = 2; ; div++) { - if (num % div == 0) { - printf "Smallest divisor of %d is %d\n", num, div - break - } - if (div * div > num) { - printf "%d is prime\n", num - break + num = $1 + for (div = 2; ; div++) { + if (num % div == 0) { + printf "Smallest divisor of %d is %d\n", num, div + break + } + if (div * div > num) { + printf "%d is prime\n", num + break + } } - } } The `break' statement is also used to break out of the `switch' @@ -9420,7 +9424,7 @@ rules. *Note BEGINFILE/ENDFILE::. According to the POSIX standard, the behavior is undefined if the `next' statement is used in a `BEGIN' or `END' rule. `gawk' treats it -as a syntax error. Although POSIX permits it, some other `awk' +as a syntax error. Although POSIX permits it, most other `awk' implementations don't allow the `next' statement inside function bodies (*note User-defined::). Just as with any other `next' statement, a `next' statement inside a function body reads the next record and @@ -9524,12 +9528,12 @@ with a nonzero status. An `awk' program can do this using an `exit' statement with a nonzero argument, as shown in the following example: BEGIN { - if (("date" | getline date_now) <= 0) { - print "Can't get system date" > "/dev/stderr" - exit 1 - } - print "current date is", date_now - close("date") + if (("date" | getline date_now) <= 0) { + print "Can't get system date" > "/dev/stderr" + exit 1 + } + print "current date is", date_now + close("date") } NOTE: For full portability, exit values should be between zero and @@ -27806,7 +27810,7 @@ Item Limit -------------------------------------------------------------------------- Characters in a character 2^(number of bits per byte) class -Length of input record `MAX_INT ' +Length of input record `MAX_INT' Length of output record Unlimited Length of source line Unlimited Number of fields in a record `MAX_LONG' @@ -27819,9 +27823,9 @@ Number of pipe redirections min(number of processes per user, number of open files) Numeric values Double-precision floating point (if not using MPFR) -Size of a field `MAX_INT ' -Size of a literal string `MAX_INT ' -Size of a printf string `MAX_INT ' +Size of a field `MAX_INT' +Size of a literal string `MAX_INT' +Size of a printf string `MAX_INT' File: gawk.info, Node: Extension Design, Next: Old Extension Mechanism, Prev: Implementation Limitations, Up: Notes @@ -30138,7 +30142,7 @@ Index * $ (dollar sign), regexp operator: Regexp Operators. (line 35) * % (percent sign), % operator: Precedence. (line 55) * % (percent sign), %= operator <1>: Precedence. (line 95) -* % (percent sign), %= operator: Assignment Ops. (line 129) +* % (percent sign), %= operator: Assignment Ops. (line 130) * & (ampersand), && operator <1>: Precedence. (line 86) * & (ampersand), && operator: Boolean Ops. (line 57) * & (ampersand), gsub()/gensub()/sub() functions and: Gory Details. @@ -30159,9 +30163,9 @@ Index * * (asterisk), ** operator <1>: Precedence. (line 49) * * (asterisk), ** operator: Arithmetic Ops. (line 81) * * (asterisk), **= operator <1>: Precedence. (line 95) -* * (asterisk), **= operator: Assignment Ops. (line 129) +* * (asterisk), **= operator: Assignment Ops. (line 130) * * (asterisk), *= operator <1>: Precedence. (line 95) -* * (asterisk), *= operator: Assignment Ops. (line 129) +* * (asterisk), *= operator: Assignment Ops. (line 130) * + (plus sign), + operator: Precedence. (line 52) * + (plus sign), ++ operator <1>: Precedence. (line 46) * + (plus sign), ++ operator: Increment Ops. (line 11) @@ -30173,7 +30177,7 @@ Index * - (hyphen), -- operator <1>: Precedence. (line 46) * - (hyphen), -- operator: Increment Ops. (line 48) * - (hyphen), -= operator <1>: Precedence. (line 95) -* - (hyphen), -= operator: Assignment Ops. (line 129) +* - (hyphen), -= operator: Assignment Ops. (line 130) * - (hyphen), filenames beginning with: Options. (line 59) * - (hyphen), in bracket expressions: Bracket Expressions. (line 17) * --assign option: Options. (line 32) @@ -30269,11 +30273,11 @@ Index * / (forward slash) to enclose regular expressions: Regexp. (line 10) * / (forward slash), / operator: Precedence. (line 55) * / (forward slash), /= operator <1>: Precedence. (line 95) -* / (forward slash), /= operator: Assignment Ops. (line 129) +* / (forward slash), /= operator: Assignment Ops. (line 130) * / (forward slash), /= operator, vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * / (forward slash), patterns and: Expression Patterns. (line 24) -* /= operator vs. /=.../ regexp constant: Assignment Ops. (line 147) +* /= operator vs. /=.../ regexp constant: Assignment Ops. (line 148) * /dev/... special files: Special FD. (line 46) * /dev/fd/N special files (gawk): Special FD. (line 46) * /inet/... special files (gawk): TCP/IP Networking. (line 6) @@ -30363,7 +30367,7 @@ Index * \ (backslash), regexp operator: Regexp Operators. (line 18) * ^ (caret), ^ operator: Precedence. (line 49) * ^ (caret), ^= operator <1>: Precedence. (line 95) -* ^ (caret), ^= operator: Assignment Ops. (line 129) +* ^ (caret), ^= operator: Assignment Ops. (line 130) * ^ (caret), in bracket expressions: Bracket Expressions. (line 17) * ^ (caret), in FS: Regexp Field Splitting. (line 59) @@ -30408,7 +30412,7 @@ Index * amazing awk assembler (aaa): Glossary. (line 12) * amazingly workable formatter (awf): Glossary. (line 25) * ambiguity, syntactic: /= operator vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * ampersand (&), && operator <1>: Precedence. (line 86) * ampersand (&), && operator: Boolean Ops. (line 57) * ampersand (&), gsub()/gensub()/sub() functions and: Gory Details. @@ -30440,7 +30444,7 @@ Index * arguments, command-line <2>: Auto-set. (line 11) * arguments, command-line: Other Arguments. (line 6) * arguments, command-line, invoking awk: Command Line. (line 6) -* arguments, in function calls: Function Calls. (line 16) +* arguments, in function calls: Function Calls. (line 18) * arguments, processing: Getopt Function. (line 6) * ARGV array, indexing into: Other Arguments. (line 12) * arithmetic operators: Arithmetic Ops. (line 6) @@ -30519,14 +30523,14 @@ Index * asterisk (*), ** operator <1>: Precedence. (line 49) * asterisk (*), ** operator: Arithmetic Ops. (line 81) * asterisk (*), **= operator <1>: Precedence. (line 95) -* asterisk (*), **= operator: Assignment Ops. (line 129) +* asterisk (*), **= operator: Assignment Ops. (line 130) * asterisk (*), *= operator <1>: Precedence. (line 95) -* asterisk (*), *= operator: Assignment Ops. (line 129) +* asterisk (*), *= operator: Assignment Ops. (line 130) * atan2: Numeric Functions. (line 11) * automatic displays, in debugger: Debugger Info. (line 24) * awf (amazingly workable formatter) program: Glossary. (line 25) * awk debugging, enabling: Options. (line 108) -* awk language, POSIX version: Assignment Ops. (line 136) +* awk language, POSIX version: Assignment Ops. (line 137) * awk profiling, enabling: Options. (line 242) * awk programs <1>: Two Rules. (line 6) * awk programs <2>: Executable Scripts. (line 6) @@ -30772,7 +30776,7 @@ Index * call stack, display in debugger: Execution Stack. (line 13) * caret (^), ^ operator: Precedence. (line 49) * caret (^), ^= operator <1>: Precedence. (line 95) -* caret (^), ^= operator: Assignment Ops. (line 129) +* caret (^), ^= operator: Assignment Ops. (line 130) * caret (^), in bracket expressions: Bracket Expressions. (line 17) * caret (^), regexp operator <1>: GNU Regexp Operators. (line 59) @@ -30853,7 +30857,7 @@ Index * commenting: Comments. (line 6) * commenting, backslash continuation and: Statements/Lines. (line 76) * common extensions, ** operator: Arithmetic Ops. (line 30) -* common extensions, **= operator: Assignment Ops. (line 136) +* common extensions, **= operator: Assignment Ops. (line 137) * common extensions, /dev/stderr special file: Special FD. (line 46) * common extensions, /dev/stdin special file: Special FD. (line 46) * common extensions, /dev/stdout special file: Special FD. (line 46) @@ -30957,7 +30961,7 @@ Index * dark corner: Conventions. (line 38) * dark corner, "0" is actually true: Truth Values. (line 24) * dark corner, /= operator vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * dark corner, ^, in FS: Regexp Field Splitting. (line 59) * dark corner, array subscripts: Uninitialized Subscripts. @@ -30983,14 +30987,14 @@ Index * dark corner, input files: awk split records. (line 110) * dark corner, invoking awk: Command Line. (line 16) * dark corner, length() function: String Functions. (line 180) -* dark corner, locale's decimal point character: Conversion. (line 77) +* dark corner, locale's decimal point character: Conversion. (line 75) * dark corner, multiline records: Multiple Line. (line 35) * dark corner, NF variable, decrementing: Changing Fields. (line 107) * dark corner, OFMT variable: OFMT. (line 27) * dark corner, regexp constants: Using Constant Regexps. (line 6) * dark corner, regexp constants, /= operator and: Assignment Ops. - (line 147) + (line 148) * dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps. (line 43) * dark corner, split() function: String Functions. (line 359) @@ -31374,7 +31378,7 @@ Index * extensions, Brian Kernighan's awk <1>: Common Extensions. (line 6) * extensions, Brian Kernighan's awk: BTL. (line 6) * extensions, common, ** operator: Arithmetic Ops. (line 30) -* extensions, common, **= operator: Assignment Ops. (line 136) +* extensions, common, **= operator: Assignment Ops. (line 137) * extensions, common, /dev/stderr special file: Special FD. (line 46) * extensions, common, /dev/stdin special file: Special FD. (line 46) * extensions, common, /dev/stdout special file: Special FD. (line 46) @@ -31532,9 +31536,9 @@ Index * forward slash (/) to enclose regular expressions: Regexp. (line 10) * forward slash (/), / operator: Precedence. (line 55) * forward slash (/), /= operator <1>: Precedence. (line 95) -* forward slash (/), /= operator: Assignment Ops. (line 129) +* forward slash (/), /= operator: Assignment Ops. (line 130) * forward slash (/), /= operator, vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * forward slash (/), patterns and: Expression Patterns. (line 24) * FPAT variable <1>: User-modified. (line 45) * FPAT variable: Splitting By Content. @@ -31812,7 +31816,7 @@ Index * hyphen (-), -- operator <1>: Precedence. (line 46) * hyphen (-), -- operator: Increment Ops. (line 48) * hyphen (-), -= operator <1>: Precedence. (line 95) -* hyphen (-), -= operator: Assignment Ops. (line 129) +* hyphen (-), -= operator: Assignment Ops. (line 130) * hyphen (-), filenames beginning with: Options. (line 59) * hyphen (-), in bracket expressions: Bracket Expressions. (line 17) * i debugger command (alias for info): Debugger Info. (line 13) @@ -32294,7 +32298,7 @@ Index * PC operating systems, gawk on, installing: PC Installation. (line 6) * percent sign (%), % operator: Precedence. (line 55) * percent sign (%), %= operator <1>: Precedence. (line 95) -* percent sign (%), %= operator: Assignment Ops. (line 129) +* percent sign (%), %= operator: Assignment Ops. (line 130) * period (.), regexp operator: Regexp Operators. (line 44) * Perl: Future Extensions. (line 6) * Peters, Arno: Contributors. (line 85) @@ -32317,7 +32321,7 @@ Index * portability: Escape Sequences. (line 94) * portability, #! (executable scripts): Executable Scripts. (line 33) * portability, ** operator and: Arithmetic Ops. (line 81) -* portability, **= operator and: Assignment Ops. (line 142) +* portability, **= operator and: Assignment Ops. (line 143) * portability, ARGV variable: Executable Scripts. (line 42) * portability, backslash continuation and: Statements/Lines. (line 30) * portability, backslash in escape sequences: Escape Sequences. @@ -32354,10 +32358,10 @@ Index * positional specifiers, printf statement, mixing with regular formats: Printf Ordering. (line 57) * positive zero: Unexpected Results. (line 34) -* POSIX awk <1>: Assignment Ops. (line 136) +* POSIX awk <1>: Assignment Ops. (line 137) * POSIX awk: This Manual. (line 14) * POSIX awk, ** operator and: Precedence. (line 98) -* POSIX awk, **= operator and: Assignment Ops. (line 142) +* POSIX awk, **= operator and: Assignment Ops. (line 143) * POSIX awk, < operator and: Getline/File. (line 26) * POSIX awk, arithmetic operators and: Arithmetic Ops. (line 30) * POSIX awk, backslashes in string constants: Escape Sequences. @@ -32547,7 +32551,7 @@ Index (line 102) * regexp constants <2>: Regexp Constants. (line 6) * regexp constants: Regexp Usage. (line 57) -* regexp constants, /=.../, /= operator and: Assignment Ops. (line 147) +* regexp constants, /=.../, /= operator and: Assignment Ops. (line 148) * regexp constants, as patterns: Expression Patterns. (line 34) * regexp constants, in gawk: Using Constant Regexps. (line 28) @@ -32744,7 +32748,7 @@ Index * side effects, conditional expressions: Conditional Exp. (line 22) * side effects, decrement/increment operators: Increment Ops. (line 11) * side effects, FILENAME variable: Getline Notes. (line 19) -* side effects, function calls: Function Calls. (line 54) +* side effects, function calls: Function Calls. (line 56) * side effects, statements: Action Overview. (line 32) * sidebar, A Constant's Base Does Not Affect Its Value: Nondecimal-numbers. (line 64) @@ -32770,7 +32774,7 @@ Index * sidebar, So Why Does gawk have BEGINFILE and ENDFILE?: Filetrans Function. (line 83) * sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops. - (line 145) + (line 146) * sidebar, Understanding $0: Changing Fields. (line 134) * sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps. (line 57) @@ -32918,7 +32922,7 @@ Index * switch statement: Switch Statement. (line 6) * SYMTAB array: Auto-set. (line 283) * syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops. - (line 147) + (line 148) * system: I/O Functions. (line 72) * systime: Time Functions. (line 66) * t debugger command (alias for tbreak): Breakpoint Control. (line 90) @@ -32990,7 +32994,7 @@ Index * troubleshooting, fatal errors, printf format strings: Format Modifiers. (line 159) * troubleshooting, fflush() function: I/O Functions. (line 60) -* troubleshooting, function call syntax: Function Calls. (line 28) +* troubleshooting, function call syntax: Function Calls. (line 30) * troubleshooting, gawk: Compatibility Mode. (line 6) * troubleshooting, gawk, bug reports: Bugs. (line 9) * troubleshooting, gawk, fatal errors, function arguments: Calling Built-in. @@ -33318,400 +33322,400 @@ Node: Values299487 Node: Constants300163 Node: Scalar Constants300843 Ref: Scalar Constants-Footnote-1301702 -Node: Nondecimal-numbers301884 -Node: Regexp Constants304884 -Node: Using Constant Regexps305359 -Node: Variables308414 -Node: Using Variables309069 -Node: Assignment Options310793 -Node: Conversion312668 -Ref: table-locale-affects318168 -Ref: Conversion-Footnote-1318792 -Node: All Operators318901 -Node: Arithmetic Ops319531 -Node: Concatenation322036 -Ref: Concatenation-Footnote-1324824 -Node: Assignment Ops324944 -Ref: table-assign-ops329932 -Node: Increment Ops331263 -Node: Truth Values and Conditions334697 -Node: Truth Values335780 -Node: Typing and Comparison336829 -Node: Variable Typing337622 -Ref: Variable Typing-Footnote-1341519 -Node: Comparison Operators341641 -Ref: table-relational-ops342051 -Node: POSIX String Comparison345599 -Ref: POSIX String Comparison-Footnote-1346555 -Node: Boolean Ops346693 -Ref: Boolean Ops-Footnote-1350763 -Node: Conditional Exp350854 -Node: Function Calls352586 -Node: Precedence356180 -Node: Locales359849 -Node: Patterns and Actions360938 -Node: Pattern Overview361992 -Node: Regexp Patterns363661 -Node: Expression Patterns364204 -Node: Ranges367985 -Node: BEGIN/END371089 -Node: Using BEGIN/END371851 -Ref: Using BEGIN/END-Footnote-1374587 -Node: I/O And BEGIN/END374693 -Node: BEGINFILE/ENDFILE376975 -Node: Empty379889 -Node: Using Shell Variables380206 -Node: Action Overview382491 -Node: Statements384848 -Node: If Statement386702 -Node: While Statement388201 -Node: Do Statement390245 -Node: For Statement391401 -Node: Switch Statement394553 -Node: Break Statement396707 -Node: Continue Statement398697 -Node: Next Statement400490 -Node: Nextfile Statement402880 -Node: Exit Statement405535 -Node: Built-in Variables407951 -Node: User-modified409046 -Ref: User-modified-Footnote-1417404 -Node: Auto-set417466 -Ref: Auto-set-Footnote-1430925 -Ref: Auto-set-Footnote-2431130 -Node: ARGC and ARGV431186 -Node: Arrays435040 -Node: Array Basics436545 -Node: Array Intro437371 -Node: Reference to Elements441688 -Node: Assigning Elements443958 -Node: Array Example444449 -Node: Scanning an Array446181 -Node: Controlling Scanning448495 -Ref: Controlling Scanning-Footnote-1453582 -Node: Delete453898 -Ref: Delete-Footnote-1456663 -Node: Numeric Array Subscripts456720 -Node: Uninitialized Subscripts458903 -Node: Multidimensional460530 -Node: Multiscanning463623 -Node: Arrays of Arrays465212 -Node: Functions469852 -Node: Built-in470671 -Node: Calling Built-in471749 -Node: Numeric Functions473737 -Ref: Numeric Functions-Footnote-1477571 -Ref: Numeric Functions-Footnote-2477928 -Ref: Numeric Functions-Footnote-3477976 -Node: String Functions478245 -Ref: String Functions-Footnote-1501248 -Ref: String Functions-Footnote-2501377 -Ref: String Functions-Footnote-3501625 -Node: Gory Details501712 -Ref: table-sub-escapes503391 -Ref: table-sub-posix-92504745 -Ref: table-sub-proposed506096 -Ref: table-posix-sub507450 -Ref: table-gensub-escapes508995 -Ref: Gory Details-Footnote-1510171 -Ref: Gory Details-Footnote-2510222 -Node: I/O Functions510373 -Ref: I/O Functions-Footnote-1517369 -Node: Time Functions517516 -Ref: Time Functions-Footnote-1528509 -Ref: Time Functions-Footnote-2528577 -Ref: Time Functions-Footnote-3528735 -Ref: Time Functions-Footnote-4528846 -Ref: Time Functions-Footnote-5528958 -Ref: Time Functions-Footnote-6529185 -Node: Bitwise Functions529451 -Ref: table-bitwise-ops530013 -Ref: Bitwise Functions-Footnote-1534258 -Node: Type Functions534442 -Node: I18N Functions535593 -Node: User-defined537245 -Node: Definition Syntax538049 -Ref: Definition Syntax-Footnote-1542963 -Node: Function Example543032 -Ref: Function Example-Footnote-1545681 -Node: Function Caveats545703 -Node: Calling A Function546221 -Node: Variable Scope547176 -Node: Pass By Value/Reference550139 -Node: Return Statement553647 -Node: Dynamic Typing556628 -Node: Indirect Calls557559 -Node: Library Functions567246 -Ref: Library Functions-Footnote-1570759 -Ref: Library Functions-Footnote-2570902 -Node: Library Names571073 -Ref: Library Names-Footnote-1574546 -Ref: Library Names-Footnote-2574766 -Node: General Functions574852 -Node: Strtonum Function575880 -Node: Assert Function578810 -Node: Round Function582136 -Node: Cliff Random Function583677 -Node: Ordinal Functions584693 -Ref: Ordinal Functions-Footnote-1587770 -Ref: Ordinal Functions-Footnote-2588022 -Node: Join Function588233 -Ref: Join Function-Footnote-1590004 -Node: Getlocaltime Function590204 -Node: Readfile Function593945 -Node: Data File Management595784 -Node: Filetrans Function596416 -Node: Rewind Function600485 -Node: File Checking601872 -Node: Empty Files602966 -Node: Ignoring Assigns605196 -Node: Getopt Function606750 -Ref: Getopt Function-Footnote-1618053 -Node: Passwd Functions618256 -Ref: Passwd Functions-Footnote-1627234 -Node: Group Functions627322 -Node: Walking Arrays635406 -Node: Sample Programs637542 -Node: Running Examples638216 -Node: Clones638944 -Node: Cut Program640168 -Node: Egrep Program650019 -Ref: Egrep Program-Footnote-1657792 -Node: Id Program657902 -Node: Split Program661551 -Ref: Split Program-Footnote-1665070 -Node: Tee Program665198 -Node: Uniq Program668001 -Node: Wc Program675430 -Ref: Wc Program-Footnote-1679696 -Ref: Wc Program-Footnote-2679896 -Node: Miscellaneous Programs679988 -Node: Dupword Program681176 -Node: Alarm Program683207 -Node: Translate Program688014 -Ref: Translate Program-Footnote-1692401 -Ref: Translate Program-Footnote-2692649 -Node: Labels Program692783 -Ref: Labels Program-Footnote-1696154 -Node: Word Sorting696238 -Node: History Sorting700122 -Node: Extract Program701961 -Ref: Extract Program-Footnote-1709464 -Node: Simple Sed709592 -Node: Igawk Program712654 -Ref: Igawk Program-Footnote-1727825 -Ref: Igawk Program-Footnote-2728026 -Node: Anagram Program728164 -Node: Signature Program731232 -Node: Advanced Features732332 -Node: Nondecimal Data734218 -Node: Array Sorting735801 -Node: Controlling Array Traversal736498 -Node: Array Sorting Functions744782 -Ref: Array Sorting Functions-Footnote-1748651 -Node: Two-way I/O748845 -Ref: Two-way I/O-Footnote-1754277 -Node: TCP/IP Networking754359 -Node: Profiling757203 -Node: Internationalization764706 -Node: I18N and L10N766131 -Node: Explaining gettext766817 -Ref: Explaining gettext-Footnote-1771885 -Ref: Explaining gettext-Footnote-2772069 -Node: Programmer i18n772234 -Node: Translator i18n776461 -Node: String Extraction777255 -Ref: String Extraction-Footnote-1778216 -Node: Printf Ordering778302 -Ref: Printf Ordering-Footnote-1781084 -Node: I18N Portability781148 -Ref: I18N Portability-Footnote-1783597 -Node: I18N Example783660 -Ref: I18N Example-Footnote-1786298 -Node: Gawk I18N786370 -Node: Debugger786991 -Node: Debugging787962 -Node: Debugging Concepts788395 -Node: Debugging Terms790251 -Node: Awk Debugging792848 -Node: Sample Debugging Session793740 -Node: Debugger Invocation794260 -Node: Finding The Bug795593 -Node: List of Debugger Commands802080 -Node: Breakpoint Control803414 -Node: Debugger Execution Control807078 -Node: Viewing And Changing Data810438 -Node: Execution Stack813794 -Node: Debugger Info815261 -Node: Miscellaneous Debugger Commands819255 -Node: Readline Support824433 -Node: Limitations825264 -Node: Arbitrary Precision Arithmetic827516 -Ref: Arbitrary Precision Arithmetic-Footnote-1829165 -Node: General Arithmetic829313 -Node: Floating Point Issues831033 -Node: String Conversion Precision831914 -Ref: String Conversion Precision-Footnote-1833619 -Node: Unexpected Results833728 -Node: POSIX Floating Point Problems835881 -Ref: POSIX Floating Point Problems-Footnote-1839706 -Node: Integer Programming839744 -Node: Floating-point Programming841483 -Ref: Floating-point Programming-Footnote-1847814 -Ref: Floating-point Programming-Footnote-2848084 -Node: Floating-point Representation848348 -Node: Floating-point Context849513 -Ref: table-ieee-formats850352 -Node: Rounding Mode851736 -Ref: table-rounding-modes852215 -Ref: Rounding Mode-Footnote-1855230 -Node: Gawk and MPFR855409 -Node: Arbitrary Precision Floats856818 -Ref: Arbitrary Precision Floats-Footnote-1859261 -Node: Setting Precision859577 -Ref: table-predefined-precision-strings860263 -Node: Setting Rounding Mode862408 -Ref: table-gawk-rounding-modes862812 -Node: Floating-point Constants863999 -Node: Changing Precision865428 -Ref: Changing Precision-Footnote-1866825 -Node: Exact Arithmetic866999 -Node: Arbitrary Precision Integers870137 -Ref: Arbitrary Precision Integers-Footnote-1873152 -Node: Dynamic Extensions873299 -Node: Extension Intro874757 -Node: Plugin License876022 -Node: Extension Mechanism Outline876707 -Ref: load-extension877124 -Ref: load-new-function878602 -Ref: call-new-function879597 -Node: Extension API Description881612 -Node: Extension API Functions Introduction882899 -Node: General Data Types887826 -Ref: General Data Types-Footnote-1893521 -Node: Requesting Values893820 -Ref: table-value-types-returned894557 -Node: Memory Allocation Functions895511 -Ref: Memory Allocation Functions-Footnote-1898257 -Node: Constructor Functions898353 -Node: Registration Functions900111 -Node: Extension Functions900796 -Node: Exit Callback Functions903098 -Node: Extension Version String904347 -Node: Input Parsers904997 -Node: Output Wrappers914754 -Node: Two-way processors919264 -Node: Printing Messages921472 -Ref: Printing Messages-Footnote-1922549 -Node: Updating `ERRNO'922701 -Node: Accessing Parameters923440 -Node: Symbol Table Access924670 -Node: Symbol table by name925184 -Node: Symbol table by cookie927160 -Ref: Symbol table by cookie-Footnote-1931292 -Node: Cached values931355 -Ref: Cached values-Footnote-1934845 -Node: Array Manipulation934936 -Ref: Array Manipulation-Footnote-1936034 -Node: Array Data Types936073 -Ref: Array Data Types-Footnote-1938776 -Node: Array Functions938868 -Node: Flattening Arrays942704 -Node: Creating Arrays949556 -Node: Extension API Variables954281 -Node: Extension Versioning954917 -Node: Extension API Informational Variables956818 -Node: Extension API Boilerplate957904 -Node: Finding Extensions961708 -Node: Extension Example962268 -Node: Internal File Description962998 -Node: Internal File Ops967089 -Ref: Internal File Ops-Footnote-1978598 -Node: Using Internal File Ops978738 -Ref: Using Internal File Ops-Footnote-1981085 -Node: Extension Samples981351 -Node: Extension Sample File Functions982875 -Node: Extension Sample Fnmatch991362 -Node: Extension Sample Fork993131 -Node: Extension Sample Inplace994344 -Node: Extension Sample Ord996122 -Node: Extension Sample Readdir996958 -Node: Extension Sample Revout998490 -Node: Extension Sample Rev2way999083 -Node: Extension Sample Read write array999773 -Node: Extension Sample Readfile1001656 -Node: Extension Sample API Tests1002756 -Node: Extension Sample Time1003281 -Node: gawkextlib1004645 -Node: Language History1007426 -Node: V7/SVR3.11009019 -Node: SVR41011339 -Node: POSIX1012781 -Node: BTL1014167 -Node: POSIX/GNU1014901 -Node: Feature History1020500 -Node: Common Extensions1033476 -Node: Ranges and Locales1034788 -Ref: Ranges and Locales-Footnote-11039405 -Ref: Ranges and Locales-Footnote-21039432 -Ref: Ranges and Locales-Footnote-31039666 -Node: Contributors1039887 -Node: Installation1045268 -Node: Gawk Distribution1046162 -Node: Getting1046646 -Node: Extracting1047472 -Node: Distribution contents1049164 -Node: Unix Installation1054885 -Node: Quick Installation1055502 -Node: Additional Configuration Options1057948 -Node: Configuration Philosophy1059684 -Node: Non-Unix Installation1062038 -Node: PC Installation1062496 -Node: PC Binary Installation1063795 -Node: PC Compiling1065643 -Node: PC Testing1068587 -Node: PC Using1069763 -Node: Cygwin1073931 -Node: MSYS1074740 -Node: VMS Installation1075254 -Node: VMS Compilation1076050 -Ref: VMS Compilation-Footnote-11077302 -Node: VMS Dynamic Extensions1077360 -Node: VMS Installation Details1078733 -Node: VMS Running1080984 -Node: VMS GNV1083818 -Node: VMS Old Gawk1084541 -Node: Bugs1085011 -Node: Other Versions1088929 -Node: Notes1095013 -Node: Compatibility Mode1095813 -Node: Additions1096596 -Node: Accessing The Source1097523 -Node: Adding Code1098963 -Node: New Ports1105008 -Node: Derived Files1109143 -Ref: Derived Files-Footnote-11114464 -Ref: Derived Files-Footnote-21114498 -Ref: Derived Files-Footnote-31115098 -Node: Future Extensions1115196 -Node: Implementation Limitations1115779 -Node: Extension Design1117031 -Node: Old Extension Problems1118185 -Ref: Old Extension Problems-Footnote-11119693 -Node: Extension New Mechanism Goals1119750 -Ref: Extension New Mechanism Goals-Footnote-11123115 -Node: Extension Other Design Decisions1123301 -Node: Extension Future Growth1125407 -Node: Old Extension Mechanism1126243 -Node: Basic Concepts1127983 -Node: Basic High Level1128664 -Ref: figure-general-flow1128936 -Ref: figure-process-flow1129535 -Ref: Basic High Level-Footnote-11132764 -Node: Basic Data Typing1132949 -Node: Glossary1136304 -Node: Copying1161535 -Node: GNU Free Documentation License1199091 -Node: Index1224227 +Node: Nondecimal-numbers301952 +Node: Regexp Constants304952 +Node: Using Constant Regexps305427 +Node: Variables308497 +Node: Using Variables309152 +Node: Assignment Options310876 +Node: Conversion312751 +Ref: table-locale-affects318187 +Ref: Conversion-Footnote-1318811 +Node: All Operators318920 +Node: Arithmetic Ops319550 +Node: Concatenation322055 +Ref: Concatenation-Footnote-1324851 +Node: Assignment Ops324971 +Ref: table-assign-ops329954 +Node: Increment Ops331271 +Node: Truth Values and Conditions334709 +Node: Truth Values335792 +Node: Typing and Comparison336841 +Node: Variable Typing337634 +Ref: Variable Typing-Footnote-1341534 +Node: Comparison Operators341656 +Ref: table-relational-ops342066 +Node: POSIX String Comparison345614 +Ref: POSIX String Comparison-Footnote-1346698 +Node: Boolean Ops346836 +Ref: Boolean Ops-Footnote-1350906 +Node: Conditional Exp350997 +Node: Function Calls352724 +Node: Precedence356482 +Node: Locales360151 +Node: Patterns and Actions361240 +Node: Pattern Overview362294 +Node: Regexp Patterns363971 +Node: Expression Patterns364514 +Node: Ranges368295 +Node: BEGIN/END371401 +Node: Using BEGIN/END372163 +Ref: Using BEGIN/END-Footnote-1374899 +Node: I/O And BEGIN/END375005 +Node: BEGINFILE/ENDFILE377290 +Node: Empty380226 +Node: Using Shell Variables380543 +Node: Action Overview382826 +Node: Statements385171 +Node: If Statement387025 +Node: While Statement388524 +Node: Do Statement390568 +Node: For Statement391724 +Node: Switch Statement394876 +Node: Break Statement396979 +Node: Continue Statement399034 +Node: Next Statement400827 +Node: Nextfile Statement403217 +Node: Exit Statement405872 +Node: Built-in Variables408274 +Node: User-modified409369 +Ref: User-modified-Footnote-1417727 +Node: Auto-set417789 +Ref: Auto-set-Footnote-1431248 +Ref: Auto-set-Footnote-2431453 +Node: ARGC and ARGV431509 +Node: Arrays435363 +Node: Array Basics436868 +Node: Array Intro437694 +Node: Reference to Elements442011 +Node: Assigning Elements444281 +Node: Array Example444772 +Node: Scanning an Array446504 +Node: Controlling Scanning448818 +Ref: Controlling Scanning-Footnote-1453905 +Node: Delete454221 +Ref: Delete-Footnote-1456986 +Node: Numeric Array Subscripts457043 +Node: Uninitialized Subscripts459226 +Node: Multidimensional460853 +Node: Multiscanning463946 +Node: Arrays of Arrays465535 +Node: Functions470175 +Node: Built-in470994 +Node: Calling Built-in472072 +Node: Numeric Functions474060 +Ref: Numeric Functions-Footnote-1477894 +Ref: Numeric Functions-Footnote-2478251 +Ref: Numeric Functions-Footnote-3478299 +Node: String Functions478568 +Ref: String Functions-Footnote-1501571 +Ref: String Functions-Footnote-2501700 +Ref: String Functions-Footnote-3501948 +Node: Gory Details502035 +Ref: table-sub-escapes503714 +Ref: table-sub-posix-92505068 +Ref: table-sub-proposed506419 +Ref: table-posix-sub507773 +Ref: table-gensub-escapes509318 +Ref: Gory Details-Footnote-1510494 +Ref: Gory Details-Footnote-2510545 +Node: I/O Functions510696 +Ref: I/O Functions-Footnote-1517692 +Node: Time Functions517839 +Ref: Time Functions-Footnote-1528832 +Ref: Time Functions-Footnote-2528900 +Ref: Time Functions-Footnote-3529058 +Ref: Time Functions-Footnote-4529169 +Ref: Time Functions-Footnote-5529281 +Ref: Time Functions-Footnote-6529508 +Node: Bitwise Functions529774 +Ref: table-bitwise-ops530336 +Ref: Bitwise Functions-Footnote-1534581 +Node: Type Functions534765 +Node: I18N Functions535916 +Node: User-defined537568 +Node: Definition Syntax538372 +Ref: Definition Syntax-Footnote-1543286 +Node: Function Example543355 +Ref: Function Example-Footnote-1546004 +Node: Function Caveats546026 +Node: Calling A Function546544 +Node: Variable Scope547499 +Node: Pass By Value/Reference550462 +Node: Return Statement553970 +Node: Dynamic Typing556951 +Node: Indirect Calls557882 +Node: Library Functions567569 +Ref: Library Functions-Footnote-1571082 +Ref: Library Functions-Footnote-2571225 +Node: Library Names571396 +Ref: Library Names-Footnote-1574869 +Ref: Library Names-Footnote-2575089 +Node: General Functions575175 +Node: Strtonum Function576203 +Node: Assert Function579133 +Node: Round Function582459 +Node: Cliff Random Function584000 +Node: Ordinal Functions585016 +Ref: Ordinal Functions-Footnote-1588093 +Ref: Ordinal Functions-Footnote-2588345 +Node: Join Function588556 +Ref: Join Function-Footnote-1590327 +Node: Getlocaltime Function590527 +Node: Readfile Function594268 +Node: Data File Management596107 +Node: Filetrans Function596739 +Node: Rewind Function600808 +Node: File Checking602195 +Node: Empty Files603289 +Node: Ignoring Assigns605519 +Node: Getopt Function607073 +Ref: Getopt Function-Footnote-1618376 +Node: Passwd Functions618579 +Ref: Passwd Functions-Footnote-1627557 +Node: Group Functions627645 +Node: Walking Arrays635729 +Node: Sample Programs637865 +Node: Running Examples638539 +Node: Clones639267 +Node: Cut Program640491 +Node: Egrep Program650342 +Ref: Egrep Program-Footnote-1658115 +Node: Id Program658225 +Node: Split Program661874 +Ref: Split Program-Footnote-1665393 +Node: Tee Program665521 +Node: Uniq Program668324 +Node: Wc Program675753 +Ref: Wc Program-Footnote-1680019 +Ref: Wc Program-Footnote-2680219 +Node: Miscellaneous Programs680311 +Node: Dupword Program681499 +Node: Alarm Program683530 +Node: Translate Program688337 +Ref: Translate Program-Footnote-1692724 +Ref: Translate Program-Footnote-2692972 +Node: Labels Program693106 +Ref: Labels Program-Footnote-1696477 +Node: Word Sorting696561 +Node: History Sorting700445 +Node: Extract Program702284 +Ref: Extract Program-Footnote-1709787 +Node: Simple Sed709915 +Node: Igawk Program712977 +Ref: Igawk Program-Footnote-1728148 +Ref: Igawk Program-Footnote-2728349 +Node: Anagram Program728487 +Node: Signature Program731555 +Node: Advanced Features732655 +Node: Nondecimal Data734541 +Node: Array Sorting736124 +Node: Controlling Array Traversal736821 +Node: Array Sorting Functions745105 +Ref: Array Sorting Functions-Footnote-1748974 +Node: Two-way I/O749168 +Ref: Two-way I/O-Footnote-1754600 +Node: TCP/IP Networking754682 +Node: Profiling757526 +Node: Internationalization765029 +Node: I18N and L10N766454 +Node: Explaining gettext767140 +Ref: Explaining gettext-Footnote-1772208 +Ref: Explaining gettext-Footnote-2772392 +Node: Programmer i18n772557 +Node: Translator i18n776784 +Node: String Extraction777578 +Ref: String Extraction-Footnote-1778539 +Node: Printf Ordering778625 +Ref: Printf Ordering-Footnote-1781407 +Node: I18N Portability781471 +Ref: I18N Portability-Footnote-1783920 +Node: I18N Example783983 +Ref: I18N Example-Footnote-1786621 +Node: Gawk I18N786693 +Node: Debugger787314 +Node: Debugging788285 +Node: Debugging Concepts788718 +Node: Debugging Terms790574 +Node: Awk Debugging793171 +Node: Sample Debugging Session794063 +Node: Debugger Invocation794583 +Node: Finding The Bug795916 +Node: List of Debugger Commands802403 +Node: Breakpoint Control803737 +Node: Debugger Execution Control807401 +Node: Viewing And Changing Data810761 +Node: Execution Stack814117 +Node: Debugger Info815584 +Node: Miscellaneous Debugger Commands819578 +Node: Readline Support824756 +Node: Limitations825587 +Node: Arbitrary Precision Arithmetic827839 +Ref: Arbitrary Precision Arithmetic-Footnote-1829488 +Node: General Arithmetic829636 +Node: Floating Point Issues831356 +Node: String Conversion Precision832237 +Ref: String Conversion Precision-Footnote-1833942 +Node: Unexpected Results834051 +Node: POSIX Floating Point Problems836204 +Ref: POSIX Floating Point Problems-Footnote-1840029 +Node: Integer Programming840067 +Node: Floating-point Programming841806 +Ref: Floating-point Programming-Footnote-1848137 +Ref: Floating-point Programming-Footnote-2848407 +Node: Floating-point Representation848671 +Node: Floating-point Context849836 +Ref: table-ieee-formats850675 +Node: Rounding Mode852059 +Ref: table-rounding-modes852538 +Ref: Rounding Mode-Footnote-1855553 +Node: Gawk and MPFR855732 +Node: Arbitrary Precision Floats857141 +Ref: Arbitrary Precision Floats-Footnote-1859584 +Node: Setting Precision859900 +Ref: table-predefined-precision-strings860586 +Node: Setting Rounding Mode862731 +Ref: table-gawk-rounding-modes863135 +Node: Floating-point Constants864322 +Node: Changing Precision865751 +Ref: Changing Precision-Footnote-1867148 +Node: Exact Arithmetic867322 +Node: Arbitrary Precision Integers870460 +Ref: Arbitrary Precision Integers-Footnote-1873475 +Node: Dynamic Extensions873622 +Node: Extension Intro875080 +Node: Plugin License876345 +Node: Extension Mechanism Outline877030 +Ref: load-extension877447 +Ref: load-new-function878925 +Ref: call-new-function879920 +Node: Extension API Description881935 +Node: Extension API Functions Introduction883222 +Node: General Data Types888149 +Ref: General Data Types-Footnote-1893844 +Node: Requesting Values894143 +Ref: table-value-types-returned894880 +Node: Memory Allocation Functions895834 +Ref: Memory Allocation Functions-Footnote-1898580 +Node: Constructor Functions898676 +Node: Registration Functions900434 +Node: Extension Functions901119 +Node: Exit Callback Functions903421 +Node: Extension Version String904670 +Node: Input Parsers905320 +Node: Output Wrappers915077 +Node: Two-way processors919587 +Node: Printing Messages921795 +Ref: Printing Messages-Footnote-1922872 +Node: Updating `ERRNO'923024 +Node: Accessing Parameters923763 +Node: Symbol Table Access924993 +Node: Symbol table by name925507 +Node: Symbol table by cookie927483 +Ref: Symbol table by cookie-Footnote-1931615 +Node: Cached values931678 +Ref: Cached values-Footnote-1935168 +Node: Array Manipulation935259 +Ref: Array Manipulation-Footnote-1936357 +Node: Array Data Types936396 +Ref: Array Data Types-Footnote-1939099 +Node: Array Functions939191 +Node: Flattening Arrays943027 +Node: Creating Arrays949879 +Node: Extension API Variables954604 +Node: Extension Versioning955240 +Node: Extension API Informational Variables957141 +Node: Extension API Boilerplate958227 +Node: Finding Extensions962031 +Node: Extension Example962591 +Node: Internal File Description963321 +Node: Internal File Ops967412 +Ref: Internal File Ops-Footnote-1978921 +Node: Using Internal File Ops979061 +Ref: Using Internal File Ops-Footnote-1981408 +Node: Extension Samples981674 +Node: Extension Sample File Functions983198 +Node: Extension Sample Fnmatch991685 +Node: Extension Sample Fork993454 +Node: Extension Sample Inplace994667 +Node: Extension Sample Ord996445 +Node: Extension Sample Readdir997281 +Node: Extension Sample Revout998813 +Node: Extension Sample Rev2way999406 +Node: Extension Sample Read write array1000096 +Node: Extension Sample Readfile1001979 +Node: Extension Sample API Tests1003079 +Node: Extension Sample Time1003604 +Node: gawkextlib1004968 +Node: Language History1007749 +Node: V7/SVR3.11009342 +Node: SVR41011662 +Node: POSIX1013104 +Node: BTL1014490 +Node: POSIX/GNU1015224 +Node: Feature History1020823 +Node: Common Extensions1033799 +Node: Ranges and Locales1035111 +Ref: Ranges and Locales-Footnote-11039728 +Ref: Ranges and Locales-Footnote-21039755 +Ref: Ranges and Locales-Footnote-31039989 +Node: Contributors1040210 +Node: Installation1045591 +Node: Gawk Distribution1046485 +Node: Getting1046969 +Node: Extracting1047795 +Node: Distribution contents1049487 +Node: Unix Installation1055208 +Node: Quick Installation1055825 +Node: Additional Configuration Options1058271 +Node: Configuration Philosophy1060007 +Node: Non-Unix Installation1062361 +Node: PC Installation1062819 +Node: PC Binary Installation1064118 +Node: PC Compiling1065966 +Node: PC Testing1068910 +Node: PC Using1070086 +Node: Cygwin1074254 +Node: MSYS1075063 +Node: VMS Installation1075577 +Node: VMS Compilation1076373 +Ref: VMS Compilation-Footnote-11077625 +Node: VMS Dynamic Extensions1077683 +Node: VMS Installation Details1079056 +Node: VMS Running1081307 +Node: VMS GNV1084141 +Node: VMS Old Gawk1084864 +Node: Bugs1085334 +Node: Other Versions1089252 +Node: Notes1095336 +Node: Compatibility Mode1096136 +Node: Additions1096919 +Node: Accessing The Source1097846 +Node: Adding Code1099286 +Node: New Ports1105331 +Node: Derived Files1109466 +Ref: Derived Files-Footnote-11114787 +Ref: Derived Files-Footnote-21114821 +Ref: Derived Files-Footnote-31115421 +Node: Future Extensions1115519 +Node: Implementation Limitations1116102 +Node: Extension Design1117350 +Node: Old Extension Problems1118504 +Ref: Old Extension Problems-Footnote-11120012 +Node: Extension New Mechanism Goals1120069 +Ref: Extension New Mechanism Goals-Footnote-11123434 +Node: Extension Other Design Decisions1123620 +Node: Extension Future Growth1125726 +Node: Old Extension Mechanism1126562 +Node: Basic Concepts1128302 +Node: Basic High Level1128983 +Ref: figure-general-flow1129255 +Ref: figure-process-flow1129854 +Ref: Basic High Level-Footnote-11133083 +Node: Basic Data Typing1133268 +Node: Glossary1136623 +Node: Copying1161854 +Node: GNU Free Documentation License1199410 +Node: Index1224546 End Tag Table diff --git a/doc/gawk.texi b/doc/gawk.texi index 32fce6be..bce3f203 100644 --- a/doc/gawk.texi +++ b/doc/gawk.texi @@ -9874,9 +9874,9 @@ have different forms, but are stored identically internally. A @dfn{numeric constant} stands for a number. This number can be an integer, a decimal fraction, or a number in scientific (exponential) notation.@footnote{The internal representation of all numbers, -including integers, uses double precision -floating-point numbers. -On most modern systems, these are in IEEE 754 standard format.} +including integers, uses double precision floating-point numbers. +On most modern systems, these are in IEEE 754 standard format. +@xref{Arbitrary Precision Arithmetic}, for much more information.} Here are some examples of numeric constants that all have the same value: @@ -10118,7 +10118,7 @@ upon the contents of the current input record. Constant regular expressions are also used as the first argument for the @code{gensub()}, @code{sub()}, and @code{gsub()} functions, as the second argument of the @code{match()} function, -and as the third argument of the @code{patsplit()} function +and as the third argument of the @code{split()} and @code{patsplit()} functions (@pxref{String Functions}). Modern implementations of @command{awk}, including @command{gawk}, allow the third argument of @code{split()} to be a regexp constant, but some @@ -10360,32 +10360,28 @@ specifies the output format to use when printing numbers with @code{print}. conversion from the semantics of printing. Both @code{CONVFMT} and @code{OFMT} have the same default value: @code{"%.6g"}. In the vast majority of cases, old @command{awk} programs do not change their behavior. -However, these semantics for @code{OFMT} are something to keep in mind if you must -port your new-style program to older implementations of @command{awk}. -We recommend -that instead of changing your programs, just port @command{gawk} itself. -@xref{Print}, -for more information on the @code{print} statement. - -And, once again, where you are can matter when it comes to converting -between numbers and strings. In @ref{Locales}, we mentioned that -the local character set and language (the locale) can affect how -@command{gawk} matches characters. The locale also affects numeric -formats. In particular, for @command{awk} programs, it affects the -decimal point character. The @code{"C"} locale, and most English-language -locales, use the period character (@samp{.}) as the decimal point. -However, many (if not most) European and non-English locales use the comma -(@samp{,}) as the decimal point character. +@xref{Print}, for more information on the @code{print} statement. + +Where you are can matter when it comes to converting between numbers and +strings. The local character set and language---the @dfn{locale}---can +affect numeric formats. In particular, for @command{awk} programs, +it affects the decimal point character and the thousands-separator +character. The @code{"C"} locale, and most English-language locales, +use the period character (@samp{.}) as the decimal point and don't +have a thousands separator. However, many (if not most) European and +non-English locales use the comma (@samp{,}) as the decimal point +character. European locales often use either a space or a period as +the thousands separator, if they have one. @cindex dark corner, locale's decimal point character The POSIX standard says that @command{awk} always uses the period as the decimal -point when reading the @command{awk} program source code, and for command-line -variable assignments (@pxref{Other Arguments}). -However, when interpreting input data, for @code{print} and @code{printf} output, -and for number to string conversion, the local decimal point character is used. -@value{DARKCORNER} -Here are some examples indicating the difference in behavior, -on a GNU/Linux system: +point when reading the @command{awk} program source code, and for +command-line variable assignments (@pxref{Other Arguments}). However, +when interpreting input data, for @code{print} and @code{printf} output, +and for number to string conversion, the local decimal point character +is used. @value{DARKCORNER} In all cases, numbers in source code and +in input data cannot have a thousands separator. Here are some examples +indicating the difference in behavior, on a GNU/Linux system: @example $ @kbd{export POSIXLY_CORRECT=1} @ii{Force POSIX behavior} @@ -10400,7 +10396,7 @@ $ @kbd{echo 4,321 | LC_ALL=en_DK.utf-8 gawk '@{ print $1 + 1 @}'} @end example @noindent -The @samp{en_DK.utf-8} locale is for English in Denmark, where the comma acts as +The @code{en_DK.utf-8} locale is for English in Denmark, where the comma acts as the decimal point separator. In the normal @code{"C"} locale, @command{gawk} treats @samp{4,321} as @samp{4}, while in the Danish locale, it's treated as the full number, 4.321. @@ -10547,7 +10543,7 @@ b * int(a / b) + (a % b) == a @end example One possibly undesirable effect of this definition of remainder is that -@code{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: +@samp{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: @example -17 % 8 = -1 @@ -10641,7 +10637,7 @@ BEGIN @{ @end example @noindent -It is not defined whether the assignment to @code{a} happens +It is not defined whether the second assignment to @code{a} happens before or after the value of @code{a} is retrieved for producing the concatenated value. The result could be either @samp{don't panic}, or @samp{panic panic}. @@ -10763,8 +10759,8 @@ element. (Such values are called @dfn{rvalues}.) @cindex variables, types of It is important to note that variables do @emph{not} have permanent types. -A variable's type is simply the type of whatever value it happens -to hold at the moment. In the following program fragment, the variable +A variable's type is simply the type of whatever value was last assigned +to it. In the following program fragment, the variable @code{foo} has a numeric value at first, and a string value later on: @example @@ -10865,6 +10861,7 @@ The indices of @code{bar} are practically guaranteed to be different, because and see @ref{Numeric Functions}, for more information). This example illustrates an important fact about assignment operators: the lefthand expression is only evaluated @emph{once}. + It is up to the implementation as to which expression is evaluated first, the lefthand or the righthand. Consider this example: @@ -10897,17 +10894,17 @@ to a number. @caption{Arithmetic Assignment Operators} @multitable @columnfractions .30 .70 @headitem Operator @tab Effect -@item @var{lvalue} @code{+=} @var{increment} @tab Adds @var{increment} to the value of @var{lvalue}. -@item @var{lvalue} @code{-=} @var{decrement} @tab Subtracts @var{decrement} from the value of @var{lvalue}. -@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiplies the value of @var{lvalue} by @var{coefficient}. -@item @var{lvalue} @code{/=} @var{divisor} @tab Divides the value of @var{lvalue} by @var{divisor}. -@item @var{lvalue} @code{%=} @var{modulus} @tab Sets @var{lvalue} to its remainder by @var{modulus}. +@item @var{lvalue} @code{+=} @var{increment} @tab Add @var{increment} to the value of @var{lvalue}. +@item @var{lvalue} @code{-=} @var{decrement} @tab Subtract @var{decrement} from the value of @var{lvalue}. +@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiply the value of @var{lvalue} by @var{coefficient}. +@item @var{lvalue} @code{/=} @var{divisor} @tab Divide the value of @var{lvalue} by @var{divisor}. +@item @var{lvalue} @code{%=} @var{modulus} @tab Set @var{lvalue} to its remainder by @var{modulus}. @cindex common extensions, @code{**=} operator @cindex extensions, common@comma{} @code{**=} operator @cindex @command{awk} language, POSIX version @cindex POSIX @command{awk} @item @var{lvalue} @code{^=} @var{power} @tab -@item @var{lvalue} @code{**=} @var{power} @tab Raises @var{lvalue} to the power @var{power}. @value{COMMONEXT} +@item @var{lvalue} @code{**=} @var{power} @tab Raise @var{lvalue} to the power @var{power}. @value{COMMONEXT} @end multitable @end float @@ -10957,10 +10954,8 @@ A workaround is: awk '/[=]=/' /dev/null @end example -@command{gawk} does not have this problem, -nor do the other -freely available versions described in -@ref{Other Versions}. +@command{gawk} does not have this problem; Brian Kernighan's @command{awk} +and @command{mawk} also do not (@pxref{Other Versions}). @docbook </sidebar> @@ -11005,10 +11000,8 @@ A workaround is: awk '/[=]=/' /dev/null @end example -@command{gawk} does not have this problem, -nor do the other -freely available versions described in -@ref{Other Versions}. +@command{gawk} does not have this problem; Brian Kernighan's @command{awk} +and @command{mawk} also do not (@pxref{Other Versions}). @end cartouche @end ifnotdocbook @c ENDOFRANGE exas @@ -11033,11 +11026,10 @@ are convenient abbreviations for very common operations. @cindex side effects, decrement/increment operators The operator used for adding one is written @samp{++}. It can be used to increment a variable either before or after taking its value. -To pre-increment a variable @code{v}, write @samp{++v}. This adds +To @dfn{pre-increment} a variable @code{v}, write @samp{++v}. This adds one to the value of @code{v}---that new value is also the value of the -expression. (The assignment expression @samp{v += 1} is completely -equivalent.) -Writing the @samp{++} after the variable specifies post-increment. This +expression. (The assignment expression @samp{v += 1} is completely equivalent.) +Writing the @samp{++} after the variable specifies @dfn{post-increment}. This increments the variable value just the same; the difference is that the value of the increment expression itself is the variable's @emph{old} value. Thus, if @code{foo} has the value four, then the expression @samp{foo++} @@ -11049,7 +11041,18 @@ The post-increment @samp{foo++} is nearly the same as writing @samp{(foo += 1) - 1}. It is not perfectly equivalent because all numbers in @command{awk} are floating-point---in floating-point, @samp{foo + 1 - 1} does not necessarily equal @code{foo}. But the difference is minute as -long as you stick to numbers that are fairly small (less than 10e12). +long as you stick to numbers that are fairly small (less than +@iftex +@math{10^12}). +@end iftex +@ifnottex +@ifnotdocbook +10e12). +@end ifnotdocbook +@end ifnottex +@docbook +10<superscript>12</superscript>). @c +@end docbook @cindex @code{$} (dollar sign), incrementing fields and arrays @cindex dollar sign (@code{$}), incrementing fields and arrays @@ -11295,6 +11298,7 @@ like a number---for example, @code{@w{" +2"}}. This concept is used for determining the type of a variable. The type of the variable is important because the types of two variables determine how they are compared. + The various versions of the POSIX standard did not get the rules quite right for several editions. Fortunately, as of at least the 2008 standard (and possibly earlier), the standard has been fixed, @@ -11388,6 +11392,7 @@ STRNUM &&string &numeric &numeric\cr }}} @end tex @ifnottex +@ifnotdocbook @display +---------------------------------------------- | STRING NUMERIC STRNUM @@ -11400,7 +11405,51 @@ NUMERIC | string numeric numeric STRNUM | string numeric numeric --------+---------------------------------------------- @end display +@end ifnotdocbook @end ifnottex +@docbook +<informaltable> +<tgroup cols="4"> +<colspec colname="1" align="left"/> +<colspec colname="2" align="left"/> +<colspec colname="3" align="left"/> +<colspec colname="4" align="left"/> +<thead> +<row> +<entry/> +<entry>STRING</entry> +<entry>NUMERIC</entry> +<entry>STRNUM</entry> +</row> +</thead> + +<tbody> +<row> +<entry><emphasis role="bold">STRING</emphasis></entry> +<entry>string</entry> +<entry>string</entry> +<entry>string</entry> +</row> + +<row> +<entry><emphasis role="bold">NUMERIC</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +<row> +<entry><emphasis role="bold">STRNUM</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +</tbody> +</tgroup> +</informaltable> + +@end docbook The basic idea is that user input that looks numeric---and @emph{only} user input---should be treated as numeric, even though it is actually @@ -11419,8 +11468,8 @@ This point bears additional emphasis: All user input is made of characters, and so is first and foremost of @var{string} type; input strings that look numeric are additionally given the @var{strnum} attribute. Thus, the six-character input string @w{@samp{ +3.14}} receives the -@var{strnum} attribute. In contrast, the eight-character literal -@w{@code{" +3.14"}} appearing in program text is a string constant. +@var{strnum} attribute. In contrast, the eight characters +@w{@code{" +3.14"}} appearing in program text comprise a string constant. The following examples print @samp{1} when the comparison between the two different constants is true, @samp{0} otherwise: @@ -11606,7 +11655,9 @@ where this is discussed in more detail. @subsubsection String Comparison With POSIX Rules The POSIX standard says that string comparison is performed based -on the locale's collating order. This is usually very different +on the locale's @dfn{collating order}. This is the order in which +characters sort, as defined by the locale (for more discussion, +@pxref{Ranges and Locales}). This order is usually very different from the results obtained when doing straight character-by-character comparison.@footnote{Technically, string comparison is supposed to behave the same way as if the strings are compared with the C @@ -11614,7 +11665,7 @@ to behave the same way as if the strings are compared with the C Because this behavior differs considerably from existing practice, @command{gawk} only implements it when in POSIX mode (@pxref{Options}). -Here is an example to illustrate the difference, in an @samp{en_US.UTF-8} +Here is an example to illustrate the difference, in an @code{en_US.UTF-8} locale: @example @@ -11830,7 +11881,7 @@ However, putting a newline in front of either character does not work without using backslash continuation (@pxref{Statements/Lines}). If @option{--posix} is specified -(@pxref{Options}), then this extension is disabled. +(@pxref{Options}), this extension is disabled. @node Function Calls @section Function Calls @@ -11849,6 +11900,8 @@ functions and their descriptions. In addition, you can define functions for use in your program. @xref{User-defined}, for instructions on how to do this. +Finally, @command{gawk} lets you write functions in C or C++ +that may be called from your program: see @ref{Dynamic Extensions}. @cindex arguments, in function calls The way to use a function is with a @dfn{function call} expression, @@ -11899,12 +11952,12 @@ when you write the source code to your program. We defer discussion of this feature until later; see @ref{Indirect Calls}. @cindex side effects, function calls -Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of @samp{sqrt(@var{argument})} is the square root of -@var{argument}. -The following program reads numbers, one number per line, and prints the -square root of each one: +Like every other expression, the function call has a value, often +called the @dfn{return value}, which is computed by the function +based on the arguments you give it. In this example, the return value +of @samp{sqrt(@var{argument})} is the square root of @var{argument}. +The following program reads numbers, one number per line, and prints +the square root of each one: @example $ @kbd{awk '@{ print "The square root of", $1, "is", sqrt($1) @}'} @@ -12219,10 +12272,10 @@ A single expression. It matches when its value is nonzero (if a number) or non-null (if a string). (@xref{Expression Patterns}.) -@item @var{pat1}, @var{pat2} +@item @var{begpat}, @var{endpat} A pair of patterns separated by a comma, specifying a range of records. -The range includes both the initial record that matches @var{pat1} and -the final record that matches @var{pat2}. +The range includes both the initial record that matches @var{begpat} and +the final record that matches @var{endpat}. (@xref{Ranges}.) @item BEGIN @@ -12234,7 +12287,7 @@ Special patterns for you to supply startup or cleanup actions for your @item BEGINFILE @itemx ENDFILE Special patterns for you to supply startup or cleanup actions to be -done on a per file basis. +done on a per-file basis. (@xref{BEGINFILE/ENDFILE}.) @item @var{empty} @@ -12395,7 +12448,7 @@ input record. When a record matches @var{begpat}, the range pattern is @dfn{turned on} and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches @var{endpat} against every -input record; when this succeeds, the range pattern is turned off again +input record; when this succeeds, the range pattern is @dfn{turned off} again for the following record. Then the range pattern goes back to checking @var{begpat} against each record. @@ -12549,7 +12602,7 @@ rule checks the @code{FNR} and @code{NR} variables. @subsubsection Input/Output from @code{BEGIN} and @code{END} Rules @cindex input/output, from @code{BEGIN} and @code{END} -There are several (sometimes subtle) points to remember when doing I/O +There are several (sometimes subtle) points to be aware of when doing I/O from a @code{BEGIN} or @code{END} rule. The first has to do with the value of @code{$0} in a @code{BEGIN} rule. Because @code{BEGIN} rules are executed before any input is read, @@ -12610,8 +12663,19 @@ This @value{SECTION} describes a @command{gawk}-specific feature. Two special kinds of rule, @code{BEGINFILE} and @code{ENDFILE}, give you ``hooks'' into @command{gawk}'s command-line file processing loop. -As with the @code{BEGIN} and @code{END} rules (@pxref{BEGIN/END}), all -@code{BEGINFILE} rules in a program are merged, in the order they are +As with the @code{BEGIN} and @code{END} rules +@ifnottex +@ifnotdocbook +(@pxref{BEGIN/END}), +@end ifnotdocbook +@end ifnottex +@iftex +(see the previous section), +@end iftex +@ifdocbook +(see the previous section), +@end ifdocbook +all @code{BEGINFILE} rules in a program are merged, in the order they are read by @command{gawk}, and all @code{ENDFILE} rules are merged as well. The body of the @code{BEGINFILE} rules is executed just before @@ -12639,10 +12703,11 @@ the file entirely. Otherwise, @command{gawk} exits with the usual fatal error. @item -If you have written extensions that modify the record handling (by inserting -an ``input parser''), you can invoke them at this point, before @command{gawk} -has started processing the file. (This is a @emph{very} advanced feature, -currently used only by the @uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) +If you have written extensions that modify the record handling (by +inserting an ``input parser,'' @pxref{Input Parsers}), you can invoke +them at this point, before @command{gawk} has started processing the file. +(This is a @emph{very} advanced feature, currently used only by the +@uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) @end itemize The @code{ENDFILE} rule is called when @command{gawk} has finished processing @@ -12725,7 +12790,7 @@ into the body of the @command{awk} program. @cindex shells, quoting The most common method is to use shell quoting to substitute the variable's value into the program inside the script. -For example, in the following program: +For example, consider the following program: @example printf "Enter search pattern: " @@ -12735,7 +12800,7 @@ awk "/$pattern/ "'@{ nmatches++ @} @end example @noindent -the @command{awk} program consists of two pieces of quoted text +The @command{awk} program consists of two pieces of quoted text that are concatenated together to form the program. The first part is double-quoted, which allows substitution of the @code{pattern} shell variable inside the quotes. @@ -12749,8 +12814,8 @@ match up the quotes when reading the program. A better method is to use @command{awk}'s variable assignment feature (@pxref{Assignment Options}) -to assign the shell variable's value to an @command{awk} variable's -value. Then use dynamic regexps to match the pattern +to assign the shell variable's value to an @command{awk} variable. +Then use dynamic regexps to match the pattern (@pxref{Computed Regexps}). The following shows how to redo the previous example using this technique: @@ -12803,7 +12868,7 @@ function @var{name}(@var{args}) @{ @dots{} @} @cindex @code{;} (semicolon), separating statements in actions @cindex semicolon (@code{;}), separating statements in actions An action consists of one or more @command{awk} @dfn{statements}, enclosed -in curly braces (@samp{@{@dots{}@}}). Each statement specifies one +in curly braces (@samp{@{@r{@dots{}}@}}). Each statement specifies one thing to do. The statements are separated by newlines or semicolons. The curly braces around an action must be used even if the action contains only one statement, or if it contains no statements at @@ -12833,10 +12898,9 @@ programs. The @command{awk} language gives you C-like constructs special ones (@pxref{Statements}). @item Compound statements -Consist of one or more statements enclosed in -curly braces. A compound statement is used in order to put several -statements together in the body of an @code{if}, @code{while}, @code{do}, -or @code{for} statement. +Enclose one or more statements in curly braces. A compound statement +is used in order to put several statements together in the body of an +@code{if}, @code{while}, @code{do}, or @code{for} statement. @item Input statements Use the @code{getline} command @@ -13170,6 +13234,8 @@ for more information on this version of the @code{for} loop. @cindex @code{default} keyword This @value{SECTION} describes a @command{gawk}-specific feature. +If @command{gawk} is in compatibility mode (@pxref{Options}), +it is not available. The @code{switch} statement allows the evaluation of an expression and the execution of statements based on a @code{case} match. Case statements @@ -13226,11 +13292,6 @@ the @code{print} statement is executed and then falls through into the the @minus{}1 case will also be executed since the @code{default} does not halt execution. -This @code{switch} statement is a @command{gawk} extension. -If @command{gawk} is in compatibility mode -(@pxref{Options}), -it is not available. - @node Break Statement @subsection The @code{break} Statement @cindex @code{break} statement @@ -13245,15 +13306,15 @@ numbers: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; div * div <= num; div++) @{ - if (num % div == 0) - break - @} - if (num % div == 0) - printf "Smallest divisor of %d is %d\n", num, div - else - printf "%d is prime\n", num + num = $1 + for (div = 2; div * div <= num; div++) @{ + if (num % div == 0) + break + @} + if (num % div == 0) + printf "Smallest divisor of %d is %d\n", num, div + else + printf "%d is prime\n", num @} @end example @@ -13271,17 +13332,17 @@ an @code{if}: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; ; div++) @{ - if (num % div == 0) @{ - printf "Smallest divisor of %d is %d\n", num, div - break - @} - if (div * div > num) @{ - printf "%d is prime\n", num - break + num = $1 + for (div = 2; ; div++) @{ + if (num % div == 0) @{ + printf "Smallest divisor of %d is %d\n", num, div + break + @} + if (div * div > num) @{ + printf "%d is prime\n", num + break + @} @} - @} @} @end example @@ -13430,16 +13491,14 @@ The @code{next} statement is not allowed inside @code{BEGINFILE} and @cindex POSIX @command{awk}, @code{next}/@code{nextfile} statements and @cindex @code{next} statement, user-defined functions and @cindex functions, user-defined, @code{next}/@code{nextfile} statements and -According to the POSIX standard, the behavior is undefined if -the @code{next} statement is used in a @code{BEGIN} or @code{END} rule. -@command{gawk} treats it as a syntax error. -Although POSIX permits it, -some other @command{awk} implementations don't allow the @code{next} -statement inside function bodies -(@pxref{User-defined}). -Just as with any other @code{next} statement, a @code{next} statement inside a -function body reads the next record and starts processing it with the -first rule in the program. +According to the POSIX standard, the behavior is undefined if the +@code{next} statement is used in a @code{BEGIN} or @code{END} rule. +@command{gawk} treats it as a syntax error. Although POSIX permits it, +most other @command{awk} implementations don't allow the @code{next} +statement inside function bodies (@pxref{User-defined}). Just as with any +other @code{next} statement, a @code{next} statement inside a function +body reads the next record and starts processing it with the first rule +in the program. @node Nextfile Statement @subsection The @code{nextfile} Statement @@ -13550,8 +13609,7 @@ status code for the @command{awk} process. If no argument is supplied, In the case where an argument is supplied to a first @code{exit} statement, and then @code{exit} is called a second time from an @code{END} rule with no argument, -@command{awk} uses the previously supplied exit value. -@value{DARKCORNER} +@command{awk} uses the previously supplied exit value. @value{DARKCORNER} @xref{Exit Status}, for more information. @cindex programming conventions, @code{exit} statement @@ -13563,12 +13621,12 @@ in the following example: @example BEGIN @{ - if (("date" | getline date_now) <= 0) @{ - print "Can't get system date" > "/dev/stderr" - exit 1 - @} - print "current date is", date_now - close("date") + if (("date" | getline date_now) <= 0) @{ + print "Can't get system date" > "/dev/stderr" + exit 1 + @} + print "current date is", date_now + close("date") @} @end example @@ -35095,7 +35153,7 @@ it on your system). @cindex Unicode Similar considerations apply to other ranges. For example, @samp{["-/]} is perfectly valid in ASCII, but is not valid in many Unicode locales, -such as @samp{en_US.UTF-8}. +such as @code{en_US.UTF-8}. Early versions of @command{gawk} used regexp matching code that was not locale aware, so ranges had their traditional interpretation. @@ -37545,7 +37603,7 @@ different limits. @multitable @columnfractions .40 .60 @headitem Item @tab Limit @item Characters in a character class @tab 2^(number of bits per byte) -@item Length of input record @tab @code{MAX_INT } +@item Length of input record @tab @code{MAX_INT} @item Length of output record @tab Unlimited @item Length of source line @tab Unlimited @item Number of fields in a record @tab @code{MAX_LONG} @@ -37554,9 +37612,9 @@ different limits. @item Number of input records total @tab @code{MAX_LONG} @item Number of pipe redirections @tab min(number of processes per user, number of open files) @item Numeric values @tab Double-precision floating point (if not using MPFR) -@item Size of a field @tab @code{MAX_INT } -@item Size of a literal string @tab @code{MAX_INT } -@item Size of a printf string @tab @code{MAX_INT } +@item Size of a field @tab @code{MAX_INT} +@item Size of a literal string @tab @code{MAX_INT} +@item Size of a printf string @tab @code{MAX_INT} @end multitable @node Extension Design diff --git a/doc/gawktexi.in b/doc/gawktexi.in index b7b0ee7d..ac3d0afe 100644 --- a/doc/gawktexi.in +++ b/doc/gawktexi.in @@ -9393,9 +9393,9 @@ have different forms, but are stored identically internally. A @dfn{numeric constant} stands for a number. This number can be an integer, a decimal fraction, or a number in scientific (exponential) notation.@footnote{The internal representation of all numbers, -including integers, uses double precision -floating-point numbers. -On most modern systems, these are in IEEE 754 standard format.} +including integers, uses double precision floating-point numbers. +On most modern systems, these are in IEEE 754 standard format. +@xref{Arbitrary Precision Arithmetic}, for much more information.} Here are some examples of numeric constants that all have the same value: @@ -9608,7 +9608,7 @@ upon the contents of the current input record. Constant regular expressions are also used as the first argument for the @code{gensub()}, @code{sub()}, and @code{gsub()} functions, as the second argument of the @code{match()} function, -and as the third argument of the @code{patsplit()} function +and as the third argument of the @code{split()} and @code{patsplit()} functions (@pxref{String Functions}). Modern implementations of @command{awk}, including @command{gawk}, allow the third argument of @code{split()} to be a regexp constant, but some @@ -9850,32 +9850,28 @@ specifies the output format to use when printing numbers with @code{print}. conversion from the semantics of printing. Both @code{CONVFMT} and @code{OFMT} have the same default value: @code{"%.6g"}. In the vast majority of cases, old @command{awk} programs do not change their behavior. -However, these semantics for @code{OFMT} are something to keep in mind if you must -port your new-style program to older implementations of @command{awk}. -We recommend -that instead of changing your programs, just port @command{gawk} itself. -@xref{Print}, -for more information on the @code{print} statement. - -And, once again, where you are can matter when it comes to converting -between numbers and strings. In @ref{Locales}, we mentioned that -the local character set and language (the locale) can affect how -@command{gawk} matches characters. The locale also affects numeric -formats. In particular, for @command{awk} programs, it affects the -decimal point character. The @code{"C"} locale, and most English-language -locales, use the period character (@samp{.}) as the decimal point. -However, many (if not most) European and non-English locales use the comma -(@samp{,}) as the decimal point character. +@xref{Print}, for more information on the @code{print} statement. + +Where you are can matter when it comes to converting between numbers and +strings. The local character set and language---the @dfn{locale}---can +affect numeric formats. In particular, for @command{awk} programs, +it affects the decimal point character and the thousands-separator +character. The @code{"C"} locale, and most English-language locales, +use the period character (@samp{.}) as the decimal point and don't +have a thousands separator. However, many (if not most) European and +non-English locales use the comma (@samp{,}) as the decimal point +character. European locales often use either a space or a period as +the thousands separator, if they have one. @cindex dark corner, locale's decimal point character The POSIX standard says that @command{awk} always uses the period as the decimal -point when reading the @command{awk} program source code, and for command-line -variable assignments (@pxref{Other Arguments}). -However, when interpreting input data, for @code{print} and @code{printf} output, -and for number to string conversion, the local decimal point character is used. -@value{DARKCORNER} -Here are some examples indicating the difference in behavior, -on a GNU/Linux system: +point when reading the @command{awk} program source code, and for +command-line variable assignments (@pxref{Other Arguments}). However, +when interpreting input data, for @code{print} and @code{printf} output, +and for number to string conversion, the local decimal point character +is used. @value{DARKCORNER} In all cases, numbers in source code and +in input data cannot have a thousands separator. Here are some examples +indicating the difference in behavior, on a GNU/Linux system: @example $ @kbd{export POSIXLY_CORRECT=1} @ii{Force POSIX behavior} @@ -9890,7 +9886,7 @@ $ @kbd{echo 4,321 | LC_ALL=en_DK.utf-8 gawk '@{ print $1 + 1 @}'} @end example @noindent -The @samp{en_DK.utf-8} locale is for English in Denmark, where the comma acts as +The @code{en_DK.utf-8} locale is for English in Denmark, where the comma acts as the decimal point separator. In the normal @code{"C"} locale, @command{gawk} treats @samp{4,321} as @samp{4}, while in the Danish locale, it's treated as the full number, 4.321. @@ -10037,7 +10033,7 @@ b * int(a / b) + (a % b) == a @end example One possibly undesirable effect of this definition of remainder is that -@code{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: +@samp{@var{x} % @var{y}} is negative if @var{x} is negative. Thus: @example -17 % 8 = -1 @@ -10131,7 +10127,7 @@ BEGIN @{ @end example @noindent -It is not defined whether the assignment to @code{a} happens +It is not defined whether the second assignment to @code{a} happens before or after the value of @code{a} is retrieved for producing the concatenated value. The result could be either @samp{don't panic}, or @samp{panic panic}. @@ -10253,8 +10249,8 @@ element. (Such values are called @dfn{rvalues}.) @cindex variables, types of It is important to note that variables do @emph{not} have permanent types. -A variable's type is simply the type of whatever value it happens -to hold at the moment. In the following program fragment, the variable +A variable's type is simply the type of whatever value was last assigned +to it. In the following program fragment, the variable @code{foo} has a numeric value at first, and a string value later on: @example @@ -10355,6 +10351,7 @@ The indices of @code{bar} are practically guaranteed to be different, because and see @ref{Numeric Functions}, for more information). This example illustrates an important fact about assignment operators: the lefthand expression is only evaluated @emph{once}. + It is up to the implementation as to which expression is evaluated first, the lefthand or the righthand. Consider this example: @@ -10387,17 +10384,17 @@ to a number. @caption{Arithmetic Assignment Operators} @multitable @columnfractions .30 .70 @headitem Operator @tab Effect -@item @var{lvalue} @code{+=} @var{increment} @tab Adds @var{increment} to the value of @var{lvalue}. -@item @var{lvalue} @code{-=} @var{decrement} @tab Subtracts @var{decrement} from the value of @var{lvalue}. -@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiplies the value of @var{lvalue} by @var{coefficient}. -@item @var{lvalue} @code{/=} @var{divisor} @tab Divides the value of @var{lvalue} by @var{divisor}. -@item @var{lvalue} @code{%=} @var{modulus} @tab Sets @var{lvalue} to its remainder by @var{modulus}. +@item @var{lvalue} @code{+=} @var{increment} @tab Add @var{increment} to the value of @var{lvalue}. +@item @var{lvalue} @code{-=} @var{decrement} @tab Subtract @var{decrement} from the value of @var{lvalue}. +@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiply the value of @var{lvalue} by @var{coefficient}. +@item @var{lvalue} @code{/=} @var{divisor} @tab Divide the value of @var{lvalue} by @var{divisor}. +@item @var{lvalue} @code{%=} @var{modulus} @tab Set @var{lvalue} to its remainder by @var{modulus}. @cindex common extensions, @code{**=} operator @cindex extensions, common@comma{} @code{**=} operator @cindex @command{awk} language, POSIX version @cindex POSIX @command{awk} @item @var{lvalue} @code{^=} @var{power} @tab -@item @var{lvalue} @code{**=} @var{power} @tab Raises @var{lvalue} to the power @var{power}. @value{COMMONEXT} +@item @var{lvalue} @code{**=} @var{power} @tab Raise @var{lvalue} to the power @var{power}. @value{COMMONEXT} @end multitable @end float @@ -10442,10 +10439,8 @@ A workaround is: awk '/[=]=/' /dev/null @end example -@command{gawk} does not have this problem, -nor do the other -freely available versions described in -@ref{Other Versions}. +@command{gawk} does not have this problem; Brian Kernighan's @command{awk} +and @command{mawk} also do not (@pxref{Other Versions}). @end sidebar @c ENDOFRANGE exas @c ENDOFRANGE opas @@ -10469,11 +10464,10 @@ are convenient abbreviations for very common operations. @cindex side effects, decrement/increment operators The operator used for adding one is written @samp{++}. It can be used to increment a variable either before or after taking its value. -To pre-increment a variable @code{v}, write @samp{++v}. This adds +To @dfn{pre-increment} a variable @code{v}, write @samp{++v}. This adds one to the value of @code{v}---that new value is also the value of the -expression. (The assignment expression @samp{v += 1} is completely -equivalent.) -Writing the @samp{++} after the variable specifies post-increment. This +expression. (The assignment expression @samp{v += 1} is completely equivalent.) +Writing the @samp{++} after the variable specifies @dfn{post-increment}. This increments the variable value just the same; the difference is that the value of the increment expression itself is the variable's @emph{old} value. Thus, if @code{foo} has the value four, then the expression @samp{foo++} @@ -10485,7 +10479,18 @@ The post-increment @samp{foo++} is nearly the same as writing @samp{(foo += 1) - 1}. It is not perfectly equivalent because all numbers in @command{awk} are floating-point---in floating-point, @samp{foo + 1 - 1} does not necessarily equal @code{foo}. But the difference is minute as -long as you stick to numbers that are fairly small (less than 10e12). +long as you stick to numbers that are fairly small (less than +@iftex +@math{10^12}). +@end iftex +@ifnottex +@ifnotdocbook +10e12). +@end ifnotdocbook +@end ifnottex +@docbook +10<superscript>12</superscript>). @c +@end docbook @cindex @code{$} (dollar sign), incrementing fields and arrays @cindex dollar sign (@code{$}), incrementing fields and arrays @@ -10673,6 +10678,7 @@ like a number---for example, @code{@w{" +2"}}. This concept is used for determining the type of a variable. The type of the variable is important because the types of two variables determine how they are compared. + The various versions of the POSIX standard did not get the rules quite right for several editions. Fortunately, as of at least the 2008 standard (and possibly earlier), the standard has been fixed, @@ -10766,6 +10772,7 @@ STRNUM &&string &numeric &numeric\cr }}} @end tex @ifnottex +@ifnotdocbook @display +---------------------------------------------- | STRING NUMERIC STRNUM @@ -10778,7 +10785,51 @@ NUMERIC | string numeric numeric STRNUM | string numeric numeric --------+---------------------------------------------- @end display +@end ifnotdocbook @end ifnottex +@docbook +<informaltable> +<tgroup cols="4"> +<colspec colname="1" align="left"/> +<colspec colname="2" align="left"/> +<colspec colname="3" align="left"/> +<colspec colname="4" align="left"/> +<thead> +<row> +<entry/> +<entry>STRING</entry> +<entry>NUMERIC</entry> +<entry>STRNUM</entry> +</row> +</thead> + +<tbody> +<row> +<entry><emphasis role="bold">STRING</emphasis></entry> +<entry>string</entry> +<entry>string</entry> +<entry>string</entry> +</row> + +<row> +<entry><emphasis role="bold">NUMERIC</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +<row> +<entry><emphasis role="bold">STRNUM</emphasis></entry> +<entry>string</entry> +<entry>numeric</entry> +<entry>numeric</entry> +</row> + +</tbody> +</tgroup> +</informaltable> + +@end docbook The basic idea is that user input that looks numeric---and @emph{only} user input---should be treated as numeric, even though it is actually @@ -10797,8 +10848,8 @@ This point bears additional emphasis: All user input is made of characters, and so is first and foremost of @var{string} type; input strings that look numeric are additionally given the @var{strnum} attribute. Thus, the six-character input string @w{@samp{ +3.14}} receives the -@var{strnum} attribute. In contrast, the eight-character literal -@w{@code{" +3.14"}} appearing in program text is a string constant. +@var{strnum} attribute. In contrast, the eight characters +@w{@code{" +3.14"}} appearing in program text comprise a string constant. The following examples print @samp{1} when the comparison between the two different constants is true, @samp{0} otherwise: @@ -10984,7 +11035,9 @@ where this is discussed in more detail. @subsubsection String Comparison With POSIX Rules The POSIX standard says that string comparison is performed based -on the locale's collating order. This is usually very different +on the locale's @dfn{collating order}. This is the order in which +characters sort, as defined by the locale (for more discussion, +@pxref{Ranges and Locales}). This order is usually very different from the results obtained when doing straight character-by-character comparison.@footnote{Technically, string comparison is supposed to behave the same way as if the strings are compared with the C @@ -10992,7 +11045,7 @@ to behave the same way as if the strings are compared with the C Because this behavior differs considerably from existing practice, @command{gawk} only implements it when in POSIX mode (@pxref{Options}). -Here is an example to illustrate the difference, in an @samp{en_US.UTF-8} +Here is an example to illustrate the difference, in an @code{en_US.UTF-8} locale: @example @@ -11208,7 +11261,7 @@ However, putting a newline in front of either character does not work without using backslash continuation (@pxref{Statements/Lines}). If @option{--posix} is specified -(@pxref{Options}), then this extension is disabled. +(@pxref{Options}), this extension is disabled. @node Function Calls @section Function Calls @@ -11227,6 +11280,8 @@ functions and their descriptions. In addition, you can define functions for use in your program. @xref{User-defined}, for instructions on how to do this. +Finally, @command{gawk} lets you write functions in C or C++ +that may be called from your program: see @ref{Dynamic Extensions}. @cindex arguments, in function calls The way to use a function is with a @dfn{function call} expression, @@ -11277,12 +11332,12 @@ when you write the source code to your program. We defer discussion of this feature until later; see @ref{Indirect Calls}. @cindex side effects, function calls -Like every other expression, the function call has a value, which is -computed by the function based on the arguments you give it. In this -example, the value of @samp{sqrt(@var{argument})} is the square root of -@var{argument}. -The following program reads numbers, one number per line, and prints the -square root of each one: +Like every other expression, the function call has a value, often +called the @dfn{return value}, which is computed by the function +based on the arguments you give it. In this example, the return value +of @samp{sqrt(@var{argument})} is the square root of @var{argument}. +The following program reads numbers, one number per line, and prints +the square root of each one: @example $ @kbd{awk '@{ print "The square root of", $1, "is", sqrt($1) @}'} @@ -11597,10 +11652,10 @@ A single expression. It matches when its value is nonzero (if a number) or non-null (if a string). (@xref{Expression Patterns}.) -@item @var{pat1}, @var{pat2} +@item @var{begpat}, @var{endpat} A pair of patterns separated by a comma, specifying a range of records. -The range includes both the initial record that matches @var{pat1} and -the final record that matches @var{pat2}. +The range includes both the initial record that matches @var{begpat} and +the final record that matches @var{endpat}. (@xref{Ranges}.) @item BEGIN @@ -11612,7 +11667,7 @@ Special patterns for you to supply startup or cleanup actions for your @item BEGINFILE @itemx ENDFILE Special patterns for you to supply startup or cleanup actions to be -done on a per file basis. +done on a per-file basis. (@xref{BEGINFILE/ENDFILE}.) @item @var{empty} @@ -11773,7 +11828,7 @@ input record. When a record matches @var{begpat}, the range pattern is @dfn{turned on} and the range pattern matches this record as well. As long as the range pattern stays turned on, it automatically matches every input record read. The range pattern also matches @var{endpat} against every -input record; when this succeeds, the range pattern is turned off again +input record; when this succeeds, the range pattern is @dfn{turned off} again for the following record. Then the range pattern goes back to checking @var{begpat} against each record. @@ -11927,7 +11982,7 @@ rule checks the @code{FNR} and @code{NR} variables. @subsubsection Input/Output from @code{BEGIN} and @code{END} Rules @cindex input/output, from @code{BEGIN} and @code{END} -There are several (sometimes subtle) points to remember when doing I/O +There are several (sometimes subtle) points to be aware of when doing I/O from a @code{BEGIN} or @code{END} rule. The first has to do with the value of @code{$0} in a @code{BEGIN} rule. Because @code{BEGIN} rules are executed before any input is read, @@ -11988,8 +12043,19 @@ This @value{SECTION} describes a @command{gawk}-specific feature. Two special kinds of rule, @code{BEGINFILE} and @code{ENDFILE}, give you ``hooks'' into @command{gawk}'s command-line file processing loop. -As with the @code{BEGIN} and @code{END} rules (@pxref{BEGIN/END}), all -@code{BEGINFILE} rules in a program are merged, in the order they are +As with the @code{BEGIN} and @code{END} rules +@ifnottex +@ifnotdocbook +(@pxref{BEGIN/END}), +@end ifnotdocbook +@end ifnottex +@iftex +(see the previous section), +@end iftex +@ifdocbook +(see the previous section), +@end ifdocbook +all @code{BEGINFILE} rules in a program are merged, in the order they are read by @command{gawk}, and all @code{ENDFILE} rules are merged as well. The body of the @code{BEGINFILE} rules is executed just before @@ -12017,10 +12083,11 @@ the file entirely. Otherwise, @command{gawk} exits with the usual fatal error. @item -If you have written extensions that modify the record handling (by inserting -an ``input parser''), you can invoke them at this point, before @command{gawk} -has started processing the file. (This is a @emph{very} advanced feature, -currently used only by the @uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) +If you have written extensions that modify the record handling (by +inserting an ``input parser,'' @pxref{Input Parsers}), you can invoke +them at this point, before @command{gawk} has started processing the file. +(This is a @emph{very} advanced feature, currently used only by the +@uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.) @end itemize The @code{ENDFILE} rule is called when @command{gawk} has finished processing @@ -12103,7 +12170,7 @@ into the body of the @command{awk} program. @cindex shells, quoting The most common method is to use shell quoting to substitute the variable's value into the program inside the script. -For example, in the following program: +For example, consider the following program: @example printf "Enter search pattern: " @@ -12113,7 +12180,7 @@ awk "/$pattern/ "'@{ nmatches++ @} @end example @noindent -the @command{awk} program consists of two pieces of quoted text +The @command{awk} program consists of two pieces of quoted text that are concatenated together to form the program. The first part is double-quoted, which allows substitution of the @code{pattern} shell variable inside the quotes. @@ -12127,8 +12194,8 @@ match up the quotes when reading the program. A better method is to use @command{awk}'s variable assignment feature (@pxref{Assignment Options}) -to assign the shell variable's value to an @command{awk} variable's -value. Then use dynamic regexps to match the pattern +to assign the shell variable's value to an @command{awk} variable. +Then use dynamic regexps to match the pattern (@pxref{Computed Regexps}). The following shows how to redo the previous example using this technique: @@ -12181,7 +12248,7 @@ function @var{name}(@var{args}) @{ @dots{} @} @cindex @code{;} (semicolon), separating statements in actions @cindex semicolon (@code{;}), separating statements in actions An action consists of one or more @command{awk} @dfn{statements}, enclosed -in curly braces (@samp{@{@dots{}@}}). Each statement specifies one +in curly braces (@samp{@{@r{@dots{}}@}}). Each statement specifies one thing to do. The statements are separated by newlines or semicolons. The curly braces around an action must be used even if the action contains only one statement, or if it contains no statements at @@ -12211,10 +12278,9 @@ programs. The @command{awk} language gives you C-like constructs special ones (@pxref{Statements}). @item Compound statements -Consist of one or more statements enclosed in -curly braces. A compound statement is used in order to put several -statements together in the body of an @code{if}, @code{while}, @code{do}, -or @code{for} statement. +Enclose one or more statements in curly braces. A compound statement +is used in order to put several statements together in the body of an +@code{if}, @code{while}, @code{do}, or @code{for} statement. @item Input statements Use the @code{getline} command @@ -12548,6 +12614,8 @@ for more information on this version of the @code{for} loop. @cindex @code{default} keyword This @value{SECTION} describes a @command{gawk}-specific feature. +If @command{gawk} is in compatibility mode (@pxref{Options}), +it is not available. The @code{switch} statement allows the evaluation of an expression and the execution of statements based on a @code{case} match. Case statements @@ -12604,11 +12672,6 @@ the @code{print} statement is executed and then falls through into the the @minus{}1 case will also be executed since the @code{default} does not halt execution. -This @code{switch} statement is a @command{gawk} extension. -If @command{gawk} is in compatibility mode -(@pxref{Options}), -it is not available. - @node Break Statement @subsection The @code{break} Statement @cindex @code{break} statement @@ -12623,15 +12686,15 @@ numbers: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; div * div <= num; div++) @{ - if (num % div == 0) - break - @} - if (num % div == 0) - printf "Smallest divisor of %d is %d\n", num, div - else - printf "%d is prime\n", num + num = $1 + for (div = 2; div * div <= num; div++) @{ + if (num % div == 0) + break + @} + if (num % div == 0) + printf "Smallest divisor of %d is %d\n", num, div + else + printf "%d is prime\n", num @} @end example @@ -12649,17 +12712,17 @@ an @code{if}: @example # find smallest divisor of num @{ - num = $1 - for (div = 2; ; div++) @{ - if (num % div == 0) @{ - printf "Smallest divisor of %d is %d\n", num, div - break - @} - if (div * div > num) @{ - printf "%d is prime\n", num - break + num = $1 + for (div = 2; ; div++) @{ + if (num % div == 0) @{ + printf "Smallest divisor of %d is %d\n", num, div + break + @} + if (div * div > num) @{ + printf "%d is prime\n", num + break + @} @} - @} @} @end example @@ -12808,16 +12871,14 @@ The @code{next} statement is not allowed inside @code{BEGINFILE} and @cindex POSIX @command{awk}, @code{next}/@code{nextfile} statements and @cindex @code{next} statement, user-defined functions and @cindex functions, user-defined, @code{next}/@code{nextfile} statements and -According to the POSIX standard, the behavior is undefined if -the @code{next} statement is used in a @code{BEGIN} or @code{END} rule. -@command{gawk} treats it as a syntax error. -Although POSIX permits it, -some other @command{awk} implementations don't allow the @code{next} -statement inside function bodies -(@pxref{User-defined}). -Just as with any other @code{next} statement, a @code{next} statement inside a -function body reads the next record and starts processing it with the -first rule in the program. +According to the POSIX standard, the behavior is undefined if the +@code{next} statement is used in a @code{BEGIN} or @code{END} rule. +@command{gawk} treats it as a syntax error. Although POSIX permits it, +most other @command{awk} implementations don't allow the @code{next} +statement inside function bodies (@pxref{User-defined}). Just as with any +other @code{next} statement, a @code{next} statement inside a function +body reads the next record and starts processing it with the first rule +in the program. @node Nextfile Statement @subsection The @code{nextfile} Statement @@ -12928,8 +12989,7 @@ status code for the @command{awk} process. If no argument is supplied, In the case where an argument is supplied to a first @code{exit} statement, and then @code{exit} is called a second time from an @code{END} rule with no argument, -@command{awk} uses the previously supplied exit value. -@value{DARKCORNER} +@command{awk} uses the previously supplied exit value. @value{DARKCORNER} @xref{Exit Status}, for more information. @cindex programming conventions, @code{exit} statement @@ -12941,12 +13001,12 @@ in the following example: @example BEGIN @{ - if (("date" | getline date_now) <= 0) @{ - print "Can't get system date" > "/dev/stderr" - exit 1 - @} - print "current date is", date_now - close("date") + if (("date" | getline date_now) <= 0) @{ + print "Can't get system date" > "/dev/stderr" + exit 1 + @} + print "current date is", date_now + close("date") @} @end example @@ -34237,7 +34297,7 @@ it on your system). @cindex Unicode Similar considerations apply to other ranges. For example, @samp{["-/]} is perfectly valid in ASCII, but is not valid in many Unicode locales, -such as @samp{en_US.UTF-8}. +such as @code{en_US.UTF-8}. Early versions of @command{gawk} used regexp matching code that was not locale aware, so ranges had their traditional interpretation. @@ -36687,7 +36747,7 @@ different limits. @multitable @columnfractions .40 .60 @headitem Item @tab Limit @item Characters in a character class @tab 2^(number of bits per byte) -@item Length of input record @tab @code{MAX_INT } +@item Length of input record @tab @code{MAX_INT} @item Length of output record @tab Unlimited @item Length of source line @tab Unlimited @item Number of fields in a record @tab @code{MAX_LONG} @@ -36696,9 +36756,9 @@ different limits. @item Number of input records total @tab @code{MAX_LONG} @item Number of pipe redirections @tab min(number of processes per user, number of open files) @item Numeric values @tab Double-precision floating point (if not using MPFR) -@item Size of a field @tab @code{MAX_INT } -@item Size of a literal string @tab @code{MAX_INT } -@item Size of a printf string @tab @code{MAX_INT } +@item Size of a field @tab @code{MAX_INT} +@item Size of a literal string @tab @code{MAX_INT} +@item Size of a printf string @tab @code{MAX_INT} @end multitable @node Extension Design |