aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/ChangeLog6
-rw-r--r--doc/gawk.info1413
-rw-r--r--doc/gawk.texi215
-rw-r--r--doc/gawktexi.in215
4 files changed, 1067 insertions, 782 deletions
diff --git a/doc/ChangeLog b/doc/ChangeLog
index ad5aca41..764d093a 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,9 @@
+2017-05-22 Arnold D. Robbins <arnold@skeeve.com>
+
+ * gawktexi.in: Document FIELDWIDTHS much better, including how
+ it works in corner cases. Some general organizational improvements
+ in this chunk of text.
+
2017-04-23 Arnold D. Robbins <arnold@skeeve.com>
* gawktexi.in: Improve documentation of --source option.
diff --git a/doc/gawk.info b/doc/gawk.info
index 5111770d..14d34a98 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -196,7 +196,13 @@ in (a) below. A copy of the license is included in the section entitled
field.
* Field Splitting Summary:: Some final points and a summary table.
* Constant Size:: Reading constant width data.
+* Fixed width data:: Processing fixed-width data.
+* Skipping intervening:: Skipping intervening fields.
+* Allowing trailing data:: Capturing optional trailing data.
+* Fields with fixed data:: Field values with fixed-width data.
* Splitting By Content:: Defining Fields By Content
+* Testing field creation:: Checking how 'gawk' is
+ splitting records.
* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program
control using the 'getline'
@@ -4228,6 +4234,8 @@ be named on the 'awk' command line (*note Getline::).
* Field Separators:: The field separator and how to change it.
* Constant Size:: Reading constant width data.
* Splitting By Content:: Defining Fields By Content
+* Testing field creation:: Checking how 'gawk' is splitting
+ records.
* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program control
using the 'getline' function.
@@ -5124,10 +5132,25 @@ This minor node discusses an advanced feature of 'gawk'. If you are a
novice 'awk' user, you might want to skip it on the first reading.
'gawk' provides a facility for dealing with fixed-width fields with
-no distinctive field separator. For example, data of this nature arises
-in the input for old Fortran programs where numbers are run together, or
-in the output of programs that did not anticipate the use of their
-output as input for other programs.
+no distinctive field separator. We discuss this feature in the
+following nodes.
+
+* Menu:
+
+* Fixed width data:: Processing fixed-width data.
+* Skipping intervening:: Skipping intervening fields.
+* Allowing trailing data:: Capturing optional trailing data.
+* Fields with fixed data:: Field values with fixed-width data.
+
+
+File: gawk.info, Node: Fixed width data, Next: Skipping intervening, Up: Constant Size
+
+4.6.1 Processing Fixed-Width Data
+---------------------------------
+
+An example of fixed-width data would be the input for old Fortran
+programs where numbers are run together, or the output of programs that
+did not anticipate the use of their output as input for other programs.
An example of the latter is a table where all the columns are lined
up by the use of a variable number of spaces and _empty fields are just
@@ -5141,12 +5164,11 @@ by assigning a string containing space-separated numbers to the built-in
variable 'FIELDWIDTHS'. Each number specifies the width of the field,
_including_ columns between fields. If you want to ignore the columns
between fields, you can specify the width as a separate field that is
-subsequently ignored. Or, starting in version 4.2, each field width may
-optionally be preceded by a colon-separated value specifying the number
-of characters to skip before the field starts. It is a fatal error to
-supply a field width that has a negative value. The following data is
-the output of the Unix 'w' utility. It is useful to illustrate the use
-of 'FIELDWIDTHS':
+subsequently ignored. It is a fatal error to supply a field width that
+has a negative value.
+
+ The following data is the output of the Unix 'w' utility. It is
+useful to illustrate the use of 'FIELDWIDTHS':
10:06pm up 21 days, 14:04, 23 users
User tty login idle JCPU PCPU what
@@ -5169,7 +5191,7 @@ calculated idle time:
sub(/^ +/, "", idle) # strip leading spaces
if (idle == "")
idle = 0
- if (idle ~ /:/) {
+ if (idle ~ /:/) { # hh:mm
split(idle, t, ":")
idle = t[1] * 60 + t[2]
}
@@ -5193,11 +5215,31 @@ calculated idle time:
brent ttyp0 286
dave ttyq4 1296000
- Starting in version 4.2, this program could be rewritten to specify
-'FIELDWIDTHS' like so:
+ Another (possibly more practical) example of fixed-width input data
+is the input from a deck of balloting cards. In some parts of the
+United States, voters mark their choices by punching holes in computer
+cards. These cards are then processed to count the votes for any
+particular candidate or on any particular issue. Because a voter may
+choose not to vote on some issue, any column on the card may be empty.
+An 'awk' program for processing such data could use the 'FIELDWIDTHS'
+feature to simplify reading the data. (Of course, getting 'gawk' to run
+on a system with card readers is another story!)
+
+
+File: gawk.info, Node: Skipping intervening, Next: Allowing trailing data, Prev: Fixed width data, Up: Constant Size
+
+4.6.2 Skipping Intervening Fields
+---------------------------------
+
+Starting in version 4.2, each field width may optionally be preceded by
+a colon-separated value specifying the number of characters to skip
+before the field starts. Thus, the preceding program could be rewritten
+to specify 'FIELDWIDTHS' like so:
+
BEGIN { FIELDWIDTHS = "8 1:5 4:7 6 1:6 1:6 2:33" }
+
This strips away some of the white space separating the fields. With
-such a change, the program would produce the following results:
+such a change, the program produces the following results:
hzang ttyV3 50
eklye ttyV5 0
@@ -5207,39 +5249,68 @@ such a change, the program would produce the following results:
brent ttyp0 286
dave ttyq4 1296000
- Another (possibly more practical) example of fixed-width input data
-is the input from a deck of balloting cards. In some parts of the
-United States, voters mark their choices by punching holes in computer
-cards. These cards are then processed to count the votes for any
-particular candidate or on any particular issue. Because a voter may
-choose not to vote on some issue, any column on the card may be empty.
-An 'awk' program for processing such data could use the 'FIELDWIDTHS'
-feature to simplify reading the data. (Of course, getting 'gawk' to run
-on a system with card readers is another story!)
+
+File: gawk.info, Node: Allowing trailing data, Next: Fields with fixed data, Prev: Skipping intervening, Up: Constant Size
- Assigning a value to 'FS' causes 'gawk' to use 'FS' for field
-splitting again. Use 'FS = FS' to make this happen, without having to
-know the current value of 'FS'. In order to tell which kind of field
-splitting is in effect, use 'PROCINFO["FS"]' (*note Auto-set::). The
-value is '"FS"' if regular field splitting is being used, or
-'"FIELDWIDTHS"' if fixed-width field splitting is being used:
+4.6.3 Capturing Optional Trailing Data
+--------------------------------------
- if (PROCINFO["FS"] == "FS")
- REGULAR FIELD SPLITTING ...
- else if (PROCINFO["FS"] == "FIELDWIDTHS")
- FIXED-WIDTH FIELD SPLITTING ...
- else if (PROCINFO["FS"] == "FPAT")
- CONTENT-BASED FIELD SPLITTING ... (see next minor node)
- else
- API INPUT PARSER FIELD SPLITTING ... (advanced feature)
+There are times when fixed-width data may be followed by additional data
+that has no fixed length. Such data may or may not be present, but if
+it is, it should be possible to get at it from an 'awk' program.
- This information is useful when writing a function that needs to
-temporarily change 'FS' or 'FIELDWIDTHS', read some records, and then
-restore the original settings (*note Passwd Functions:: for an example
-of such a function).
+ Starting with version 4.2, in order to provide a way to say "anything
+else in the record after the defined fields," 'gawk' allows you to add a
+final '*' character to the value of 'FIELDWIDTHS'. There can only be
+one such character, and it must be the final non-whitespace character in
+'FIELDWIDTHS'. For example:
+
+ $ cat fw.awk Show the program
+ -| BEGIN { FIELDWIDTHS = "2 2 *" }
+ -| { print NF, $1, $2, $3 }
+ $ cat fw.in Show sample input
+ -| 1234abcdefghi
+ $ gawk -f fw.awk fw.in Run the program
+ -| 3 12 34 abcdefghi

-File: gawk.info, Node: Splitting By Content, Next: Multiple Line, Prev: Constant Size, Up: Reading Files
+File: gawk.info, Node: Fields with fixed data, Prev: Allowing trailing data, Up: Constant Size
+
+4.6.4 Field Values With Fixed-Width Data
+----------------------------------------
+
+So far, so good. But what happens if there isn't as much data as there
+should be based on the contents of 'FIELDWIDTHS'? Or, what happens if
+there is more data than expected?
+
+ For many years, what happens in these cases was not well defined.
+Starting with version 4.2, the rules are as follows:
+
+Enough data for some fields
+ For example, if 'FIELDWIDTHS' is set to '"2 3 4"' and the input
+ record is 'aabbb'. In this case, 'NF' is set to two.
+
+Not enough data for a field
+ For example, if 'FIELDWIDTHS' is set to '"2 3 4"' and the input
+ record is 'aab'. In this case, 'NF' is set to two and '$2' has the
+ value '"b"'. The idea is that even though there aren't as many
+ characters as were expected, there are some, so the data should be
+ made available to the program.
+
+Too much data
+ For example, if 'FIELDWIDTHS' is set to '"2 3 4"' and the input
+ record is 'aabbbccccddd'. In this case, 'NF' is set to three and
+ the extra characters ('ddd') are ignored. If you want 'gawk' to
+ capture the extra characters, supply a final '*' in the value of
+ 'FIELDWIDTHS'.
+
+Too much data, but with '*' supplied
+ For example, if 'FIELDWIDTHS' is set to '"2 3 4 *"' and the input
+ record is 'aabbbccccddd'. In this case, 'NF' is set to four, and
+ '$4' has the value '"ddd"'.
+
+
+File: gawk.info, Node: Splitting By Content, Next: Testing field creation, Prev: Constant Size, Up: Reading Files
4.7 Defining Fields by Content
==============================
@@ -5315,9 +5386,7 @@ would be to remove the quotes when they occur, with something like this:
affects field splitting with 'FPAT'.
Assigning a value to 'FPAT' overrides field splitting with 'FS' and
-with 'FIELDWIDTHS'. Similar to 'FIELDWIDTHS', the value of
-'PROCINFO["FS"]' will be '"FPAT"' if content-based field splitting is
-being used.
+with 'FIELDWIDTHS'.
NOTE: Some programs export CSV data that contains embedded newlines
between the double quotes. 'gawk' provides no way to deal with
@@ -5335,23 +5404,53 @@ contain at least one character. A straightforward modification
Finally, the 'patsplit()' function makes the same functionality
available for splitting regular strings (*note String Functions::).
- To recap, 'gawk' provides three independent methods to split input
+ ---------- Footnotes ----------
+
+ (1) The CSV format lacked a formal standard definition for many
+years. RFC 4180 (http://www.ietf.org/rfc/rfc4180.txt) standardizes the
+most common practices.
+
+
+File: gawk.info, Node: Testing field creation, Next: Multiple Line, Prev: Splitting By Content, Up: Reading Files
+
+4.8 Checking How 'gawk' Is Splitting Records
+============================================
+
+As we've seen, 'gawk' provides three independent methods to split input
records into fields. The mechanism used is based on which of the three
variables--'FS', 'FIELDWIDTHS', or 'FPAT'--was last assigned to. In
addition, an API input parser may choose to override the record parsing
mechanism; please refer to *note Input Parsers:: for further information
about this feature.
- ---------- Footnotes ----------
+ To restore normal field splitting after using 'FIELDWIDTHS' and/or
+'FPAT', simply assign a value to 'FS'. You can use 'FS = FS' to do
+this, without having to know the current value of 'FS'.
- (1) The CSV format lacked a formal standard definition for many
-years. RFC 4180 (http://www.ietf.org/rfc/rfc4180.txt) standardizes the
-most common practices.
+ In order to tell which kind of field splitting is in effect, use
+'PROCINFO["FS"]' (*note Auto-set::). The value is '"FS"' if regular
+field splitting is being used, '"FIELDWIDTHS"' if fixed-width field
+splitting is being used, or '"FPAT"' if content-based field splitting is
+being used:
+
+ if (PROCINFO["FS"] == "FS")
+ REGULAR FIELD SPLITTING ...
+ else if (PROCINFO["FS"] == "FIELDWIDTHS")
+ FIXED-WIDTH FIELD SPLITTING ...
+ else if (PROCINFO["FS"] == "FPAT")
+ CONTENT-BASED FIELD SPLITTING
+ else
+ API INPUT PARSER FIELD SPLITTING ... (advanced feature)
+
+ This information is useful when writing a function that needs to
+temporarily change 'FS' or 'FIELDWIDTHS', read some records, and then
+restore the original settings (*note Passwd Functions:: for an example
+of such a function).

-File: gawk.info, Node: Multiple Line, Next: Getline, Prev: Splitting By Content, Up: Reading Files
+File: gawk.info, Node: Multiple Line, Next: Getline, Prev: Testing field creation, Up: Reading Files
-4.8 Multiple-Line Records
+4.9 Multiple-Line Records
=========================
In some databases, a single line cannot conveniently hold all the
@@ -5491,8 +5590,8 @@ separator of a single space: 'FS = " "'.

File: gawk.info, Node: Getline, Next: Read Timeout, Prev: Multiple Line, Up: Reading Files
-4.9 Explicit Input with 'getline'
-=================================
+4.10 Explicit Input with 'getline'
+==================================
So far we have been getting our input data from 'awk''s main input
stream--either the standard input (usually your keyboard, sometimes the
@@ -5543,8 +5642,8 @@ represents a shell command.

File: gawk.info, Node: Plain Getline, Next: Getline/Variable, Up: Getline
-4.9.1 Using 'getline' with No Arguments
----------------------------------------
+4.10.1 Using 'getline' with No Arguments
+----------------------------------------
The 'getline' command can be used without arguments to read input from
the current input file. All it does in this case is read the next input
@@ -5604,8 +5703,8 @@ the value of '$0'.

File: gawk.info, Node: Getline/Variable, Next: Getline/File, Prev: Plain Getline, Up: Getline
-4.9.2 Using 'getline' into a Variable
--------------------------------------
+4.10.2 Using 'getline' into a Variable
+--------------------------------------
You can use 'getline VAR' to read the next record from 'awk''s input
into the variable VAR. No other processing is done. For example,
@@ -5645,8 +5744,8 @@ fields, so the values of the fields (including '$0') and the value of

File: gawk.info, Node: Getline/File, Next: Getline/Variable/File, Prev: Getline/Variable, Up: Getline
-4.9.3 Using 'getline' from a File
----------------------------------
+4.10.3 Using 'getline' from a File
+----------------------------------
Use 'getline < FILE' to read the next record from FILE. Here, FILE is a
string-valued expression that specifies the file name. '< FILE' is
@@ -5678,8 +5777,8 @@ portable to all 'awk' implementations.

File: gawk.info, Node: Getline/Variable/File, Next: Getline/Pipe, Prev: Getline/File, Up: Getline
-4.9.4 Using 'getline' into a Variable from a File
--------------------------------------------------
+4.10.4 Using 'getline' into a Variable from a File
+--------------------------------------------------
Use 'getline VAR < FILE' to read input from the file FILE, and put it in
the variable VAR. As earlier, FILE is a string-valued expression that
@@ -5722,8 +5821,8 @@ regular expression.

File: gawk.info, Node: Getline/Pipe, Next: Getline/Variable/Pipe, Prev: Getline/Variable/File, Up: Getline
-4.9.5 Using 'getline' from a Pipe
----------------------------------
+4.10.5 Using 'getline' from a Pipe
+----------------------------------
Omniscience has much to recommend it. Failing that, attention to
details would be useful.
@@ -5792,8 +5891,8 @@ you want your program to be portable to all 'awk' implementations.

File: gawk.info, Node: Getline/Variable/Pipe, Next: Getline/Coprocess, Prev: Getline/Pipe, Up: Getline
-4.9.6 Using 'getline' into a Variable from a Pipe
--------------------------------------------------
+4.10.6 Using 'getline' into a Variable from a Pipe
+--------------------------------------------------
When you use 'COMMAND | getline VAR', the output of COMMAND is sent
through a pipe to 'getline' and into the variable VAR. For example, the
@@ -5819,8 +5918,8 @@ to other 'awk' implementations.

File: gawk.info, Node: Getline/Coprocess, Next: Getline/Variable/Coprocess, Prev: Getline/Variable/Pipe, Up: Getline
-4.9.7 Using 'getline' from a Coprocess
---------------------------------------
+4.10.7 Using 'getline' from a Coprocess
+---------------------------------------
Reading input into 'getline' from a pipe is a one-way operation. The
command that is started with 'COMMAND | getline' only sends data _to_
@@ -5849,8 +5948,8 @@ coprocesses are discussed in more detail.

File: gawk.info, Node: Getline/Variable/Coprocess, Next: Getline Notes, Prev: Getline/Coprocess, Up: Getline
-4.9.8 Using 'getline' into a Variable from a Coprocess
-------------------------------------------------------
+4.10.8 Using 'getline' into a Variable from a Coprocess
+-------------------------------------------------------
When you use 'COMMAND |& getline VAR', the output from the coprocess
COMMAND is sent through a two-way pipe to 'getline' and into the
@@ -5867,8 +5966,8 @@ coprocesses are discussed in more detail.

File: gawk.info, Node: Getline Notes, Next: Getline Summary, Prev: Getline/Variable/Coprocess, Up: Getline
-4.9.9 Points to Remember About 'getline'
-----------------------------------------
+4.10.9 Points to Remember About 'getline'
+-----------------------------------------
Here are some miscellaneous points about 'getline' that you should bear
in mind:
@@ -5927,8 +6026,8 @@ in mind:

File: gawk.info, Node: Getline Summary, Prev: Getline Notes, Up: Getline
-4.9.10 Summary of 'getline' Variants
-------------------------------------
+4.10.10 Summary of 'getline' Variants
+-------------------------------------
*note Table 4.1: table-getline-variants. summarizes the eight variants
of 'getline', listing which predefined variables are set by each one,
@@ -5955,7 +6054,7 @@ Table 4.1: 'getline' variants and what they set

File: gawk.info, Node: Read Timeout, Next: Retrying Input, Prev: Getline, Up: Reading Files
-4.10 Reading Input with a Timeout
+4.11 Reading Input with a Timeout
=================================
This minor node describes a feature that is specific to 'gawk'.
@@ -6049,7 +6148,7 @@ can block indefinitely until some other process opens it for writing.

File: gawk.info, Node: Retrying Input, Next: Command-line directories, Prev: Read Timeout, Up: Reading Files
-4.11 Retrying Reads After Certain Input Errors
+4.12 Retrying Reads After Certain Input Errors
==============================================
This minor node describes a feature that is specific to 'gawk'.
@@ -6076,7 +6175,7 @@ configured to behave in a non-blocking fashion.

File: gawk.info, Node: Command-line directories, Next: Input Summary, Prev: Retrying Input, Up: Reading Files
-4.12 Directories on the Command Line
+4.13 Directories on the Command Line
====================================
According to the POSIX standard, files named on the 'awk' command line
@@ -6099,7 +6198,7 @@ usable data from an 'awk' program.

File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-line directories, Up: Reading Files
-4.13 Summary
+4.14 Summary
============
* Input is split into records based on the value of 'RS'. The
@@ -6171,7 +6270,7 @@ File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-li

File: gawk.info, Node: Input Exercises, Prev: Input Summary, Up: Reading Files
-4.14 Exercises
+4.15 Exercises
==============
1. Using the 'FIELDWIDTHS' variable (*note Constant Size::), write a
@@ -33833,7 +33932,7 @@ Index
* fields, separating <1>: Field Separators. (line 15)
* fields, single-character: Single Character Fields.
(line 6)
-* FIELDWIDTHS variable: Constant Size. (line 22)
+* FIELDWIDTHS variable: Fixed width data. (line 17)
* FIELDWIDTHS variable <1>: User-modified. (line 37)
* file descriptors: Special FD. (line 6)
* file inclusion, @include directive: Include Files. (line 8)
@@ -34042,7 +34141,7 @@ Index
* gawk, features, adding: Adding Code. (line 6)
* gawk, features, advanced: Advanced Features. (line 6)
* gawk, field separators and: User-modified. (line 74)
-* gawk, FIELDWIDTHS variable in: Constant Size. (line 22)
+* gawk, FIELDWIDTHS variable in: Fixed width data. (line 17)
* gawk, FIELDWIDTHS variable in <1>: User-modified. (line 37)
* gawk, file names in: Special Files. (line 6)
* gawk, format-control characters: Control Letters. (line 18)
@@ -34093,7 +34192,8 @@ Index
* gawk, RT variable in <2>: Auto-set. (line 296)
* gawk, See Also awk: Preface. (line 34)
* gawk, source code, obtaining: Getting. (line 6)
-* gawk, splitting fields and: Constant Size. (line 103)
+* gawk, splitting fields and: Testing field creation.
+ (line 6)
* gawk, string-translation functions: I18N Functions. (line 6)
* gawk, SYMTAB array in: Auto-set. (line 300)
* gawk, TEXTDOMAIN variable in: User-modified. (line 155)
@@ -35400,8 +35500,8 @@ Index
* troubleshooting, backslash before nonspecial character: Escape Sequences.
(line 108)
* troubleshooting, division: Arithmetic Ops. (line 44)
-* troubleshooting, fatal errors, field widths, specifying: Constant Size.
- (line 22)
+* troubleshooting, fatal errors, field widths, specifying: Fixed width data.
+ (line 17)
* troubleshooting, fatal errors, printf format strings: Format Modifiers.
(line 157)
* troubleshooting, fflush() function: I/O Functions. (line 63)
@@ -35525,7 +35625,7 @@ Index
* Vinschen, Corinna: Acknowledgments. (line 60)
* w debugger command (alias for watch): Viewing And Changing Data.
(line 66)
-* w utility: Constant Size. (line 22)
+* w utility: Fixed width data. (line 17)
* wait() extension function: Extension Sample Fork.
(line 22)
* waitpid() extension function: Extension Sample Fork.
@@ -35580,574 +35680,579 @@ Index

Tag Table:
Node: Top1200
-Node: Foreword342794
-Node: Foreword447236
-Node: Preface48768
-Ref: Preface-Footnote-151627
-Ref: Preface-Footnote-251734
-Ref: Preface-Footnote-351968
-Node: History52110
-Node: Names54462
-Ref: Names-Footnote-155556
-Node: This Manual55703
-Ref: This Manual-Footnote-162188
-Node: Conventions62288
-Node: Manual History64642
-Ref: Manual History-Footnote-167637
-Ref: Manual History-Footnote-267678
-Node: How To Contribute67752
-Node: Acknowledgments68403
-Node: Getting Started73289
-Node: Running gawk75728
-Node: One-shot76918
-Node: Read Terminal78181
-Node: Long80174
-Node: Executable Scripts81687
-Ref: Executable Scripts-Footnote-184482
-Node: Comments84585
-Node: Quoting87069
-Node: DOS Quoting92586
-Node: Sample Data Files94641
-Node: Very Simple97236
-Node: Two Rules102138
-Node: More Complex104023
-Node: Statements/Lines106889
-Ref: Statements/Lines-Footnote-1111348
-Node: Other Features111613
-Node: When112549
-Ref: When-Footnote-1114303
-Node: Intro Summary114368
-Node: Invoking Gawk115252
-Node: Command Line116766
-Node: Options117564
-Ref: Options-Footnote-1134183
-Ref: Options-Footnote-2134413
-Node: Other Arguments134438
-Node: Naming Standard Input137385
-Node: Environment Variables138478
-Node: AWKPATH Variable139036
-Ref: AWKPATH Variable-Footnote-1142447
-Ref: AWKPATH Variable-Footnote-2142481
-Node: AWKLIBPATH Variable142742
-Node: Other Environment Variables143999
-Node: Exit Status147820
-Node: Include Files148497
-Node: Loading Shared Libraries152092
-Node: Obsolete153520
-Node: Undocumented154212
-Node: Invoking Summary154509
-Node: Regexp156169
-Node: Regexp Usage157623
-Node: Escape Sequences159660
-Node: Regexp Operators165892
-Ref: Regexp Operators-Footnote-1173308
-Ref: Regexp Operators-Footnote-2173455
-Node: Bracket Expressions173553
-Ref: table-char-classes176029
-Node: Leftmost Longest179166
-Node: Computed Regexps180469
-Node: GNU Regexp Operators183896
-Node: Case-sensitivity187575
-Ref: Case-sensitivity-Footnote-1190462
-Ref: Case-sensitivity-Footnote-2190697
-Node: Regexp Summary190805
-Node: Reading Files192271
-Node: Records194434
-Node: awk split records195167
-Node: gawk split records200098
-Ref: gawk split records-Footnote-1204638
-Node: Fields204675
-Node: Nonconstant Fields207416
-Ref: Nonconstant Fields-Footnote-1209652
-Node: Changing Fields209856
-Node: Field Separators215784
-Node: Default Field Splitting218482
-Node: Regexp Field Splitting219600
-Node: Single Character Fields222953
-Node: Command Line Field Separator224013
-Node: Full Line Fields227231
-Ref: Full Line Fields-Footnote-1228753
-Ref: Full Line Fields-Footnote-2228799
-Node: Field Splitting Summary228900
-Node: Constant Size230974
-Node: Splitting By Content236283
-Ref: Splitting By Content-Footnote-1240423
-Node: Multiple Line240586
-Ref: Multiple Line-Footnote-1246468
-Node: Getline246647
-Node: Plain Getline249114
-Node: Getline/Variable251753
-Node: Getline/File252902
-Node: Getline/Variable/File254288
-Ref: Getline/Variable/File-Footnote-1255891
-Node: Getline/Pipe255979
-Node: Getline/Variable/Pipe258684
-Node: Getline/Coprocess259817
-Node: Getline/Variable/Coprocess261082
-Node: Getline Notes261822
-Node: Getline Summary264617
-Ref: table-getline-variants265039
-Node: Read Timeout265787
-Ref: Read Timeout-Footnote-1269693
-Node: Retrying Input269751
-Node: Command-line directories270950
-Node: Input Summary271856
-Node: Input Exercises275028
-Node: Printing275756
-Node: Print277590
-Node: Print Examples279047
-Node: Output Separators281827
-Node: OFMT283844
-Node: Printf285200
-Node: Basic Printf285985
-Node: Control Letters287559
-Node: Format Modifiers291547
-Node: Printf Examples297562
-Node: Redirection300048
-Node: Special FD306889
-Ref: Special FD-Footnote-1310057
-Node: Special Files310131
-Node: Other Inherited Files310748
-Node: Special Network311749
-Node: Special Caveats312609
-Node: Close Files And Pipes313558
-Ref: table-close-pipe-return-values320465
-Ref: Close Files And Pipes-Footnote-1321248
-Ref: Close Files And Pipes-Footnote-2321396
-Node: Nonfatal321548
-Node: Output Summary323873
-Node: Output Exercises325095
-Node: Expressions325774
-Node: Values326962
-Node: Constants327640
-Node: Scalar Constants328331
-Ref: Scalar Constants-Footnote-1329195
-Node: Nondecimal-numbers329445
-Node: Regexp Constants332446
-Node: Using Constant Regexps332972
-Node: Standard Regexp Constants333594
-Node: Strong Regexp Constants336782
-Node: Variables339740
-Node: Using Variables340397
-Node: Assignment Options342307
-Node: Conversion344180
-Node: Strings And Numbers344704
-Ref: Strings And Numbers-Footnote-1347767
-Node: Locale influences conversions347876
-Ref: table-locale-affects350634
-Node: All Operators351252
-Node: Arithmetic Ops351881
-Node: Concatenation354387
-Ref: Concatenation-Footnote-1357234
-Node: Assignment Ops357341
-Ref: table-assign-ops362332
-Node: Increment Ops363645
-Node: Truth Values and Conditions367105
-Node: Truth Values368179
-Node: Typing and Comparison369227
-Node: Variable Typing370047
-Ref: Variable Typing-Footnote-1376510
-Ref: Variable Typing-Footnote-2376582
-Node: Comparison Operators376659
-Ref: table-relational-ops377078
-Node: POSIX String Comparison380573
-Ref: POSIX String Comparison-Footnote-1382268
-Ref: POSIX String Comparison-Footnote-2382407
-Node: Boolean Ops382491
-Ref: Boolean Ops-Footnote-1386973
-Node: Conditional Exp387065
-Node: Function Calls388801
-Node: Precedence392678
-Node: Locales396337
-Node: Expressions Summary397969
-Node: Patterns and Actions400542
-Node: Pattern Overview401662
-Node: Regexp Patterns403339
-Node: Expression Patterns403881
-Node: Ranges407662
-Node: BEGIN/END410770
-Node: Using BEGIN/END411531
-Ref: Using BEGIN/END-Footnote-1414267
-Node: I/O And BEGIN/END414373
-Node: BEGINFILE/ENDFILE416687
-Node: Empty419594
-Node: Using Shell Variables419911
-Node: Action Overview422185
-Node: Statements424510
-Node: If Statement426358
-Node: While Statement427853
-Node: Do Statement429881
-Node: For Statement431029
-Node: Switch Statement434187
-Node: Break Statement436573
-Node: Continue Statement438665
-Node: Next Statement440492
-Node: Nextfile Statement442875
-Node: Exit Statement445527
-Node: Built-in Variables447930
-Node: User-modified449063
-Node: Auto-set456830
-Ref: Auto-set-Footnote-1471558
-Ref: Auto-set-Footnote-2471764
-Node: ARGC and ARGV471820
-Node: Pattern Action Summary476033
-Node: Arrays478463
-Node: Array Basics479792
-Node: Array Intro480636
-Ref: figure-array-elements482611
-Ref: Array Intro-Footnote-1485315
-Node: Reference to Elements485443
-Node: Assigning Elements487907
-Node: Array Example488398
-Node: Scanning an Array490157
-Node: Controlling Scanning493179
-Ref: Controlling Scanning-Footnote-1498578
-Node: Numeric Array Subscripts498894
-Node: Uninitialized Subscripts501078
-Node: Delete502697
-Ref: Delete-Footnote-1505449
-Node: Multidimensional505506
-Node: Multiscanning508601
-Node: Arrays of Arrays510192
-Node: Arrays Summary514959
-Node: Functions517052
-Node: Built-in518090
-Node: Calling Built-in519171
-Node: Numeric Functions521167
-Ref: Numeric Functions-Footnote-1526112
-Ref: Numeric Functions-Footnote-2526469
-Ref: Numeric Functions-Footnote-3526517
-Node: String Functions526789
-Ref: String Functions-Footnote-1550447
-Ref: String Functions-Footnote-2550575
-Ref: String Functions-Footnote-3550823
-Node: Gory Details550910
-Ref: table-sub-escapes552701
-Ref: table-sub-proposed554220
-Ref: table-posix-sub555583
-Ref: table-gensub-escapes557124
-Ref: Gory Details-Footnote-1557947
-Node: I/O Functions558101
-Ref: table-system-return-values564683
-Ref: I/O Functions-Footnote-1566663
-Ref: I/O Functions-Footnote-2566811
-Node: Time Functions566931
-Ref: Time Functions-Footnote-1577598
-Ref: Time Functions-Footnote-2577666
-Ref: Time Functions-Footnote-3577824
-Ref: Time Functions-Footnote-4577935
-Ref: Time Functions-Footnote-5578047
-Ref: Time Functions-Footnote-6578274
-Node: Bitwise Functions578540
-Ref: table-bitwise-ops579134
-Ref: Bitwise Functions-Footnote-1585167
-Ref: Bitwise Functions-Footnote-2585340
-Node: Type Functions585531
-Node: I18N Functions588206
-Node: User-defined589857
-Node: Definition Syntax590662
-Ref: Definition Syntax-Footnote-1596349
-Node: Function Example596420
-Ref: Function Example-Footnote-1599342
-Node: Function Caveats599364
-Node: Calling A Function599882
-Node: Variable Scope600840
-Node: Pass By Value/Reference603834
-Node: Return Statement607333
-Node: Dynamic Typing610312
-Node: Indirect Calls611242
-Ref: Indirect Calls-Footnote-1621493
-Node: Functions Summary621621
-Node: Library Functions624326
-Ref: Library Functions-Footnote-1627933
-Ref: Library Functions-Footnote-2628076
-Node: Library Names628247
-Ref: Library Names-Footnote-1631707
-Ref: Library Names-Footnote-2631930
-Node: General Functions632016
-Node: Strtonum Function633119
-Node: Assert Function636141
-Node: Round Function639467
-Node: Cliff Random Function641008
-Node: Ordinal Functions642024
-Ref: Ordinal Functions-Footnote-1645087
-Ref: Ordinal Functions-Footnote-2645339
-Node: Join Function645549
-Ref: Join Function-Footnote-1647319
-Node: Getlocaltime Function647519
-Node: Readfile Function651261
-Node: Shell Quoting653233
-Node: Data File Management654634
-Node: Filetrans Function655266
-Node: Rewind Function659362
-Node: File Checking661268
-Ref: File Checking-Footnote-1662602
-Node: Empty Files662803
-Node: Ignoring Assigns664782
-Node: Getopt Function666332
-Ref: Getopt Function-Footnote-1677801
-Node: Passwd Functions678001
-Ref: Passwd Functions-Footnote-1686840
-Node: Group Functions686928
-Ref: Group Functions-Footnote-1694826
-Node: Walking Arrays695033
-Node: Library Functions Summary698041
-Node: Library Exercises699447
-Node: Sample Programs699912
-Node: Running Examples700682
-Node: Clones701410
-Node: Cut Program702634
-Node: Egrep Program712563
-Ref: Egrep Program-Footnote-1720075
-Node: Id Program720185
-Node: Split Program723865
-Ref: Split Program-Footnote-1727324
-Node: Tee Program727453
-Node: Uniq Program730243
-Node: Wc Program737669
-Ref: Wc Program-Footnote-1741924
-Node: Miscellaneous Programs742018
-Node: Dupword Program743231
-Node: Alarm Program745261
-Node: Translate Program750116
-Ref: Translate Program-Footnote-1754681
-Node: Labels Program754951
-Ref: Labels Program-Footnote-1758302
-Node: Word Sorting758386
-Node: History Sorting762458
-Node: Extract Program764293
-Node: Simple Sed771822
-Node: Igawk Program774896
-Ref: Igawk Program-Footnote-1789227
-Ref: Igawk Program-Footnote-2789429
-Ref: Igawk Program-Footnote-3789551
-Node: Anagram Program789666
-Node: Signature Program792728
-Node: Programs Summary793975
-Node: Programs Exercises795189
-Ref: Programs Exercises-Footnote-1799318
-Node: Advanced Features799409
-Node: Nondecimal Data801399
-Node: Array Sorting802990
-Node: Controlling Array Traversal803690
-Ref: Controlling Array Traversal-Footnote-1812057
-Node: Array Sorting Functions812175
-Ref: Array Sorting Functions-Footnote-1817266
-Node: Two-way I/O817462
-Ref: Two-way I/O-Footnote-1824013
-Ref: Two-way I/O-Footnote-2824200
-Node: TCP/IP Networking824282
-Node: Profiling827400
-Ref: Profiling-Footnote-1836072
-Node: Advanced Features Summary836395
-Node: Internationalization838239
-Node: I18N and L10N839719
-Node: Explaining gettext840406
-Ref: Explaining gettext-Footnote-1846298
-Ref: Explaining gettext-Footnote-2846483
-Node: Programmer i18n846648
-Ref: Programmer i18n-Footnote-1851597
-Node: Translator i18n851646
-Node: String Extraction852440
-Ref: String Extraction-Footnote-1853572
-Node: Printf Ordering853658
-Ref: Printf Ordering-Footnote-1856444
-Node: I18N Portability856508
-Ref: I18N Portability-Footnote-1858964
-Node: I18N Example859027
-Ref: I18N Example-Footnote-1861833
-Node: Gawk I18N861906
-Node: I18N Summary862551
-Node: Debugger863892
-Node: Debugging864914
-Node: Debugging Concepts865355
-Node: Debugging Terms867164
-Node: Awk Debugging869739
-Node: Sample Debugging Session870645
-Node: Debugger Invocation871179
-Node: Finding The Bug872565
-Node: List of Debugger Commands879043
-Node: Breakpoint Control880376
-Node: Debugger Execution Control884070
-Node: Viewing And Changing Data887432
-Node: Execution Stack890806
-Node: Debugger Info892443
-Node: Miscellaneous Debugger Commands896514
-Node: Readline Support901602
-Node: Limitations902498
-Node: Debugging Summary904607
-Node: Arbitrary Precision Arithmetic905886
-Node: Computer Arithmetic907302
-Ref: table-numeric-ranges910893
-Ref: Computer Arithmetic-Footnote-1911615
-Node: Math Definitions911672
-Ref: table-ieee-formats914986
-Ref: Math Definitions-Footnote-1915589
-Node: MPFR features915694
-Node: FP Math Caution917411
-Ref: FP Math Caution-Footnote-1918483
-Node: Inexactness of computations918852
-Node: Inexact representation919812
-Node: Comparing FP Values921172
-Node: Errors accumulate922254
-Node: Getting Accuracy923687
-Node: Try To Round926397
-Node: Setting precision927296
-Ref: table-predefined-precision-strings927993
-Node: Setting the rounding mode929823
-Ref: table-gawk-rounding-modes930197
-Ref: Setting the rounding mode-Footnote-1933605
-Node: Arbitrary Precision Integers933784
-Ref: Arbitrary Precision Integers-Footnote-1938701
-Node: POSIX Floating Point Problems938850
-Ref: POSIX Floating Point Problems-Footnote-1942732
-Node: Floating point summary942770
-Node: Dynamic Extensions944960
-Node: Extension Intro946513
-Node: Plugin License947779
-Node: Extension Mechanism Outline948576
-Ref: figure-load-extension949015
-Ref: figure-register-new-function950580
-Ref: figure-call-new-function951672
-Node: Extension API Description953734
-Node: Extension API Functions Introduction955376
-Node: General Data Types960710
-Ref: General Data Types-Footnote-1967915
-Node: Memory Allocation Functions968214
-Ref: Memory Allocation Functions-Footnote-1971059
-Node: Constructor Functions971158
-Node: Registration Functions974157
-Node: Extension Functions974842
-Node: Exit Callback Functions980055
-Node: Extension Version String981305
-Node: Input Parsers981968
-Node: Output Wrappers994675
-Node: Two-way processors999187
-Node: Printing Messages1001452
-Ref: Printing Messages-Footnote-11002623
-Node: Updating ERRNO1002776
-Node: Requesting Values1003515
-Ref: table-value-types-returned1004252
-Node: Accessing Parameters1005188
-Node: Symbol Table Access1006423
-Node: Symbol table by name1006935
-Node: Symbol table by cookie1008724
-Ref: Symbol table by cookie-Footnote-11012909
-Node: Cached values1012973
-Ref: Cached values-Footnote-11016509
-Node: Array Manipulation1016600
-Ref: Array Manipulation-Footnote-11017691
-Node: Array Data Types1017728
-Ref: Array Data Types-Footnote-11020386
-Node: Array Functions1020478
-Node: Flattening Arrays1024877
-Node: Creating Arrays1031818
-Node: Redirection API1036587
-Node: Extension API Variables1039429
-Node: Extension Versioning1040062
-Ref: gawk-api-version1040499
-Node: Extension API Informational Variables1042227
-Node: Extension API Boilerplate1043291
-Node: Changes from API V11047153
-Node: Finding Extensions1047813
-Node: Extension Example1048372
-Node: Internal File Description1049170
-Node: Internal File Ops1053250
-Ref: Internal File Ops-Footnote-11064650
-Node: Using Internal File Ops1064790
-Ref: Using Internal File Ops-Footnote-11067173
-Node: Extension Samples1067447
-Node: Extension Sample File Functions1068976
-Node: Extension Sample Fnmatch1076625
-Node: Extension Sample Fork1078112
-Node: Extension Sample Inplace1079330
-Node: Extension Sample Ord1082540
-Node: Extension Sample Readdir1083376
-Ref: table-readdir-file-types1084265
-Node: Extension Sample Revout1085070
-Node: Extension Sample Rev2way1085659
-Node: Extension Sample Read write array1086399
-Node: Extension Sample Readfile1088341
-Node: Extension Sample Time1089436
-Node: Extension Sample API Tests1090784
-Node: gawkextlib1091276
-Node: Extension summary1093723
-Node: Extension Exercises1097425
-Node: Language History1098923
-Node: V7/SVR3.11100579
-Node: SVR41102731
-Node: POSIX1104165
-Node: BTL1105544
-Node: POSIX/GNU1106273
-Node: Feature History1112165
-Node: Common Extensions1126535
-Node: Ranges and Locales1127818
-Ref: Ranges and Locales-Footnote-11132434
-Ref: Ranges and Locales-Footnote-21132461
-Ref: Ranges and Locales-Footnote-31132696
-Node: Contributors1132917
-Node: History summary1138477
-Node: Installation1139857
-Node: Gawk Distribution1140801
-Node: Getting1141285
-Node: Extracting1142246
-Node: Distribution contents1143884
-Node: Unix Installation1150226
-Node: Quick Installation1150908
-Node: Shell Startup Files1153322
-Node: Additional Configuration Options1154411
-Node: Configuration Philosophy1156400
-Node: Non-Unix Installation1158769
-Node: PC Installation1159229
-Node: PC Binary Installation1160067
-Node: PC Compiling1160502
-Node: PC Using1161619
-Node: Cygwin1164664
-Node: MSYS1165434
-Node: VMS Installation1165935
-Node: VMS Compilation1166726
-Ref: VMS Compilation-Footnote-11167955
-Node: VMS Dynamic Extensions1168013
-Node: VMS Installation Details1169698
-Node: VMS Running1171951
-Node: VMS GNV1176230
-Node: VMS Old Gawk1176965
-Node: Bugs1177436
-Node: Bug address1178099
-Node: Usenet1180496
-Node: Maintainers1181273
-Node: Other Versions1182649
-Node: Installation summary1189233
-Node: Notes1190268
-Node: Compatibility Mode1191133
-Node: Additions1191915
-Node: Accessing The Source1192840
-Node: Adding Code1194275
-Node: New Ports1200493
-Node: Derived Files1204981
-Ref: Derived Files-Footnote-11210466
-Ref: Derived Files-Footnote-21210501
-Ref: Derived Files-Footnote-31211099
-Node: Future Extensions1211213
-Node: Implementation Limitations1211871
-Node: Extension Design1213054
-Node: Old Extension Problems1214208
-Ref: Old Extension Problems-Footnote-11215726
-Node: Extension New Mechanism Goals1215783
-Ref: Extension New Mechanism Goals-Footnote-11219147
-Node: Extension Other Design Decisions1219336
-Node: Extension Future Growth1221449
-Node: Old Extension Mechanism1222285
-Node: Notes summary1224048
-Node: Basic Concepts1225230
-Node: Basic High Level1225911
-Ref: figure-general-flow1226193
-Ref: figure-process-flow1226878
-Ref: Basic High Level-Footnote-11230179
-Node: Basic Data Typing1230364
-Node: Glossary1233692
-Node: Copying1265639
-Node: GNU Free Documentation License1303178
-Node: Index1328296
+Node: Foreword343204
+Node: Foreword447646
+Node: Preface49178
+Ref: Preface-Footnote-152037
+Ref: Preface-Footnote-252144
+Ref: Preface-Footnote-352378
+Node: History52520
+Node: Names54872
+Ref: Names-Footnote-155966
+Node: This Manual56113
+Ref: This Manual-Footnote-162598
+Node: Conventions62698
+Node: Manual History65052
+Ref: Manual History-Footnote-168047
+Ref: Manual History-Footnote-268088
+Node: How To Contribute68162
+Node: Acknowledgments68813
+Node: Getting Started73699
+Node: Running gawk76138
+Node: One-shot77328
+Node: Read Terminal78591
+Node: Long80584
+Node: Executable Scripts82097
+Ref: Executable Scripts-Footnote-184892
+Node: Comments84995
+Node: Quoting87479
+Node: DOS Quoting92996
+Node: Sample Data Files95051
+Node: Very Simple97646
+Node: Two Rules102548
+Node: More Complex104433
+Node: Statements/Lines107299
+Ref: Statements/Lines-Footnote-1111758
+Node: Other Features112023
+Node: When112959
+Ref: When-Footnote-1114713
+Node: Intro Summary114778
+Node: Invoking Gawk115662
+Node: Command Line117176
+Node: Options117974
+Ref: Options-Footnote-1134593
+Ref: Options-Footnote-2134823
+Node: Other Arguments134848
+Node: Naming Standard Input137795
+Node: Environment Variables138888
+Node: AWKPATH Variable139446
+Ref: AWKPATH Variable-Footnote-1142857
+Ref: AWKPATH Variable-Footnote-2142891
+Node: AWKLIBPATH Variable143152
+Node: Other Environment Variables144409
+Node: Exit Status148230
+Node: Include Files148907
+Node: Loading Shared Libraries152502
+Node: Obsolete153930
+Node: Undocumented154622
+Node: Invoking Summary154919
+Node: Regexp156579
+Node: Regexp Usage158033
+Node: Escape Sequences160070
+Node: Regexp Operators166302
+Ref: Regexp Operators-Footnote-1173718
+Ref: Regexp Operators-Footnote-2173865
+Node: Bracket Expressions173963
+Ref: table-char-classes176439
+Node: Leftmost Longest179576
+Node: Computed Regexps180879
+Node: GNU Regexp Operators184306
+Node: Case-sensitivity187985
+Ref: Case-sensitivity-Footnote-1190872
+Ref: Case-sensitivity-Footnote-2191107
+Node: Regexp Summary191215
+Node: Reading Files192681
+Node: Records194950
+Node: awk split records195683
+Node: gawk split records200614
+Ref: gawk split records-Footnote-1205154
+Node: Fields205191
+Node: Nonconstant Fields207932
+Ref: Nonconstant Fields-Footnote-1210168
+Node: Changing Fields210372
+Node: Field Separators216300
+Node: Default Field Splitting218998
+Node: Regexp Field Splitting220116
+Node: Single Character Fields223469
+Node: Command Line Field Separator224529
+Node: Full Line Fields227747
+Ref: Full Line Fields-Footnote-1229269
+Ref: Full Line Fields-Footnote-2229315
+Node: Field Splitting Summary229416
+Node: Constant Size231490
+Node: Fixed width data232222
+Node: Skipping intervening235689
+Node: Allowing trailing data236487
+Node: Fields with fixed data237524
+Node: Splitting By Content239042
+Ref: Splitting By Content-Footnote-1242692
+Node: Testing field creation242855
+Node: Multiple Line244476
+Ref: Multiple Line-Footnote-1250360
+Node: Getline250539
+Node: Plain Getline253008
+Node: Getline/Variable255649
+Node: Getline/File256800
+Node: Getline/Variable/File258188
+Ref: Getline/Variable/File-Footnote-1259793
+Node: Getline/Pipe259881
+Node: Getline/Variable/Pipe262588
+Node: Getline/Coprocess263723
+Node: Getline/Variable/Coprocess264990
+Node: Getline Notes265732
+Node: Getline Summary268529
+Ref: table-getline-variants268953
+Node: Read Timeout269701
+Ref: Read Timeout-Footnote-1273607
+Node: Retrying Input273665
+Node: Command-line directories274864
+Node: Input Summary275770
+Node: Input Exercises278942
+Node: Printing279670
+Node: Print281504
+Node: Print Examples282961
+Node: Output Separators285741
+Node: OFMT287758
+Node: Printf289114
+Node: Basic Printf289899
+Node: Control Letters291473
+Node: Format Modifiers295461
+Node: Printf Examples301476
+Node: Redirection303962
+Node: Special FD310803
+Ref: Special FD-Footnote-1313971
+Node: Special Files314045
+Node: Other Inherited Files314662
+Node: Special Network315663
+Node: Special Caveats316523
+Node: Close Files And Pipes317472
+Ref: table-close-pipe-return-values324379
+Ref: Close Files And Pipes-Footnote-1325162
+Ref: Close Files And Pipes-Footnote-2325310
+Node: Nonfatal325462
+Node: Output Summary327787
+Node: Output Exercises329009
+Node: Expressions329688
+Node: Values330876
+Node: Constants331554
+Node: Scalar Constants332245
+Ref: Scalar Constants-Footnote-1333109
+Node: Nondecimal-numbers333359
+Node: Regexp Constants336360
+Node: Using Constant Regexps336886
+Node: Standard Regexp Constants337508
+Node: Strong Regexp Constants340696
+Node: Variables343654
+Node: Using Variables344311
+Node: Assignment Options346221
+Node: Conversion348094
+Node: Strings And Numbers348618
+Ref: Strings And Numbers-Footnote-1351681
+Node: Locale influences conversions351790
+Ref: table-locale-affects354548
+Node: All Operators355166
+Node: Arithmetic Ops355795
+Node: Concatenation358301
+Ref: Concatenation-Footnote-1361148
+Node: Assignment Ops361255
+Ref: table-assign-ops366246
+Node: Increment Ops367559
+Node: Truth Values and Conditions371019
+Node: Truth Values372093
+Node: Typing and Comparison373141
+Node: Variable Typing373961
+Ref: Variable Typing-Footnote-1380424
+Ref: Variable Typing-Footnote-2380496
+Node: Comparison Operators380573
+Ref: table-relational-ops380992
+Node: POSIX String Comparison384487
+Ref: POSIX String Comparison-Footnote-1386182
+Ref: POSIX String Comparison-Footnote-2386321
+Node: Boolean Ops386405
+Ref: Boolean Ops-Footnote-1390887
+Node: Conditional Exp390979
+Node: Function Calls392715
+Node: Precedence396592
+Node: Locales400251
+Node: Expressions Summary401883
+Node: Patterns and Actions404456
+Node: Pattern Overview405576
+Node: Regexp Patterns407253
+Node: Expression Patterns407795
+Node: Ranges411576
+Node: BEGIN/END414684
+Node: Using BEGIN/END415445
+Ref: Using BEGIN/END-Footnote-1418181
+Node: I/O And BEGIN/END418287
+Node: BEGINFILE/ENDFILE420601
+Node: Empty423508
+Node: Using Shell Variables423825
+Node: Action Overview426099
+Node: Statements428424
+Node: If Statement430272
+Node: While Statement431767
+Node: Do Statement433795
+Node: For Statement434943
+Node: Switch Statement438101
+Node: Break Statement440487
+Node: Continue Statement442579
+Node: Next Statement444406
+Node: Nextfile Statement446789
+Node: Exit Statement449441
+Node: Built-in Variables451844
+Node: User-modified452977
+Node: Auto-set460744
+Ref: Auto-set-Footnote-1475472
+Ref: Auto-set-Footnote-2475678
+Node: ARGC and ARGV475734
+Node: Pattern Action Summary479947
+Node: Arrays482377
+Node: Array Basics483706
+Node: Array Intro484550
+Ref: figure-array-elements486525
+Ref: Array Intro-Footnote-1489229
+Node: Reference to Elements489357
+Node: Assigning Elements491821
+Node: Array Example492312
+Node: Scanning an Array494071
+Node: Controlling Scanning497093
+Ref: Controlling Scanning-Footnote-1502492
+Node: Numeric Array Subscripts502808
+Node: Uninitialized Subscripts504992
+Node: Delete506611
+Ref: Delete-Footnote-1509363
+Node: Multidimensional509420
+Node: Multiscanning512515
+Node: Arrays of Arrays514106
+Node: Arrays Summary518873
+Node: Functions520966
+Node: Built-in522004
+Node: Calling Built-in523085
+Node: Numeric Functions525081
+Ref: Numeric Functions-Footnote-1530026
+Ref: Numeric Functions-Footnote-2530383
+Ref: Numeric Functions-Footnote-3530431
+Node: String Functions530703
+Ref: String Functions-Footnote-1554361
+Ref: String Functions-Footnote-2554489
+Ref: String Functions-Footnote-3554737
+Node: Gory Details554824
+Ref: table-sub-escapes556615
+Ref: table-sub-proposed558134
+Ref: table-posix-sub559497
+Ref: table-gensub-escapes561038
+Ref: Gory Details-Footnote-1561861
+Node: I/O Functions562015
+Ref: table-system-return-values568597
+Ref: I/O Functions-Footnote-1570577
+Ref: I/O Functions-Footnote-2570725
+Node: Time Functions570845
+Ref: Time Functions-Footnote-1581512
+Ref: Time Functions-Footnote-2581580
+Ref: Time Functions-Footnote-3581738
+Ref: Time Functions-Footnote-4581849
+Ref: Time Functions-Footnote-5581961
+Ref: Time Functions-Footnote-6582188
+Node: Bitwise Functions582454
+Ref: table-bitwise-ops583048
+Ref: Bitwise Functions-Footnote-1589081
+Ref: Bitwise Functions-Footnote-2589254
+Node: Type Functions589445
+Node: I18N Functions592120
+Node: User-defined593771
+Node: Definition Syntax594576
+Ref: Definition Syntax-Footnote-1600263
+Node: Function Example600334
+Ref: Function Example-Footnote-1603256
+Node: Function Caveats603278
+Node: Calling A Function603796
+Node: Variable Scope604754
+Node: Pass By Value/Reference607748
+Node: Return Statement611247
+Node: Dynamic Typing614226
+Node: Indirect Calls615156
+Ref: Indirect Calls-Footnote-1625407
+Node: Functions Summary625535
+Node: Library Functions628240
+Ref: Library Functions-Footnote-1631847
+Ref: Library Functions-Footnote-2631990
+Node: Library Names632161
+Ref: Library Names-Footnote-1635621
+Ref: Library Names-Footnote-2635844
+Node: General Functions635930
+Node: Strtonum Function637033
+Node: Assert Function640055
+Node: Round Function643381
+Node: Cliff Random Function644922
+Node: Ordinal Functions645938
+Ref: Ordinal Functions-Footnote-1649001
+Ref: Ordinal Functions-Footnote-2649253
+Node: Join Function649463
+Ref: Join Function-Footnote-1651233
+Node: Getlocaltime Function651433
+Node: Readfile Function655175
+Node: Shell Quoting657147
+Node: Data File Management658548
+Node: Filetrans Function659180
+Node: Rewind Function663276
+Node: File Checking665182
+Ref: File Checking-Footnote-1666516
+Node: Empty Files666717
+Node: Ignoring Assigns668696
+Node: Getopt Function670246
+Ref: Getopt Function-Footnote-1681715
+Node: Passwd Functions681915
+Ref: Passwd Functions-Footnote-1690754
+Node: Group Functions690842
+Ref: Group Functions-Footnote-1698740
+Node: Walking Arrays698947
+Node: Library Functions Summary701955
+Node: Library Exercises703361
+Node: Sample Programs703826
+Node: Running Examples704596
+Node: Clones705324
+Node: Cut Program706548
+Node: Egrep Program716477
+Ref: Egrep Program-Footnote-1723989
+Node: Id Program724099
+Node: Split Program727779
+Ref: Split Program-Footnote-1731238
+Node: Tee Program731367
+Node: Uniq Program734157
+Node: Wc Program741583
+Ref: Wc Program-Footnote-1745838
+Node: Miscellaneous Programs745932
+Node: Dupword Program747145
+Node: Alarm Program749175
+Node: Translate Program754030
+Ref: Translate Program-Footnote-1758595
+Node: Labels Program758865
+Ref: Labels Program-Footnote-1762216
+Node: Word Sorting762300
+Node: History Sorting766372
+Node: Extract Program768207
+Node: Simple Sed775736
+Node: Igawk Program778810
+Ref: Igawk Program-Footnote-1793141
+Ref: Igawk Program-Footnote-2793343
+Ref: Igawk Program-Footnote-3793465
+Node: Anagram Program793580
+Node: Signature Program796642
+Node: Programs Summary797889
+Node: Programs Exercises799103
+Ref: Programs Exercises-Footnote-1803232
+Node: Advanced Features803323
+Node: Nondecimal Data805313
+Node: Array Sorting806904
+Node: Controlling Array Traversal807604
+Ref: Controlling Array Traversal-Footnote-1815971
+Node: Array Sorting Functions816089
+Ref: Array Sorting Functions-Footnote-1821180
+Node: Two-way I/O821376
+Ref: Two-way I/O-Footnote-1827927
+Ref: Two-way I/O-Footnote-2828114
+Node: TCP/IP Networking828196
+Node: Profiling831314
+Ref: Profiling-Footnote-1839986
+Node: Advanced Features Summary840309
+Node: Internationalization842153
+Node: I18N and L10N843633
+Node: Explaining gettext844320
+Ref: Explaining gettext-Footnote-1850212
+Ref: Explaining gettext-Footnote-2850397
+Node: Programmer i18n850562
+Ref: Programmer i18n-Footnote-1855511
+Node: Translator i18n855560
+Node: String Extraction856354
+Ref: String Extraction-Footnote-1857486
+Node: Printf Ordering857572
+Ref: Printf Ordering-Footnote-1860358
+Node: I18N Portability860422
+Ref: I18N Portability-Footnote-1862878
+Node: I18N Example862941
+Ref: I18N Example-Footnote-1865747
+Node: Gawk I18N865820
+Node: I18N Summary866465
+Node: Debugger867806
+Node: Debugging868828
+Node: Debugging Concepts869269
+Node: Debugging Terms871078
+Node: Awk Debugging873653
+Node: Sample Debugging Session874559
+Node: Debugger Invocation875093
+Node: Finding The Bug876479
+Node: List of Debugger Commands882957
+Node: Breakpoint Control884290
+Node: Debugger Execution Control887984
+Node: Viewing And Changing Data891346
+Node: Execution Stack894720
+Node: Debugger Info896357
+Node: Miscellaneous Debugger Commands900428
+Node: Readline Support905516
+Node: Limitations906412
+Node: Debugging Summary908521
+Node: Arbitrary Precision Arithmetic909800
+Node: Computer Arithmetic911216
+Ref: table-numeric-ranges914807
+Ref: Computer Arithmetic-Footnote-1915529
+Node: Math Definitions915586
+Ref: table-ieee-formats918900
+Ref: Math Definitions-Footnote-1919503
+Node: MPFR features919608
+Node: FP Math Caution921325
+Ref: FP Math Caution-Footnote-1922397
+Node: Inexactness of computations922766
+Node: Inexact representation923726
+Node: Comparing FP Values925086
+Node: Errors accumulate926168
+Node: Getting Accuracy927601
+Node: Try To Round930311
+Node: Setting precision931210
+Ref: table-predefined-precision-strings931907
+Node: Setting the rounding mode933737
+Ref: table-gawk-rounding-modes934111
+Ref: Setting the rounding mode-Footnote-1937519
+Node: Arbitrary Precision Integers937698
+Ref: Arbitrary Precision Integers-Footnote-1942615
+Node: POSIX Floating Point Problems942764
+Ref: POSIX Floating Point Problems-Footnote-1946646
+Node: Floating point summary946684
+Node: Dynamic Extensions948874
+Node: Extension Intro950427
+Node: Plugin License951693
+Node: Extension Mechanism Outline952490
+Ref: figure-load-extension952929
+Ref: figure-register-new-function954494
+Ref: figure-call-new-function955586
+Node: Extension API Description957648
+Node: Extension API Functions Introduction959290
+Node: General Data Types964624
+Ref: General Data Types-Footnote-1971829
+Node: Memory Allocation Functions972128
+Ref: Memory Allocation Functions-Footnote-1974973
+Node: Constructor Functions975072
+Node: Registration Functions978071
+Node: Extension Functions978756
+Node: Exit Callback Functions983969
+Node: Extension Version String985219
+Node: Input Parsers985882
+Node: Output Wrappers998589
+Node: Two-way processors1003101
+Node: Printing Messages1005366
+Ref: Printing Messages-Footnote-11006537
+Node: Updating ERRNO1006690
+Node: Requesting Values1007429
+Ref: table-value-types-returned1008166
+Node: Accessing Parameters1009102
+Node: Symbol Table Access1010337
+Node: Symbol table by name1010849
+Node: Symbol table by cookie1012638
+Ref: Symbol table by cookie-Footnote-11016823
+Node: Cached values1016887
+Ref: Cached values-Footnote-11020423
+Node: Array Manipulation1020514
+Ref: Array Manipulation-Footnote-11021605
+Node: Array Data Types1021642
+Ref: Array Data Types-Footnote-11024300
+Node: Array Functions1024392
+Node: Flattening Arrays1028791
+Node: Creating Arrays1035732
+Node: Redirection API1040501
+Node: Extension API Variables1043343
+Node: Extension Versioning1043976
+Ref: gawk-api-version1044413
+Node: Extension API Informational Variables1046141
+Node: Extension API Boilerplate1047205
+Node: Changes from API V11051067
+Node: Finding Extensions1051727
+Node: Extension Example1052286
+Node: Internal File Description1053084
+Node: Internal File Ops1057164
+Ref: Internal File Ops-Footnote-11068564
+Node: Using Internal File Ops1068704
+Ref: Using Internal File Ops-Footnote-11071087
+Node: Extension Samples1071361
+Node: Extension Sample File Functions1072890
+Node: Extension Sample Fnmatch1080539
+Node: Extension Sample Fork1082026
+Node: Extension Sample Inplace1083244
+Node: Extension Sample Ord1086454
+Node: Extension Sample Readdir1087290
+Ref: table-readdir-file-types1088179
+Node: Extension Sample Revout1088984
+Node: Extension Sample Rev2way1089573
+Node: Extension Sample Read write array1090313
+Node: Extension Sample Readfile1092255
+Node: Extension Sample Time1093350
+Node: Extension Sample API Tests1094698
+Node: gawkextlib1095190
+Node: Extension summary1097637
+Node: Extension Exercises1101339
+Node: Language History1102837
+Node: V7/SVR3.11104493
+Node: SVR41106645
+Node: POSIX1108079
+Node: BTL1109458
+Node: POSIX/GNU1110187
+Node: Feature History1116079
+Node: Common Extensions1130449
+Node: Ranges and Locales1131732
+Ref: Ranges and Locales-Footnote-11136348
+Ref: Ranges and Locales-Footnote-21136375
+Ref: Ranges and Locales-Footnote-31136610
+Node: Contributors1136831
+Node: History summary1142391
+Node: Installation1143771
+Node: Gawk Distribution1144715
+Node: Getting1145199
+Node: Extracting1146160
+Node: Distribution contents1147798
+Node: Unix Installation1154140
+Node: Quick Installation1154822
+Node: Shell Startup Files1157236
+Node: Additional Configuration Options1158325
+Node: Configuration Philosophy1160314
+Node: Non-Unix Installation1162683
+Node: PC Installation1163143
+Node: PC Binary Installation1163981
+Node: PC Compiling1164416
+Node: PC Using1165533
+Node: Cygwin1168578
+Node: MSYS1169348
+Node: VMS Installation1169849
+Node: VMS Compilation1170640
+Ref: VMS Compilation-Footnote-11171869
+Node: VMS Dynamic Extensions1171927
+Node: VMS Installation Details1173612
+Node: VMS Running1175865
+Node: VMS GNV1180144
+Node: VMS Old Gawk1180879
+Node: Bugs1181350
+Node: Bug address1182013
+Node: Usenet1184410
+Node: Maintainers1185187
+Node: Other Versions1186563
+Node: Installation summary1193147
+Node: Notes1194182
+Node: Compatibility Mode1195047
+Node: Additions1195829
+Node: Accessing The Source1196754
+Node: Adding Code1198189
+Node: New Ports1204407
+Node: Derived Files1208895
+Ref: Derived Files-Footnote-11214380
+Ref: Derived Files-Footnote-21214415
+Ref: Derived Files-Footnote-31215013
+Node: Future Extensions1215127
+Node: Implementation Limitations1215785
+Node: Extension Design1216968
+Node: Old Extension Problems1218122
+Ref: Old Extension Problems-Footnote-11219640
+Node: Extension New Mechanism Goals1219697
+Ref: Extension New Mechanism Goals-Footnote-11223061
+Node: Extension Other Design Decisions1223250
+Node: Extension Future Growth1225363
+Node: Old Extension Mechanism1226199
+Node: Notes summary1227962
+Node: Basic Concepts1229144
+Node: Basic High Level1229825
+Ref: figure-general-flow1230107
+Ref: figure-process-flow1230792
+Ref: Basic High Level-Footnote-11234093
+Node: Basic Data Typing1234278
+Node: Glossary1237606
+Node: Copying1269553
+Node: GNU Free Documentation License1307092
+Node: Index1332210

End Tag Table
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 353a0c9d..5b9eeed7 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -568,7 +568,13 @@ particular records in a file and perform operations upon them.
field.
* Field Splitting Summary:: Some final points and a summary table.
* Constant Size:: Reading constant width data.
+* Fixed width data:: Processing fixed-width data.
+* Skipping intervening:: Skipping intervening fields.
+* Allowing trailing data:: Capturing optional trailing data.
+* Fields with fixed data:: Field values with fixed-width data.
* Splitting By Content:: Defining Fields By Content
+* Testing field creation:: Checking how @command{gawk} is
+ splitting records.
* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program
control using the @code{getline}
@@ -6431,6 +6437,8 @@ used with it do not have to be named on the @command{awk} command line
* Field Separators:: The field separator and how to change it.
* Constant Size:: Reading constant width data.
* Splitting By Content:: Defining Fields By Content
+* Testing field creation:: Checking how @command{gawk} is splitting
+ records.
* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program control
using the @code{getline} function.
@@ -7756,18 +7764,30 @@ feature of @command{gawk}. If you are a novice @command{awk} user,
you might want to skip it on the first reading.
@command{gawk} provides a facility for dealing with fixed-width fields
-with no distinctive field separator. For example, data of this nature
-arises in the input for old Fortran programs where numbers are run
-together, or in the output of programs that did not anticipate the use
-of their output as input for other programs.
-
-An example of the latter is a table where all the columns are lined up by
-the use of a variable number of spaces and @emph{empty fields are just
-spaces}. Clearly, @command{awk}'s normal field splitting based on @code{FS}
-does not work well in this case. Although a portable @command{awk} program
-can use a series of @code{substr()} calls on @code{$0}
-(@pxref{String Functions}),
-this is awkward and inefficient for a large number of fields.
+with no distinctive field separator. We discuss this feature in
+the following @value{SUBSECTION}s.
+
+@menu
+* Fixed width data:: Processing fixed-width data.
+* Skipping intervening:: Skipping intervening fields.
+* Allowing trailing data:: Capturing optional trailing data.
+* Fields with fixed data:: Field values with fixed-width data.
+@end menu
+
+@node Fixed width data
+@subsection Processing Fixed-Width Data
+
+An example of fixed-width data would be the input for old Fortran programs
+where numbers are run together, or the output of programs that did not
+anticipate the use of their output as input for other programs.
+
+An example of the latter is a table where all the columns are lined up
+by the use of a variable number of spaces and @emph{empty fields are
+just spaces}. Clearly, @command{awk}'s normal field splitting based
+on @code{FS} does not work well in this case. Although a portable
+@command{awk} program can use a series of @code{substr()} calls on
+@code{$0} (@pxref{String Functions}), this is awkward and inefficient
+for a large number of fields.
@cindex troubleshooting, fatal errors, field widths@comma{} specifying
@cindex @command{w} utility
@@ -7775,14 +7795,12 @@ this is awkward and inefficient for a large number of fields.
@cindex @command{gawk}, @code{FIELDWIDTHS} variable in
The splitting of an input record into fixed-width fields is specified by
assigning a string containing space-separated numbers to the built-in
-variable @code{FIELDWIDTHS}. Each number specifies the width of the field,
-@emph{including} columns between fields. If you want to ignore the columns
-between fields, you can specify the width as a separate field that is
-subsequently ignored.
-Or, starting in @value{PVERSION} 4.2, each field width may optionally be
-preceded by a colon-separated value specifying the number of characters to skip
-before the field starts.
-It is a fatal error to supply a field width that has a negative value.
+variable @code{FIELDWIDTHS}. Each number specifies the width of the
+field, @emph{including} columns between fields. If you want to ignore
+the columns between fields, you can specify the width as a separate
+field that is subsequently ignored. It is a fatal error to supply a
+field width that has a negative value.
+
The following data is the output of the Unix @command{w} utility. It is useful
to illustrate the use of @code{FIELDWIDTHS}:
@@ -7812,7 +7830,7 @@ NR > 2 @{
sub(/^ +/, "", idle) # strip leading spaces
if (idle == "")
idle = 0
- if (idle ~ /:/) @{
+ if (idle ~ /:/) @{ # hh:mm
split(idle, t, ":")
idle = t[1] * 60 + t[2]
@}
@@ -7841,13 +7859,30 @@ brent ttyp0 286
dave ttyq4 1296000
@end example
-Starting in @value{PVERSION} 4.2, this program could be rewritten to
-specify @code{FIELDWIDTHS} like so:
+Another (possibly more practical) example of fixed-width input data
+is the input from a deck of balloting cards. In some parts of
+the United States, voters mark their choices by punching holes in computer
+cards. These cards are then processed to count the votes for any particular
+candidate or on any particular issue. Because a voter may choose not to
+vote on some issue, any column on the card may be empty. An @command{awk}
+program for processing such data could use the @code{FIELDWIDTHS} feature
+to simplify reading the data. (Of course, getting @command{gawk} to run on
+a system with card readers is another story!)
+
+@node Skipping intervening
+@subsection Skipping Intervening Fields
+
+Starting in @value{PVERSION} 4.2, each field width may optionally be
+preceded by a colon-separated value specifying the number of characters
+to skip before the field starts. Thus, the preceding program could be
+rewritten to specify @code{FIELDWIDTHS} like so:
+
@example
BEGIN @{ FIELDWIDTHS = "8 1:5 4:7 6 1:6 1:6 2:33" @}
@end example
+
This strips away some of the white space separating the fields. With such
-a change, the program would produce the following results:
+a change, the program produces the following results:
@example
hzang ttyV3 50
@@ -7859,42 +7894,65 @@ brent ttyp0 286
dave ttyq4 1296000
@end example
-Another (possibly more practical) example of fixed-width input data
-is the input from a deck of balloting cards. In some parts of
-the United States, voters mark their choices by punching holes in computer
-cards. These cards are then processed to count the votes for any particular
-candidate or on any particular issue. Because a voter may choose not to
-vote on some issue, any column on the card may be empty. An @command{awk}
-program for processing such data could use the @code{FIELDWIDTHS} feature
-to simplify reading the data. (Of course, getting @command{gawk} to run on
-a system with card readers is another story!)
+@node Allowing trailing data
+@subsection Capturing Optional Trailing Data
-@cindex @command{gawk}, splitting fields and
-Assigning a value to @code{FS} causes @command{gawk} to use
-@code{FS} for field splitting again. Use @samp{FS = FS} to make this happen,
-without having to know the current value of @code{FS}.
-In order to tell which kind of field splitting is in effect,
-use @code{PROCINFO["FS"]}
-(@pxref{Auto-set}).
-The value is @code{"FS"} if regular field splitting is being used,
-or @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
+There are times when fixed-width data may be followed by additional data
+that has no fixed length. Such data may or may not be present, but if
+it is, it should be possible to get at it from an @command{awk} program.
+
+Starting with version 4.2, in order to provide a way to say ``anything
+else in the record after the defined fields,'' @command{gawk}
+allows you to add a final @samp{*} character to the value of
+@code{FIELDWIDTHS}. There can only be one such character, and it must
+be the final non-whitespace character in @code{FIELDWIDTHS}.
+For example:
@example
-if (PROCINFO["FS"] == "FS")
- @var{regular field splitting} @dots{}
-else if (PROCINFO["FS"] == "FIELDWIDTHS")
- @var{fixed-width field splitting} @dots{}
-else if (PROCINFO["FS"] == "FPAT")
- @var{content-based field splitting} @dots{} @ii{(see next @value{SECTION})}
-else
- @var{API input parser field splitting} @dots{} @ii{(advanced feature)}
+$ @kbd{cat fw.awk} @ii{Show the program}
+@print{} BEGIN @{ FIELDWIDTHS = "2 2 *" @}
+@print{} @{ print NF, $1, $2, $3 @}
+$ @kbd{cat fw.in} @ii{Show sample input}
+@print{} 1234abcdefghi
+$ @kbd{gawk -f fw.awk fw.in} @ii{Run the program}
+@print{} 3 12 34 abcdefghi
@end example
-This information is useful when writing a function
-that needs to temporarily change @code{FS} or @code{FIELDWIDTHS},
-read some records, and then restore the original settings
-(@pxref{Passwd Functions}
-for an example of such a function).
+@node Fields with fixed data
+@subsection Field Values With Fixed-Width Data
+
+So far, so good. But what happens if there isn't as much data as there
+should be based on the contents of @code{FIELDWIDTHS}? Or, what happens
+if there is more data than expected?
+
+For many years, what happens in these cases was not well defined. Starting
+with version 4.2, the rules are as follows:
+
+@table @asis
+@item Enough data for some fields
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4"} and the
+input record is @samp{aabbb}. In this case, @code{NF} is set to two.
+
+@item Not enough data for a field
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4"} and the
+input record is @samp{aab}. In this case, @code{NF} is set to two and
+@code{$2} has the value @code{"b"}. The idea is that even though there
+aren't as many characters as were expected, there are some, so the data
+should be made available to the program.
+
+@item Too much data
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4"} and the
+input record is @samp{aabbbccccddd}. In this case, @code{NF} is set to
+three and the extra characters (@samp{ddd}) are ignored. If you want
+@command{gawk} to capture the extra characters, supply a final @samp{*}
+in the value of @code{FIELDWIDTHS}.
+
+@item Too much data, but with @samp{*} supplied
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4 *"} and the
+input record is @samp{aabbbccccddd}. In this case, @code{NF} is set to
+four, and @code{$4} has the value @code{"ddd"}.
+
+@end table
@node Splitting By Content
@section Defining Fields by Content
@@ -7995,8 +8053,6 @@ affects field splitting with @code{FPAT}.
Assigning a value to @code{FPAT} overrides field splitting
with @code{FS} and with @code{FIELDWIDTHS}.
-Similar to @code{FIELDWIDTHS}, the value of @code{PROCINFO["FS"]}
-will be @code{"FPAT"} if content-based field splitting is being used.
@quotation NOTE
Some programs export CSV data that contains embedded newlines between
@@ -8023,13 +8079,44 @@ FPAT = "([^,]*)|(\"[^\"]+\")"
Finally, the @code{patsplit()} function makes the same functionality
available for splitting regular strings (@pxref{String Functions}).
-To recap, @command{gawk} provides three independent methods
-to split input records into fields.
-The mechanism used is based on which of the three
-variables---@code{FS}, @code{FIELDWIDTHS}, or @code{FPAT}---was
-last assigned to. In addition, an API input parser may choose to
-override the record parsing mechanism; please refer to @ref{Input Parsers}
-for further information about this feature.
+
+@node Testing field creation
+@section Checking How @command{gawk} Is Splitting Records
+
+@cindex @command{gawk}, splitting fields and
+As we've seen, @command{gawk} provides three independent methods to split
+input records into fields. The mechanism used is based on which of the
+three variables---@code{FS}, @code{FIELDWIDTHS}, or @code{FPAT}---was
+last assigned to. In addition, an API input parser may choose to override
+the record parsing mechanism; please refer to @ref{Input Parsers} for
+further information about this feature.
+
+To restore normal field splitting after using @code{FIELDWIDTHS}
+and/or @code{FPAT}, simply assign a value to @code{FS}.
+You can use @samp{FS = FS} to do this,
+without having to know the current value of @code{FS}.
+
+In order to tell which kind of field splitting is in effect,
+use @code{PROCINFO["FS"]} (@pxref{Auto-set}).
+The value is @code{"FS"} if regular field splitting is being used,
+@code{"FIELDWIDTHS"} if fixed-width field splitting is being used,
+or @code{"FPAT"} if content-based field splitting is being used:
+
+@example
+if (PROCINFO["FS"] == "FS")
+ @var{regular field splitting} @dots{}
+else if (PROCINFO["FS"] == "FIELDWIDTHS")
+ @var{fixed-width field splitting} @dots{}
+else if (PROCINFO["FS"] == "FPAT")
+ @var{content-based field splitting}
+else
+ @var{API input parser field splitting} @dots{} @ii{(advanced feature)}
+@end example
+
+This information is useful when writing a function that needs to
+temporarily change @code{FS} or @code{FIELDWIDTHS}, read some records,
+and then restore the original settings (@pxref{Passwd Functions} for an
+example of such a function).
@node Multiple Line
@section Multiple-Line Records
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index d5707932..1e1b1340 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -563,7 +563,13 @@ particular records in a file and perform operations upon them.
field.
* Field Splitting Summary:: Some final points and a summary table.
* Constant Size:: Reading constant width data.
+* Fixed width data:: Processing fixed-width data.
+* Skipping intervening:: Skipping intervening fields.
+* Allowing trailing data:: Capturing optional trailing data.
+* Fields with fixed data:: Field values with fixed-width data.
* Splitting By Content:: Defining Fields By Content
+* Testing field creation:: Checking how @command{gawk} is
+ splitting records.
* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program
control using the @code{getline}
@@ -6215,6 +6221,8 @@ used with it do not have to be named on the @command{awk} command line
* Field Separators:: The field separator and how to change it.
* Constant Size:: Reading constant width data.
* Splitting By Content:: Defining Fields By Content
+* Testing field creation:: Checking how @command{gawk} is splitting
+ records.
* Multiple Line:: Reading multiline records.
* Getline:: Reading files under explicit program control
using the @code{getline} function.
@@ -7356,18 +7364,30 @@ feature of @command{gawk}. If you are a novice @command{awk} user,
you might want to skip it on the first reading.
@command{gawk} provides a facility for dealing with fixed-width fields
-with no distinctive field separator. For example, data of this nature
-arises in the input for old Fortran programs where numbers are run
-together, or in the output of programs that did not anticipate the use
-of their output as input for other programs.
-
-An example of the latter is a table where all the columns are lined up by
-the use of a variable number of spaces and @emph{empty fields are just
-spaces}. Clearly, @command{awk}'s normal field splitting based on @code{FS}
-does not work well in this case. Although a portable @command{awk} program
-can use a series of @code{substr()} calls on @code{$0}
-(@pxref{String Functions}),
-this is awkward and inefficient for a large number of fields.
+with no distinctive field separator. We discuss this feature in
+the following @value{SUBSECTION}s.
+
+@menu
+* Fixed width data:: Processing fixed-width data.
+* Skipping intervening:: Skipping intervening fields.
+* Allowing trailing data:: Capturing optional trailing data.
+* Fields with fixed data:: Field values with fixed-width data.
+@end menu
+
+@node Fixed width data
+@subsection Processing Fixed-Width Data
+
+An example of fixed-width data would be the input for old Fortran programs
+where numbers are run together, or the output of programs that did not
+anticipate the use of their output as input for other programs.
+
+An example of the latter is a table where all the columns are lined up
+by the use of a variable number of spaces and @emph{empty fields are
+just spaces}. Clearly, @command{awk}'s normal field splitting based
+on @code{FS} does not work well in this case. Although a portable
+@command{awk} program can use a series of @code{substr()} calls on
+@code{$0} (@pxref{String Functions}), this is awkward and inefficient
+for a large number of fields.
@cindex troubleshooting, fatal errors, field widths@comma{} specifying
@cindex @command{w} utility
@@ -7375,14 +7395,12 @@ this is awkward and inefficient for a large number of fields.
@cindex @command{gawk}, @code{FIELDWIDTHS} variable in
The splitting of an input record into fixed-width fields is specified by
assigning a string containing space-separated numbers to the built-in
-variable @code{FIELDWIDTHS}. Each number specifies the width of the field,
-@emph{including} columns between fields. If you want to ignore the columns
-between fields, you can specify the width as a separate field that is
-subsequently ignored.
-Or, starting in @value{PVERSION} 4.2, each field width may optionally be
-preceded by a colon-separated value specifying the number of characters to skip
-before the field starts.
-It is a fatal error to supply a field width that has a negative value.
+variable @code{FIELDWIDTHS}. Each number specifies the width of the
+field, @emph{including} columns between fields. If you want to ignore
+the columns between fields, you can specify the width as a separate
+field that is subsequently ignored. It is a fatal error to supply a
+field width that has a negative value.
+
The following data is the output of the Unix @command{w} utility. It is useful
to illustrate the use of @code{FIELDWIDTHS}:
@@ -7412,7 +7430,7 @@ NR > 2 @{
sub(/^ +/, "", idle) # strip leading spaces
if (idle == "")
idle = 0
- if (idle ~ /:/) @{
+ if (idle ~ /:/) @{ # hh:mm
split(idle, t, ":")
idle = t[1] * 60 + t[2]
@}
@@ -7441,13 +7459,30 @@ brent ttyp0 286
dave ttyq4 1296000
@end example
-Starting in @value{PVERSION} 4.2, this program could be rewritten to
-specify @code{FIELDWIDTHS} like so:
+Another (possibly more practical) example of fixed-width input data
+is the input from a deck of balloting cards. In some parts of
+the United States, voters mark their choices by punching holes in computer
+cards. These cards are then processed to count the votes for any particular
+candidate or on any particular issue. Because a voter may choose not to
+vote on some issue, any column on the card may be empty. An @command{awk}
+program for processing such data could use the @code{FIELDWIDTHS} feature
+to simplify reading the data. (Of course, getting @command{gawk} to run on
+a system with card readers is another story!)
+
+@node Skipping intervening
+@subsection Skipping Intervening Fields
+
+Starting in @value{PVERSION} 4.2, each field width may optionally be
+preceded by a colon-separated value specifying the number of characters
+to skip before the field starts. Thus, the preceding program could be
+rewritten to specify @code{FIELDWIDTHS} like so:
+
@example
BEGIN @{ FIELDWIDTHS = "8 1:5 4:7 6 1:6 1:6 2:33" @}
@end example
+
This strips away some of the white space separating the fields. With such
-a change, the program would produce the following results:
+a change, the program produces the following results:
@example
hzang ttyV3 50
@@ -7459,42 +7494,65 @@ brent ttyp0 286
dave ttyq4 1296000
@end example
-Another (possibly more practical) example of fixed-width input data
-is the input from a deck of balloting cards. In some parts of
-the United States, voters mark their choices by punching holes in computer
-cards. These cards are then processed to count the votes for any particular
-candidate or on any particular issue. Because a voter may choose not to
-vote on some issue, any column on the card may be empty. An @command{awk}
-program for processing such data could use the @code{FIELDWIDTHS} feature
-to simplify reading the data. (Of course, getting @command{gawk} to run on
-a system with card readers is another story!)
+@node Allowing trailing data
+@subsection Capturing Optional Trailing Data
-@cindex @command{gawk}, splitting fields and
-Assigning a value to @code{FS} causes @command{gawk} to use
-@code{FS} for field splitting again. Use @samp{FS = FS} to make this happen,
-without having to know the current value of @code{FS}.
-In order to tell which kind of field splitting is in effect,
-use @code{PROCINFO["FS"]}
-(@pxref{Auto-set}).
-The value is @code{"FS"} if regular field splitting is being used,
-or @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
+There are times when fixed-width data may be followed by additional data
+that has no fixed length. Such data may or may not be present, but if
+it is, it should be possible to get at it from an @command{awk} program.
+
+Starting with version 4.2, in order to provide a way to say ``anything
+else in the record after the defined fields,'' @command{gawk}
+allows you to add a final @samp{*} character to the value of
+@code{FIELDWIDTHS}. There can only be one such character, and it must
+be the final non-whitespace character in @code{FIELDWIDTHS}.
+For example:
@example
-if (PROCINFO["FS"] == "FS")
- @var{regular field splitting} @dots{}
-else if (PROCINFO["FS"] == "FIELDWIDTHS")
- @var{fixed-width field splitting} @dots{}
-else if (PROCINFO["FS"] == "FPAT")
- @var{content-based field splitting} @dots{} @ii{(see next @value{SECTION})}
-else
- @var{API input parser field splitting} @dots{} @ii{(advanced feature)}
+$ @kbd{cat fw.awk} @ii{Show the program}
+@print{} BEGIN @{ FIELDWIDTHS = "2 2 *" @}
+@print{} @{ print NF, $1, $2, $3 @}
+$ @kbd{cat fw.in} @ii{Show sample input}
+@print{} 1234abcdefghi
+$ @kbd{gawk -f fw.awk fw.in} @ii{Run the program}
+@print{} 3 12 34 abcdefghi
@end example
-This information is useful when writing a function
-that needs to temporarily change @code{FS} or @code{FIELDWIDTHS},
-read some records, and then restore the original settings
-(@pxref{Passwd Functions}
-for an example of such a function).
+@node Fields with fixed data
+@subsection Field Values With Fixed-Width Data
+
+So far, so good. But what happens if there isn't as much data as there
+should be based on the contents of @code{FIELDWIDTHS}? Or, what happens
+if there is more data than expected?
+
+For many years, what happens in these cases was not well defined. Starting
+with version 4.2, the rules are as follows:
+
+@table @asis
+@item Enough data for some fields
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4"} and the
+input record is @samp{aabbb}. In this case, @code{NF} is set to two.
+
+@item Not enough data for a field
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4"} and the
+input record is @samp{aab}. In this case, @code{NF} is set to two and
+@code{$2} has the value @code{"b"}. The idea is that even though there
+aren't as many characters as were expected, there are some, so the data
+should be made available to the program.
+
+@item Too much data
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4"} and the
+input record is @samp{aabbbccccddd}. In this case, @code{NF} is set to
+three and the extra characters (@samp{ddd}) are ignored. If you want
+@command{gawk} to capture the extra characters, supply a final @samp{*}
+in the value of @code{FIELDWIDTHS}.
+
+@item Too much data, but with @samp{*} supplied
+For example, if @code{FIELDWIDTHS} is set to @code{"2 3 4 *"} and the
+input record is @samp{aabbbccccddd}. In this case, @code{NF} is set to
+four, and @code{$4} has the value @code{"ddd"}.
+
+@end table
@node Splitting By Content
@section Defining Fields by Content
@@ -7595,8 +7653,6 @@ affects field splitting with @code{FPAT}.
Assigning a value to @code{FPAT} overrides field splitting
with @code{FS} and with @code{FIELDWIDTHS}.
-Similar to @code{FIELDWIDTHS}, the value of @code{PROCINFO["FS"]}
-will be @code{"FPAT"} if content-based field splitting is being used.
@quotation NOTE
Some programs export CSV data that contains embedded newlines between
@@ -7623,13 +7679,44 @@ FPAT = "([^,]*)|(\"[^\"]+\")"
Finally, the @code{patsplit()} function makes the same functionality
available for splitting regular strings (@pxref{String Functions}).
-To recap, @command{gawk} provides three independent methods
-to split input records into fields.
-The mechanism used is based on which of the three
-variables---@code{FS}, @code{FIELDWIDTHS}, or @code{FPAT}---was
-last assigned to. In addition, an API input parser may choose to
-override the record parsing mechanism; please refer to @ref{Input Parsers}
-for further information about this feature.
+
+@node Testing field creation
+@section Checking How @command{gawk} Is Splitting Records
+
+@cindex @command{gawk}, splitting fields and
+As we've seen, @command{gawk} provides three independent methods to split
+input records into fields. The mechanism used is based on which of the
+three variables---@code{FS}, @code{FIELDWIDTHS}, or @code{FPAT}---was
+last assigned to. In addition, an API input parser may choose to override
+the record parsing mechanism; please refer to @ref{Input Parsers} for
+further information about this feature.
+
+To restore normal field splitting after using @code{FIELDWIDTHS}
+and/or @code{FPAT}, simply assign a value to @code{FS}.
+You can use @samp{FS = FS} to do this,
+without having to know the current value of @code{FS}.
+
+In order to tell which kind of field splitting is in effect,
+use @code{PROCINFO["FS"]} (@pxref{Auto-set}).
+The value is @code{"FS"} if regular field splitting is being used,
+@code{"FIELDWIDTHS"} if fixed-width field splitting is being used,
+or @code{"FPAT"} if content-based field splitting is being used:
+
+@example
+if (PROCINFO["FS"] == "FS")
+ @var{regular field splitting} @dots{}
+else if (PROCINFO["FS"] == "FIELDWIDTHS")
+ @var{fixed-width field splitting} @dots{}
+else if (PROCINFO["FS"] == "FPAT")
+ @var{content-based field splitting}
+else
+ @var{API input parser field splitting} @dots{} @ii{(advanced feature)}
+@end example
+
+This information is useful when writing a function that needs to
+temporarily change @code{FS} or @code{FIELDWIDTHS}, read some records,
+and then restore the original settings (@pxref{Passwd Functions} for an
+example of such a function).
@node Multiple Line
@section Multiple-Line Records