Now at gawk 2.10.

author: Arnold D. Robbins <arnold@skeeve.com> 2010-07-02 15:53:23 +0300
committer: Arnold D. Robbins <arnold@skeeve.com> 2010-07-02 15:53:23 +0300
commit: f3d9dd233ac07f764a554528c85be3768a1d1ddb (patch)
tree: f190ab7e0188c66eba76a74b8717e3ad7b16ef04 /gawk-info-2
parent: 0f1b7311fbc0e61e3e12194ce3e8484aaa4b7fe6 (diff)
download: egawk-f3d9dd233ac07f764a554528c85be3768a1d1ddb.tar.gz
egawk-f3d9dd233ac07f764a554528c85be3768a1d1ddb.tar.bz2
egawk-f3d9dd233ac07f764a554528c85be3768a1d1ddb.zip
1 files changed, 1265 insertions, 0 deletions
diff --git a/gawk-info-2 b/gawk-info-2
new file mode 100644
index 00000000..a228c5b9
--- /dev/null
+++ b/gawk-info-2
@@ -0,0 +1,1265 @@
+Info file gawk-info, produced by Makeinfo, -*- Text -*- from input
+file gawk.texinfo.
+
+This file documents `awk', a program that you can use to select
+particular records in a file and perform operations upon them.
+
+Copyright (C) 1989 Free Software Foundation, Inc.
+
+Permission is granted to make and distribute verbatim copies of this
+manual provided the copyright notice and this permission notice are
+preserved on all copies.
+
+Permission is granted to copy and distribute modified versions of
+this manual under the conditions for verbatim copying, provided that
+the entire resulting derived work is distributed under the terms of a
+permission notice identical to this one.
+
+Permission is granted to copy and distribute translations of this
+manual into another language, under the above conditions for modified
+versions, except that this permission notice may be stated in a
+translation approved by the Foundation.
+
+
+
+File: gawk-info,  Node: Fields,  Next: Non-Constant Fields,  Prev: Records,  Up: Reading Files
+
+Examining Fields
+================
+
+When `awk' reads an input record, the record is automatically
+separated or "parsed" by the interpreter into pieces called "fields".
+By default, fields are separated by whitespace, like words in a line.
+Whitespace in `awk' means any string of one or more spaces and/or
+tabs; other characters such as newline, formfeed, and so on, that are
+considered whitespace by other languages are *not* considered
+whitespace by `awk'.
+
+The purpose of fields is to make it more convenient for you to refer
+to these pieces of the record.  You don't have to use them--you can
+operate on the whole record if you wish--but fields are what make
+simple `awk' programs so powerful.
+
+To refer to a field in an `awk' program, you use a dollar--sign, `$',
+followed by the number of the field you want.  Thus, `$1' refers to
+the first field, `$2' to the second, and so on.  For example, suppose
+the following is a line of input:
+
+     This seems like a pretty nice example.
+
+ Here the first field, or `$1', is `This'; the second field, or `$2',
+is `seems'; and so on.  Note that the last field, `$7', is
+`example.'.  Because there is no space between the `e' and the `.',
+the period is considered part of the seventh field.
+
+No matter how many fields there are, the last field in a record can
+be represented by `$NF'.  So, in the example above, `$NF' would be
+the same as `$7', which is `example.'.  Why this works is explained
+below (*note Non-Constant Fields::.).  If you try to refer to a field
+beyond the last one, such as `$8' when the record has only 7 fields,
+you get the empty string.
+
+Plain `NF', with no `$', is a special variable whose value is the
+number of fields in the current record.
+
+`$0', which looks like an attempt to refer to the zeroth field, is a
+special case: it represents the whole input record.  This is what you
+would use when you aren't interested in fields.
+
+Here are some more examples:
+
+     awk '$1 ~ /foo/ { print $0 }' BBS-list
+
+This example contains the "matching" operator `~' (*note Comparison
+Ops::.).  Using this operator, all records in the file `BBS-list'
+whose first field contains the string `foo' are printed.
+
+By contrast, the following example:
+
+     awk '/foo/ { print $1, $NF }' BBS-list
+
+looks for the string `foo' in *the entire record* and prints the
+first field and the last field for each input record containing the
+pattern.
+
+The following program will search the system password file, and print
+the entries for users who have no password.
+
+     awk -F: '$2 == ""' /etc/passwd
+
+This program uses the `-F' option on the command line to set the file
+separator.  (Fields in `/etc/passwd' are separated by colons.  The
+second field represents a user's encrypted password, but if the field
+is empty, that user has no password.)
+
+
+
+File: gawk-info,  Node: Non-Constant Fields,  Next: Changing Fields,  Prev: Fields,  Up: Reading Files
+
+Non-constant Field Numbers
+==========================
+
+The number of a field does not need to be a constant.  Any expression
+in the `awk' language can be used after a `$' to refer to a field. 
+The `awk' utility evaluates the expression and uses the "numeric
+value" as a field number.  Consider this example:
+
+     awk '{ print $NR }'
+
+Recall that `NR' is the number of records read so far: 1 in the first
+record, 2 in the second, etc.  So this example will print the first
+field of the first record, the second field of the second record, and
+so on.  For the twentieth record, field number 20 will be printed;
+most likely this will make a blank line, because the record will not
+have 20 fields.
+
+Here is another example of using expressions as field numbers:
+
+     awk '{ print $(2*2) }' BBS-list
+
+The `awk' language must evaluate the expression `(2*2)' and use its
+value as the field number to print.  The `*' sign represents
+multiplication, so the expression `2*2' evaluates to 4.  This
+example, then, prints the hours of operation (the fourth field) for
+every line of the file `BBS-list'.
+
+When you use non--constant field numbers, you may ask for a field
+with a negative number.  This always results in an empty string, just
+like a field whose number is too large for the input record.  For
+example, `$(1-4)' would try to examine field number -3; it would
+result in an empty string.
+
+If the field number you compute is zero, you get the entire record.
+
+The number of fields in the current record is stored in the special
+variable `NF' (*note Special::.).  The expression `$NF' is not a
+special feature: it is the direct consequence of evaluating `NF' and
+using its value as a field number.
+
+
+
+File: gawk-info,  Node: Changing Fields,  Next: Field Separators,  Prev: Non-Constant Fields,  Up: Reading Files
+
+Changing the Contents of a Field
+================================
+
+You can change the contents of a field as seen by `awk' within an
+`awk' program; this changes what `awk' perceives as the current input
+record.  (The actual input is untouched: `awk' never modifies the
+input file.)
+
+Look at this example:
+
+     awk '{ $3 = $2 - 10; print $2, $3 }' inventory-shipped
+
+The `-' sign represents subtraction, so this program reassigns field
+three, `$3', to be the value of field two minus ten, ``$2' - 10'. 
+(*Note Arithmetic Ops::.)  Then field two, and the new value for
+field three, are printed.
+
+In order for this to work, the text in field `$2' must make sense as
+a number; the string of characters must be converted to a number in
+order for the computer to do arithmetic on it.  The number resulting
+from the subtraction is converted back to a string of characters
+which then becomes field 3.  *Note Conversion::.
+
+When you change the value of a field (as perceived by `awk'), the
+text of the input record is recalculated to contain the new field
+where the old one was.  `$0' will from that time on reflect the
+altered field.  Thus,
+
+     awk '{ $2 = $2 - 10; print $0 }' inventory-shipped
+
+will print a copy of the input file, with 10 subtracted from the
+second field of each line.
+
+You can also assign contents to fields that are out of range.  For
+example:
+
+     awk '{ $6 = ($5 + $4 + $3 + $2)/4) ; print $6 }' inventory-shipped
+
+We've just created `$6', whose value is the average of fields `$2',
+`$3', `$4', and `$5'.  The `+' sign represents addition, and the `/'
+sign represents division.  For the file `inventory-shipped' `$6'
+represents the average number of parcels shipped for a particular
+month.
+
+Creating a new field changes what `awk' interprets as the current
+input record.  The value of `$0' will be recomputed.  This
+recomputation affects and is affected by features not yet discussed,
+in particular, the "Output Field Separator", `OFS', which is used to
+separate the fields (*note Output Separators::.), and `NF' (the
+number of fields; *note Fields::.).  For example, the value of `NF'
+will be set to the number of the highest out--of--range field you
+create.
+
+Note, however, that merely *referencing* an out--of--range field will
+*not* change the value of either `$0' or `NF'.  Referencing an
+out--of--range field merely produces a null string.  For example:
+
+     if ($(NF+1) != "")
+         print "can't happen"
+     else
+         print "everything is normal"
+
+should print `everything is normal'.  (*Note If::, for more
+information about `awk''s `if-else' statements.)
+
+
+
+File: gawk-info,  Node: Field Separators,  Next: Multiple,  Prev: Changing Fields,  Up: Reading Files
+
+Specifying How Fields Are Separated
+===================================
+
+You can change the way `awk' splits a record into fields by changing
+the value of the "field separator".  The field separator is
+represented by the special variable `FS' in an `awk' program, and can
+be set by `-F' on the command line.  The `awk' language scans each
+input line for the field separator character to determine the
+positions of fields within that line.  Shell programmers take note! 
+`awk' uses the variable `FS', not `IFS'.
+
+The default value of the field separator is a string containing a
+single space.  This value is actually a special case; as you know, by
+default, fields are separated by whitespace sequences, not by single
+spaces: two spaces in a row do not delimit an empty field. 
+``Whitespace'' is defined as sequences of one or more spaces or tab
+characters.
+
+You change the value of `FS' by "assigning" it a new value.  You can
+do this using the special `BEGIN' pattern (*note BEGIN/END::.).  This
+pattern allows you to change the value of `FS' before any input is
+read.  The new value of `FS' is enclosed in quotations.  For example,
+set the value of `FS' to the string `","':
+
+     awk 'BEGIN { FS = "," } ; { print $2 }'
+
+and use the input line:
+
+     John Q. Smith, 29 Oak St., Walamazoo, MI 42139
+
+This `awk' program will extract the string `29 Oak St.'.
+
+Sometimes your input data will contain separator characters that
+don't separate fields the way you thought they would.  For instance,
+the person's name in the example we've been using might have a title
+or suffix attached, such as `John Q. Smith, LXIX'.  If you assigned
+`FS' to be `,' then:
+
+     awk 'BEGIN { FS = "," } ; { print $2 }
+
+would extract `LXIX', instead of `29 Oak St.'.  If you were expecting
+the program to print the address, you would be surprised.  So, choose
+your data layout and separator characters carefully to prevent
+problems like this from happening.
+
+You can assign `FS' to be a series of characters.  For example, the
+assignment:
+
+     FS = ", \t"
+
+makes every area of an input line that consists of a comma followed
+by a space and a tab, into a field separator.  (`\t' stands for a tab.)
+
+If `FS' is any single character other than a blank, then that
+character is used as the field separator, and two successive
+occurrences of that character do delimit an empty field.
+
+If you assign `FS' to a string longer than one character, that string
+is evaluated as a "regular expression" (*note Regexp::.).  The value
+of the regular expression is used as a field separator.
+
+`FS' can be set on the command line.  You use the `-F' argument to do
+so.  For example:
+
+     awk -F, 'PROGRAM' INPUT-FILES
+
+sets `FS' to be the `,' character.  Notice that the argument uses a
+capital `F'.  Contrast this with `-f', which specifies a file
+containing an `awk' program.  Case is significant in command options:
+the `-F' and `-f' options have nothing to do with each other.  You
+can use both options at the same time to set the `FS' argument *and*
+get an `awk' program from a file.
+
+As a special case, if the argument to `-F' is `t', then `FS' is set
+to the tab character.  (This is because if you type `-F\t', without
+the quotes, at the shell, the `\' gets deleted, so `awk' figures that
+you really want your fields to be separated with tabs, and not `t's. 
+Use `FS="t"' if you really do want to separate your fields with `t's.)
+
+For example, let's use an `awk' program file called `baud.awk' that
+contains the pattern `/300/', and the action `print $1'.  We'll use
+the operating system utility `cat' to ``look'' at our program:
+
+     % cat baud.awk
+     /300/   { print $1 }
+
+Let's also set `FS' to be the `-' character.  We will apply all this
+information to the file `BBS-list'.  This `awk' program will now
+print a list of the names of the bulletin boards that operate at 300
+baud and the first three digits of their phone numbers.
+
+     awk -F- -f baud.awk BBS-list
+
+produces this output:
+
+     aardvark     555
+     alpo
+     barfly       555
+     bites        555
+     camelot      555
+     core         555
+     fooey        555
+     foot         555
+     macfoo       555
+     sdace        555
+     sabafoo      555
+
+Note the second line of output.  If you check the original file, you
+will see that the second line looked like this:
+
+     alpo-net     555-3412     2400/1200/300     A
+
+The `-' as part of the system's name was used as the field separator,
+instead of the `-' in the phone number that was originally intended. 
+This demonstrates why you have to be careful in choosing your field
+and record separators.
+
+
+
+File: gawk-info,  Node: Multiple,  Next: Assignment Options,  Prev: Field Separators,  Up: Reading Files
+
+Multiple--Line Records
+======================
+
+In some data bases, a single line cannot conveniently hold all the
+information in one entry.  Then you will want to use multi--line
+records.
+
+The first step in doing this is to choose your data format: when
+records are not defined as single lines, how will you want to define
+them?  What should separate records?
+
+One technique is to use an unusual character or string to separate
+records.  For example, you could use the formfeed character (written
+`\f' in `awk', as in C) to separate them, making each record a page
+of the file.  To do this, just set the variable `RS' to `"\f"' (a
+string containing the formfeed character), or whatever string you
+prefer to use.
+
+Another technique is to have blank lines separate records.  By a
+special dispensation, a null string as the value of `RS' indicates
+that records are separated by one or more blank lines.  If you set
+`RS' to the null string, a record will always end at the first blank
+line encountered.  And the next record won't start until the first
+nonblank line that follows--no matter how many blank lines appear in
+a row, they will be considered one record--separator.
+
+The second step is to separate the fields in the record.  One way to
+do this is to put each field on a separate line: to do this, just set
+the variable `FS' to the string `"\n"'.  (This simple regular
+expression matches a single newline.)  Another idea is to divide each
+of the lines into fields in the normal manner; the regular expression
+`"[ \t\n]+"' will do this nicely by treating the newlines inside the
+record just like spaces.
+
+When `RS' is set to the null string, the newline character *always*
+acts as a field separator.  This is in addition to whatever value
+`FS' has.  The probable reason for this rule is so that you get
+rational behavior in the default case (i.e. `FS == " "').  This can
+be a problem if you really don't want the newline character to
+separate fields, since there is no way to do that.  However, you can
+work around this by using the `split' function to manually break up
+your data (*note String Functions::.).
+
+Here is how to use records separated by blank lines and break each
+line into fields normally:
+
+     awk 'BEGIN { RS = ""; FS = "[ \t\n]+" } ; { print $0 }' BBS-list
+
+
+
+File: gawk-info,  Node: Assignment Options,  Next: Getline,  Prev: Multiple,  Up: Reading Files
+
+Assigning Variables on the Command Line
+=======================================
+
+You can include variable "assignments" among the file names on the
+command line used to invoke `awk' (*note Command Line::.).  Such
+assignments have the form:
+
+     VARIABLE=TEXT
+
+and allow you to change variables either at the beginning of the
+`awk' run or in between input files.  The variable assignment is
+performed at a time determined by its position among the input file
+arguments: after the processing of the preceding input file argument.
+For example:
+
+     awk '{ print $n }' n=4 inventory-shipped n=2 BBS-list
+
+prints the value of field number `n' for all input records.  Before
+the first file is read, the command line sets the variable `n' equal
+to 4.  This causes the fourth field of the file `inventory-shipped'
+to be printed.  After the first file has finished, but before the
+second file is started, `n' is set to 2, so that the second field of
+the file `BBS-list' will be printed.
+
+Command line arguments are made available for explicit examination by
+the `awk' program in an array named `ARGV' (*note Special::.).
+
+
+
+File: gawk-info,  Node: Getline,  Prev: Assignment Options,  Up: Reading Files
+
+Explicit Input with `getline'
+=============================
+
+So far we have been getting our input files from `awk''s main input
+stream--either the standard input (usually your terminal) or the
+files specified on the command line.  The `awk' language has a
+special built--in function called `getline' that can be used to read
+input under your explicit control.
+
+This command is quite complex and should *not* be used by beginners. 
+The command (and its variations) is covered here because this is the
+section about input.  The examples that follow the explanation of the
+`getline' command include material that has not been covered yet. 
+Therefore, come back and attempt the `getline' command *after* you
+have reviewed the rest of this manual and have a good knowledge of
+how `awk' works.
+
+When retrieving input, `getline' returns a 1 if it found a record,
+and a 0 if the end of the file was encountered.  If there was some
+error in getting a record, such as a file that could not be opened,
+then `getline' returns a -1.
+
+In the following examples, COMMAND stands for a string value that
+represents a shell command.
+
+`getline'
+     The `getline' function can be used by itself, in an `awk'
+     program, to read input from the current input.  All it does in
+     this case is read the next input record and split it up into
+     fields.  This is useful if you've finished processing the
+     current record, but you want to do some special processing
+     *right now* on the next record.  Here's an example:
+
+          awk '{
+               if (t = index($0, "/*")) {
+                    if(t > 1)
+                         tmp = substr($0, 1, t - 1)
+                    else
+                         tmp = ""
+                    u = index(substr($0, t + 2), "*/")
+                    while (! u) {
+                         getline
+                         t = -1
+                         u = index($0, "*/")
+                    }
+                    if(u <= length($0) - 2)
+                         $0 = tmp substr($0, t + u + 3)
+                    else
+                         $0 = tmp
+               }
+               print $0
+          }'
+
+     This `awk' program deletes all comments, `/* ...  */', from the
+     input.  By replacing the `print $0' with other statements, you
+     could perform more complicated processing on the de--commented
+     input, such as search it for matches for a regular expression.
+
+     This form of the `getline' command sets `NF' (the number of
+     fields; *note Fields::.), `NR' (the number of records read so
+     far), the `FNR' variable (*note Records::.), and the value of
+     `$0'.
+
+     *Note:* The new value of `$0' will be used in testing the
+     patterns of any subsequent rules. The original value of `$0'
+     that triggered the rule which executed `getline' is lost.  By
+     contrast, the `next' statement reads a new record but
+     immediately begins processing it normally, starting with the
+     first rule in the program.  *Note Next::.
+
+`getline VAR'
+     This form of `getline' reads a record into the variable VAR. 
+     This is useful when you want your program to read the next
+     record from the input file, but you don't want to subject the
+     record to the normal input processing.
+
+     For example, suppose the next line is a comment, or a special
+     string, and you want to read it, but you must make certain that
+     it won't accidentally trigger any rules.  This version of
+     `getline' will allow you to read that line and store it in a
+     variable so that the main read--a--line--and--check--each--rule
+     loop of `awk' never sees it.
+
+     The following example swaps every two lines of input.  For
+     example, given:
+
+          wan
+          tew
+          free
+          phore
+
+     it outputs:
+
+          tew
+          wan
+          phore
+          free
+
+     Here's the program:
+
+          awk '{
+               if ((getline tmp) > 0) {
+                    print tmp
+                    print $0
+               } else
+                    print $0
+          }'
+
+     The `getline' function used in this way sets only `NR' and `FNR'
+     (and of course, VAR).  The record is not split into fields, so
+     the values of the fields (including `$0') and the value of `NF'
+     do not change.
+
+`getline < FILE'
+     This form of the `getline' function takes its input from the
+     file FILE.  Here FILE is a string--valued expression that
+     specifies the file name.
+
+     This form is useful if you want to read your input from a
+     particular file, instead of from the main input stream.  For
+     example, the following program reads its input record from the
+     file `foo.input' when it encounters a first field with a value
+     equal to 10 in the current input file.
+
+          awk '{
+          if ($1 == 10) {
+               getline < "foo.input"
+               print
+          } else
+               print
+          }'
+
+     Since the main input stream is not used, the values of `NR' and
+     `FNR' are not changed.  But the record read is split into fields
+     in the normal manner, so the values of `$0' and other fields are
+     changed.  So is the value of `NF'.
+
+     This does not cause the record to be tested against all the
+     patterns in the `awk' program, in the way that would happen if
+     the record were read normally by the main processing loop of
+     `awk'.  However the new record is tested against any subsequent
+     rules, just as when `getline' is used without a redirection.
+
+`getline VAR < FILE'
+     This form of the `getline' function takes its input from the
+     file FILE and puts it in the variable VAR.  As above, FILE is a
+     string--valued expression that specifies the file to read from.
+
+     In this version of `getline', none of the built--in variables
+     are changed, and the record is not split into fields.  The only
+     variable changed is VAR.
+
+     For example, the following program copies all the input files to
+     the output, except for records that say `@include FILENAME'. 
+     Such a record is replaced by the contents of the file FILENAME.
+
+          awk '{
+               if (NF == 2 && $1 == "@include") {
+                    while ((getline line < $2) > 0)
+                         print line
+                    close($2)
+               } else
+                    print
+          }'
+
+     Note here how the name of the extra input file is not built into
+     the program; it is taken from the data, from the second field on
+     the `@include' line.
+
+     The `close' command is used to ensure that if two identical
+     `@include' lines appear in the input, the entire specified file
+     is included twice.  *Note Close Input::.
+
+     One deficiency of this program is that it does not process
+     nested `@include' statements the way a true macro preprocessor
+     would.
+
+`COMMAND | getline'
+     You can "pipe" the output of a command into `getline'.  A pipe
+     is simply a way to link the output of one program to the input
+     of another.  In this case, the string COMMAND is run as a shell
+     command and its output is piped into `awk' to be used as input. 
+     This form of `getline' reads one record from the pipe.
+
+     For example, the following program copies input to output,
+     except for lines that begin with `@execute', which are replaced
+     by the output produced by running the rest of the line as a
+     shell command:
+
+          awk '{
+               if ($1 == "@execute") {
+                    tmp = substr($0, 10)
+                    while ((tmp | getline) > 0)
+                         print
+                    close(tmp)
+               } else
+                    print
+          }'
+
+     The `close' command is used to ensure that if two identical
+     `@execute' lines appear in the input, the command is run again
+     for each one.  *Note Close Input::.
+
+     Given the input:
+
+          foo
+          bar
+          baz
+          @execute who
+          bletch
+
+     the program might produce:
+
+          foo
+          bar
+          baz
+          hack     ttyv0   Jul 13 14:22
+          hack     ttyp0   Jul 13 14:23     (gnu:0)
+          hack     ttyp1   Jul 13 14:23     (gnu:0)
+          hack     ttyp2   Jul 13 14:23     (gnu:0)
+          hack     ttyp3   Jul 13 14:23     (gnu:0)
+          bletch
+
+     Notice that this program ran the command `who' and printed the
+     result.  (If you try this program yourself, you will get
+     different results, showing you logged in.)
+
+     This variation of `getline' splits the record into fields, sets
+     the value of `NF' and recomputes the value of `$0'.  The values
+     of `NR' and `FNR' are not changed.
+
+`COMMAND | getline VAR'
+     The output of the command COMMAND is sent through a pipe to
+     `getline' and into the variable VAR.  For example, the following
+     program reads the current date and time into the variable
+     `current_time', using the utility called `date', and then prints
+     it.
+
+          awk 'BEGIN {
+               "date" | getline current_time
+               close("date")
+               print "Report printed on " current_time
+          }'
+
+     In this version of `getline', none of the built--in variables
+     are changed, and the record is not split into fields.
+
+
+
+File: gawk-info,  Node: Close Input,  Up: Getline
+
+Closing Input Files
+-------------------
+
+If the same file name or the same shell command is used with
+`getline' more than once during the execution of the `awk' program,
+the file is opened (or the command is executed) only the first time. 
+At that time, the first record of input is read from that file or
+command.  The next time the same file or command is used in
+`getline', another record is read from it, and so on.
+
+What this implies is that if you want to start reading the same file
+again from the beginning, or if you want to rerun a shell command
+(rather that reading more output from the command), you must take
+special steps.  What you can do is use the `close' statement:
+
+     close (FILENAME)
+
+This statement closes a file or pipe, represented here by FILENAME. 
+The string value of FILENAME must be the same value as the string
+used to open the file or pipe to begin with.
+
+Once this statement is executed, the next `getline' from that file or
+command will reopen the file or rerun the command.
+
+
+
+File: gawk-info,  Node: Printing,  Next: One-liners,  Prev: Reading Files,  Up: Top
+
+Printing Output
+***************
+
+One of the most common things that actions do is to output or "print"
+some or all of the input.  For simple output, use the `print'
+statement.  For fancier formatting use the `printf' statement.  Both
+are described in this chapter.
+
+* Menu:
+
+* Print::              The `print' statement.
+* Print Examples::     Simple examples of `print' statements.
+* Output Separators::  The output separators and how to change them.
+
+* Redirection::        How to redirect output to multiple files and pipes.
+* Close Output::       How to close output files and pipes.
+
+* Printf::             The `printf' statement.
+
+ 
+
+File: gawk-info,  Node: Print,  Next: Print Examples,  Up: Printing
+
+The `print' Statement
+=====================
+
+The `print' statement does output with simple, standardized
+formatting.  You specify only the strings or numbers to be printed,
+in a list separated by commas.  They are output, separated by single
+spaces, followed by a newline.  The statement looks like this:
+
+     print ITEM1, ITEM2, ...
+
+ The entire list of items may optionally be enclosed in parentheses. 
+The parentheses are necessary if any of the item expressions uses a
+relational operator; otherwise it could be confused with a
+redirection (*note Redirection::.).  The relational operators are
+`==', `!=', `<', `>', `>=', `<=', `~' and `!~' (*note Comparison
+Ops::.).
+
+The items printed can be constant strings or numbers, fields of the
+current record (such as `$1'), variables, or any `awk' expressions. 
+The `print' statement is completely general for computing *what*
+values to print.  With one exception (*note Output Separators::.),
+what you can't do is specify *how* to print them--how many columns to
+use, whether to use exponential notation or not, and so on.  For
+that, you need the `printf' statement (*note Printf::.).
+
+To print a fixed piece of text, write a string constant as one item,
+such as `"Hello there"'.  If you forget to use the double--quote
+characters, your text will be taken as an `awk' expression, and you
+will probably get an error.  Keep in mind that a space will be
+printed between any two items.
+
+The simple statement `print' with no items is equivalent to `print
+$0': it prints the entire current record.  To print a blank line, use
+`print ""', where `""' is the null, or empty, string.
+
+Most often, each `print' statement makes one line of output.  But it
+isn't limited to one line.  If an item value is a string that
+contains a newline, the newline is output along with the rest of the
+string.  A single `print' can make any number of lines this way.
+
+
+
+File: gawk-info,  Node: Print Examples,  Next: Output Separators,  Prev: Print,  Up: Printing
+
+Examples of `print' Statements
+==============================
+
+Here is an example that prints the first two fields of each input
+record, with a space between them:
+
+     awk '{ print $1, $2 }' inventory-shipped
+
+Its output looks like this:
+
+     Jan 13
+     Feb 15
+     Mar 15
+     ...
+
+ A common mistake in using the `print' statement is to omit the comma
+between two items.  This often has the effect of making the items run
+together in the output, with no space.  The reason for this is that
+juxtaposing two string expressions in `awk' means to concatenate
+them.  For example, without the comma:
+
+     awk '{ print $1 $2 }' inventory-shipped
+
+prints:
+
+     Jan13
+     Feb15
+     Mar15
+     ...
+
+ Neither example's output makes much sense to someone unfamiliar with
+the file `inventory-shipped'.  A heading line at the beginning would
+make it clearer.  Let's add some headings to our table of months
+(`$1') and green crates shipped (`$2').  We do this using the BEGIN
+pattern (*note BEGIN/END::.) to cause the headings to be printed only
+once:
+
+     awk 'BEGIN {  print "Month Crates"
+                   print "---- -----" }
+                {  print $1, $2 }' inventory-shipped
+
+Did you already guess what will happen?  This program prints the
+following:
+
+     Month Crates
+     ---- -----
+     Jan 13
+     Feb 15
+     Mar 15
+     ...
+
+ The headings and the table data don't line up!  We can fix this by
+printing some spaces between the two fields:
+
+     awk 'BEGIN { print "Month Crates"
+                  print "---- -----" }
+                { print $1, "     ", $2 }' inventory-shipped
+
+You can imagine that this way of lining up columns can get pretty
+complicated when you have many columns to fix.  Counting spaces for
+two or three columns can be simple, but more than this and you can
+get ``lost'' quite easily.  This is why the `printf' statement was
+created (*note Printf::.); one of its specialties is lining up
+columns of data.
+
+
+
+File: gawk-info,  Node: Output Separators,  Next: Redirection,  Prev: Print Examples,  Up: Printing
+
+Output Separators
+=================
+
+As mentioned previously, a `print' statement contains a list of
+items, separated by commas.  In the output, the items are normally
+separated by single spaces.  But they do not have to be spaces; a
+single space is only the default.  You can specify any string of
+characters to use as the "output field separator", by setting the
+special variable `OFS'.  The initial value of this variable is the
+string `" "'.
+
+The output from an entire `print' statement is called an "output
+record".  Each `print' statement outputs one output record and then
+outputs a string called the "output record separator".  The special
+variable `ORS' specifies this string.  The initial value of the
+variable is the string `"\n"' containing a newline character; thus,
+normally each `print' statement makes a separate line.
+
+You can change how output fields and records are separated by
+assigning new values to the variables `OFS' and/or `ORS'.  The usual
+place to do this is in the `BEGIN' rule (*note BEGIN/END::.), so that
+it happens before any input is processed.  You may also do this with
+assignments on the command line, before the names of your input files.
+
+The following example prints the first and second fields of each
+input record separated by a semicolon, with a blank line added after
+each line:
+
+     awk 'BEGIN { OFS = ";"; ORS = "\n\n" }
+                { print $1, $2 }'  BBS-list
+
+If the value of `ORS' does not contain a newline, all your output
+will be run together on a single line, unless you output newlines
+some other way.
+
+
+
+File: gawk-info,  Node: Redirection,  Next: Printf,  Prev: Output Separators,  Up: Printing
+
+Redirecting Output of `print' and `printf'
+==========================================
+
+So far we have been dealing only with output that prints to the
+standard output, usually your terminal.  Both `print' and `printf'
+can be told to send their output to other places.  This is called
+"redirection".
+
+A redirection appears after the `print' or `printf' statement. 
+Redirections in `awk' are written just like redirections in shell
+commands, except that they are written inside the `awk' program.
+
+Here are the three forms of output redirection.  They are all shown
+for the `print' statement, but they work for `printf' also.
+
+`print ITEMS > OUTPUT-FILE'
+     This type of redirection prints the items onto the output file
+     OUTPUT-FILE.  The file name OUTPUT-FILE can be any expression. 
+     Its value is changed to a string and then used as a filename
+     (*note Expressions::.).
+
+     When this type of redirection is used, the OUTPUT-FILE is erased
+     before the first output is written to it.  Subsequent writes do
+     not erase OUTPUT-FILE, but append to it.  If OUTPUT-FILE does
+     not exist, then it is created.
+
+     For example, here is how one `awk' program can write a list of
+     BBS names to a file `name-list' and a list of phone numbers to a
+     file `phone-list'.  Each output file contains one name or number
+     per line.
+
+          awk '{ print $2 > "phone-list"
+                 print $1 > "name-list" }' BBS-list
+
+`print ITEMS >> OUTPUT-FILE'
+     This type of redirection prints the items onto the output file
+     OUTPUT-FILE.  The difference between this and the single--`>'
+     redirection is that the old contents (if any) of OUTPUT-FILE are
+     not erased.  Instead, the `awk' output is appended to the file.
+
+`print ITEMS | COMMAND'
+     It is also possible to send output through a "pipe" instead of
+     into a file.   This type of redirection opens a pipe to COMMAND
+     and writes the values of ITEMS through this pipe, to another
+     process created to execute COMMAND.
+
+     The redirection argument COMMAND is actually an `awk'
+     expression.  Its value is converted to a string, whose contents
+     give the shell command to be run.
+
+     For example, this produces two files, one unsorted list of BBS
+     names and one list sorted in reverse alphabetical order:
+
+          awk '{ print $1 > "names.unsorted"
+                 print $1 | "sort -r > names.sorted" }' BBS-list
+
+     Here the unsorted list is written with an ordinary redirection
+     while the sorted list is written by piping through the `sort'
+     utility.
+
+     Here is an example that uses redirection to mail a message to a
+     mailing list `bug-system'.  This might be useful when trouble is
+     encountered in an `awk' script run periodically for system
+     maintenance.
+
+          print "Awk script failed:", $0 | "mail bug-system"
+          print "processing record number", FNR, "of", FILENAME  | "mail bug-system"
+          close ("mail bug-system")
+
+     We use a `close' statement here because it's a good idea to
+     close the pipe as soon as all the intended output has been sent
+     to it.  *Note Close Output::, for more information on this.
+
+Redirecting output using `>', `>>', or `|' asks the system to open a
+file or pipe only if the particular FILE or COMMAND you've specified
+has not already been written to by your program.
+
+
+
+File: gawk-info,  Node: Close Output,  Up: Redirection
+
+Closing Output Files and Pipes
+------------------------------
+
+When a file or pipe is opened, the filename or command associated
+with it is remembered by `awk' and subsequent writes to the same file
+or command are appended to the previous writes.  The file or pipe
+stays open until `awk' exits.  This is usually convenient.
+
+Sometimes there is a reason to close an output file or pipe earlier
+than that.  To do this, use the `close' command, as follows:
+
+     close (FILENAME)
+
+or
+
+     close (COMMAND)
+
+The argument FILENAME or COMMAND can be any expression.  Its value
+must exactly equal the string used to open the file or pipe to begin
+with--for example, if you open a pipe with this:
+
+     print $1 | "sort -r > names.sorted"
+
+then you must close it with this:
+
+     close ("sort -r > names.sorted")
+
+Here are some reasons why you might need to close an output file:
+
+   * To write a file and read it back later on in the same `awk'
+     program.  Close the file when you are finished writing it; then
+     you can start reading it with `getline' (*note Getline::.).
+
+   * To write numerous files, successively, in the same `awk'
+     program.  If you don't close the files, eventually you will
+     exceed the system limit on the number of open files in one
+     process.  So close each one when you are finished writing it.
+
+   * To make a command finish.  When you redirect output through a
+     pipe, the command reading the pipe normally continues to try to
+     read input as long as the pipe is open.  Often this means the
+     command cannot really do its work until the pipe is closed.  For
+     example, if you redirect output to the `mail' program, the
+     message will not actually be sent until the pipe is closed.
+
+   * To run the same subprogram a second time, with the same arguments.
+     This is not the same thing as giving more input to the first run!
+
+     For example, suppose you pipe output to the `mail' program.  If
+     you output several lines redirected to this pipe without closing
+     it, they make a single message of several lines.  By contrast,
+     if you close the pipe after each line of output, then each line
+     makes a separate message.
+
+
+
+File: gawk-info,  Node: Printf,  Prev: Redirection,  Up: Printing
+
+Using `printf' Statements For Fancier Printing
+==============================================
+
+If you want more precise control over the output format than `print'
+gives you, use `printf'.  With `printf' you can specify the width to
+use for each item, and you can specify various stylistic choices for
+numbers (such as what radix to use, whether to print an exponent,
+whether to print a sign, and how many digits to print after the
+decimal point).  You do this by specifying a "format string".
+
+* Menu:
+
+* Basic Printf::       Syntax of the `printf' statement.
+* Format-Control::     Format-control letters.
+* Modifiers::          Format--specification modifiers.
+* Printf Examples::    Several examples.
+
+ 
+
+File: gawk-info,  Node: Basic Printf,  Next: Format-Control,  Up: Printf
+
+Introduction to the `printf' Statement
+--------------------------------------
+
+The `printf' statement looks like this:
+
+     printf FORMAT, ITEM1, ITEM2, ...
+
+ The entire list of items may optionally be enclosed in parentheses. 
+The parentheses are necessary if any of the item expressions uses a
+relational operator; otherwise it could be confused with a
+redirection (*note Redirection::.).  The relational operators are
+`==', `!=', `<', `>', `>=', `<=', `~' and `!~' (*note Comparison
+Ops::.).
+
+The difference between `printf' and `print' is the argument FORMAT. 
+This is an expression whose value is taken as a string; its job is to
+say how to output each of the other arguments.  It is called the
+"format string".
+
+The format string is essentially the same as in the C library
+function `printf'.  Most of FORMAT is text to be output verbatim. 
+Scattered among this text are "format specifiers", one per item. 
+Each format specifier says to output the next item at that place in
+the format.
+
+The `printf' statement does not automatically append a newline to its
+output.  It outputs nothing but what the format specifies.  So if you
+want a newline, you must include one in the format.  The output
+separator variables `OFS' and `ORS' have no effect on `printf'
+statements.
+
+
+
+File: gawk-info,  Node: Format-Control,  Next: Modifiers,  Prev: Basic Printf,  Up: Printf
+
+Format--Control Characters
+--------------------------
+
+A format specifier starts with the character `%' and ends with a
+"format--control letter"; it tells the `printf' statement how to
+output one item.  (If you actually want to output a `%', write `%%'.)
+The format--control letter specifies what kind of value to print. 
+The rest of the format specifier is made up of optional "modifiers"
+which are parameters such as the field width to use.
+
+Here is a list of them:
+
+`c'
+     This prints a number as an ASCII character.  Thus, `printf "%c",
+     65' outputs the letter `A'.  The output for a string value is
+     the first character of the string.
+
+`d'
+     This prints a decimal integer.
+
+`e'
+     This prints a number in scientific (exponential) notation.  For
+     example,
+
+          printf "%4.3e", 1950
+
+     prints `1.950e+03', with a total of 4 significant figures of
+     which 3 follow the decimal point.  The `4.3' are "modifiers",
+     discussed below.
+
+`f'
+     This prints a number in floating point notation.
+
+`g'
+     This prints either scientific notation or floating point
+     notation, whichever is shorter.
+
+`o'
+     This prints an unsigned octal integer.
+
+`s'
+     This prints a string.
+
+`x'
+     This prints an unsigned hexadecimal integer.
+
+`%'
+     This isn't really a format--control letter, but it does have a
+     meaning when used after a `%': the sequence `%%' outputs one
+     `%'.  It does not consume an argument.
+
+
+
+File: gawk-info,  Node: Modifiers,  Next: Printf Examples,  Prev: Format-Control,  Up: Printf
+
+Modifiers for `printf' Formats
+------------------------------
+
+A format specification can also include "modifiers" that can control
+how much of the item's value is printed and how much space it gets. 
+The modifiers come between the `%' and the format--control letter. 
+Here are the possible modifiers, in the order in which they may appear:
+
+`-'
+     The minus sign, used before the width modifier, says to
+     left--justify the argument within its specified width.  Normally
+     the argument is printed right--justified in the specified width.
+
+`WIDTH'
+     This is a number representing the desired width of a field. 
+     Inserting any number between the `%' sign and the format control
+     character forces the field to be expanded to this width.  The
+     default way to do this is to pad with spaces on the left.
+
+`.PREC'
+     This is a number that specifies the precision to use when
+     printing.  This specifies the number of digits you want printed
+     to the right of the decimal place.
+
+The C library `printf''s dynamic WIDTH and PREC capability (for
+example, `"%*.*s"') is not supported.  However, it can be easily
+simulated using concatenation to dynamically build the format string.
+
+
+
+File: gawk-info,  Node: Printf Examples,  Prev: Modifiers,  Up: Printf
+
+Examples of Using `printf'
+--------------------------
+
+Here is how to use `printf' to make an aligned table:
+
+     awk '{ printf "%-10s %s\n", $1, $2 }' BBS-list
+
+prints the names of bulletin boards (`$1') of the file `BBS-list' as
+a string of 10 characters, left justified.  It also prints the phone
+numbers (`$2') afterward on the line.  This will produce an aligned
+two--column table of names and phone numbers, like so:
+
+     aardvark   555-5553
+     alpo-net   555-3412
+     barfly     555-7685
+     bites      555-1675
+     camelot    555-0542
+     core       555-2912
+     fooey      555-1234
+     foot       555-6699
+     macfoo     555-6480
+     sdace      555-3430
+     sabafoo    555-2127
+
+Did you notice that we did not specify that the phone numbers be
+printed as numbers?  They had to be printed as strings because the
+numbers are separated by a dash.  This dash would be interpreted as a
+"minus" sign if we had tried to print the phone numbers as numbers. 
+This would have led to some pretty confusing results.
+
+We did not specify a width for the phone numbers because they are the
+last things on their lines.  We don't need to put spaces after them.
+
+We could make our table look even nicer by adding headings to the
+tops of the columns.  To do this, use the BEGIN pattern (*note
+BEGIN/END::.) to cause the header to be printed only once, at the
+beginning of the `awk' program:
+
+     awk 'BEGIN { print "Name      Number"
+                  print "---      -----" }
+           { printf "%-10s %s\n", $1, $2 }' BBS-list
+
+Did you notice that we mixed `print' and `printf' statements in the
+above example?  We could have used just `printf' statements to get
+the same results:
+
+     awk 'BEGIN { printf "%-10s %s\n", "Name", "Number"
+                  printf "%-10s %s\n", "---", "-----" }
+          { printf "%-10s %s\n", $1, $2 }' BBS-list
+
+By outputting each column heading with the same format specification
+used for the elements of the column, we have made sure that the
+headings will be aligned just like the columns.
+
+The fact that the same format specification is used can be emphasized
+by storing it in a variable, like so:
+
+     awk 'BEGIN { format = "%-10s %s\n"
+                  printf format, "Name", "Number"
+                  printf format, "---", "-----" }
+          { printf format, $1, $2 }' BBS-list
+
+See if you can use the `printf' statement to line up the headings and
+table data for our `inventory-shipped' example covered earlier in the
+section on the `print' statement (*note Print::.).
+
+
+
+File: gawk-info,  Node: One-liners,  Next: Patterns,  Prev: Printing,  Up: Top
+
+Useful ``One-liners''
+*********************
+
+Useful `awk' programs are often short, just a line or two.  Here is a
+collection of useful, short programs to get you started.  Some of
+these programs contain constructs that haven't been covered yet.  The
+description of the program will give you a good idea of what is going
+on, but please read the rest of the manual to become an `awk' expert!
+
+`awk '{ num_fields = num_fields + NF }'
+``     END { print num_fields }'''
+     This program prints the total number of fields in all input lines.
+
+`awk 'length($0) > 80''
+     This program prints every line longer than 80 characters.  The
+     sole rule has a relational expression as its pattern, and has no
+     action (so the default action, printing the record, is used).
+
+`awk 'NF > 0''
+     This program prints every line that has at least one field. 
+     This is an easy way to delete blank lines from a file (or
+     rather, to create a new file similar to the old file but from
+     which the blank lines have been deleted).
+
+`awk '{ if (NF > 0) print }''
+     This program also prints every line that has at least one field.
+     Here we allow the rule to match every line, then decide in the
+     action whether to print.
+
+`awk 'BEGIN { for (i = 1; i <= 7; i++)'
+``              print int(101 * rand()) }'''
+     This program prints 7 random numbers from 0 to 100, inclusive.
+
+`ls -l FILES | awk '{ x += $4 } ; END { print "total bytes: " x }''
+     This program prints the total number of bytes used by FILES.
+
+`expand FILE | awk '{ if (x < length()) x = length() }'
+``                  END { print "maximum line length is " x }'''
+     This program prints the maximum line length of FILE.  The input
+     is piped through the `expand' program to change tabs into
+     spaces, so the widths compared are actually the right--margin
+     columns.
+
+
author	Arnold D. Robbins <arnold@skeeve.com>	2010-07-02 15:53:23 +0300
committer	Arnold D. Robbins <arnold@skeeve.com>	2010-07-02 15:53:23 +0300
commit	f3d9dd233ac07f764a554528c85be3768a1d1ddb (patch)
tree	f190ab7e0188c66eba76a74b8717e3ad7b16ef04 /gawk-info-2
parent	0f1b7311fbc0e61e3e12194ce3e8484aaa4b7fe6 (diff)
download	egawk-f3d9dd233ac07f764a554528c85be3768a1d1ddb.tar.gz egawk-f3d9dd233ac07f764a554528c85be3768a1d1ddb.tar.bz2 egawk-f3d9dd233ac07f764a554528c85be3768a1d1ddb.zip