aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--ChangeLog4
-rw-r--r--FUTURES20
-rw-r--r--TODO.xgawk11
-rw-r--r--doc/ChangeLog5
-rw-r--r--doc/awkcard.in46
-rw-r--r--doc/gawk.129
-rw-r--r--doc/gawk.info2204
-rw-r--r--doc/gawk.texi1449
8 files changed, 1504 insertions, 2264 deletions
diff --git a/ChangeLog b/ChangeLog
index ca35fafa..86011883 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2012-08-10 Arnold D. Robbins <arnold@skeeve.com>
+
+ * FUTURES, TODO.xgawk: Updates.
+
2012-08-08 Arnold D. Robbins <arnold@skeeve.com>
* configure.ac: Add -DNDEBUG to remove asserts if not developing.
diff --git a/FUTURES b/FUTURES
index 62225b12..8e927aaa 100644
--- a/FUTURES
+++ b/FUTURES
@@ -13,17 +13,17 @@ For 4.1
=======
DONE: Merge gawk/pgawk/dgawk into one executable
- Consider removing use of and/or need for the protos.h file.
-
- Consider moving var_value info into Node_var itself
- to reduce memory usage.
-
DONE: Merge xmlgawk -l feature
- Merge xmlgawk XML extensions
+ DONE: Merge xmlgawk XML extensions (via source forge project that
+ works with new API)
DONE: Integrate MPFR to provide high precision arithmetic.
+ DONE: Implement designed API for loadable modules
+
+ DONE: Redo the loadable modules interface from the awk level.
+
Continue code reviews / code cleanup
Consider making gawk output +nan for NaN values so that it
@@ -31,16 +31,16 @@ For 4.1
For 4.2
=======
- Implement designed API for loadable modules
- Redo the loadable modules interface from the awk level.
+ Consider removing use of and/or need for the protos.h file.
+
+ Consider moving var_value info into Node_var itself
+ to reduce memory usage.
Rework management of array index storage. (Partially DONE.)
DBM storage of awk arrays. Try to allow multiple dbm packages.
- ? Move the loadable modules interface to libtool.
-
? Add an optional base to strtonum, allowing 2-36.
? Optional third argument for index indicating where to start the
diff --git a/TODO.xgawk b/TODO.xgawk
index d11fad6d..e0913514 100644
--- a/TODO.xgawk
+++ b/TODO.xgawk
@@ -3,11 +3,6 @@ To-do list for xgawk enhancements:
- Attempting to load the same file with -f and -i (or @include) should
be a fatal error.
-- Review open hook implementation.
- * Mostly done.
- * Still to go: Rework iop_alloc, interaction with open hooks, and
- skipping command line directories.
-
Low priority:
- Enhance extension/fork.c waitpid to allow the caller to specify the options.
@@ -140,3 +135,9 @@ Done:
- MPFR. This is probably not useful now that MPFR support has been
integrated into gawk. Are there any users who need this extension?
+
+- Review open hook implementation.
+ * Mostly done.
+ * Still to go: Rework iop_alloc, interaction with open hooks, and
+ skipping command line directories.
+
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 65907bc1..32ef1a1c 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,8 @@
+2012-08-10 Arnold D. Robbins <arnold@skeeve.com>
+
+ * awkcard.in, gawk.1, gawk.texi: Updated. Mostly for new API stuff
+ but also some other things.
+
2012-08-01 Arnold D. Robbins <arnold@skeeve.com>
* Makefile.am (install-data-hook): Install a dgawk.1 link to the
diff --git a/doc/awkcard.in b/doc/awkcard.in
index d0c1578a..9615b58e 100644
--- a/doc/awkcard.in
+++ b/doc/awkcard.in
@@ -271,6 +271,8 @@ for localization.
.TI "\*(FC\-h\*(FR, \*(FC\-\^\-help\*(FR
Print a short summary of the available
options on \*(FCstdout\*(FR, then exit zero.
+.TI "\*(FC\-i \*(FIfile\*(FR, \*(FC\-\^\-include \*(FIfile\*(FR
+Include library AWK code in \*(FIfile\*(FR.
.TI "\*(FC\-l \*(FIlib\*(FR, \*(FC\-\^\-load \*(FIlib\*(FR
Load dynamic extension \*(FIlib\fP.
.TI "\*(FC\-L \*(FR[\*(FC\*(FIvalue\*(FR], \*(FC\-\^\-lint\*(FR[\*(FC=\*(FIvalue\*(FR]
@@ -300,13 +302,7 @@ Send profiling data to \*(FIfile\*(FR
The profile contains execution counts in the left margin
of each statement in the program.
.TI "\*(FC\-P\*(FR, \*(FC\-\^\-posix\*(FR
-Disable common and GNU extensions.
-.TI "\*(FC\-r\*(FR, \*(FC\-\^\-re\-interval\*(FR
-Enable \*(FIinterval expressions\*(FR.\*(CB
-... in regular
-... expression matching (see \fHRegular
-... Expressions\fP below). Useful if
-... \*(FC\-\^\-traditional\*(FR is specified
+Disable common and GNU extensions.\*(CB
.in -4n
.EB "\s+2\f(HBCOMMAND LINE ARGUMENTS (\*(GK\f(HB)\*(FR\s0"
@@ -318,6 +314,12 @@ Enable \*(FIinterval expressions\*(FR.\*(CB
.ES
.fi
.in +4n
+.TI "\*(FC\-r\*(FR, \*(FC\-\^\-re\-interval\*(FR
+Enable \*(FIinterval expressions\*(FR.
+... in regular
+... expression matching (see \fHRegular
+... Expressions\fP below). Useful if
+... \*(FC\-\^\-traditional\*(FR is specified
.TI "\*(FC\-S\*(FR, \*(FC\-\^\-sandbox\*(FR
Disable the \*(FCsystem()\*(FR function,
input redirection with \*(FCgetline\*(FR,
@@ -342,7 +344,7 @@ options are passed on to the AWK program in
\*(FCARGV\*(FR
for processing.\*(CB
.EB "\s+2\f(HBCOMMAND LINE ARGUMENTS (\*(GK\f(HB)\*(FR\s0"
-
+.sp .4
.\"
.\"
.\" --- Command Line Arguments (mawk)
@@ -454,7 +456,7 @@ The program text is read as if all the \*(FIprog-file\*(FR(s)
\*(CBand command line
source texts\*(CD had been concatenated.
.sp
-\*(GK includes files named on \*(FC@include\*(FR lines.
+\*(CB\*(GK includes files named on \*(FC@include\*(FR lines.
Nested includes are allowed.\*(CD
.sp .5
AWK programs execute in the following order.
@@ -1141,7 +1143,10 @@ The default path is
If a file name given to the \*(FC\-f\fP option contains a ``/'' character,
no path search is performed.
.sp .5
-.PP
+The variable \*(FCAWKLIBPATH\fP
+specifies the search path for dynamic extensions to use
+with \*(FC@load\fP and the \*(FC\-l\fP option.
+.sp .5
For socket communication,
\*(FCGAWK_SOCK_RETRIES\fP
controls the number of retries, and
@@ -1151,6 +1156,10 @@ The interval is in milliseconds. On systems that do not support
\*(FIusleep\fP(3),
the value is rounded up to an integral number of seconds.
.sp .5
+The value of \*(FCGAWK_READ_TIMEOUT\fP specifies the time, in milliseconds,
+for \*(GK to
+wait for input before returning with an error.
+.sp .5
If \*(FCPOSIXLY_CORRECT\fP exists
.\" in the environment,
then \*(GK
@@ -1845,16 +1854,15 @@ Return the bitwise XOR of the arguments.\*(CB
.fi
.in +.2i
.ti -.2i
-\*(CD\*(FCextension(\*(FIlib\*(FC, \*(FIfunc\*(FC)\*(FR
+\*(CD\*(FC@load "\*(FIextension\*(FC"\*(FR
.br
-Dynamically load the shared library
-\*(FIlib\*(FR
-and call
-\*(FIfunc\*(FR
-in it to initialize the library.
+Dynamically load the named \*(FIextension\*(FR.
This adds new built-in functions to \*(GK.
-It returns the value returned by
-\*(FIfunc\*(FR.\*(CB
+.\" The extension should use the API defined by the
+.\" \*(FCgawkapi.h\*(FR header file, as documented in
+.\" the full manual.
+The extension is loaded during the parsing of the program.
+See the manual for details.\*(CB
.in -.2i
.EB "\s+2\f(HBDYNAMIC EXTENSIONS (\*(GK\f(HB)\*(FR\s0"
.BT
@@ -1955,7 +1963,7 @@ maintains it.\*(CX
.ES
.fi
\*(CDCopyright \(co 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
-2007, 2009, 2010, 2011 Free Software Foundation, Inc.
+2007, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
.sp .5
Permission is granted to make and distribute verbatim copies of this
reference card provided the copyright notice and this permission notice
diff --git a/doc/gawk.1 b/doc/gawk.1
index c0a0a413..494ab16d 100644
--- a/doc/gawk.1
+++ b/doc/gawk.1
@@ -14,7 +14,7 @@
. if \w'\(rq' .ds rq "\(rq
. \}
.\}
-.TH GAWK 1 "Nov 10 2011" "Free Software Foundation" "Utility Commands"
+.TH GAWK 1 "Aug 09 2012" "Free Software Foundation" "Utility Commands"
.SH NAME
gawk \- pattern scanning and processing language
.SH SYNOPSIS
@@ -3181,24 +3181,11 @@ may be used in place of
.SH DYNAMICALLY LOADING NEW FUNCTIONS
You can dynamically add new built-in functions to the running
.I gawk
-interpreter.
+interpreter with the
+.B @load
+statement.
The full details are beyond the scope of this manual page;
-see \*(EP for the details.
-.PP
-.TP 8
-\fBextension(\fIobject\fB, \fIfunction\fB)\fR
-Dynamically link the shared object file named by
-.IR object ,
-and invoke
-.I function
-in that object, to perform initialization.
-These should both be provided as strings.
-Return the value returned by
-.IR function .
-.PP
-Using this feature at the C level is not pretty, but
-it is unlikely to go away. Additional mechanisms may
-be added at some point.
+see \*(EP.
.SH SIGNALS
The
.I gawk
@@ -3727,7 +3714,7 @@ status is 2. On non-POSIX systems, this value may be mapped to
.SH VERSION INFORMATION
This man page documents
.IR gawk ,
-version 4.0.
+version 4.1.
.SH AUTHORS
The original version of \*(UX
.I awk
@@ -3805,6 +3792,7 @@ While the
developers occasionally read this newsgroup, posting bug reports there
is an unreliable way to report bugs. Instead, please use the electronic mail
addresses given above.
+Really.
.PP
If you're using a GNU/Linux or BSD-based system,
you may wish to submit a bug report to the vendor of your distribution.
@@ -3824,6 +3812,7 @@ are surprisingly difficult to diagnose in the completely general case,
and the effort to do so really is not worth it.
.SH SEE ALSO
.IR egrep (1),
+.IR sed (1),
.IR getpid (2),
.IR getppid (2),
.IR getpgrp (2),
@@ -3839,7 +3828,7 @@ Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger,
Addison-Wesley, 1988. ISBN 0-201-07981-X.
.PP
\*(EP,
-Edition 4.0, shipped with the
+Edition 4.1, shipped with the
.I gawk
source.
The current version of this document is available online at
diff --git a/doc/gawk.info b/doc/gawk.info
index bcc773d6..65bf903c 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -97,12 +97,14 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Sample Programs:: Many `awk' programs with complete
explanations.
* Debugger:: The `gawk' debugger.
+* Dynamic Extensions:: Adding new built-in functions to
+ `gawk'.
* Language History:: The evolution of the `awk'
language.
* Installation:: Installing `gawk' under various
operating systems.
-* Notes:: Notes about `gawk' extensions and
- possible future work.
+* Notes:: Notes about adding things to `gawk'
+ and possible future work.
* Basic Concepts:: A very quick introduction to programming
concepts.
* Glossary:: An explanation of some unfamiliar terms.
@@ -359,21 +361,22 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* I18N Portability:: `awk'-level portability issues.
* I18N Example:: A simple i18n example.
* Gawk I18N:: `gawk' is also internationalized.
-* Floating-point Programming:: Effective floating-point programming.
-* Floating-point Representation:: Binary floating-point representation.
-* Floating-point Context:: Floating-point context.
-* Rounding Mode:: Floating-point rounding mode.
-* Arbitrary Precision Floats:: Arbitrary precision floating-point
- arithmetic with `gawk'.
-* Setting Precision:: Setting the working precision.
-* Setting Rounding Mode:: Setting the rounding mode.
-* Floating-point Constants:: Representing floating-point constants.
-* Changing Precision:: Changing the precision of a number.
-* Exact Arithmetic:: Exact arithmetic with floating-point numbers.
-* Integer Programming:: Effective integer programming.
-* Arbitrary Precision Integers:: Arbitrary precision integer
- arithmetic with `gawk'.
-* MPFR and GMP Libraries:: Information about the MPFR and GMP libraries.
+* Floating-point Programming:: Effective Floating-point Programming.
+* Floating-point Representation:: Binary Floating-point Representation.
+* Floating-point Context:: Floating-point Context.
+* Rounding Mode:: Floating-point Rounding Mode.
+* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
+ Arithmetic with `gawk'.
+* Setting Precision:: Setting the Working Precision.
+* Setting Rounding Mode:: Setting the Rounding Mode.
+* Floating-point Constants:: Representing Floating-point Constants.
+* Changing Precision:: Changing the Precision of a Number.
+* Exact Arithmetic:: Exact Arithmetic with Floating-point
+ Numbers.
+* Integer Programming:: Effective Integer Programming.
+* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic with
+ `gawk'.
+* MPFR and GMP Libraries ::
* Nondecimal Data:: Allowing nondecimal input data.
* Array Sorting:: Facilities for controlling array traversal
and sorting arrays.
@@ -438,14 +441,14 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Anagram Program:: Finding anagrams from a dictionary.
* Signature Program:: People do amazing things with too much time
on their hands.
-* Debugging:: Introduction to `gawk' Debugger.
+* Debugging:: Introduction to `gawk' debugger.
* Debugging Concepts:: Debugging in General.
* Debugging Terms:: Additional Debugging Concepts.
* Awk Debugging:: Awk Debugging.
-* Sample Debugging Session:: Sample Debugging Session.
+* Sample Debugging Session:: Sample debugging session.
* Debugger Invocation:: How to Start the Debugger.
* Finding The Bug:: Finding the Bug.
-* List of Debugger Commands:: Main Commands.
+* List of Debugger Commands:: Main debugger commands.
* Breakpoint Control:: Control of Breakpoints.
* Debugger Execution Control:: Control of Execution.
* Viewing And Changing Data:: Viewing and Changing Data.
@@ -453,8 +456,13 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
* Debugger Info:: Obtaining Information about the Program and
the Debugger State.
* Miscellaneous Debugger Commands:: Miscellaneous Commands.
-* Readline Support:: Readline Support.
-* Limitations:: Limitations and Future Plans.
+* Readline Support:: Readline support.
+* Limitations:: Limitations and future plans.
+* Plugin License:: A note about licensing.
+* Sample Library:: A example of new functions.
+* Internal File Description:: What the new functions will do.
+* Internal File Ops:: The code for internal file operations.
+* Using Internal File Ops:: How to use an external extension.
* V7/SVR3.1:: The major changes between V7 and System V
Release 3.1.
* SVR4:: Minor changes between System V Releases 3.1
@@ -505,16 +513,6 @@ texts being (a) (see below), and with the Back-Cover Texts being (b)
`gawk'.
* New Ports:: Porting `gawk' to a new operating
system.
-* Dynamic Extensions:: Adding new built-in functions to
- `gawk'.
-* Internals:: A brief look at some `gawk'
- internals.
-* Plugin License:: A note about licensing.
-* Loading Extensions:: How to load dynamic extensions.
-* Sample Library:: A example of new functions.
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
* Future Extensions:: New features that may be implemented one
day.
* Basic High Level:: The high level view.
@@ -883,8 +881,8 @@ non-POSIX systems. It also describes how to report bugs in `gawk' and
where to get other freely available `awk' implementations.
*note Notes::, describes how to disable `gawk''s extensions, as well
-as how to contribute new code to `gawk', how to write extension
-libraries, and some possible future directions for `gawk' development.
+as how to contribute new code to `gawk', and some possible future
+directions for `gawk' development.
*note Basic Concepts::, provides some very cursory background
material for those who are completely unfamiliar with computer
@@ -2594,8 +2592,8 @@ A number of environment variables influence how `gawk' behaves.
* AWKPATH Variable:: Searching directories for `awk'
programs.
-* AWKLIBPATH Variable:: Searching directories for `awk'
- shared libraries.
+* AWKLIBPATH Variable:: Searching directories for `awk' shared
+ libraries.
* Other Environment Variables:: The environment variables.

@@ -3737,7 +3735,6 @@ have to be named on the `awk' command line (*note Getline::).
* Getline:: Reading files under explicit program control
using the `getline' function.
* Read Timeout:: Reading input with a timeout.
-
* Command line directories:: What happens if you put a directory on the
command line.
@@ -8520,10 +8517,10 @@ would otherwise be difficult or impossible to perform:
entirely. Otherwise, `gawk' exits with the usual fatal error.
* If you have written extensions that modify the record handling (by
- inserting an "open hook"), you can invoke them at this point,
+ inserting an "input parser"), you can invoke them at this point,
before `gawk' has started processing the file. (This is a _very_
- advanced feature, currently used only by the XMLgawk project
- (http://xmlgawk.sourceforge.net).)
+ advanced feature, currently used only by the `gawkextlib' project
+ (http://gawkextlib.sourceforge.net).)
The `ENDFILE' rule is called when `gawk' has finished processing the
last record in an input file. For the last input file, it will be
@@ -13771,21 +13768,22 @@ numbers.
* Menu:
-* Floating-point Programming:: Effective Floating-point Programming.
-* Floating-point Representation:: Binary Floating-point Representation.
-* Floating-point Context:: Floating-point Context.
-* Rounding Mode:: Floating-point Rounding Mode.
-* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
- Arithmetic with `gawk'.
-* Setting Precision:: Setting the Working Precision.
-* Setting Rounding Mode:: Setting the Rounding Mode.
-* Floating-point Constants:: Representing Floating-point Constants.
-* Changing Precision:: Changing the Precision of a Number.
-* Exact Arithmetic:: Exact Arithmetic with Floating-point Numbers.
-* Integer Programming:: Effective Integer Programming.
-* Arbitrary Precision Integers:: Arbitrary Precision Integer
- Arithmetic with `gawk'.
-* MPFR and GMP Libraries:: Information About the MPFR and GMP Libraries.
+* Floating-point Programming:: Effective Floating-point Programming.
+* Floating-point Representation:: Binary Floating-point Representation.
+* Floating-point Context:: Floating-point Context.
+* Rounding Mode:: Floating-point Rounding Mode.
+* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
+ Arithmetic with `gawk'.
+* Setting Precision:: Setting the Working Precision.
+* Setting Rounding Mode:: Setting the Rounding Mode.
+* Floating-point Constants:: Representing Floating-point Constants.
+* Changing Precision:: Changing the Precision of a Number.
+* Exact Arithmetic:: Exact Arithmetic with Floating-point
+ Numbers.
+* Integer Programming:: Effective Integer Programming.
+* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic with
+ `gawk'.
+* MPFR and GMP Libraries ::
---------- Footnotes ----------
@@ -19689,7 +19687,7 @@ supplies the following copyright terms:
We leave it to you to determine what the program does.

-File: gawk.info, Node: Debugger, Next: Language History, Prev: Sample Programs, Up: Top
+File: gawk.info, Node: Debugger, Next: Dynamic Extensions, Prev: Sample Programs, Up: Top
15 Debugging `awk' Programs
***************************
@@ -20741,7 +20739,400 @@ features may be added, and of course feel free to try to add them
yourself!

-File: gawk.info, Node: Language History, Next: Installation, Prev: Debugger, Up: Top
+File: gawk.info, Node: Dynamic Extensions, Next: Language History, Prev: Debugger, Up: Top
+
+16 Writing Extensions for `gawk'
+********************************
+
+This chapter is a placeholder, pending a rewrite for the new API. Some
+of the old bits remain, since they can be partially reused.
+
+ It is possible to add new built-in functions to `gawk' using
+dynamically loaded libraries. This facility is available on systems
+(such as GNU/Linux) that support the C `dlopen()' and `dlsym()'
+functions. This major node describes how to write and use dynamically
+loaded extensions for `gawk'. Experience with programming in C or C++
+is necessary when reading this minor node.
+
+ NOTE: When `--sandbox' is specified, extensions are disabled
+ (*note Options::.
+
+* Menu:
+
+* Plugin License:: A note about licensing.
+* Sample Library:: A example of new functions.
+
+
+File: gawk.info, Node: Plugin License, Next: Sample Library, Up: Dynamic Extensions
+
+16.1 Extension Licensing
+========================
+
+Every dynamic extension should define the global symbol
+`plugin_is_GPL_compatible' to assert that it has been licensed under a
+GPL-compatible license. If this symbol does not exist, `gawk' will
+emit a fatal error and exit.
+
+ The declared type of the symbol should be `int'. It does not need
+to be in any allocated section, though. The code merely asserts that
+the symbol exists in the global scope. Something like this is enough:
+
+ int plugin_is_GPL_compatible;
+
+
+File: gawk.info, Node: Sample Library, Prev: Plugin License, Up: Dynamic Extensions
+
+16.2 Example: Directory and File Operation Built-ins
+====================================================
+
+Two useful functions that are not in `awk' are `chdir()' (so that an
+`awk' program can change its directory) and `stat()' (so that an `awk'
+program can gather information about a file). This minor node
+implements these functions for `gawk' in an external extension library.
+
+* Menu:
+
+* Internal File Description:: What the new functions will do.
+* Internal File Ops:: The code for internal file operations.
+* Using Internal File Ops:: How to use an external extension.
+
+
+File: gawk.info, Node: Internal File Description, Next: Internal File Ops, Up: Sample Library
+
+16.2.1 Using `chdir()' and `stat()'
+-----------------------------------
+
+This minor node shows how to use the new functions at the `awk' level
+once they've been integrated into the running `gawk' interpreter.
+Using `chdir()' is very straightforward. It takes one argument, the new
+directory to change to:
+
+ ...
+ newdir = "/home/arnold/funstuff"
+ ret = chdir(newdir)
+ if (ret < 0) {
+ printf("could not change to %s: %s\n",
+ newdir, ERRNO) > "/dev/stderr"
+ exit 1
+ }
+ ...
+
+ The return value is negative if the `chdir' failed, and `ERRNO'
+(*note Built-in Variables::) is set to a string indicating the error.
+
+ Using `stat()' is a bit more complicated. The C `stat()' function
+fills in a structure that has a fair amount of information. The right
+way to model this in `awk' is to fill in an associative array with the
+appropriate information:
+
+ file = "/home/arnold/.profile"
+ fdata[1] = "x" # force `fdata' to be an array
+ ret = stat(file, fdata)
+ if (ret < 0) {
+ printf("could not stat %s: %s\n",
+ file, ERRNO) > "/dev/stderr"
+ exit 1
+ }
+ printf("size of %s is %d bytes\n", file, fdata["size"])
+
+ The `stat()' function always clears the data array, even if the
+`stat()' fails. It fills in the following elements:
+
+`"name"'
+ The name of the file that was `stat()''ed.
+
+`"dev"'
+`"ino"'
+ The file's device and inode numbers, respectively.
+
+`"mode"'
+ The file's mode, as a numeric value. This includes both the file's
+ type and its permissions.
+
+`"nlink"'
+ The number of hard links (directory entries) the file has.
+
+`"uid"'
+`"gid"'
+ The numeric user and group ID numbers of the file's owner.
+
+`"size"'
+ The size in bytes of the file.
+
+`"blocks"'
+ The number of disk blocks the file actually occupies. This may not
+ be a function of the file's size if the file has holes.
+
+`"atime"'
+`"mtime"'
+`"ctime"'
+ The file's last access, modification, and inode update times,
+ respectively. These are numeric timestamps, suitable for
+ formatting with `strftime()' (*note Built-in::).
+
+`"pmode"'
+ The file's "printable mode." This is a string representation of
+ the file's type and permissions, such as what is produced by `ls
+ -l'--for example, `"drwxr-xr-x"'.
+
+`"type"'
+ A printable string representation of the file's type. The value
+ is one of the following:
+
+ `"blockdev"'
+ `"chardev"'
+ The file is a block or character device ("special file").
+
+ `"directory"'
+ The file is a directory.
+
+ `"fifo"'
+ The file is a named-pipe (also known as a FIFO).
+
+ `"file"'
+ The file is just a regular file.
+
+ `"socket"'
+ The file is an `AF_UNIX' ("Unix domain") socket in the
+ filesystem.
+
+ `"symlink"'
+ The file is a symbolic link.
+
+ Several additional elements may be present depending upon the
+operating system and the type of the file. You can test for them in
+your `awk' program by using the `in' operator (*note Reference to
+Elements::):
+
+`"blksize"'
+ The preferred block size for I/O to the file. This field is not
+ present on all POSIX-like systems in the C `stat' structure.
+
+`"linkval"'
+ If the file is a symbolic link, this element is the name of the
+ file the link points to (i.e., the value of the link).
+
+`"rdev"'
+`"major"'
+`"minor"'
+ If the file is a block or character device file, then these values
+ represent the numeric device number and the major and minor
+ components of that number, respectively.
+
+
+File: gawk.info, Node: Internal File Ops, Next: Using Internal File Ops, Prev: Internal File Description, Up: Sample Library
+
+16.2.2 C Code for `chdir()' and `stat()'
+----------------------------------------
+
+Here is the C code for these extensions. They were written for
+GNU/Linux. The code needs some more work for complete portability to
+other POSIX-compliant systems:(1)
+
+ #include "awk.h"
+
+ #include <sys/sysmacros.h>
+
+ int plugin_is_GPL_compatible;
+
+ /* do_chdir --- provide dynamically loaded chdir() builtin for gawk */
+
+ static NODE *
+ do_chdir(int nargs)
+ {
+ NODE *newdir;
+ int ret = -1;
+
+ if (do_lint && nargs != 1)
+ lintwarn("chdir: called with incorrect number of arguments");
+
+ newdir = get_scalar_argument(0, FALSE);
+
+ The file includes the `"awk.h"' header file for definitions for the
+`gawk' internals. It includes `<sys/sysmacros.h>' for access to the
+`major()' and `minor'() macros.
+
+ By convention, for an `awk' function `foo', the function that
+implements it is called `do_foo'. The function should take a `int'
+argument, usually called `nargs', that represents the number of defined
+arguments for the function. The `newdir' variable represents the new
+directory to change to, retrieved with `get_scalar_argument()'. Note
+that the first argument is numbered zero.
+
+ This code actually accomplishes the `chdir()'. It first forces the
+argument to be a string and passes the string value to the `chdir()'
+system call. If the `chdir()' fails, `ERRNO' is updated.
+
+ (void) force_string(newdir);
+ ret = chdir(newdir->stptr);
+ if (ret < 0)
+ update_ERRNO_int(errno);
+
+ Finally, the function returns the return value to the `awk' level:
+
+ return make_number((AWKNUM) ret);
+ }
+
+ The `stat()' built-in is more involved. First comes a function that
+turns a numeric mode into a printable representation (e.g., 644 becomes
+`-rw-r--r--'). This is omitted here for brevity:
+
+ /* format_mode --- turn a stat mode field into something readable */
+
+ static char *
+ format_mode(unsigned long fmode)
+ {
+ ...
+ }
+
+ Next comes the `do_stat()' function. It starts with variable
+declarations and argument checking:
+
+ /* do_stat --- provide a stat() function for gawk */
+
+ static NODE *
+ do_stat(int nargs)
+ {
+ NODE *file, *array, *tmp;
+ struct stat sbuf;
+ int ret;
+ NODE **aptr;
+ char *pmode; /* printable mode */
+ char *type = "unknown";
+
+ if (do_lint && nargs > 2)
+ lintwarn("stat: called with too many arguments");
+
+ Then comes the actual work. First, the function gets the arguments.
+Then, it always clears the array. The code use `lstat()' (instead of
+`stat()') to get the file information, in case the file is a symbolic
+link. If there's an error, it sets `ERRNO' and returns:
+
+ /* file is first arg, array to hold results is second */
+ file = get_scalar_argument(0, FALSE);
+ array = get_array_argument(1, FALSE);
+
+ /* empty out the array */
+ assoc_clear(array);
+
+ /* lstat the file, if error, set ERRNO and return */
+ (void) force_string(file);
+ ret = lstat(file->stptr, & sbuf);
+ if (ret < 0) {
+ update_ERRNO_int(errno);
+ return make_number((AWKNUM) ret);
+ }
+
+ Now comes the tedious part: filling in the array. Only a few of the
+calls are shown here, since they all follow the same pattern:
+
+ /* fill in the array */
+ aptr = assoc_lookup(array, tmp = make_string("name", 4));
+ *aptr = dupnode(file);
+ unref(tmp);
+
+ aptr = assoc_lookup(array, tmp = make_string("mode", 4));
+ *aptr = make_number((AWKNUM) sbuf.st_mode);
+ unref(tmp);
+
+ aptr = assoc_lookup(array, tmp = make_string("pmode", 5));
+ pmode = format_mode(sbuf.st_mode);
+ *aptr = make_string(pmode, strlen(pmode));
+ unref(tmp);
+
+ When done, return the `lstat()' return value:
+
+
+ return make_number((AWKNUM) ret);
+ }
+
+ Finally, it's necessary to provide the "glue" that loads the new
+function(s) into `gawk'. By convention, each library has a routine
+named `dl_load()' that does the job. The simplest way is to use the
+`dl_load_func' macro in `gawkapi.h'.
+
+ And that's it! As an exercise, consider adding functions to
+implement system calls such as `chown()', `chmod()', and `umask()'.
+
+ ---------- Footnotes ----------
+
+ (1) This version is edited slightly for presentation. See
+`extension/filefuncs.c' in the `gawk' distribution for the complete
+version.
+
+
+File: gawk.info, Node: Using Internal File Ops, Prev: Internal File Ops, Up: Sample Library
+
+16.2.3 Integrating the Extensions
+---------------------------------
+
+Now that the code is written, it must be possible to add it at runtime
+to the running `gawk' interpreter. First, the code must be compiled.
+Assuming that the functions are in a file named `filefuncs.c', and IDIR
+is the location of the `gawk' include files, the following steps create
+a GNU/Linux shared library:
+
+ $ gcc -fPIC -shared -DHAVE_CONFIG_H -c -O -g -IIDIR filefuncs.c
+ $ ld -o filefuncs.so -shared filefuncs.o
+
+ Once the library exists, it is loaded by calling the `extension()'
+built-in function. This function takes two arguments: the name of the
+library to load and the name of a function to call when the library is
+first loaded. This function adds the new functions to `gawk'. It
+returns the value returned by the initialization function within the
+shared library:
+
+ # file testff.awk
+ BEGIN {
+ extension("./filefuncs.so", "dl_load")
+
+ chdir(".") # no-op
+
+ data[1] = 1 # force `data' to be an array
+ print "Info for testff.awk"
+ ret = stat("testff.awk", data)
+ print "ret =", ret
+ for (i in data)
+ printf "data[\"%s\"] = %s\n", i, data[i]
+ print "testff.awk modified:",
+ strftime("%m %d %y %H:%M:%S", data["mtime"])
+
+ print "\nInfo for JUNK"
+ ret = stat("JUNK", data)
+ print "ret =", ret
+ for (i in data)
+ printf "data[\"%s\"] = %s\n", i, data[i]
+ print "JUNK modified:", strftime("%m %d %y %H:%M:%S", data["mtime"])
+ }
+
+ Here are the results of running the program:
+
+ $ gawk -f testff.awk
+ -| Info for testff.awk
+ -| ret = 0
+ -| data["size"] = 607
+ -| data["ino"] = 14945891
+ -| data["name"] = testff.awk
+ -| data["pmode"] = -rw-rw-r--
+ -| data["nlink"] = 1
+ -| data["atime"] = 1293993369
+ -| data["mtime"] = 1288520752
+ -| data["mode"] = 33204
+ -| data["blksize"] = 4096
+ -| data["dev"] = 2054
+ -| data["type"] = file
+ -| data["gid"] = 500
+ -| data["uid"] = 500
+ -| data["blocks"] = 8
+ -| data["ctime"] = 1290113572
+ -| testff.awk modified: 10 31 10 12:25:52
+ -|
+ -| Info for JUNK
+ -| ret = -1
+ -| JUNK modified: 01 01 70 02:00:00
+
+
+File: gawk.info, Node: Language History, Next: Installation, Prev: Dynamic Extensions, Up: Top
Appendix A The Evolution of the `awk' Language
**********************************************
@@ -21030,9 +21421,6 @@ the current version of `gawk'.
- The `bindtextdomain()', `dcgettext()' and `dcngettext()'
functions for internationalization (*note Programmer i18n::).
- - The `extension()' built-in function and the ability to add
- new functions dynamically (*note Dynamic Extensions::).
-
- The `fflush()' function from Brian Kernighan's version of
`awk' (*note I/O Functions::).
@@ -21051,11 +21439,13 @@ the current version of `gawk'.
search for the `-l' command-line option (*note Options::).
- The ability to use GNU-style long-named options that start
- with `--' and the `--characters-as-bytes', `--compat',
- `--dump-variables', `--exec', `--gen-pot', `--lint',
- `--lint-old', `--non-decimal-data', `--posix', `--profile',
- `--re-interval', `--sandbox', `--source', `--traditional', and
- `--use-lc-numeric' options (*note Options::).
+ with `--' and the `--bignum', `--characters-as-bytes',
+ `--copyright', `--debug', `--dump-variables', `--exec',
+ `--gen-pot', `--include', `--lint', `--lint-old', `--load',
+ `--non-decimal-data', `--optimize', `--posix',
+ `--pretty-print', `--profile', `--re-interval', `--sandbox',
+ `--source', `--traditional', and `--use-lc-numeric' options
+ (*note Options::).
* Support for the following obsolete systems was removed from the
code and the documentation for `gawk' version 4.0:
@@ -21277,7 +21667,7 @@ Info file, in approximate chronological order:
various PC platforms.
* Christos Zoulas provided the `extension()' built-in function for
- dynamically adding new modules.
+ dynamically adding new modules. (This was removed at `gawk' 4.1.)
* Ju"rgen Kahrs contributed the initial version of the TCP/IP
networking code and documentation, and motivated the inclusion of
@@ -22400,8 +22790,6 @@ and maintainers of `gawk'. Everything in it applies specifically to
* Compatibility Mode:: How to disable certain `gawk'
extensions.
* Additions:: Making Additions To `gawk'.
-* Dynamic Extensions:: Adding new built-in functions to
- `gawk'.
* Future Extensions:: New features that may be implemented one day.

@@ -22428,7 +22816,7 @@ for the casual user. It probably has not even been compiled into your
version of `gawk', since it slows down execution.

-File: gawk.info, Node: Additions, Next: Dynamic Extensions, Prev: Compatibility Mode, Up: Notes
+File: gawk.info, Node: Additions, Next: Future Extensions, Prev: Compatibility Mode, Up: Notes
C.2 Making Additions to `gawk'
==============================
@@ -22597,9 +22985,10 @@ possible to include your changes:
7. Submit changes as unified diffs. Use `diff -u -r -N' to compare
the original `gawk' source tree with your version. I recommend
- using the GNU version of `diff'. Send the output produced by
- either run of `diff' to me when you submit your changes. (*Note
- Bugs::, for the electronic mail information.)
+ using the GNU version of `diff', or best of all, `git diff' or
+ `git format-patch'. Send the output produced by `diff' to me when
+ you submit your changes. (*Note Bugs::, for the electronic mail
+ information.)
Using this format makes it easy for me to apply your changes to the
master version of the `gawk' source code (using `patch'). If I
@@ -22698,661 +23087,9 @@ code that is already there.
style and brace layout that suits your taste.

-File: gawk.info, Node: Dynamic Extensions, Next: Future Extensions, Prev: Additions, Up: Notes
-
-C.3 Adding New Built-in Functions to `gawk'
-===========================================
-
- Danger Will Robinson! Danger!!
- Warning! Warning!
- The Robot
-
- It is possible to add new built-in functions to `gawk' using
-dynamically loaded libraries. This facility is available on systems
-(such as GNU/Linux) that support the C `dlopen()' and `dlsym()'
-functions. This minor node describes how to write and use dynamically
-loaded extensions for `gawk'. Experience with programming in C or C++
-is necessary when reading this minor node.
-
- CAUTION: The facilities described in this minor node are very much
- subject to change in a future `gawk' release. Be aware that you
- may have to re-do everything, at some future time.
-
- If you have written your own dynamic extensions, be sure to
- recompile them for each new `gawk' release. There is no guarantee
- of binary compatibility between different releases, nor will there
- ever be such a guarantee.
-
- NOTE: When `--sandbox' is specified, extensions are disabled
- (*note Options::.
-
-* Menu:
-
-* Internals:: A brief look at some `gawk' internals.
-* Plugin License:: A note about licensing.
-* Loading Extensions:: How to load dynamic extensions.
-* Sample Library:: A example of new functions.
-
-
-File: gawk.info, Node: Internals, Next: Plugin License, Up: Dynamic Extensions
-
-C.3.1 A Minimal Introduction to `gawk' Internals
-------------------------------------------------
-
-The truth is that `gawk' was not designed for simple extensibility.
-The facilities for adding functions using shared libraries work, but
-are something of a "bag on the side." Thus, this tour is brief and
-simplistic; would-be `gawk' hackers are encouraged to spend some time
-reading the source code before trying to write extensions based on the
-material presented here. Of particular note are the files `awk.h',
-`builtin.c', and `eval.c'. Reading `awkgram.y' in order to see how the
-parse tree is built would also be of use.
-
- With the disclaimers out of the way, the following types, structure
-members, functions, and macros are declared in `awk.h' and are of use
-when writing extensions. The next minor node shows how they are used:
-
-`AWKNUM'
- An `AWKNUM' is the internal type of `awk' floating-point numbers.
- Typically, it is a C `double'.
-
-`NODE'
- Just about everything is done using objects of type `NODE'. These
- contain both strings and numbers, as well as variables and arrays.
-
-`AWKNUM force_number(NODE *n)'
- This macro forces a value to be numeric. It returns the actual
- numeric value contained in the node. It may end up calling an
- internal `gawk' function.
-
-`void force_string(NODE *n)'
- This macro guarantees that a `NODE''s string value is current. It
- may end up calling an internal `gawk' function. It also
- guarantees that the string is zero-terminated.
-
-`void force_wstring(NODE *n)'
- Similarly, this macro guarantees that a `NODE''s wide-string value
- is current. It may end up calling an internal `gawk' function.
- It also guarantees that the wide string is zero-terminated.
-
-`nargs'
- Inside an extension function, this is the actual number of
- parameters passed to the current function.
-
-`n->stptr'
-`n->stlen'
- The data and length of a `NODE''s string value, respectively. The
- string is _not_ guaranteed to be zero-terminated. If you need to
- pass the string value to a C library function, save the value in
- `n->stptr[n->stlen]', assign `'\0'' to it, call the routine, and
- then restore the value.
-
-`n->wstptr'
-`n->wstlen'
- The data and length of a `NODE''s wide-string value, respectively.
- Use `force_wstring()' to make sure these values are current.
-
-`n->type'
- The type of the `NODE'. This is a C `enum'. Values should be one
- of `Node_var', `Node_var_new', or `Node_var_array' for function
- parameters.
-
-`n->vname'
- The "variable name" of a node. This is not of much use inside
- externally written extensions.
-
-`void assoc_clear(NODE *n)'
- Clears the associative array pointed to by `n'. Make sure that
- `n->type == Node_var_array' first.
-
-`NODE **assoc_lookup(NODE *symbol, NODE *subs)'
- Finds, and installs if necessary, array elements. `symbol' is the
- array, `subs' is the subscript. This is usually a value created
- with `make_string()' (see below).
-
-`NODE *make_string(char *s, size_t len)'
- Take a C string and turn it into a pointer to a `NODE' that can be
- stored appropriately. This is permanent storage; understanding of
- `gawk' memory management is helpful.
-
-`NODE *make_number(AWKNUM val)'
- Take an `AWKNUM' and turn it into a pointer to a `NODE' that can
- be stored appropriately. This is permanent storage; understanding
- of `gawk' memory management is helpful.
-
-`NODE *dupnode(NODE *n)'
- Duplicate a node. In most cases, this increments an internal
- reference count instead of actually duplicating the entire `NODE';
- understanding of `gawk' memory management is helpful.
-
-`void unref(NODE *n)'
- This macro releases the memory associated with a `NODE' allocated
- with `make_string()' or `make_number()'. Understanding of `gawk'
- memory management is helpful.
-
-`void make_builtin(const char *name, NODE *(*func)(NODE *), int count)'
- Register a C function pointed to by `func' as new built-in
- function `name'. `name' is a regular C string. `count' is the
- maximum number of arguments that the function takes. The function
- should be written in the following manner:
-
- /* do_xxx --- do xxx function for gawk */
-
- NODE *
- do_xxx(int nargs)
- {
- ...
- }
-
-`NODE *get_argument(int i)'
- This function is called from within a C extension function to get
- the `i'-th argument from the function call. The first argument is
- argument zero.
-
-`NODE *get_actual_argument(int i,'
-` int optional, int wantarray);'
- This function retrieves a particular argument `i'. `wantarray' is
- `TRUE' if the argument should be an array, `FALSE' otherwise. If
- `optional' is `TRUE', the argument need not have been supplied.
- If it wasn't, the return value is `NULL'. It is a fatal error if
- `optional' is `TRUE' but the argument was not provided.
-
-`get_scalar_argument(i, opt)'
- This is a convenience macro that calls `get_actual_argument()'.
-
-`get_array_argument(i, opt)'
- This is a convenience macro that calls `get_actual_argument()'.
-
-`void update_ERRNO_int(int errno_saved)'
- This function is called from within a C extension function to set
- the value of `gawk''s `ERRNO' variable, based on the error value
- provided as the argument. It is provided as a convenience.
-
-`void update_ERRNO_string(const char *string, enum errno_translate)'
- This function is called from within a C extension function to set
- the value of `gawk''s `ERRNO' variable to a given string. The
- second argument determines whether the string is translated before
- being installed into `ERRNO'. It is provided as a convenience.
-
-`void unset_ERRNO(void)'
- This function is called from within a C extension function to set
- the value of `gawk''s `ERRNO' variable to a null string. It is
- provided as a convenience.
-
-`void register_deferred_variable(const char *name, NODE *(*load_func)(void))'
- This function is called to register a function to be called when a
- reference to an undefined variable with the given name is
- encountered. The callback function will never be called if the
- variable exists already, so, unless the calling code is running at
- program startup, it should first check whether a variable of the
- given name already exists. The argument function must return a
- pointer to a `NODE' containing the newly created variable. This
- function is used to implement the builtin `ENVIRON' and `PROCINFO'
- arrays, so you can refer to them for examples.
-
-`void register_open_hook(void *(*open_func)(IOBUF *))'
- This function is called to register a function to be called
- whenever a new data file is opened, leading to the creation of an
- `IOBUF' structure in `iop_alloc()'. After creating the new
- `IOBUF', `iop_alloc()' will call (in reverse order of
- registration, so the last function registered is called first)
- each open hook until one returns non-`NULL'. If any hook returns
- a non-`NULL' value, that value is assigned to the `IOBUF''s
- `opaque' field (which will presumably point to a structure
- containing additional state associated with the input processing),
- and no further open hooks are called.
-
- The function called will most likely want to set the `IOBUF''s
- `get_record' method to indicate that future input records should
- be retrieved by calling that method instead of using the standard
- `gawk' input processing.
-
- And the function will also probably want to set the `IOBUF''s
- `close_func' method to be called when the file is closed to clean
- up any state associated with the input.
-
- Finally, hook functions should be prepared to receive an `IOBUF'
- structure where the `fd' field is set to `INVALID_HANDLE', meaning
- that `gawk' was not able to open the file itself. In this case,
- the hook function must be able to successfully open the file and
- place a valid file descriptor there.
-
- Currently, for example, the hook function facility is used to
- implement the XML parser shared library extension. For more info,
- please look in `awk.h' and in `io.c'.
-
- An argument that is supposed to be an array needs to be handled with
-some extra code, in case the array being passed in is actually from a
-function parameter.
-
- The following boilerplate code shows how to do this:
-
- NODE *the_arg;
-
- /* assume need 3rd arg, 0-based */
- the_arg = get_array_argument(2, FALSE);
-
- Again, you should spend time studying the `gawk' internals; don't
-just blindly copy this code.
-
-
-File: gawk.info, Node: Plugin License, Next: Loading Extensions, Prev: Internals, Up: Dynamic Extensions
-
-C.3.2 Extension Licensing
--------------------------
-
-Every dynamic extension should define the global symbol
-`plugin_is_GPL_compatible' to assert that it has been licensed under a
-GPL-compatible license. If this symbol does not exist, `gawk' will
-emit a fatal error and exit.
-
- The declared type of the symbol should be `int'. It does not need
-to be in any allocated section, though. The code merely asserts that
-the symbol exists in the global scope. Something like this is enough:
-
- int plugin_is_GPL_compatible;
-
-
-File: gawk.info, Node: Loading Extensions, Next: Sample Library, Prev: Plugin License, Up: Dynamic Extensions
-
-C.3.3 Loading a Dynamic Extension
----------------------------------
-
-There are two ways to load a dynamically linked library. The first is
-to use the builtin `extension()':
-
- extension(libname, init_func)
-
- where `libname' is the library to load, and `init_func' is the name
-of the initialization or bootstrap routine to run once loaded.
-
- The second method for dynamic loading of a library is to use the
-command line option `-l':
-
- $ gawk -l libname -f myprog
-
- This will work only if the initialization routine is named
-`dl_load()'.
-
- If you use `extension()', the library will be loaded at run time.
-This means that the functions are available only to the rest of your
-script. If you use the command line option `-l' instead, the library
-will be loaded before `gawk' starts compiling the actual program. The
-net effect is that you can use those functions anywhere in the program.
-
- `gawk' has a list of directories where it searches for libraries.
-By default, the list includes directories that depend upon how gawk was
-built and installed (*note AWKLIBPATH Variable::). If you want `gawk'
-to look for libraries in your private directory, you have to tell it.
-The way to do it is to set the `AWKLIBPATH' environment variable (*note
-AWKLIBPATH Variable::). `gawk' supplies the default shared library
-platform suffix if it is not present in the name of the library. If
-the name of your library is `mylib.so', you can simply type
-
- $ gawk -l mylib -f myprog
-
- and `gawk' will do everything necessary to load in your library, and
-then call your `dl_load()' routine.
-
- You can always specify the library using an absolute pathname, in
-which case `gawk' will not use `AWKLIBPATH' to search for it.
-
-
-File: gawk.info, Node: Sample Library, Prev: Loading Extensions, Up: Dynamic Extensions
-
-C.3.4 Example: Directory and File Operation Built-ins
------------------------------------------------------
-
-Two useful functions that are not in `awk' are `chdir()' (so that an
-`awk' program can change its directory) and `stat()' (so that an `awk'
-program can gather information about a file). This minor node
-implements these functions for `gawk' in an external extension library.
-
-* Menu:
-
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
-
-
-File: gawk.info, Node: Internal File Description, Next: Internal File Ops, Up: Sample Library
-
-C.3.4.1 Using `chdir()' and `stat()'
-....................................
-
-This minor node shows how to use the new functions at the `awk' level
-once they've been integrated into the running `gawk' interpreter.
-Using `chdir()' is very straightforward. It takes one argument, the new
-directory to change to:
-
- ...
- newdir = "/home/arnold/funstuff"
- ret = chdir(newdir)
- if (ret < 0) {
- printf("could not change to %s: %s\n",
- newdir, ERRNO) > "/dev/stderr"
- exit 1
- }
- ...
-
- The return value is negative if the `chdir' failed, and `ERRNO'
-(*note Built-in Variables::) is set to a string indicating the error.
-
- Using `stat()' is a bit more complicated. The C `stat()' function
-fills in a structure that has a fair amount of information. The right
-way to model this in `awk' is to fill in an associative array with the
-appropriate information:
-
- file = "/home/arnold/.profile"
- fdata[1] = "x" # force `fdata' to be an array
- ret = stat(file, fdata)
- if (ret < 0) {
- printf("could not stat %s: %s\n",
- file, ERRNO) > "/dev/stderr"
- exit 1
- }
- printf("size of %s is %d bytes\n", file, fdata["size"])
-
- The `stat()' function always clears the data array, even if the
-`stat()' fails. It fills in the following elements:
-
-`"name"'
- The name of the file that was `stat()''ed.
-
-`"dev"'
-`"ino"'
- The file's device and inode numbers, respectively.
-
-`"mode"'
- The file's mode, as a numeric value. This includes both the file's
- type and its permissions.
-
-`"nlink"'
- The number of hard links (directory entries) the file has.
-
-`"uid"'
-`"gid"'
- The numeric user and group ID numbers of the file's owner.
-
-`"size"'
- The size in bytes of the file.
-
-`"blocks"'
- The number of disk blocks the file actually occupies. This may not
- be a function of the file's size if the file has holes.
-
-`"atime"'
-`"mtime"'
-`"ctime"'
- The file's last access, modification, and inode update times,
- respectively. These are numeric timestamps, suitable for
- formatting with `strftime()' (*note Built-in::).
-
-`"pmode"'
- The file's "printable mode." This is a string representation of
- the file's type and permissions, such as what is produced by `ls
- -l'--for example, `"drwxr-xr-x"'.
-
-`"type"'
- A printable string representation of the file's type. The value
- is one of the following:
-
- `"blockdev"'
- `"chardev"'
- The file is a block or character device ("special file").
-
- `"directory"'
- The file is a directory.
-
- `"fifo"'
- The file is a named-pipe (also known as a FIFO).
-
- `"file"'
- The file is just a regular file.
-
- `"socket"'
- The file is an `AF_UNIX' ("Unix domain") socket in the
- filesystem.
-
- `"symlink"'
- The file is a symbolic link.
-
- Several additional elements may be present depending upon the
-operating system and the type of the file. You can test for them in
-your `awk' program by using the `in' operator (*note Reference to
-Elements::):
-
-`"blksize"'
- The preferred block size for I/O to the file. This field is not
- present on all POSIX-like systems in the C `stat' structure.
-
-`"linkval"'
- If the file is a symbolic link, this element is the name of the
- file the link points to (i.e., the value of the link).
-
-`"rdev"'
-`"major"'
-`"minor"'
- If the file is a block or character device file, then these values
- represent the numeric device number and the major and minor
- components of that number, respectively.
-
-
-File: gawk.info, Node: Internal File Ops, Next: Using Internal File Ops, Prev: Internal File Description, Up: Sample Library
-
-C.3.4.2 C Code for `chdir()' and `stat()'
-.........................................
-
-Here is the C code for these extensions. They were written for
-GNU/Linux. The code needs some more work for complete portability to
-other POSIX-compliant systems:(1)
-
- #include "awk.h"
-
- #include <sys/sysmacros.h>
-
- int plugin_is_GPL_compatible;
-
- /* do_chdir --- provide dynamically loaded chdir() builtin for gawk */
-
- static NODE *
- do_chdir(int nargs)
- {
- NODE *newdir;
- int ret = -1;
-
- if (do_lint && nargs != 1)
- lintwarn("chdir: called with incorrect number of arguments");
-
- newdir = get_scalar_argument(0, FALSE);
-
- The file includes the `"awk.h"' header file for definitions for the
-`gawk' internals. It includes `<sys/sysmacros.h>' for access to the
-`major()' and `minor'() macros.
-
- By convention, for an `awk' function `foo', the function that
-implements it is called `do_foo'. The function should take a `int'
-argument, usually called `nargs', that represents the number of defined
-arguments for the function. The `newdir' variable represents the new
-directory to change to, retrieved with `get_scalar_argument()'. Note
-that the first argument is numbered zero.
-
- This code actually accomplishes the `chdir()'. It first forces the
-argument to be a string and passes the string value to the `chdir()'
-system call. If the `chdir()' fails, `ERRNO' is updated.
-
- (void) force_string(newdir);
- ret = chdir(newdir->stptr);
- if (ret < 0)
- update_ERRNO_int(errno);
-
- Finally, the function returns the return value to the `awk' level:
-
- return make_number((AWKNUM) ret);
- }
-
- The `stat()' built-in is more involved. First comes a function that
-turns a numeric mode into a printable representation (e.g., 644 becomes
-`-rw-r--r--'). This is omitted here for brevity:
-
- /* format_mode --- turn a stat mode field into something readable */
-
- static char *
- format_mode(unsigned long fmode)
- {
- ...
- }
-
- Next comes the `do_stat()' function. It starts with variable
-declarations and argument checking:
-
- /* do_stat --- provide a stat() function for gawk */
-
- static NODE *
- do_stat(int nargs)
- {
- NODE *file, *array, *tmp;
- struct stat sbuf;
- int ret;
- NODE **aptr;
- char *pmode; /* printable mode */
- char *type = "unknown";
-
- if (do_lint && nargs > 2)
- lintwarn("stat: called with too many arguments");
-
- Then comes the actual work. First, the function gets the arguments.
-Then, it always clears the array. The code use `lstat()' (instead of
-`stat()') to get the file information, in case the file is a symbolic
-link. If there's an error, it sets `ERRNO' and returns:
-
- /* file is first arg, array to hold results is second */
- file = get_scalar_argument(0, FALSE);
- array = get_array_argument(1, FALSE);
-
- /* empty out the array */
- assoc_clear(array);
-
- /* lstat the file, if error, set ERRNO and return */
- (void) force_string(file);
- ret = lstat(file->stptr, & sbuf);
- if (ret < 0) {
- update_ERRNO_int(errno);
- return make_number((AWKNUM) ret);
- }
-
- Now comes the tedious part: filling in the array. Only a few of the
-calls are shown here, since they all follow the same pattern:
-
- /* fill in the array */
- aptr = assoc_lookup(array, tmp = make_string("name", 4));
- *aptr = dupnode(file);
- unref(tmp);
-
- aptr = assoc_lookup(array, tmp = make_string("mode", 4));
- *aptr = make_number((AWKNUM) sbuf.st_mode);
- unref(tmp);
-
- aptr = assoc_lookup(array, tmp = make_string("pmode", 5));
- pmode = format_mode(sbuf.st_mode);
- *aptr = make_string(pmode, strlen(pmode));
- unref(tmp);
-
- When done, return the `lstat()' return value:
-
-
- return make_number((AWKNUM) ret);
- }
-
- Finally, it's necessary to provide the "glue" that loads the new
-function(s) into `gawk'. By convention, each library has a routine
-named `dl_load()' that does the job. The simplest way is to use the
-`dl_load_func' macro in `gawkapi.h'.
-
- And that's it! As an exercise, consider adding functions to
-implement system calls such as `chown()', `chmod()', and `umask()'.
-
- ---------- Footnotes ----------
-
- (1) This version is edited slightly for presentation. See
-`extension/filefuncs.c' in the `gawk' distribution for the complete
-version.
-
-
-File: gawk.info, Node: Using Internal File Ops, Prev: Internal File Ops, Up: Sample Library
+File: gawk.info, Node: Future Extensions, Prev: Additions, Up: Notes
-C.3.4.3 Integrating the Extensions
-..................................
-
-Now that the code is written, it must be possible to add it at runtime
-to the running `gawk' interpreter. First, the code must be compiled.
-Assuming that the functions are in a file named `filefuncs.c', and IDIR
-is the location of the `gawk' include files, the following steps create
-a GNU/Linux shared library:
-
- $ gcc -fPIC -shared -DHAVE_CONFIG_H -c -O -g -IIDIR filefuncs.c
- $ ld -o filefuncs.so -shared filefuncs.o
-
- Once the library exists, it is loaded by calling the `extension()'
-built-in function. This function takes two arguments: the name of the
-library to load and the name of a function to call when the library is
-first loaded. This function adds the new functions to `gawk'. It
-returns the value returned by the initialization function within the
-shared library:
-
- # file testff.awk
- BEGIN {
- extension("./filefuncs.so", "dl_load")
-
- chdir(".") # no-op
-
- data[1] = 1 # force `data' to be an array
- print "Info for testff.awk"
- ret = stat("testff.awk", data)
- print "ret =", ret
- for (i in data)
- printf "data[\"%s\"] = %s\n", i, data[i]
- print "testff.awk modified:",
- strftime("%m %d %y %H:%M:%S", data["mtime"])
-
- print "\nInfo for JUNK"
- ret = stat("JUNK", data)
- print "ret =", ret
- for (i in data)
- printf "data[\"%s\"] = %s\n", i, data[i]
- print "JUNK modified:", strftime("%m %d %y %H:%M:%S", data["mtime"])
- }
-
- Here are the results of running the program:
-
- $ gawk -f testff.awk
- -| Info for testff.awk
- -| ret = 0
- -| data["size"] = 607
- -| data["ino"] = 14945891
- -| data["name"] = testff.awk
- -| data["pmode"] = -rw-rw-r--
- -| data["nlink"] = 1
- -| data["atime"] = 1293993369
- -| data["mtime"] = 1288520752
- -| data["mode"] = 33204
- -| data["blksize"] = 4096
- -| data["dev"] = 2054
- -| data["type"] = file
- -| data["gid"] = 500
- -| data["uid"] = 500
- -| data["blocks"] = 8
- -| data["ctime"] = 1290113572
- -| testff.awk modified: 10 31 10 12:25:52
- -|
- -| Info for JUNK
- -| ret = -1
- -| JUNK modified: 01 01 70 02:00:00
-
-
-File: gawk.info, Node: Future Extensions, Prev: Dynamic Extensions, Up: Notes
-
-C.4 Probable Future Extensions
+C.3 Probable Future Extensions
==============================
AWK is a language similar to PERL, only considerably more elegant.
@@ -23369,12 +23106,9 @@ well.
Following is a list of probable future changes visible at the `awk'
language level:
-Loadable module interface
- It is not clear that the `awk'-level interface to the modules
- facility is as good as it should be. The interface needs to be
- redesigned, particularly taking namespace issues into account, as
- well as possibly including issues such as library search path order
- and versioning.
+Databases
+ It may be possible to map a GDBM/NDBM/SDBM file into an `awk'
+ array.
`RECLEN' variable for fixed-length records
Along with `FIELDWIDTHS', this would speed up the processing of
@@ -23382,30 +23116,12 @@ Loadable module interface
`"RECLEN"', depending upon which kind of record processing is in
effect.
-Databases
- It may be possible to map a GDBM/NDBM/SDBM file into an `awk'
- array.
-
More `lint' warnings
There are more things that could be checked for portability.
Following is a list of probable improvements that will make `gawk''s
source code easier to work with:
-Loadable module mechanics
- The current extension mechanism works (*note Dynamic Extensions::),
- but is rather primitive. It requires a fair amount of manual work
- to create and integrate a loadable module. Nor is the current
- mechanism as portable as might be desired. The GNU `libtool'
- package provides a number of features that would make using
- loadable modules much easier. `gawk' should be changed to use
- `libtool'.
-
-Loadable module internals
- The API to its internals that `gawk' "exports" should be revised.
- Too many things are needlessly exposed. A new API should be
- designed and implemented to make module writing easier.
-
Better array subscript management
`gawk''s management of array subscript storage could use revamping,
so that using the same value to index multiple arrays only stores
@@ -25955,7 +25671,7 @@ Index
* Ada programming language: Glossary. (line 20)
* adding, features to gawk: Adding Code. (line 6)
* adding, fields: Changing Fields. (line 53)
-* adding, functions to gawk: Dynamic Extensions. (line 10)
+* adding, functions to gawk: Dynamic Extensions. (line 9)
* advanced features, buffering: I/O Functions. (line 98)
* advanced features, close() function: Close Files And Pipes.
(line 131)
@@ -26014,18 +25730,15 @@ Index
* arguments, command-line, invoking awk: Command Line. (line 6)
* arguments, in function calls: Function Calls. (line 16)
* arguments, processing: Getopt Function. (line 6)
-* arguments, retrieving: Internals. (line 111)
* arithmetic operators: Arithmetic Ops. (line 6)
* arrays: Arrays. (line 6)
* arrays, as parameters to functions: Pass By Value/Reference.
(line 47)
* arrays, associative: Array Intro. (line 50)
-* arrays, associative, clearing: Internals. (line 68)
* arrays, associative, library functions and: Library Names. (line 57)
* arrays, deleting entire contents: Delete. (line 39)
* arrays, elements, assigning: Assigning Elements. (line 6)
* arrays, elements, deleting: Delete. (line 6)
-* arrays, elements, installing: Internals. (line 72)
* arrays, elements, order of: Scanning an Array. (line 48)
* arrays, elements, referencing: Reference to Elements.
(line 6)
@@ -26064,8 +25777,6 @@ Index
* assignment operators, evaluation order: Assignment Ops. (line 111)
* assignment operators, lvalues/rvalues: Assignment Ops. (line 32)
* assignments as filenames: Ignoring Assigns. (line 6)
-* assoc_clear() internal function: Internals. (line 68)
-* assoc_lookup() internal function: Internals. (line 72)
* associative arrays: Array Intro. (line 50)
* asterisk (*), * operator, as multiplication operator: Precedence.
(line 55)
@@ -26134,10 +25845,8 @@ Index
* awk, versions of, See Also Brian Kernighan's awk <1>: Other Versions.
(line 13)
* awk, versions of, See Also Brian Kernighan's awk: BTL. (line 6)
-* awk.h file (internal): Internals. (line 15)
* awka compiler for awk: Other Versions. (line 55)
* AWKLIBPATH environment variable: AWKLIBPATH Variable. (line 6)
-* AWKNUM internal type: Internals. (line 19)
* AWKPATH environment variable <1>: PC Using. (line 11)
* AWKPATH environment variable: AWKPATH Variable. (line 6)
* awkprof.out file: Profiling. (line 6)
@@ -26339,7 +26048,6 @@ Index
* close() function, two-way pipes and: Two-way I/O. (line 77)
* Close, Diane <1>: Contributors. (line 21)
* Close, Diane: Manual History. (line 41)
-* close_func() input method: Internals. (line 157)
* collating elements: Bracket Expressions. (line 69)
* collating symbols: Bracket Expressions. (line 76)
* Colombo, Antonio: Acknowledgments. (line 60)
@@ -26719,7 +26427,6 @@ Index
* DuBois, John: Acknowledgments. (line 60)
* dump debugger command: Miscellaneous Debugger Commands.
(line 9)
-* dupnode() internal function: Internals. (line 87)
* dupword.awk program: Dupword Program. (line 31)
* e debugger command (alias for enable): Breakpoint Control. (line 73)
* EBCDIC: Ordinal Functions. (line 45)
@@ -26760,7 +26467,6 @@ Index
* endgrent() user-defined function: Group Functions. (line 218)
* endpwent() function (C library): Passwd Functions. (line 210)
* endpwent() user-defined function: Passwd Functions. (line 213)
-* ENVIRON array <1>: Internals. (line 146)
* ENVIRON array: Auto-set. (line 60)
* environment variables: Auto-set. (line 60)
* epoch, definition of: Glossary. (line 239)
@@ -26769,11 +26475,10 @@ Index
* equals sign (=), == operator: Comparison Operators.
(line 11)
* EREs (Extended Regular Expressions): Bracket Expressions. (line 24)
-* ERRNO variable <1>: Internals. (line 130)
-* ERRNO variable <2>: TCP/IP Networking. (line 54)
-* ERRNO variable <3>: Auto-set. (line 73)
-* ERRNO variable <4>: BEGINFILE/ENDFILE. (line 26)
-* ERRNO variable <5>: Close Files And Pipes.
+* ERRNO variable <1>: TCP/IP Networking. (line 54)
+* ERRNO variable <2>: Auto-set. (line 73)
+* ERRNO variable <3>: BEGINFILE/ENDFILE. (line 26)
+* ERRNO variable <4>: Close Files And Pipes.
(line 139)
* ERRNO variable: Getline. (line 19)
* error handling: Special FD. (line 16)
@@ -26818,7 +26523,6 @@ Index
(line 9)
* expressions, selecting: Conditional Exp. (line 6)
* Extended Regular Expressions (EREs): Bracket Expressions. (line 24)
-* eXtensible Markup Language (XML): Internals. (line 157)
* extension() function (gawk): Using Internal File Ops.
(line 15)
* extensions, Brian Kernighan's awk <1>: Other Versions. (line 13)
@@ -26958,15 +26662,11 @@ Index
(line 6)
* floating-point, numbers <1>: Unexpected Results. (line 6)
* floating-point, numbers: Basic Data Typing. (line 21)
-* floating-point, numbers, AWKNUM internal type: Internals. (line 19)
* FNR variable <1>: Auto-set. (line 103)
* FNR variable: Records. (line 6)
* FNR variable, changing: Auto-set. (line 225)
* for statement: For Statement. (line 6)
* for statement, in arrays: Scanning an Array. (line 20)
-* force_number() internal function: Internals. (line 27)
-* force_string() internal function: Internals. (line 32)
-* force_wstring() internal function: Internals. (line 37)
* format specifiers, mixing regular with positional specifiers: Printf Ordering.
(line 57)
* format specifiers, printf statement: Control Letters. (line 6)
@@ -27014,7 +26714,7 @@ Index
(line 47)
* functions, built-in <1>: Functions. (line 6)
* functions, built-in: Function Calls. (line 10)
-* functions, built-in, adding to gawk: Dynamic Extensions. (line 10)
+* functions, built-in, adding to gawk: Dynamic Extensions. (line 9)
* functions, built-in, evaluation order: Calling Built-in. (line 30)
* functions, defining: Definition Syntax. (line 6)
* functions, library: Library Functions. (line 6)
@@ -27042,7 +26742,6 @@ Index
* functions, names of <1>: Definition Syntax. (line 20)
* functions, names of: Arrays. (line 18)
* functions, recursive: Definition Syntax. (line 73)
-* functions, return values, setting: Internals. (line 130)
* functions, string-translation: I18N Functions. (line 6)
* functions, undefined: Pass By Value/Reference.
(line 71)
@@ -27096,8 +26795,7 @@ Index
* gawk, FPAT variable in: Splitting By Content.
(line 26)
* gawk, function arguments and: Calling Built-in. (line 16)
-* gawk, functions, adding: Dynamic Extensions. (line 10)
-* gawk, functions, loading: Loading Extensions. (line 6)
+* gawk, functions, adding: Dynamic Extensions. (line 9)
* gawk, hexadecimal numbers and: Nondecimal-numbers. (line 42)
* gawk, IGNORECASE variable in <1>: Array Sorting Functions.
(line 81)
@@ -27112,7 +26810,6 @@ Index
* gawk, implementation issues, limits: Getline Notes. (line 14)
* gawk, implementation issues, pipes: Redirection. (line 135)
* gawk, installing: Installation. (line 6)
-* gawk, internals: Internals. (line 6)
* gawk, internationalization and, See internationalization: Internationalization.
(line 13)
* gawk, interpreter, adding code to: Using Internal File Ops.
@@ -27158,11 +26855,6 @@ Index
* gensub() function (gawk): Using Constant Regexps.
(line 43)
* gensub() function (gawk), escape processing: Gory Details. (line 6)
-* get_actual_argument() internal function: Internals. (line 116)
-* get_argument() internal function: Internals. (line 111)
-* get_array_argument() internal macro: Internals. (line 127)
-* get_record() input method: Internals. (line 157)
-* get_scalar_argument() internal macro: Internals. (line 124)
* getaddrinfo() function (C library): TCP/IP Networking. (line 38)
* getgrent() function (C library): Group Functions. (line 6)
* getgrent() user-defined function: Group Functions. (line 6)
@@ -27323,37 +27015,6 @@ Index
* integers: Basic Data Typing. (line 21)
* integers, unsigned: Basic Data Typing. (line 30)
* interacting with other programs: I/O Functions. (line 63)
-* internal constant, INVALID_HANDLE: Internals. (line 157)
-* internal function, assoc_clear(): Internals. (line 68)
-* internal function, assoc_lookup(): Internals. (line 72)
-* internal function, dupnode(): Internals. (line 87)
-* internal function, force_number(): Internals. (line 27)
-* internal function, force_string(): Internals. (line 32)
-* internal function, force_wstring(): Internals. (line 37)
-* internal function, get_actual_argument(): Internals. (line 116)
-* internal function, get_argument(): Internals. (line 111)
-* internal function, iop_alloc(): Internals. (line 157)
-* internal function, make_builtin(): Internals. (line 97)
-* internal function, make_number(): Internals. (line 82)
-* internal function, make_string(): Internals. (line 77)
-* internal function, register_deferred_variable(): Internals. (line 146)
-* internal function, register_open_hook(): Internals. (line 157)
-* internal function, unref(): Internals. (line 92)
-* internal function, unset_ERRNO(): Internals. (line 141)
-* internal function, update_ERRNO_int(): Internals. (line 130)
-* internal function, update_ERRNO_string(): Internals. (line 135)
-* internal macro, get_array_argument(): Internals. (line 127)
-* internal macro, get_scalar_argument(): Internals. (line 124)
-* internal structure, IOBUF: Internals. (line 157)
-* internal type, AWKNUM: Internals. (line 19)
-* internal type, NODE: Internals. (line 23)
-* internal variable, nargs: Internals. (line 42)
-* internal variable, stlen: Internals. (line 46)
-* internal variable, stptr: Internals. (line 46)
-* internal variable, type: Internals. (line 59)
-* internal variable, vname: Internals. (line 64)
-* internal variable, wstlen: Internals. (line 54)
-* internal variable, wstptr: Internals. (line 54)
* internationalization <1>: I18N and L10N. (line 6)
* internationalization: I18N Functions. (line 6)
* internationalization, localization <1>: Internationalization.
@@ -27373,10 +27034,7 @@ Index
* interpreted programs <1>: Glossary. (line 361)
* interpreted programs: Basic High Level. (line 14)
* interval expressions: Regexp Operators. (line 116)
-* INVALID_HANDLE internal constant: Internals. (line 157)
* inventory-shipped file: Sample Data Files. (line 32)
-* IOBUF internal structure: Internals. (line 157)
-* iop_alloc() internal function: Internals. (line 157)
* isarray() function (gawk): Type Functions. (line 11)
* ISO: Glossary. (line 372)
* ISO 8859-1: Glossary. (line 141)
@@ -27482,7 +27140,6 @@ Index
* Linux: Manual History. (line 28)
* list debugger command: Miscellaneous Debugger Commands.
(line 74)
-* loading extension: Loading Extensions. (line 6)
* loading, library: Options. (line 173)
* local variables: Variable Scope. (line 6)
* locale categories: Explaining gettext. (line 80)
@@ -27502,15 +27159,11 @@ Index
* loops, count for header: Profiling. (line 123)
* loops, exiting: Break Statement. (line 6)
* loops, See Also while statement: While Statement. (line 6)
-* Lost In Space: Dynamic Extensions. (line 6)
* ls utility: More Complex. (line 15)
* lshift() function (gawk): Bitwise Functions. (line 46)
* lvalues/rvalues: Assignment Ops. (line 32)
* mailing labels, printing: Labels Program. (line 6)
* mailing list, GNITS: Acknowledgments. (line 52)
-* make_builtin() internal function: Internals. (line 97)
-* make_number() internal function: Internals. (line 82)
-* make_string() internal function: Internals. (line 77)
* mark parity: Ordinal Functions. (line 45)
* marked string extraction (internationalization): String Extraction.
(line 6)
@@ -27525,7 +27178,6 @@ Index
* matching, null strings: Gory Details. (line 164)
* mawk program: Other Versions. (line 35)
* McPhee, Patrick: Contributors. (line 100)
-* memory, releasing: Internals. (line 92)
* message object files: Explaining gettext. (line 41)
* message object files, converting from portable object files: I18N Example.
(line 62)
@@ -27551,7 +27203,6 @@ Index
* namespace issues <1>: Library Names. (line 6)
* namespace issues: Arrays. (line 18)
* namespace issues, functions: Definition Syntax. (line 20)
-* nargs internal variable: Internals. (line 42)
* nawk utility: Names. (line 17)
* negative zero: Unexpected Results. (line 28)
* NetBSD: Glossary. (line 611)
@@ -27592,8 +27243,6 @@ Index
* ni debugger command (alias for nexti): Debugger Execution Control.
(line 49)
* noassign.awk program: Ignoring Assigns. (line 15)
-* NODE internal type: Internals. (line 23)
-* nodes, duplicating: Internals. (line 87)
* not Boolean-logic operator: Boolean Ops. (line 6)
* NR variable <1>: Auto-set. (line 119)
* NR variable: Records. (line 6)
@@ -27614,7 +27263,6 @@ Index
* number sign (#), #! (executable scripts), portability issues with: Executable Scripts.
(line 6)
* number sign (#), commenting: Comments. (line 6)
-* numbers: Internals. (line 82)
* numbers, as array subscripts: Numeric Array Subscripts.
(line 6)
* numbers, as values of characters: Ordinal Functions. (line 6)
@@ -27624,16 +27272,13 @@ Index
* numbers, converting: Conversion. (line 6)
* numbers, converting, to strings: User-modified. (line 28)
* numbers, floating-point: Basic Data Typing. (line 21)
-* numbers, floating-point, AWKNUM internal type: Internals. (line 19)
* numbers, hexadecimal: Nondecimal-numbers. (line 6)
-* numbers, NODE internal type: Internals. (line 23)
* numbers, octal: Nondecimal-numbers. (line 6)
* numbers, random: Numeric Functions. (line 64)
* numbers, rounding: Round Function. (line 6)
* numeric, constants: Scalar Constants. (line 6)
* numeric, output format: OFMT. (line 6)
* numeric, strings: Variable Typing. (line 6)
-* numeric, values: Internals. (line 27)
* o debugger command (alias for option): Debugger Info. (line 57)
* oawk utility: Names. (line 17)
* obsolete features: Obsolete. (line 6)
@@ -27717,7 +27362,6 @@ Index
(line 36)
* P1003.1 POSIX standard: Glossary. (line 454)
* P1003.2 POSIX standard: Glossary. (line 454)
-* parameters, number of: Internals. (line 42)
* parentheses () <1>: Profiling. (line 138)
* parentheses (): Regexp Operators. (line 79)
* password file: Passwd Functions. (line 16)
@@ -27880,13 +27524,12 @@ Index
* private variables: Library Names. (line 11)
* processes, two-way communications with: Two-way I/O. (line 23)
* processing data: Basic High Level. (line 6)
-* PROCINFO array <1>: Internals. (line 146)
-* PROCINFO array <2>: Id Program. (line 15)
-* PROCINFO array <3>: Group Functions. (line 6)
-* PROCINFO array <4>: Passwd Functions. (line 6)
-* PROCINFO array <5>: Two-way I/O. (line 116)
-* PROCINFO array <6>: Time Functions. (line 46)
-* PROCINFO array <7>: Auto-set. (line 124)
+* PROCINFO array <1>: Id Program. (line 15)
+* PROCINFO array <2>: Group Functions. (line 6)
+* PROCINFO array <3>: Passwd Functions. (line 6)
+* PROCINFO array <4>: Two-way I/O. (line 116)
+* PROCINFO array <5>: Time Functions. (line 46)
+* PROCINFO array <6>: Auto-set. (line 124)
* PROCINFO array: Obsolete. (line 11)
* profiling awk programs: Profiling. (line 6)
* profiling awk programs, dynamically: Profiling. (line 171)
@@ -27976,8 +27619,6 @@ Index
* regexp constants, slashes vs. quotes: Computed Regexps. (line 28)
* regexp constants, vs. string constants: Computed Regexps. (line 38)
* regexp, See regular expressions: Regexp. (line 6)
-* register_deferred_variable() internal function: Internals. (line 146)
-* register_open_hook() internal function: Internals. (line 157)
* regular expressions: Regexp. (line 6)
* regular expressions as field separators: Field Separators. (line 50)
* regular expressions, anchors in: Regexp Operators. (line 22)
@@ -28046,8 +27687,6 @@ Index
* Robbins, Miriam <1>: Passwd Functions. (line 90)
* Robbins, Miriam <2>: Getline/Pipe. (line 36)
* Robbins, Miriam: Acknowledgments. (line 83)
-* Robinson, Will: Dynamic Extensions. (line 6)
-* robot, the: Dynamic Extensions. (line 6)
* Rommel, Kai Uwe: Contributors. (line 43)
* round() user-defined function: Round Function. (line 16)
* rounding mode, floating-point: Rounding Mode. (line 6)
@@ -28209,8 +27848,6 @@ Index
(line 68)
* stepi debugger command: Debugger Execution Control.
(line 76)
-* stlen internal variable: Internals. (line 46)
-* stptr internal variable: Internals. (line 46)
* stream editors <1>: Simple Sed. (line 6)
* stream editors: Field Splitting Summary.
(line 47)
@@ -28221,7 +27858,6 @@ Index
(line 6)
* string operators: Concatenation. (line 9)
* string-matching operators: Regexp Usage. (line 19)
-* strings: Internals. (line 77)
* strings, converting <1>: Bitwise Functions. (line 109)
* strings, converting: Conversion. (line 6)
* strings, converting, numbers to: User-modified. (line 28)
@@ -28230,7 +27866,6 @@ Index
* strings, for localization: Programmer i18n. (line 14)
* strings, length of: Scalar Constants. (line 20)
* strings, merging arrays into: Join Function. (line 6)
-* strings, NODE internal type: Internals. (line 23)
* strings, null: Regexp Field Splitting.
(line 43)
* strings, numeric: Variable Typing. (line 6)
@@ -28352,7 +27987,6 @@ Index
* trunc-mod operation: Arithmetic Ops. (line 66)
* truth values: Truth Values. (line 6)
* type conversion: Conversion. (line 21)
-* type internal variable: Internals. (line 59)
* u debugger command (alias for until): Debugger Execution Control.
(line 83)
* undefined functions: Pass By Value/Reference.
@@ -28378,16 +28012,12 @@ Index
(line 72)
* Unix, awk scripts and: Executable Scripts. (line 6)
* UNIXROOT variable, on OS/2 systems: PC Using. (line 17)
-* unref() internal function: Internals. (line 92)
-* unset_ERRNO() internal function: Internals. (line 141)
* unsigned integers: Basic Data Typing. (line 30)
* until debugger command: Debugger Execution Control.
(line 83)
* unwatch debugger command: Viewing And Changing Data.
(line 84)
* up debugger command: Execution Stack. (line 33)
-* update_ERRNO_int() internal function: Internals. (line 130)
-* update_ERRNO_string() internal function: Internals. (line 135)
* user database, reading: Passwd Functions. (line 6)
* user-defined, functions: User-defined. (line 6)
* user-defined, functions, counts: Profiling. (line 129)
@@ -28438,7 +28068,6 @@ Index
* vertical bar (|), || operator <1>: Precedence. (line 89)
* vertical bar (|), || operator: Boolean Ops. (line 57)
* Vinschen, Corinna: Acknowledgments. (line 60)
-* vname internal variable: Internals. (line 64)
* w debugger command (alias for watch): Viewing And Changing Data.
(line 67)
* w utility: Constant Size. (line 22)
@@ -28472,11 +28101,8 @@ Index
* words, counting: Wc Program. (line 6)
* words, duplicate, searching for: Dupword Program. (line 6)
* words, usage counts, generating: Word Sorting. (line 6)
-* wstlen internal variable: Internals. (line 54)
-* wstptr internal variable: Internals. (line 54)
* xgawk: Other Versions. (line 120)
* xgettext utility: String Extraction. (line 13)
-* XML (eXtensible Markup Language): Internals. (line 157)
* XOR bitwise operation: Bitwise Functions. (line 6)
* xor() function (gawk): Bitwise Functions. (line 55)
* Yawitz, Efraim: Contributors. (line 106)
@@ -28514,442 +28140,440 @@ Index

Tag Table:
Node: Top1352
-Node: Foreword31758
-Node: Preface36103
-Ref: Preface-Footnote-139156
-Ref: Preface-Footnote-239262
-Node: History39494
-Node: Names41885
-Ref: Names-Footnote-143362
-Node: This Manual43434
-Ref: This Manual-Footnote-148372
-Node: Conventions48472
-Node: Manual History50606
-Ref: Manual History-Footnote-153876
-Ref: Manual History-Footnote-253917
-Node: How To Contribute53991
-Node: Acknowledgments55135
-Node: Getting Started59631
-Node: Running gawk62010
-Node: One-shot63196
-Node: Read Terminal64421
-Ref: Read Terminal-Footnote-166071
-Ref: Read Terminal-Footnote-266347
-Node: Long66518
-Node: Executable Scripts67894
-Ref: Executable Scripts-Footnote-169763
-Ref: Executable Scripts-Footnote-269865
-Node: Comments70412
-Node: Quoting72879
-Node: DOS Quoting77502
-Node: Sample Data Files78177
-Node: Very Simple81209
-Node: Two Rules85808
-Node: More Complex87955
-Ref: More Complex-Footnote-190885
-Node: Statements/Lines90970
-Ref: Statements/Lines-Footnote-195432
-Node: Other Features95697
-Node: When96625
-Node: Invoking Gawk98772
-Node: Command Line100233
-Node: Options101016
-Ref: Options-Footnote-1116414
-Node: Other Arguments116439
-Node: Naming Standard Input119097
-Node: Environment Variables120191
-Node: AWKPATH Variable120749
-Ref: AWKPATH Variable-Footnote-1123507
-Node: AWKLIBPATH Variable123767
-Node: Other Environment Variables124364
-Node: Exit Status126859
-Node: Include Files127534
-Node: Loading Shared Libraries131103
-Node: Obsolete132328
-Node: Undocumented133025
-Node: Regexp133268
-Node: Regexp Usage134657
-Node: Escape Sequences136683
-Node: Regexp Operators142446
-Ref: Regexp Operators-Footnote-1149826
-Ref: Regexp Operators-Footnote-2149973
-Node: Bracket Expressions150071
-Ref: table-char-classes151961
-Node: GNU Regexp Operators154484
-Node: Case-sensitivity158207
-Ref: Case-sensitivity-Footnote-1161175
-Ref: Case-sensitivity-Footnote-2161410
-Node: Leftmost Longest161518
-Node: Computed Regexps162719
-Node: Reading Files166129
-Node: Records168133
-Ref: Records-Footnote-1176807
-Node: Fields176844
-Ref: Fields-Footnote-1179877
-Node: Nonconstant Fields179963
-Node: Changing Fields182165
-Node: Field Separators188146
-Node: Default Field Splitting190775
-Node: Regexp Field Splitting191892
-Node: Single Character Fields195234
-Node: Command Line Field Separator196293
-Node: Field Splitting Summary199734
-Ref: Field Splitting Summary-Footnote-1202926
-Node: Constant Size203027
-Node: Splitting By Content207611
-Ref: Splitting By Content-Footnote-1211337
-Node: Multiple Line211377
-Ref: Multiple Line-Footnote-1217224
-Node: Getline217403
-Node: Plain Getline219619
-Node: Getline/Variable221708
-Node: Getline/File222849
-Node: Getline/Variable/File224171
-Ref: Getline/Variable/File-Footnote-1225770
-Node: Getline/Pipe225857
-Node: Getline/Variable/Pipe228417
-Node: Getline/Coprocess229524
-Node: Getline/Variable/Coprocess230767
-Node: Getline Notes231481
-Node: Getline Summary233423
-Ref: table-getline-variants233831
-Node: Read Timeout234687
-Ref: Read Timeout-Footnote-1238432
-Node: Command line directories238489
-Node: Printing239119
-Node: Print240750
-Node: Print Examples242087
-Node: Output Separators244871
-Node: OFMT246631
-Node: Printf247989
-Node: Basic Printf248895
-Node: Control Letters250434
-Node: Format Modifiers254246
-Node: Printf Examples260255
-Node: Redirection262970
-Node: Special Files269954
-Node: Special FD270487
-Ref: Special FD-Footnote-1274112
-Node: Special Network274186
-Node: Special Caveats275036
-Node: Close Files And Pipes275832
-Ref: Close Files And Pipes-Footnote-1282855
-Ref: Close Files And Pipes-Footnote-2283003
-Node: Expressions283153
-Node: Values284285
-Node: Constants284961
-Node: Scalar Constants285641
-Ref: Scalar Constants-Footnote-1286500
-Node: Nondecimal-numbers286682
-Node: Regexp Constants289741
-Node: Using Constant Regexps290216
-Node: Variables293271
-Node: Using Variables293926
-Node: Assignment Options295650
-Node: Conversion297522
-Ref: table-locale-affects302898
-Ref: Conversion-Footnote-1303522
-Node: All Operators303631
-Node: Arithmetic Ops304261
-Node: Concatenation306766
-Ref: Concatenation-Footnote-1309559
-Node: Assignment Ops309679
-Ref: table-assign-ops314667
-Node: Increment Ops316075
-Node: Truth Values and Conditions319545
-Node: Truth Values320628
-Node: Typing and Comparison321677
-Node: Variable Typing322466
-Ref: Variable Typing-Footnote-1326363
-Node: Comparison Operators326485
-Ref: table-relational-ops326895
-Node: POSIX String Comparison330444
-Ref: POSIX String Comparison-Footnote-1331400
-Node: Boolean Ops331538
-Ref: Boolean Ops-Footnote-1335616
-Node: Conditional Exp335707
-Node: Function Calls337439
-Node: Precedence341033
-Node: Locales344702
-Node: Patterns and Actions345791
-Node: Pattern Overview346845
-Node: Regexp Patterns348514
-Node: Expression Patterns349057
-Node: Ranges352742
-Node: BEGIN/END355708
-Node: Using BEGIN/END356470
-Ref: Using BEGIN/END-Footnote-1359201
-Node: I/O And BEGIN/END359307
-Node: BEGINFILE/ENDFILE361589
-Node: Empty364482
-Node: Using Shell Variables364798
-Node: Action Overview367083
-Node: Statements369440
-Node: If Statement371294
-Node: While Statement372793
-Node: Do Statement374837
-Node: For Statement375993
-Node: Switch Statement379145
-Node: Break Statement381242
-Node: Continue Statement383232
-Node: Next Statement385025
-Node: Nextfile Statement387415
-Node: Exit Statement389960
-Node: Built-in Variables392376
-Node: User-modified393471
-Ref: User-modified-Footnote-1401826
-Node: Auto-set401888
-Ref: Auto-set-Footnote-1411796
-Node: ARGC and ARGV412001
-Node: Arrays415852
-Node: Array Basics417357
-Node: Array Intro418183
-Node: Reference to Elements422501
-Node: Assigning Elements424771
-Node: Array Example425262
-Node: Scanning an Array426994
-Node: Controlling Scanning429308
-Ref: Controlling Scanning-Footnote-1434241
-Node: Delete434557
-Ref: Delete-Footnote-1436992
-Node: Numeric Array Subscripts437049
-Node: Uninitialized Subscripts439232
-Node: Multi-dimensional440860
-Node: Multi-scanning443954
-Node: Arrays of Arrays445545
-Node: Functions450190
-Node: Built-in451012
-Node: Calling Built-in452090
-Node: Numeric Functions454078
-Ref: Numeric Functions-Footnote-1457910
-Ref: Numeric Functions-Footnote-2458267
-Ref: Numeric Functions-Footnote-3458315
-Node: String Functions458584
-Ref: String Functions-Footnote-1482081
-Ref: String Functions-Footnote-2482210
-Ref: String Functions-Footnote-3482458
-Node: Gory Details482545
-Ref: table-sub-escapes484224
-Ref: table-sub-posix-92485578
-Ref: table-sub-proposed486921
-Ref: table-posix-sub488271
-Ref: table-gensub-escapes489817
-Ref: Gory Details-Footnote-1491024
-Ref: Gory Details-Footnote-2491075
-Node: I/O Functions491226
-Ref: I/O Functions-Footnote-1497881
-Node: Time Functions498028
-Ref: Time Functions-Footnote-1508920
-Ref: Time Functions-Footnote-2508988
-Ref: Time Functions-Footnote-3509146
-Ref: Time Functions-Footnote-4509257
-Ref: Time Functions-Footnote-5509369
-Ref: Time Functions-Footnote-6509596
-Node: Bitwise Functions509862
-Ref: table-bitwise-ops510420
-Ref: Bitwise Functions-Footnote-1514641
-Node: Type Functions514825
-Node: I18N Functions515295
-Node: User-defined516922
-Node: Definition Syntax517726
-Ref: Definition Syntax-Footnote-1522636
-Node: Function Example522705
-Node: Function Caveats525299
-Node: Calling A Function525720
-Node: Variable Scope526835
-Node: Pass By Value/Reference528810
-Node: Return Statement532250
-Node: Dynamic Typing535231
-Node: Indirect Calls535966
-Node: Internationalization545651
-Node: I18N and L10N547090
-Node: Explaining gettext547776
-Ref: Explaining gettext-Footnote-1552842
-Ref: Explaining gettext-Footnote-2553026
-Node: Programmer i18n553191
-Node: Translator i18n557391
-Node: String Extraction558184
-Ref: String Extraction-Footnote-1559145
-Node: Printf Ordering559231
-Ref: Printf Ordering-Footnote-1562015
-Node: I18N Portability562079
-Ref: I18N Portability-Footnote-1564528
-Node: I18N Example564591
-Ref: I18N Example-Footnote-1567226
-Node: Gawk I18N567298
-Node: Arbitrary Precision Arithmetic567915
-Ref: Arbitrary Precision Arithmetic-Footnote-1570790
-Node: Floating-point Programming570938
-Node: Floating-point Representation576208
-Node: Floating-point Context577312
-Ref: table-ieee-formats578147
-Node: Rounding Mode579517
-Ref: table-rounding-modes580144
-Ref: Rounding Mode-Footnote-1583267
-Node: Arbitrary Precision Floats583448
-Ref: Arbitrary Precision Floats-Footnote-1585489
-Node: Setting Precision585800
-Node: Setting Rounding Mode588558
-Node: Floating-point Constants589475
-Node: Changing Precision590894
-Ref: Changing Precision-Footnote-1592294
-Node: Exact Arithmetic592467
-Node: Integer Programming595480
-Node: Arbitrary Precision Integers597260
-Ref: Arbitrary Precision Integers-Footnote-1600284
-Node: MPFR and GMP Libraries600430
-Node: Advanced Features600815
-Node: Nondecimal Data602338
-Node: Array Sorting603921
-Node: Controlling Array Traversal604618
-Node: Array Sorting Functions612855
-Ref: Array Sorting Functions-Footnote-1616529
-Ref: Array Sorting Functions-Footnote-2616622
-Node: Two-way I/O616816
-Ref: Two-way I/O-Footnote-1622248
-Node: TCP/IP Networking622318
-Node: Profiling625162
-Node: Library Functions632616
-Ref: Library Functions-Footnote-1635623
-Node: Library Names635794
-Ref: Library Names-Footnote-1639265
-Ref: Library Names-Footnote-2639485
-Node: General Functions639571
-Node: Strtonum Function640524
-Node: Assert Function643454
-Node: Round Function646780
-Node: Cliff Random Function648323
-Node: Ordinal Functions649339
-Ref: Ordinal Functions-Footnote-1652409
-Ref: Ordinal Functions-Footnote-2652661
-Node: Join Function652870
-Ref: Join Function-Footnote-1654641
-Node: Getlocaltime Function654841
-Node: Data File Management658556
-Node: Filetrans Function659188
-Node: Rewind Function663327
-Node: File Checking664714
-Node: Empty Files665808
-Node: Ignoring Assigns668038
-Node: Getopt Function669591
-Ref: Getopt Function-Footnote-1680895
-Node: Passwd Functions681098
-Ref: Passwd Functions-Footnote-1690073
-Node: Group Functions690161
-Node: Walking Arrays698245
-Node: Sample Programs699814
-Node: Running Examples700479
-Node: Clones701207
-Node: Cut Program702431
-Node: Egrep Program712276
-Ref: Egrep Program-Footnote-1720049
-Node: Id Program720159
-Node: Split Program723775
-Ref: Split Program-Footnote-1727294
-Node: Tee Program727422
-Node: Uniq Program730225
-Node: Wc Program737654
-Ref: Wc Program-Footnote-1741920
-Ref: Wc Program-Footnote-2742120
-Node: Miscellaneous Programs742212
-Node: Dupword Program743400
-Node: Alarm Program745431
-Node: Translate Program750180
-Ref: Translate Program-Footnote-1754567
-Ref: Translate Program-Footnote-2754795
-Node: Labels Program754929
-Ref: Labels Program-Footnote-1758300
-Node: Word Sorting758384
-Node: History Sorting762268
-Node: Extract Program764107
-Ref: Extract Program-Footnote-1771590
-Node: Simple Sed771718
-Node: Igawk Program774780
-Ref: Igawk Program-Footnote-1789937
-Ref: Igawk Program-Footnote-2790138
-Node: Anagram Program790276
-Node: Signature Program793344
-Node: Debugger794444
-Node: Debugging795396
-Node: Debugging Concepts795829
-Node: Debugging Terms797685
-Node: Awk Debugging800282
-Node: Sample Debugging Session801174
-Node: Debugger Invocation801694
-Node: Finding The Bug803023
-Node: List of Debugger Commands809511
-Node: Breakpoint Control810845
-Node: Debugger Execution Control814509
-Node: Viewing And Changing Data817869
-Node: Execution Stack821225
-Node: Debugger Info822692
-Node: Miscellaneous Debugger Commands826673
-Node: Readline Support832118
-Node: Limitations832949
-Node: Language History835201
-Node: V7/SVR3.1836713
-Node: SVR4839034
-Node: POSIX840476
-Node: BTL841484
-Node: POSIX/GNU842218
-Node: Common Extensions847509
-Node: Ranges and Locales848616
-Ref: Ranges and Locales-Footnote-1853220
-Node: Contributors853441
-Node: Installation857702
-Node: Gawk Distribution858596
-Node: Getting859080
-Node: Extracting859906
-Node: Distribution contents861598
-Node: Unix Installation866820
-Node: Quick Installation867437
-Node: Additional Configuration Options869399
-Node: Configuration Philosophy870876
-Node: Non-Unix Installation873218
-Node: PC Installation873676
-Node: PC Binary Installation874975
-Node: PC Compiling876823
-Node: PC Testing879767
-Node: PC Using880943
-Node: Cygwin885128
-Node: MSYS886128
-Node: VMS Installation886642
-Node: VMS Compilation887245
-Ref: VMS Compilation-Footnote-1888252
-Node: VMS Installation Details888310
-Node: VMS Running889945
-Node: VMS Old Gawk891552
-Node: Bugs892026
-Node: Other Versions895878
-Node: Notes901193
-Node: Compatibility Mode901885
-Node: Additions902668
-Node: Accessing The Source903480
-Node: Adding Code904905
-Node: New Ports910872
-Node: Dynamic Extensions914985
-Node: Internals916425
-Node: Plugin License925247
-Node: Loading Extensions925885
-Node: Sample Library927726
-Node: Internal File Description928416
-Node: Internal File Ops932131
-Ref: Internal File Ops-Footnote-1936696
-Node: Using Internal File Ops936836
-Node: Future Extensions939214
-Node: Basic Concepts941718
-Node: Basic High Level942475
-Ref: Basic High Level-Footnote-1946510
-Node: Basic Data Typing946695
-Node: Floating Point Issues951220
-Node: String Conversion Precision952303
-Ref: String Conversion Precision-Footnote-1954003
-Node: Unexpected Results954112
-Node: POSIX Floating Point Problems955938
-Ref: POSIX Floating Point Problems-Footnote-1959643
-Node: Glossary959681
-Node: Copying984657
-Node: GNU Free Documentation License1022214
-Node: Index1047351
+Node: Foreword31579
+Node: Preface35924
+Ref: Preface-Footnote-138977
+Ref: Preface-Footnote-239083
+Node: History39315
+Node: Names41706
+Ref: Names-Footnote-143183
+Node: This Manual43255
+Ref: This Manual-Footnote-148159
+Node: Conventions48259
+Node: Manual History50393
+Ref: Manual History-Footnote-153663
+Ref: Manual History-Footnote-253704
+Node: How To Contribute53778
+Node: Acknowledgments54922
+Node: Getting Started59418
+Node: Running gawk61797
+Node: One-shot62983
+Node: Read Terminal64208
+Ref: Read Terminal-Footnote-165858
+Ref: Read Terminal-Footnote-266134
+Node: Long66305
+Node: Executable Scripts67681
+Ref: Executable Scripts-Footnote-169550
+Ref: Executable Scripts-Footnote-269652
+Node: Comments70199
+Node: Quoting72666
+Node: DOS Quoting77289
+Node: Sample Data Files77964
+Node: Very Simple80996
+Node: Two Rules85595
+Node: More Complex87742
+Ref: More Complex-Footnote-190672
+Node: Statements/Lines90757
+Ref: Statements/Lines-Footnote-195219
+Node: Other Features95484
+Node: When96412
+Node: Invoking Gawk98559
+Node: Command Line100020
+Node: Options100803
+Ref: Options-Footnote-1116201
+Node: Other Arguments116226
+Node: Naming Standard Input118884
+Node: Environment Variables119978
+Node: AWKPATH Variable120536
+Ref: AWKPATH Variable-Footnote-1123294
+Node: AWKLIBPATH Variable123554
+Node: Other Environment Variables124151
+Node: Exit Status126646
+Node: Include Files127321
+Node: Loading Shared Libraries130890
+Node: Obsolete132115
+Node: Undocumented132812
+Node: Regexp133055
+Node: Regexp Usage134444
+Node: Escape Sequences136470
+Node: Regexp Operators142233
+Ref: Regexp Operators-Footnote-1149613
+Ref: Regexp Operators-Footnote-2149760
+Node: Bracket Expressions149858
+Ref: table-char-classes151748
+Node: GNU Regexp Operators154271
+Node: Case-sensitivity157994
+Ref: Case-sensitivity-Footnote-1160962
+Ref: Case-sensitivity-Footnote-2161197
+Node: Leftmost Longest161305
+Node: Computed Regexps162506
+Node: Reading Files165916
+Node: Records167919
+Ref: Records-Footnote-1176593
+Node: Fields176630
+Ref: Fields-Footnote-1179663
+Node: Nonconstant Fields179749
+Node: Changing Fields181951
+Node: Field Separators187932
+Node: Default Field Splitting190561
+Node: Regexp Field Splitting191678
+Node: Single Character Fields195020
+Node: Command Line Field Separator196079
+Node: Field Splitting Summary199520
+Ref: Field Splitting Summary-Footnote-1202712
+Node: Constant Size202813
+Node: Splitting By Content207397
+Ref: Splitting By Content-Footnote-1211123
+Node: Multiple Line211163
+Ref: Multiple Line-Footnote-1217010
+Node: Getline217189
+Node: Plain Getline219405
+Node: Getline/Variable221494
+Node: Getline/File222635
+Node: Getline/Variable/File223957
+Ref: Getline/Variable/File-Footnote-1225556
+Node: Getline/Pipe225643
+Node: Getline/Variable/Pipe228203
+Node: Getline/Coprocess229310
+Node: Getline/Variable/Coprocess230553
+Node: Getline Notes231267
+Node: Getline Summary233209
+Ref: table-getline-variants233617
+Node: Read Timeout234473
+Ref: Read Timeout-Footnote-1238218
+Node: Command line directories238275
+Node: Printing238905
+Node: Print240536
+Node: Print Examples241873
+Node: Output Separators244657
+Node: OFMT246417
+Node: Printf247775
+Node: Basic Printf248681
+Node: Control Letters250220
+Node: Format Modifiers254032
+Node: Printf Examples260041
+Node: Redirection262756
+Node: Special Files269740
+Node: Special FD270273
+Ref: Special FD-Footnote-1273898
+Node: Special Network273972
+Node: Special Caveats274822
+Node: Close Files And Pipes275618
+Ref: Close Files And Pipes-Footnote-1282641
+Ref: Close Files And Pipes-Footnote-2282789
+Node: Expressions282939
+Node: Values284071
+Node: Constants284747
+Node: Scalar Constants285427
+Ref: Scalar Constants-Footnote-1286286
+Node: Nondecimal-numbers286468
+Node: Regexp Constants289527
+Node: Using Constant Regexps290002
+Node: Variables293057
+Node: Using Variables293712
+Node: Assignment Options295436
+Node: Conversion297308
+Ref: table-locale-affects302684
+Ref: Conversion-Footnote-1303308
+Node: All Operators303417
+Node: Arithmetic Ops304047
+Node: Concatenation306552
+Ref: Concatenation-Footnote-1309345
+Node: Assignment Ops309465
+Ref: table-assign-ops314453
+Node: Increment Ops315861
+Node: Truth Values and Conditions319331
+Node: Truth Values320414
+Node: Typing and Comparison321463
+Node: Variable Typing322252
+Ref: Variable Typing-Footnote-1326149
+Node: Comparison Operators326271
+Ref: table-relational-ops326681
+Node: POSIX String Comparison330230
+Ref: POSIX String Comparison-Footnote-1331186
+Node: Boolean Ops331324
+Ref: Boolean Ops-Footnote-1335402
+Node: Conditional Exp335493
+Node: Function Calls337225
+Node: Precedence340819
+Node: Locales344488
+Node: Patterns and Actions345577
+Node: Pattern Overview346631
+Node: Regexp Patterns348300
+Node: Expression Patterns348843
+Node: Ranges352528
+Node: BEGIN/END355494
+Node: Using BEGIN/END356256
+Ref: Using BEGIN/END-Footnote-1358987
+Node: I/O And BEGIN/END359093
+Node: BEGINFILE/ENDFILE361375
+Node: Empty364279
+Node: Using Shell Variables364595
+Node: Action Overview366880
+Node: Statements369237
+Node: If Statement371091
+Node: While Statement372590
+Node: Do Statement374634
+Node: For Statement375790
+Node: Switch Statement378942
+Node: Break Statement381039
+Node: Continue Statement383029
+Node: Next Statement384822
+Node: Nextfile Statement387212
+Node: Exit Statement389757
+Node: Built-in Variables392173
+Node: User-modified393268
+Ref: User-modified-Footnote-1401623
+Node: Auto-set401685
+Ref: Auto-set-Footnote-1411593
+Node: ARGC and ARGV411798
+Node: Arrays415649
+Node: Array Basics417154
+Node: Array Intro417980
+Node: Reference to Elements422298
+Node: Assigning Elements424568
+Node: Array Example425059
+Node: Scanning an Array426791
+Node: Controlling Scanning429105
+Ref: Controlling Scanning-Footnote-1434038
+Node: Delete434354
+Ref: Delete-Footnote-1436789
+Node: Numeric Array Subscripts436846
+Node: Uninitialized Subscripts439029
+Node: Multi-dimensional440657
+Node: Multi-scanning443751
+Node: Arrays of Arrays445342
+Node: Functions449987
+Node: Built-in450809
+Node: Calling Built-in451887
+Node: Numeric Functions453875
+Ref: Numeric Functions-Footnote-1457707
+Ref: Numeric Functions-Footnote-2458064
+Ref: Numeric Functions-Footnote-3458112
+Node: String Functions458381
+Ref: String Functions-Footnote-1481878
+Ref: String Functions-Footnote-2482007
+Ref: String Functions-Footnote-3482255
+Node: Gory Details482342
+Ref: table-sub-escapes484021
+Ref: table-sub-posix-92485375
+Ref: table-sub-proposed486718
+Ref: table-posix-sub488068
+Ref: table-gensub-escapes489614
+Ref: Gory Details-Footnote-1490821
+Ref: Gory Details-Footnote-2490872
+Node: I/O Functions491023
+Ref: I/O Functions-Footnote-1497678
+Node: Time Functions497825
+Ref: Time Functions-Footnote-1508717
+Ref: Time Functions-Footnote-2508785
+Ref: Time Functions-Footnote-3508943
+Ref: Time Functions-Footnote-4509054
+Ref: Time Functions-Footnote-5509166
+Ref: Time Functions-Footnote-6509393
+Node: Bitwise Functions509659
+Ref: table-bitwise-ops510217
+Ref: Bitwise Functions-Footnote-1514438
+Node: Type Functions514622
+Node: I18N Functions515092
+Node: User-defined516719
+Node: Definition Syntax517523
+Ref: Definition Syntax-Footnote-1522433
+Node: Function Example522502
+Node: Function Caveats525096
+Node: Calling A Function525517
+Node: Variable Scope526632
+Node: Pass By Value/Reference528607
+Node: Return Statement532047
+Node: Dynamic Typing535028
+Node: Indirect Calls535763
+Node: Internationalization545448
+Node: I18N and L10N546887
+Node: Explaining gettext547573
+Ref: Explaining gettext-Footnote-1552639
+Ref: Explaining gettext-Footnote-2552823
+Node: Programmer i18n552988
+Node: Translator i18n557188
+Node: String Extraction557981
+Ref: String Extraction-Footnote-1558942
+Node: Printf Ordering559028
+Ref: Printf Ordering-Footnote-1561812
+Node: I18N Portability561876
+Ref: I18N Portability-Footnote-1564325
+Node: I18N Example564388
+Ref: I18N Example-Footnote-1567023
+Node: Gawk I18N567095
+Node: Arbitrary Precision Arithmetic567712
+Ref: Arbitrary Precision Arithmetic-Footnote-1570464
+Node: Floating-point Programming570612
+Node: Floating-point Representation575882
+Node: Floating-point Context576986
+Ref: table-ieee-formats577821
+Node: Rounding Mode579191
+Ref: table-rounding-modes579818
+Ref: Rounding Mode-Footnote-1582941
+Node: Arbitrary Precision Floats583122
+Ref: Arbitrary Precision Floats-Footnote-1585163
+Node: Setting Precision585474
+Node: Setting Rounding Mode588232
+Node: Floating-point Constants589149
+Node: Changing Precision590568
+Ref: Changing Precision-Footnote-1591968
+Node: Exact Arithmetic592141
+Node: Integer Programming595154
+Node: Arbitrary Precision Integers596934
+Ref: Arbitrary Precision Integers-Footnote-1599958
+Node: MPFR and GMP Libraries600104
+Node: Advanced Features600489
+Node: Nondecimal Data602012
+Node: Array Sorting603595
+Node: Controlling Array Traversal604292
+Node: Array Sorting Functions612529
+Ref: Array Sorting Functions-Footnote-1616203
+Ref: Array Sorting Functions-Footnote-2616296
+Node: Two-way I/O616490
+Ref: Two-way I/O-Footnote-1621922
+Node: TCP/IP Networking621992
+Node: Profiling624836
+Node: Library Functions632290
+Ref: Library Functions-Footnote-1635297
+Node: Library Names635468
+Ref: Library Names-Footnote-1638939
+Ref: Library Names-Footnote-2639159
+Node: General Functions639245
+Node: Strtonum Function640198
+Node: Assert Function643128
+Node: Round Function646454
+Node: Cliff Random Function647997
+Node: Ordinal Functions649013
+Ref: Ordinal Functions-Footnote-1652083
+Ref: Ordinal Functions-Footnote-2652335
+Node: Join Function652544
+Ref: Join Function-Footnote-1654315
+Node: Getlocaltime Function654515
+Node: Data File Management658230
+Node: Filetrans Function658862
+Node: Rewind Function663001
+Node: File Checking664388
+Node: Empty Files665482
+Node: Ignoring Assigns667712
+Node: Getopt Function669265
+Ref: Getopt Function-Footnote-1680569
+Node: Passwd Functions680772
+Ref: Passwd Functions-Footnote-1689747
+Node: Group Functions689835
+Node: Walking Arrays697919
+Node: Sample Programs699488
+Node: Running Examples700153
+Node: Clones700881
+Node: Cut Program702105
+Node: Egrep Program711950
+Ref: Egrep Program-Footnote-1719723
+Node: Id Program719833
+Node: Split Program723449
+Ref: Split Program-Footnote-1726968
+Node: Tee Program727096
+Node: Uniq Program729899
+Node: Wc Program737328
+Ref: Wc Program-Footnote-1741594
+Ref: Wc Program-Footnote-2741794
+Node: Miscellaneous Programs741886
+Node: Dupword Program743074
+Node: Alarm Program745105
+Node: Translate Program749854
+Ref: Translate Program-Footnote-1754241
+Ref: Translate Program-Footnote-2754469
+Node: Labels Program754603
+Ref: Labels Program-Footnote-1757974
+Node: Word Sorting758058
+Node: History Sorting761942
+Node: Extract Program763781
+Ref: Extract Program-Footnote-1771264
+Node: Simple Sed771392
+Node: Igawk Program774454
+Ref: Igawk Program-Footnote-1789611
+Ref: Igawk Program-Footnote-2789812
+Node: Anagram Program789950
+Node: Signature Program793018
+Node: Debugger794118
+Node: Debugging795072
+Node: Debugging Concepts795505
+Node: Debugging Terms797361
+Node: Awk Debugging799958
+Node: Sample Debugging Session800850
+Node: Debugger Invocation801370
+Node: Finding The Bug802699
+Node: List of Debugger Commands809187
+Node: Breakpoint Control810521
+Node: Debugger Execution Control814185
+Node: Viewing And Changing Data817545
+Node: Execution Stack820901
+Node: Debugger Info822368
+Node: Miscellaneous Debugger Commands826349
+Node: Readline Support831794
+Node: Limitations832625
+Node: Dynamic Extensions834877
+Node: Plugin License835773
+Node: Sample Library836387
+Node: Internal File Description837071
+Node: Internal File Ops840784
+Ref: Internal File Ops-Footnote-1845347
+Node: Using Internal File Ops845487
+Node: Language History847863
+Node: V7/SVR3.1849385
+Node: SVR4851706
+Node: POSIX853148
+Node: BTL854156
+Node: POSIX/GNU854890
+Node: Common Extensions860146
+Node: Ranges and Locales861253
+Ref: Ranges and Locales-Footnote-1865857
+Node: Contributors866078
+Node: Installation870374
+Node: Gawk Distribution871268
+Node: Getting871752
+Node: Extracting872578
+Node: Distribution contents874270
+Node: Unix Installation879492
+Node: Quick Installation880109
+Node: Additional Configuration Options882071
+Node: Configuration Philosophy883548
+Node: Non-Unix Installation885890
+Node: PC Installation886348
+Node: PC Binary Installation887647
+Node: PC Compiling889495
+Node: PC Testing892439
+Node: PC Using893615
+Node: Cygwin897800
+Node: MSYS898800
+Node: VMS Installation899314
+Node: VMS Compilation899917
+Ref: VMS Compilation-Footnote-1900924
+Node: VMS Installation Details900982
+Node: VMS Running902617
+Node: VMS Old Gawk904224
+Node: Bugs904698
+Node: Other Versions908550
+Node: Notes913865
+Node: Compatibility Mode914452
+Node: Additions915235
+Node: Accessing The Source916046
+Node: Adding Code917471
+Node: New Ports923479
+Node: Future Extensions927592
+Node: Basic Concepts929079
+Node: Basic High Level929836
+Ref: Basic High Level-Footnote-1933871
+Node: Basic Data Typing934056
+Node: Floating Point Issues938581
+Node: String Conversion Precision939664
+Ref: String Conversion Precision-Footnote-1941364
+Node: Unexpected Results941473
+Node: POSIX Floating Point Problems943299
+Ref: POSIX Floating Point Problems-Footnote-1947004
+Node: Glossary947042
+Node: Copying972018
+Node: GNU Free Documentation License1009575
+Node: Index1034712

End Tag Table
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 12b77556..ceea9a92 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -295,12 +295,14 @@ particular records in a file and perform operations upon them.
* Sample Programs:: Many @command{awk} programs with complete
explanations.
* Debugger:: The @code{gawk} debugger.
+* Dynamic Extensions:: Adding new built-in functions to
+ @command{gawk}.
* Language History:: The evolution of the @command{awk}
language.
* Installation:: Installing @command{gawk} under various
operating systems.
-* Notes:: Notes about @command{gawk} extensions and
- possible future work.
+* Notes:: Notes about adding things to @command{gawk}
+ and possible future work.
* Basic Concepts:: A very quick introduction to programming
concepts.
* Glossary:: An explanation of some unfamiliar terms.
@@ -558,21 +560,22 @@ particular records in a file and perform operations upon them.
* I18N Portability:: @command{awk}-level portability issues.
* I18N Example:: A simple i18n example.
* Gawk I18N:: @command{gawk} is also internationalized.
-* Floating-point Programming:: Effective floating-point programming.
-* Floating-point Representation:: Binary floating-point representation.
-* Floating-point Context:: Floating-point context.
-* Rounding Mode:: Floating-point rounding mode.
-* Arbitrary Precision Floats:: Arbitrary precision floating-point
- arithmetic with @command{gawk}.
-* Setting Precision:: Setting the working precision.
-* Setting Rounding Mode:: Setting the rounding mode.
-* Floating-point Constants:: Representing floating-point constants.
-* Changing Precision:: Changing the precision of a number.
-* Exact Arithmetic:: Exact arithmetic with floating-point numbers.
-* Integer Programming:: Effective integer programming.
-* Arbitrary Precision Integers:: Arbitrary precision integer
- arithmetic with @command{gawk}.
-* MPFR and GMP Libraries:: Information about the MPFR and GMP libraries.
+* Floating-point Programming:: Effective Floating-point Programming.
+* Floating-point Representation:: Binary Floating-point Representation.
+* Floating-point Context:: Floating-point Context.
+* Rounding Mode:: Floating-point Rounding Mode.
+* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
+ Arithmetic with @command{gawk}.
+* Setting Precision:: Setting the Working Precision.
+* Setting Rounding Mode:: Setting the Rounding Mode.
+* Floating-point Constants:: Representing Floating-point Constants.
+* Changing Precision:: Changing the Precision of a Number.
+* Exact Arithmetic:: Exact Arithmetic with Floating-point
+ Numbers.
+* Integer Programming:: Effective Integer Programming.
+* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic with
+ @command{gawk}.
+* MPFR and GMP Libraries ::
* Nondecimal Data:: Allowing nondecimal input data.
* Array Sorting:: Facilities for controlling array traversal
and sorting arrays.
@@ -637,14 +640,14 @@ particular records in a file and perform operations upon them.
* Anagram Program:: Finding anagrams from a dictionary.
* Signature Program:: People do amazing things with too much time
on their hands.
-* Debugging:: Introduction to @command{gawk} Debugger.
+* Debugging:: Introduction to @command{gawk} debugger.
* Debugging Concepts:: Debugging in General.
* Debugging Terms:: Additional Debugging Concepts.
* Awk Debugging:: Awk Debugging.
-* Sample Debugging Session:: Sample Debugging Session.
+* Sample Debugging Session:: Sample debugging session.
* Debugger Invocation:: How to Start the Debugger.
* Finding The Bug:: Finding the Bug.
-* List of Debugger Commands:: Main Commands.
+* List of Debugger Commands:: Main debugger commands.
* Breakpoint Control:: Control of Breakpoints.
* Debugger Execution Control:: Control of Execution.
* Viewing And Changing Data:: Viewing and Changing Data.
@@ -652,8 +655,13 @@ particular records in a file and perform operations upon them.
* Debugger Info:: Obtaining Information about the Program and
the Debugger State.
* Miscellaneous Debugger Commands:: Miscellaneous Commands.
-* Readline Support:: Readline Support.
-* Limitations:: Limitations and Future Plans.
+* Readline Support:: Readline support.
+* Limitations:: Limitations and future plans.
+* Plugin License:: A note about licensing.
+* Sample Library:: A example of new functions.
+* Internal File Description:: What the new functions will do.
+* Internal File Ops:: The code for internal file operations.
+* Using Internal File Ops:: How to use an external extension.
* V7/SVR3.1:: The major changes between V7 and System V
Release 3.1.
* SVR4:: Minor changes between System V Releases 3.1
@@ -704,16 +712,6 @@ particular records in a file and perform operations upon them.
@command{gawk}.
* New Ports:: Porting @command{gawk} to a new operating
system.
-* Dynamic Extensions:: Adding new built-in functions to
- @command{gawk}.
-* Internals:: A brief look at some @command{gawk}
- internals.
-* Plugin License:: A note about licensing.
-* Loading Extensions:: How to load dynamic extensions.
-* Sample Library:: A example of new functions.
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
* Future Extensions:: New features that may be implemented one
day.
* Basic High Level:: The high level view.
@@ -1206,8 +1204,7 @@ available @command{awk} implementations.
@ref{Notes},
describes how to disable @command{gawk}'s extensions, as
well as how to contribute new code to @command{gawk},
-how to write extension libraries, and some possible
-future directions for @command{gawk} development.
+and some possible future directions for @command{gawk} development.
@ref{Basic Concepts},
provides some very cursory background material for those who
@@ -3616,8 +3613,8 @@ behaves.
@menu
* AWKPATH Variable:: Searching directories for @command{awk}
programs.
-* AWKLIBPATH Variable:: Searching directories for @command{awk}
- shared libraries.
+* AWKLIBPATH Variable:: Searching directories for @command{awk} shared
+ libraries.
* Other Environment Variables:: The environment variables.
@end menu
@@ -5263,7 +5260,6 @@ used with it do not have to be named on the @command{awk} command line
* Getline:: Reading files under explicit program control
using the @code{getline} function.
* Read Timeout:: Reading input with a timeout.
-
* Command line directories:: What happens if you put a directory on the
command line.
@end menu
@@ -11565,9 +11561,9 @@ fatal error.
@item
If you have written extensions that modify the record handling (by inserting
-an ``open hook''), you can invoke them at this point, before @command{gawk}
+an ``input parser''), you can invoke them at this point, before @command{gawk}
has started processing the file. (This is a @emph{very} advanced feature,
-currently used only by the @uref{http://xmlgawk.sourceforge.net, XMLgawk project}.)
+currently used only by the @uref{http://gawkextlib.sourceforge.net, @code{gawkextlib} project}.)
@end itemize
The @code{ENDFILE} rule is called when @command{gawk} has finished processing
@@ -18508,21 +18504,22 @@ in general, and the limitations of doing arithmetic with ordinary
@command{gawk} numbers.
@menu
-* Floating-point Programming:: Effective Floating-point Programming.
-* Floating-point Representation:: Binary Floating-point Representation.
-* Floating-point Context:: Floating-point Context.
-* Rounding Mode:: Floating-point Rounding Mode.
-* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
- Arithmetic with @command{gawk}.
-* Setting Precision:: Setting the Working Precision.
-* Setting Rounding Mode:: Setting the Rounding Mode.
-* Floating-point Constants:: Representing Floating-point Constants.
-* Changing Precision:: Changing the Precision of a Number.
-* Exact Arithmetic:: Exact Arithmetic with Floating-point Numbers.
-* Integer Programming:: Effective Integer Programming.
-* Arbitrary Precision Integers:: Arbitrary Precision Integer
- Arithmetic with @command{gawk}.
-* MPFR and GMP Libraries:: Information About the MPFR and GMP Libraries.
+* Floating-point Programming:: Effective Floating-point Programming.
+* Floating-point Representation:: Binary Floating-point Representation.
+* Floating-point Context:: Floating-point Context.
+* Rounding Mode:: Floating-point Rounding Mode.
+* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
+ Arithmetic with @command{gawk}.
+* Setting Precision:: Setting the Working Precision.
+* Setting Rounding Mode:: Setting the Rounding Mode.
+* Floating-point Constants:: Representing Floating-point Constants.
+* Changing Precision:: Changing the Precision of a Number.
+* Exact Arithmetic:: Exact Arithmetic with Floating-point
+ Numbers.
+* Integer Programming:: Effective Integer Programming.
+* Arbitrary Precision Integers:: Arbitrary Precision Integer Arithmetic with
+ @command{gawk}.
+* MPFR and GMP Libraries ::
@end menu
@node Floating-point Programming
@@ -27530,6 +27527,471 @@ The @command{gawk} debugger only accepts source supplied with the @option{-f} op
Look forward to a future release when these and other missing features may
be added, and of course feel free to try to add them yourself!
+@node Dynamic Extensions
+@chapter Writing Extensions for @command{gawk}
+
+This chapter is a placeholder, pending a rewrite for the new API.
+Some of the old bits remain, since they can be partially reused.
+
+
+@c STARTOFRANGE gladfgaw
+@cindex @command{gawk}, functions, adding
+@c STARTOFRANGE adfugaw
+@cindex adding, functions to @command{gawk}
+@c STARTOFRANGE fubadgaw
+@cindex functions, built-in, adding to @command{gawk}
+It is possible to add new built-in
+functions to @command{gawk} using dynamically loaded libraries. This
+facility is available on systems (such as GNU/Linux) that support
+the C @code{dlopen()} and @code{dlsym()} functions.
+This @value{CHAPTER} describes how to write and use dynamically
+loaded extensions for @command{gawk}.
+Experience with programming in
+C or C++ is necessary when reading this @value{SECTION}.
+
+@quotation NOTE
+When @option{--sandbox} is specified, extensions are disabled
+(@pxref{Options}.
+@end quotation
+
+@menu
+* Plugin License:: A note about licensing.
+* Sample Library:: A example of new functions.
+@end menu
+
+@node Plugin License
+@section Extension Licensing
+
+Every dynamic extension should define the global symbol
+@code{plugin_is_GPL_compatible} to assert that it has been licensed under
+a GPL-compatible license. If this symbol does not exist, @command{gawk}
+will emit a fatal error and exit.
+
+The declared type of the symbol should be @code{int}. It does not need
+to be in any allocated section, though. The code merely asserts that
+the symbol exists in the global scope. Something like this is enough:
+
+@example
+int plugin_is_GPL_compatible;
+@end example
+
+@node Sample Library
+@section Example: Directory and File Operation Built-ins
+@c STARTOFRANGE chdirg
+@cindex @code{chdir()} function@comma{} implementing in @command{gawk}
+@c STARTOFRANGE statg
+@cindex @code{stat()} function@comma{} implementing in @command{gawk}
+@c STARTOFRANGE filre
+@cindex files, information about@comma{} retrieving
+@c STARTOFRANGE dirch
+@cindex directories, changing
+
+Two useful functions that are not in @command{awk} are @code{chdir()}
+(so that an @command{awk} program can change its directory) and
+@code{stat()} (so that an @command{awk} program can gather information about
+a file).
+This @value{SECTION} implements these functions for @command{gawk} in an
+external extension library.
+
+@menu
+* Internal File Description:: What the new functions will do.
+* Internal File Ops:: The code for internal file operations.
+* Using Internal File Ops:: How to use an external extension.
+@end menu
+
+@node Internal File Description
+@subsection Using @code{chdir()} and @code{stat()}
+
+This @value{SECTION} shows how to use the new functions at the @command{awk}
+level once they've been integrated into the running @command{gawk}
+interpreter.
+Using @code{chdir()} is very straightforward. It takes one argument,
+the new directory to change to:
+
+@example
+@dots{}
+newdir = "/home/arnold/funstuff"
+ret = chdir(newdir)
+if (ret < 0) @{
+ printf("could not change to %s: %s\n",
+ newdir, ERRNO) > "/dev/stderr"
+ exit 1
+@}
+@dots{}
+@end example
+
+The return value is negative if the @code{chdir} failed,
+and @code{ERRNO}
+(@pxref{Built-in Variables})
+is set to a string indicating the error.
+
+Using @code{stat()} is a bit more complicated.
+The C @code{stat()} function fills in a structure that has a fair
+amount of information.
+The right way to model this in @command{awk} is to fill in an associative
+array with the appropriate information:
+
+@c broke printf for page breaking
+@example
+file = "/home/arnold/.profile"
+fdata[1] = "x" # force `fdata' to be an array
+ret = stat(file, fdata)
+if (ret < 0) @{
+ printf("could not stat %s: %s\n",
+ file, ERRNO) > "/dev/stderr"
+ exit 1
+@}
+printf("size of %s is %d bytes\n", file, fdata["size"])
+@end example
+
+The @code{stat()} function always clears the data array, even if
+the @code{stat()} fails. It fills in the following elements:
+
+@table @code
+@item "name"
+The name of the file that was @code{stat()}'ed.
+
+@item "dev"
+@itemx "ino"
+The file's device and inode numbers, respectively.
+
+@item "mode"
+The file's mode, as a numeric value. This includes both the file's
+type and its permissions.
+
+@item "nlink"
+The number of hard links (directory entries) the file has.
+
+@item "uid"
+@itemx "gid"
+The numeric user and group ID numbers of the file's owner.
+
+@item "size"
+The size in bytes of the file.
+
+@item "blocks"
+The number of disk blocks the file actually occupies. This may not
+be a function of the file's size if the file has holes.
+
+@item "atime"
+@itemx "mtime"
+@itemx "ctime"
+The file's last access, modification, and inode update times,
+respectively. These are numeric timestamps, suitable for formatting
+with @code{strftime()}
+(@pxref{Built-in}).
+
+@item "pmode"
+The file's ``printable mode.'' This is a string representation of
+the file's type and permissions, such as what is produced by
+@samp{ls -l}---for example, @code{"drwxr-xr-x"}.
+
+@item "type"
+A printable string representation of the file's type. The value
+is one of the following:
+
+@table @code
+@item "blockdev"
+@itemx "chardev"
+The file is a block or character device (``special file'').
+
+@ignore
+@item "door"
+The file is a Solaris ``door'' (special file used for
+interprocess communications).
+@end ignore
+
+@item "directory"
+The file is a directory.
+
+@item "fifo"
+The file is a named-pipe (also known as a FIFO).
+
+@item "file"
+The file is just a regular file.
+
+@item "socket"
+The file is an @code{AF_UNIX} (``Unix domain'') socket in the
+filesystem.
+
+@item "symlink"
+The file is a symbolic link.
+@end table
+@end table
+
+Several additional elements may be present depending upon the operating
+system and the type of the file. You can test for them in your @command{awk}
+program by using the @code{in} operator
+(@pxref{Reference to Elements}):
+
+@table @code
+@item "blksize"
+The preferred block size for I/O to the file. This field is not
+present on all POSIX-like systems in the C @code{stat} structure.
+
+@item "linkval"
+If the file is a symbolic link, this element is the name of the
+file the link points to (i.e., the value of the link).
+
+@item "rdev"
+@itemx "major"
+@itemx "minor"
+If the file is a block or character device file, then these values
+represent the numeric device number and the major and minor components
+of that number, respectively.
+@end table
+
+@node Internal File Ops
+@subsection C Code for @code{chdir()} and @code{stat()}
+
+Here is the C code for these extensions. They were written for
+GNU/Linux. The code needs some more work for complete portability
+to other POSIX-compliant systems:@footnote{This version is edited
+slightly for presentation. See
+@file{extension/filefuncs.c} in the @command{gawk} distribution
+for the complete version.}
+
+@c break line for page breaking
+@example
+#include "awk.h"
+
+#include <sys/sysmacros.h>
+
+int plugin_is_GPL_compatible;
+
+/* do_chdir --- provide dynamically loaded chdir() builtin for gawk */
+
+static NODE *
+do_chdir(int nargs)
+@{
+ NODE *newdir;
+ int ret = -1;
+
+ if (do_lint && nargs != 1)
+ lintwarn("chdir: called with incorrect number of arguments");
+
+ newdir = get_scalar_argument(0, FALSE);
+@end example
+
+The file includes the @code{"awk.h"} header file for definitions
+for the @command{gawk} internals. It includes @code{<sys/sysmacros.h>}
+for access to the @code{major()} and @code{minor}() macros.
+
+@cindex programming conventions, @command{gawk} internals
+By convention, for an @command{awk} function @code{foo}, the function that
+implements it is called @samp{do_foo}. The function should take
+a @samp{int} argument, usually called @code{nargs}, that
+represents the number of defined arguments for the function. The @code{newdir}
+variable represents the new directory to change to, retrieved
+with @code{get_scalar_argument()}. Note that the first argument is
+numbered zero.
+
+This code actually accomplishes the @code{chdir()}. It first forces
+the argument to be a string and passes the string value to the
+@code{chdir()} system call. If the @code{chdir()} fails, @code{ERRNO}
+is updated.
+
+@example
+ (void) force_string(newdir);
+ ret = chdir(newdir->stptr);
+ if (ret < 0)
+ update_ERRNO_int(errno);
+@end example
+
+Finally, the function returns the return value to the @command{awk} level:
+
+@example
+ return make_number((AWKNUM) ret);
+@}
+@end example
+
+The @code{stat()} built-in is more involved. First comes a function
+that turns a numeric mode into a printable representation
+(e.g., 644 becomes @samp{-rw-r--r--}). This is omitted here for brevity:
+
+@c break line for page breaking
+@example
+/* format_mode --- turn a stat mode field into something readable */
+
+static char *
+format_mode(unsigned long fmode)
+@{
+ @dots{}
+@}
+@end example
+
+Next comes the @code{do_stat()} function. It starts with
+variable declarations and argument checking:
+
+@ignore
+Changed message for page breaking. Used to be:
+ "stat: called with incorrect number of arguments (%d), should be 2",
+@end ignore
+@example
+/* do_stat --- provide a stat() function for gawk */
+
+static NODE *
+do_stat(int nargs)
+@{
+ NODE *file, *array, *tmp;
+ struct stat sbuf;
+ int ret;
+ NODE **aptr;
+ char *pmode; /* printable mode */
+ char *type = "unknown";
+
+ if (do_lint && nargs > 2)
+ lintwarn("stat: called with too many arguments");
+@end example
+
+Then comes the actual work. First, the function gets the arguments.
+Then, it always clears the array.
+The code use @code{lstat()} (instead of @code{stat()})
+to get the file information,
+in case the file is a symbolic link.
+If there's an error, it sets @code{ERRNO} and returns:
+
+@c comment made multiline for page breaking
+@example
+ /* file is first arg, array to hold results is second */
+ file = get_scalar_argument(0, FALSE);
+ array = get_array_argument(1, FALSE);
+
+ /* empty out the array */
+ assoc_clear(array);
+
+ /* lstat the file, if error, set ERRNO and return */
+ (void) force_string(file);
+ ret = lstat(file->stptr, & sbuf);
+ if (ret < 0) @{
+ update_ERRNO_int(errno);
+ return make_number((AWKNUM) ret);
+ @}
+@end example
+
+Now comes the tedious part: filling in the array. Only a few of the
+calls are shown here, since they all follow the same pattern:
+
+@example
+ /* fill in the array */
+ aptr = assoc_lookup(array, tmp = make_string("name", 4));
+ *aptr = dupnode(file);
+ unref(tmp);
+
+ aptr = assoc_lookup(array, tmp = make_string("mode", 4));
+ *aptr = make_number((AWKNUM) sbuf.st_mode);
+ unref(tmp);
+
+ aptr = assoc_lookup(array, tmp = make_string("pmode", 5));
+ pmode = format_mode(sbuf.st_mode);
+ *aptr = make_string(pmode, strlen(pmode));
+ unref(tmp);
+@end example
+
+When done, return the @code{lstat()} return value:
+
+@example
+
+ return make_number((AWKNUM) ret);
+@}
+@end example
+
+@cindex programming conventions, @command{gawk} internals
+Finally, it's necessary to provide the ``glue'' that loads the
+new function(s) into @command{gawk}. By convention, each library has
+a routine named @code{dl_load()} that does the job. The simplest way
+is to use the @code{dl_load_func} macro in @code{gawkapi.h}.
+
+And that's it! As an exercise, consider adding functions to
+implement system calls such as @code{chown()}, @code{chmod()},
+and @code{umask()}.
+
+@node Using Internal File Ops
+@subsection Integrating the Extensions
+
+@cindex @command{gawk}, interpreter@comma{} adding code to
+Now that the code is written, it must be possible to add it at
+runtime to the running @command{gawk} interpreter. First, the
+code must be compiled. Assuming that the functions are in
+a file named @file{filefuncs.c}, and @var{idir} is the location
+of the @command{gawk} include files,
+the following steps create
+a GNU/Linux shared library:
+
+@example
+$ @kbd{gcc -fPIC -shared -DHAVE_CONFIG_H -c -O -g -I@var{idir} filefuncs.c}
+$ @kbd{ld -o filefuncs.so -shared filefuncs.o}
+@end example
+
+@cindex @code{extension()} function (@command{gawk})
+Once the library exists, it is loaded by calling the @code{extension()}
+built-in function.
+This function takes two arguments: the name of the
+library to load and the name of a function to call when the library
+is first loaded. This function adds the new functions to @command{gawk}.
+It returns the value returned by the initialization function
+within the shared library:
+
+@example
+# file testff.awk
+BEGIN @{
+ extension("./filefuncs.so", "dl_load")
+
+ chdir(".") # no-op
+
+ data[1] = 1 # force `data' to be an array
+ print "Info for testff.awk"
+ ret = stat("testff.awk", data)
+ print "ret =", ret
+ for (i in data)
+ printf "data[\"%s\"] = %s\n", i, data[i]
+ print "testff.awk modified:",
+ strftime("%m %d %y %H:%M:%S", data["mtime"])
+
+ print "\nInfo for JUNK"
+ ret = stat("JUNK", data)
+ print "ret =", ret
+ for (i in data)
+ printf "data[\"%s\"] = %s\n", i, data[i]
+ print "JUNK modified:", strftime("%m %d %y %H:%M:%S", data["mtime"])
+@}
+@end example
+
+Here are the results of running the program:
+
+@example
+$ @kbd{gawk -f testff.awk}
+@print{} Info for testff.awk
+@print{} ret = 0
+@print{} data["size"] = 607
+@print{} data["ino"] = 14945891
+@print{} data["name"] = testff.awk
+@print{} data["pmode"] = -rw-rw-r--
+@print{} data["nlink"] = 1
+@print{} data["atime"] = 1293993369
+@print{} data["mtime"] = 1288520752
+@print{} data["mode"] = 33204
+@print{} data["blksize"] = 4096
+@print{} data["dev"] = 2054
+@print{} data["type"] = file
+@print{} data["gid"] = 500
+@print{} data["uid"] = 500
+@print{} data["blocks"] = 8
+@print{} data["ctime"] = 1290113572
+@print{} testff.awk modified: 10 31 10 12:25:52
+@print{}
+@print{} Info for JUNK
+@print{} ret = -1
+@print{} JUNK modified: 01 01 70 02:00:00
+@end example
+@c ENDOFRANGE filre
+@c ENDOFRANGE dirch
+@c ENDOFRANGE statg
+@c ENDOFRANGE chdirg
+@c ENDOFRANGE gladfgaw
+@c ENDOFRANGE adfugaw
+@c ENDOFRANGE fubadgaw
+
@ignore
@c Try this
@iftex
@@ -28010,11 +28472,6 @@ functions for internationalization
(@pxref{Programmer i18n}).
@item
-The @code{extension()} built-in function and the ability to add
-new functions dynamically
-(@pxref{Dynamic Extensions}).
-
-@item
The @code{fflush()} function from Brian Kernighan's
version of @command{awk}
(@pxref{I/O Functions}).
@@ -28048,15 +28505,21 @@ the @option{-l} command-line option
@item
The ability to use GNU-style long-named options that start with @option{--}
and the
+@option{--bignum},
@option{--characters-as-bytes},
-@option{--compat},
+@option{--copyright},
+@option{--debug},
@option{--dump-variables},
@option{--exec},
@option{--gen-pot},
+@option{--include},
@option{--lint},
@option{--lint-old},
+@option{--load},
@option{--non-decimal-data},
+@option{--optimize},
@option{--posix},
+@option{--pretty-print},
@option{--profile},
@option{--re-interval},
@option{--sandbox},
@@ -28374,6 +28837,7 @@ the various PC platforms.
Christos Zoulas
provided the @code{extension()}
built-in function for dynamically adding new modules.
+(This was removed at @command{gawk} 4.1.)
@item
@cindex Kahrs, J@"urgen
@@ -29802,8 +30266,6 @@ maintainers of @command{gawk}. Everything in it applies specifically to
* Compatibility Mode:: How to disable certain @command{gawk}
extensions.
* Additions:: Making Additions To @command{gawk}.
-* Dynamic Extensions:: Adding new built-in functions to
- @command{gawk}.
* Future Extensions:: New features that may be implemented one day.
@end menu
@@ -30039,8 +30501,9 @@ You will also have to sign paperwork for your documentation changes.
Submit changes as unified diffs.
Use @samp{diff -u -r -N} to compare
the original @command{gawk} source tree with your version.
-I recommend using the GNU version of @command{diff}.
-Send the output produced by either run of @command{diff} to me when you
+I recommend using the GNU version of @command{diff}, or best of all,
+@samp{git diff} or @samp{git format-patch}.
+Send the output produced by @command{diff} to me when you
submit your changes.
(@xref{Bugs}, for the electronic mail
information.)
@@ -30166,838 +30629,6 @@ operating systems' code that is already there.
In the code that you supply and maintain, feel free to use a
coding style and brace layout that suits your taste.
-@node Dynamic Extensions
-@appendixsec Adding New Built-in Functions to @command{gawk}
-@cindex Robinson, Will
-@cindex robot, the
-@cindex Lost In Space
-@quotation
-@i{Danger Will Robinson! Danger!!@*
-Warning! Warning!}@*
-The Robot
-@end quotation
-
-@c STARTOFRANGE gladfgaw
-@cindex @command{gawk}, functions, adding
-@c STARTOFRANGE adfugaw
-@cindex adding, functions to @command{gawk}
-@c STARTOFRANGE fubadgaw
-@cindex functions, built-in, adding to @command{gawk}
-It is possible to add new built-in
-functions to @command{gawk} using dynamically loaded libraries. This
-facility is available on systems (such as GNU/Linux) that support
-the C @code{dlopen()} and @code{dlsym()} functions.
-This @value{SECTION} describes how to write and use dynamically
-loaded extensions for @command{gawk}.
-Experience with programming in
-C or C++ is necessary when reading this @value{SECTION}.
-
-@quotation CAUTION
-The facilities described in this @value{SECTION}
-are very much subject to change in a future @command{gawk} release.
-Be aware that you may have to re-do everything,
-at some future time.
-
-If you have written your own dynamic extensions,
-be sure to recompile them for each new @command{gawk} release.
-There is no guarantee of binary compatibility between different
-releases, nor will there ever be such a guarantee.
-@end quotation
-
-@quotation NOTE
-When @option{--sandbox} is specified, extensions are disabled
-(@pxref{Options}.
-@end quotation
-
-@menu
-* Internals:: A brief look at some @command{gawk} internals.
-* Plugin License:: A note about licensing.
-* Loading Extensions:: How to load dynamic extensions.
-* Sample Library:: A example of new functions.
-@end menu
-
-@node Internals
-@appendixsubsec A Minimal Introduction to @command{gawk} Internals
-@c STARTOFRANGE gawint
-@cindex @command{gawk}, internals
-
-The truth is that @command{gawk} was not designed for simple extensibility.
-The facilities for adding functions using shared libraries work, but
-are something of a ``bag on the side.'' Thus, this tour is
-brief and simplistic; would-be @command{gawk} hackers are encouraged to
-spend some time reading the source code before trying to write
-extensions based on the material presented here. Of particular note
-are the files @file{awk.h}, @file{builtin.c}, and @file{eval.c}.
-Reading @file{awkgram.y} in order to see how the parse tree is built
-would also be of use.
-
-@cindex @code{awk.h} file (internal)
-With the disclaimers out of the way, the following types, structure
-members, functions, and macros are declared in @file{awk.h} and are of
-use when writing extensions. The next @value{SECTION}
-shows how they are used:
-
-@table @code
-@cindex floating-point, numbers, @code{AWKNUM} internal type
-@cindex numbers, floating-point, @code{AWKNUM} internal type
-@cindex @code{AWKNUM} internal type
-@cindex internal type, @code{AWKNUM}
-@item AWKNUM
-An @code{AWKNUM} is the internal type of @command{awk}
-floating-point numbers. Typically, it is a C @code{double}.
-
-@cindex @code{NODE} internal type
-@cindex internal type, @code{NODE}
-@cindex strings, @code{NODE} internal type
-@cindex numbers, @code{NODE} internal type
-@item NODE
-Just about everything is done using objects of type @code{NODE}.
-These contain both strings and numbers, as well as variables and arrays.
-
-@cindex @code{force_number()} internal function
-@cindex internal function, @code{force_number()}
-@cindex numeric, values
-@item AWKNUM force_number(NODE *n)
-This macro forces a value to be numeric. It returns the actual
-numeric value contained in the node.
-It may end up calling an internal @command{gawk} function.
-
-@cindex @code{force_string()} internal function
-@cindex internal function, @code{force_string()}
-@item void force_string(NODE *n)
-This macro guarantees that a @code{NODE}'s string value is current.
-It may end up calling an internal @command{gawk} function.
-It also guarantees that the string is zero-terminated.
-
-@cindex @code{force_wstring()} internal function
-@cindex internal function, @code{force_wstring()}
-@item void force_wstring(NODE *n)
-Similarly, this
-macro guarantees that a @code{NODE}'s wide-string value is current.
-It may end up calling an internal @command{gawk} function.
-It also guarantees that the wide string is zero-terminated.
-
-@cindex parameters@comma{} number of
-@cindex @code{nargs} internal variable
-@cindex internal variable, @code{nargs}
-@item nargs
-Inside an extension function, this is the actual number of
-parameters passed to the current function.
-
-@cindex @code{stptr} internal variable
-@cindex internal variable, @code{stptr}
-@cindex @code{stlen} internal variable
-@cindex internal variable, @code{stlen}
-@item n->stptr
-@itemx n->stlen
-The data and length of a @code{NODE}'s string value, respectively.
-The string is @emph{not} guaranteed to be zero-terminated.
-If you need to pass the string value to a C library function, save
-the value in @code{n->stptr[n->stlen]}, assign @code{'\0'} to it,
-call the routine, and then restore the value.
-
-@cindex @code{wstptr} internal variable
-@cindex internal variable, @code{wstptr}
-@cindex @code{wstlen} internal variable
-@cindex internal variable, @code{wstlen}
-@item n->wstptr
-@itemx n->wstlen
-The data and length of a @code{NODE}'s wide-string value, respectively.
-Use @code{force_wstring()} to make sure these values are current.
-
-@cindex @code{type} internal variable
-@cindex internal variable, @code{type}
-@item n->type
-The type of the @code{NODE}. This is a C @code{enum}. Values should
-be one of @code{Node_var}, @code{Node_var_new}, or @code{Node_var_array}
-for function parameters.
-
-@cindex @code{vname} internal variable
-@cindex internal variable, @code{vname}
-@item n->vname
-The ``variable name'' of a node. This is not of much use inside
-externally written extensions.
-
-@cindex arrays, associative, clearing
-@cindex @code{assoc_clear()} internal function
-@cindex internal function, @code{assoc_clear()}
-@item void assoc_clear(NODE *n)
-Clears the associative array pointed to by @code{n}.
-Make sure that @samp{n->type == Node_var_array} first.
-
-@cindex arrays, elements, installing
-@cindex @code{assoc_lookup()} internal function
-@cindex internal function, @code{assoc_lookup()}
-@item NODE **assoc_lookup(NODE *symbol, NODE *subs)
-Finds, and installs if necessary, array elements.
-@code{symbol} is the array, @code{subs} is the subscript.
-This is usually a value created with @code{make_string()} (see below).
-
-@cindex strings
-@cindex @code{make_string()} internal function
-@cindex internal function, @code{make_string()}
-@item NODE *make_string(char *s, size_t len)
-Take a C string and turn it into a pointer to a @code{NODE} that
-can be stored appropriately. This is permanent storage; understanding
-of @command{gawk} memory management is helpful.
-
-@cindex numbers
-@cindex @code{make_number()} internal function
-@cindex internal function, @code{make_number()}
-@item NODE *make_number(AWKNUM val)
-Take an @code{AWKNUM} and turn it into a pointer to a @code{NODE} that
-can be stored appropriately. This is permanent storage; understanding
-of @command{gawk} memory management is helpful.
-
-
-@cindex nodes@comma{} duplicating
-@cindex @code{dupnode()} internal function
-@cindex internal function, @code{dupnode()}
-@item NODE *dupnode(NODE *n)
-Duplicate a node. In most cases, this increments an internal
-reference count instead of actually duplicating the entire @code{NODE};
-understanding of @command{gawk} memory management is helpful.
-
-@cindex memory, releasing
-@cindex @code{unref()} internal function
-@cindex internal function, @code{unref()}
-@item void unref(NODE *n)
-This macro releases the memory associated with a @code{NODE}
-allocated with @code{make_string()} or @code{make_number()}.
-Understanding of @command{gawk} memory management is helpful.
-
-@cindex @code{make_builtin()} internal function
-@cindex internal function, @code{make_builtin()}
-@item void make_builtin(const char *name, NODE *(*func)(NODE *), int count)
-Register a C function pointed to by @code{func} as new built-in
-function @code{name}. @code{name} is a regular C string. @code{count}
-is the maximum number of arguments that the function takes.
-The function should be written in the following manner:
-
-@example
-/* do_xxx --- do xxx function for gawk */
-
-NODE *
-do_xxx(int nargs)
-@{
- @dots{}
-@}
-@end example
-
-@cindex arguments, retrieving
-@cindex @code{get_argument()} internal function
-@cindex internal function, @code{get_argument()}
-@item NODE *get_argument(int i)
-This function is called from within a C extension function to get
-the @code{i}-th argument from the function call.
-The first argument is argument zero.
-
-@cindex @code{get_actual_argument()} internal function
-@cindex internal function, @code{get_actual_argument()}
-@item NODE *get_actual_argument(int i,
-@itemx @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ int@ optional,@ int@ wantarray);
-This function retrieves a particular argument @code{i}. @code{wantarray} is @code{TRUE}
-if the argument should be an array, @code{FALSE} otherwise. If @code{optional} is
-@code{TRUE}, the argument need not have been supplied. If it wasn't, the return
-value is @code{NULL}. It is a fatal error if @code{optional} is @code{TRUE} but
-the argument was not provided.
-
-@cindex @code{get_scalar_argument()} internal macro
-@cindex internal macro, @code{get_scalar_argument()}
-@item get_scalar_argument(i, opt)
-This is a convenience macro that calls @code{get_actual_argument()}.
-
-@cindex @code{get_array_argument()} internal macro
-@cindex internal macro, @code{get_array_argument()}
-@item get_array_argument(i, opt)
-This is a convenience macro that calls @code{get_actual_argument()}.
-
-@cindex functions, return values@comma{} setting
-
-@cindex @code{ERRNO} variable
-@cindex @code{update_ERRNO_int()} internal function
-@cindex internal function, @code{update_ERRNO_int()}
-@item void update_ERRNO_int(int errno_saved)
-This function is called from within a C extension function to set
-the value of @command{gawk}'s @code{ERRNO} variable, based on the error
-value provided as the argument.
-It is provided as a convenience.
-
-@cindex @code{ERRNO} variable
-@cindex @code{update_ERRNO_string()} internal function
-@cindex internal function, @code{update_ERRNO_string()}
-@item void update_ERRNO_string(const char *string, enum errno_translate)
-This function is called from within a C extension function to set
-the value of @command{gawk}'s @code{ERRNO} variable to a given string.
-The second argument determines whether the string is translated before being
-installed into @code{ERRNO}. It is provided as a convenience.
-
-@cindex @code{ERRNO} variable
-@cindex @code{unset_ERRNO()} internal function
-@cindex internal function, @code{unset_ERRNO()}
-@item void unset_ERRNO(void)
-This function is called from within a C extension function to set
-the value of @command{gawk}'s @code{ERRNO} variable to a null string.
-It is provided as a convenience.
-
-@cindex @code{ENVIRON} array
-@cindex @code{PROCINFO} array
-@cindex @code{register_deferred_variable()} internal function
-@cindex internal function, @code{register_deferred_variable()}
-@item void register_deferred_variable(const char *name, NODE *(*load_func)(void))
-This function is called to register a function to be called when a
-reference to an undefined variable with the given name is encountered.
-The callback function will never be called if the variable exists already,
-so, unless the calling code is running at program startup, it should first
-check whether a variable of the given name already exists.
-The argument function must return a pointer to a @code{NODE} containing the
-newly created variable. This function is used to implement the builtin
-@code{ENVIRON} and @code{PROCINFO} arrays, so you can refer to them
-for examples.
-
-@cindex @code{IOBUF} internal structure
-@cindex internal structure, @code{IOBUF}
-@cindex @code{iop_alloc()} internal function
-@cindex internal function, @code{iop_alloc()}
-@cindex @code{get_record()} input method
-@cindex @code{close_func}() input method
-@cindex @code{INVALID_HANDLE} internal constant
-@cindex internal constant, @code{INVALID_HANDLE}
-@cindex XML (eXtensible Markup Language)
-@cindex eXtensible Markup Language (XML)
-@cindex @code{register_open_hook()} internal function
-@cindex internal function, @code{register_open_hook()}
-@item void register_open_hook(void *(*open_func)(IOBUF *))
-This function is called to register a function to be called whenever
-a new data file is opened, leading to the creation of an @code{IOBUF}
-structure in @code{iop_alloc()}. After creating the new @code{IOBUF},
-@code{iop_alloc()} will call (in reverse order of registration, so the last
-function registered is called first) each open hook until one returns
-non-@code{NULL}. If any hook returns a non-@code{NULL} value, that value is assigned
-to the @code{IOBUF}'s @code{opaque} field (which will presumably point
-to a structure containing additional state associated with the input
-processing), and no further open hooks are called.
-
-The function called will most likely want to set the @code{IOBUF}'s
-@code{get_record} method to indicate that future input records should
-be retrieved by calling that method instead of using the standard
-@command{gawk} input processing.
-
-And the function will also probably want to set the @code{IOBUF}'s
-@code{close_func} method to be called when the file is closed to clean
-up any state associated with the input.
-
-Finally, hook functions should be prepared to receive an @code{IOBUF}
-structure where the @code{fd} field is set to @code{INVALID_HANDLE},
-meaning that @command{gawk} was not able to open the file itself. In
-this case, the hook function must be able to successfully open the file
-and place a valid file descriptor there.
-
-Currently, for example, the hook function facility is used to implement
-the XML parser shared library extension. For more info, please look in
-@file{awk.h} and in @file{io.c}.
-@end table
-
-An argument that is supposed to be an array needs to be handled with
-some extra code, in case the array being passed in is actually
-from a function parameter.
-
-The following boilerplate code shows how to do this:
-
-@example
-NODE *the_arg;
-
-/* assume need 3rd arg, 0-based */
-the_arg = get_array_argument(2, FALSE);
-@end example
-
-Again, you should spend time studying the @command{gawk} internals;
-don't just blindly copy this code.
-@c ENDOFRANGE gawint
-
-@node Plugin License
-@appendixsubsec Extension Licensing
-
-Every dynamic extension should define the global symbol
-@code{plugin_is_GPL_compatible} to assert that it has been licensed under
-a GPL-compatible license. If this symbol does not exist, @command{gawk}
-will emit a fatal error and exit.
-
-The declared type of the symbol should be @code{int}. It does not need
-to be in any allocated section, though. The code merely asserts that
-the symbol exists in the global scope. Something like this is enough:
-
-@example
-int plugin_is_GPL_compatible;
-@end example
-
-@node Loading Extensions
-@appendixsubsec Loading a Dynamic Extension
-@cindex loading extension
-@cindex @command{gawk}, functions, loading
-There are two ways to load a dynamically linked library. The first is to use the
-builtin @code{extension()}:
-
-@example
-extension(libname, init_func)
-@end example
-
-where @file{libname} is the library to load, and @samp{init_func} is the
-name of the initialization or bootstrap routine to run once loaded.
-
-The second method for dynamic loading of a library is to use the
-command line option @option{-l}:
-
-@example
-$ @kbd{gawk -l libname -f myprog}
-@end example
-
-This will work only if the initialization routine is named @code{dl_load()}.
-
-If you use @code{extension()}, the library will be loaded
-at run time. This means that the functions are available only to the rest of
-your script. If you use the command line option @option{-l} instead,
-the library will be loaded before @command{gawk} starts compiling the
-actual program. The net effect is that you can use those functions
-anywhere in the program.
-
-@command{gawk} has a list of directories where it searches for libraries.
-By default, the list includes directories that depend upon how gawk was built
-and installed (@pxref{AWKLIBPATH Variable}). If you want @command{gawk}
-to look for libraries in your private directory, you have to tell it.
-The way to do it is to set the @env{AWKLIBPATH} environment variable
-(@pxref{AWKLIBPATH Variable}).
-@command{gawk} supplies the default shared library platform suffix if it is not
-present in the name of the library.
-If the name of your library is @file{mylib.so}, you can simply type
-
-@example
-$ @kbd{gawk -l mylib -f myprog}
-@end example
-
-and @command{gawk} will do everything necessary to load in your library,
-and then call your @code{dl_load()} routine.
-
-You can always specify the library using an absolute pathname, in which
-case @command{gawk} will not use @env{AWKLIBPATH} to search for it.
-
-@node Sample Library
-@appendixsubsec Example: Directory and File Operation Built-ins
-@c STARTOFRANGE chdirg
-@cindex @code{chdir()} function@comma{} implementing in @command{gawk}
-@c STARTOFRANGE statg
-@cindex @code{stat()} function@comma{} implementing in @command{gawk}
-@c STARTOFRANGE filre
-@cindex files, information about@comma{} retrieving
-@c STARTOFRANGE dirch
-@cindex directories, changing
-
-Two useful functions that are not in @command{awk} are @code{chdir()}
-(so that an @command{awk} program can change its directory) and
-@code{stat()} (so that an @command{awk} program can gather information about
-a file).
-This @value{SECTION} implements these functions for @command{gawk} in an
-external extension library.
-
-@menu
-* Internal File Description:: What the new functions will do.
-* Internal File Ops:: The code for internal file operations.
-* Using Internal File Ops:: How to use an external extension.
-@end menu
-
-@node Internal File Description
-@appendixsubsubsec Using @code{chdir()} and @code{stat()}
-
-This @value{SECTION} shows how to use the new functions at the @command{awk}
-level once they've been integrated into the running @command{gawk}
-interpreter.
-Using @code{chdir()} is very straightforward. It takes one argument,
-the new directory to change to:
-
-@example
-@dots{}
-newdir = "/home/arnold/funstuff"
-ret = chdir(newdir)
-if (ret < 0) @{
- printf("could not change to %s: %s\n",
- newdir, ERRNO) > "/dev/stderr"
- exit 1
-@}
-@dots{}
-@end example
-
-The return value is negative if the @code{chdir} failed,
-and @code{ERRNO}
-(@pxref{Built-in Variables})
-is set to a string indicating the error.
-
-Using @code{stat()} is a bit more complicated.
-The C @code{stat()} function fills in a structure that has a fair
-amount of information.
-The right way to model this in @command{awk} is to fill in an associative
-array with the appropriate information:
-
-@c broke printf for page breaking
-@example
-file = "/home/arnold/.profile"
-fdata[1] = "x" # force `fdata' to be an array
-ret = stat(file, fdata)
-if (ret < 0) @{
- printf("could not stat %s: %s\n",
- file, ERRNO) > "/dev/stderr"
- exit 1
-@}
-printf("size of %s is %d bytes\n", file, fdata["size"])
-@end example
-
-The @code{stat()} function always clears the data array, even if
-the @code{stat()} fails. It fills in the following elements:
-
-@table @code
-@item "name"
-The name of the file that was @code{stat()}'ed.
-
-@item "dev"
-@itemx "ino"
-The file's device and inode numbers, respectively.
-
-@item "mode"
-The file's mode, as a numeric value. This includes both the file's
-type and its permissions.
-
-@item "nlink"
-The number of hard links (directory entries) the file has.
-
-@item "uid"
-@itemx "gid"
-The numeric user and group ID numbers of the file's owner.
-
-@item "size"
-The size in bytes of the file.
-
-@item "blocks"
-The number of disk blocks the file actually occupies. This may not
-be a function of the file's size if the file has holes.
-
-@item "atime"
-@itemx "mtime"
-@itemx "ctime"
-The file's last access, modification, and inode update times,
-respectively. These are numeric timestamps, suitable for formatting
-with @code{strftime()}
-(@pxref{Built-in}).
-
-@item "pmode"
-The file's ``printable mode.'' This is a string representation of
-the file's type and permissions, such as what is produced by
-@samp{ls -l}---for example, @code{"drwxr-xr-x"}.
-
-@item "type"
-A printable string representation of the file's type. The value
-is one of the following:
-
-@table @code
-@item "blockdev"
-@itemx "chardev"
-The file is a block or character device (``special file'').
-
-@ignore
-@item "door"
-The file is a Solaris ``door'' (special file used for
-interprocess communications).
-@end ignore
-
-@item "directory"
-The file is a directory.
-
-@item "fifo"
-The file is a named-pipe (also known as a FIFO).
-
-@item "file"
-The file is just a regular file.
-
-@item "socket"
-The file is an @code{AF_UNIX} (``Unix domain'') socket in the
-filesystem.
-
-@item "symlink"
-The file is a symbolic link.
-@end table
-@end table
-
-Several additional elements may be present depending upon the operating
-system and the type of the file. You can test for them in your @command{awk}
-program by using the @code{in} operator
-(@pxref{Reference to Elements}):
-
-@table @code
-@item "blksize"
-The preferred block size for I/O to the file. This field is not
-present on all POSIX-like systems in the C @code{stat} structure.
-
-@item "linkval"
-If the file is a symbolic link, this element is the name of the
-file the link points to (i.e., the value of the link).
-
-@item "rdev"
-@itemx "major"
-@itemx "minor"
-If the file is a block or character device file, then these values
-represent the numeric device number and the major and minor components
-of that number, respectively.
-@end table
-
-@node Internal File Ops
-@appendixsubsubsec C Code for @code{chdir()} and @code{stat()}
-
-Here is the C code for these extensions. They were written for
-GNU/Linux. The code needs some more work for complete portability
-to other POSIX-compliant systems:@footnote{This version is edited
-slightly for presentation. See
-@file{extension/filefuncs.c} in the @command{gawk} distribution
-for the complete version.}
-
-@c break line for page breaking
-@example
-#include "awk.h"
-
-#include <sys/sysmacros.h>
-
-int plugin_is_GPL_compatible;
-
-/* do_chdir --- provide dynamically loaded chdir() builtin for gawk */
-
-static NODE *
-do_chdir(int nargs)
-@{
- NODE *newdir;
- int ret = -1;
-
- if (do_lint && nargs != 1)
- lintwarn("chdir: called with incorrect number of arguments");
-
- newdir = get_scalar_argument(0, FALSE);
-@end example
-
-The file includes the @code{"awk.h"} header file for definitions
-for the @command{gawk} internals. It includes @code{<sys/sysmacros.h>}
-for access to the @code{major()} and @code{minor}() macros.
-
-@cindex programming conventions, @command{gawk} internals
-By convention, for an @command{awk} function @code{foo}, the function that
-implements it is called @samp{do_foo}. The function should take
-a @samp{int} argument, usually called @code{nargs}, that
-represents the number of defined arguments for the function. The @code{newdir}
-variable represents the new directory to change to, retrieved
-with @code{get_scalar_argument()}. Note that the first argument is
-numbered zero.
-
-This code actually accomplishes the @code{chdir()}. It first forces
-the argument to be a string and passes the string value to the
-@code{chdir()} system call. If the @code{chdir()} fails, @code{ERRNO}
-is updated.
-
-@example
- (void) force_string(newdir);
- ret = chdir(newdir->stptr);
- if (ret < 0)
- update_ERRNO_int(errno);
-@end example
-
-Finally, the function returns the return value to the @command{awk} level:
-
-@example
- return make_number((AWKNUM) ret);
-@}
-@end example
-
-The @code{stat()} built-in is more involved. First comes a function
-that turns a numeric mode into a printable representation
-(e.g., 644 becomes @samp{-rw-r--r--}). This is omitted here for brevity:
-
-@c break line for page breaking
-@example
-/* format_mode --- turn a stat mode field into something readable */
-
-static char *
-format_mode(unsigned long fmode)
-@{
- @dots{}
-@}
-@end example
-
-Next comes the @code{do_stat()} function. It starts with
-variable declarations and argument checking:
-
-@ignore
-Changed message for page breaking. Used to be:
- "stat: called with incorrect number of arguments (%d), should be 2",
-@end ignore
-@example
-/* do_stat --- provide a stat() function for gawk */
-
-static NODE *
-do_stat(int nargs)
-@{
- NODE *file, *array, *tmp;
- struct stat sbuf;
- int ret;
- NODE **aptr;
- char *pmode; /* printable mode */
- char *type = "unknown";
-
- if (do_lint && nargs > 2)
- lintwarn("stat: called with too many arguments");
-@end example
-
-Then comes the actual work. First, the function gets the arguments.
-Then, it always clears the array.
-The code use @code{lstat()} (instead of @code{stat()})
-to get the file information,
-in case the file is a symbolic link.
-If there's an error, it sets @code{ERRNO} and returns:
-
-@c comment made multiline for page breaking
-@example
- /* file is first arg, array to hold results is second */
- file = get_scalar_argument(0, FALSE);
- array = get_array_argument(1, FALSE);
-
- /* empty out the array */
- assoc_clear(array);
-
- /* lstat the file, if error, set ERRNO and return */
- (void) force_string(file);
- ret = lstat(file->stptr, & sbuf);
- if (ret < 0) @{
- update_ERRNO_int(errno);
- return make_number((AWKNUM) ret);
- @}
-@end example
-
-Now comes the tedious part: filling in the array. Only a few of the
-calls are shown here, since they all follow the same pattern:
-
-@example
- /* fill in the array */
- aptr = assoc_lookup(array, tmp = make_string("name", 4));
- *aptr = dupnode(file);
- unref(tmp);
-
- aptr = assoc_lookup(array, tmp = make_string("mode", 4));
- *aptr = make_number((AWKNUM) sbuf.st_mode);
- unref(tmp);
-
- aptr = assoc_lookup(array, tmp = make_string("pmode", 5));
- pmode = format_mode(sbuf.st_mode);
- *aptr = make_string(pmode, strlen(pmode));
- unref(tmp);
-@end example
-
-When done, return the @code{lstat()} return value:
-
-@example
-
- return make_number((AWKNUM) ret);
-@}
-@end example
-
-@cindex programming conventions, @command{gawk} internals
-Finally, it's necessary to provide the ``glue'' that loads the
-new function(s) into @command{gawk}. By convention, each library has
-a routine named @code{dl_load()} that does the job. The simplest way
-is to use the @code{dl_load_func} macro in @code{gawkapi.h}.
-
-And that's it! As an exercise, consider adding functions to
-implement system calls such as @code{chown()}, @code{chmod()},
-and @code{umask()}.
-
-@node Using Internal File Ops
-@appendixsubsubsec Integrating the Extensions
-
-@cindex @command{gawk}, interpreter@comma{} adding code to
-Now that the code is written, it must be possible to add it at
-runtime to the running @command{gawk} interpreter. First, the
-code must be compiled. Assuming that the functions are in
-a file named @file{filefuncs.c}, and @var{idir} is the location
-of the @command{gawk} include files,
-the following steps create
-a GNU/Linux shared library:
-
-@example
-$ @kbd{gcc -fPIC -shared -DHAVE_CONFIG_H -c -O -g -I@var{idir} filefuncs.c}
-$ @kbd{ld -o filefuncs.so -shared filefuncs.o}
-@end example
-
-@cindex @code{extension()} function (@command{gawk})
-Once the library exists, it is loaded by calling the @code{extension()}
-built-in function.
-This function takes two arguments: the name of the
-library to load and the name of a function to call when the library
-is first loaded. This function adds the new functions to @command{gawk}.
-It returns the value returned by the initialization function
-within the shared library:
-
-@example
-# file testff.awk
-BEGIN @{
- extension("./filefuncs.so", "dl_load")
-
- chdir(".") # no-op
-
- data[1] = 1 # force `data' to be an array
- print "Info for testff.awk"
- ret = stat("testff.awk", data)
- print "ret =", ret
- for (i in data)
- printf "data[\"%s\"] = %s\n", i, data[i]
- print "testff.awk modified:",
- strftime("%m %d %y %H:%M:%S", data["mtime"])
-
- print "\nInfo for JUNK"
- ret = stat("JUNK", data)
- print "ret =", ret
- for (i in data)
- printf "data[\"%s\"] = %s\n", i, data[i]
- print "JUNK modified:", strftime("%m %d %y %H:%M:%S", data["mtime"])
-@}
-@end example
-
-Here are the results of running the program:
-
-@example
-$ @kbd{gawk -f testff.awk}
-@print{} Info for testff.awk
-@print{} ret = 0
-@print{} data["size"] = 607
-@print{} data["ino"] = 14945891
-@print{} data["name"] = testff.awk
-@print{} data["pmode"] = -rw-rw-r--
-@print{} data["nlink"] = 1
-@print{} data["atime"] = 1293993369
-@print{} data["mtime"] = 1288520752
-@print{} data["mode"] = 33204
-@print{} data["blksize"] = 4096
-@print{} data["dev"] = 2054
-@print{} data["type"] = file
-@print{} data["gid"] = 500
-@print{} data["uid"] = 500
-@print{} data["blocks"] = 8
-@print{} data["ctime"] = 1290113572
-@print{} testff.awk modified: 10 31 10 12:25:52
-@print{}
-@print{} Info for JUNK
-@print{} ret = -1
-@print{} JUNK modified: 01 01 70 02:00:00
-@end example
-@c ENDOFRANGE filre
-@c ENDOFRANGE dirch
-@c ENDOFRANGE statg
-@c ENDOFRANGE chdirg
-@c ENDOFRANGE gladfgaw
-@c ENDOFRANGE adfugaw
-@c ENDOFRANGE fubadgaw
-
@node Future Extensions
@appendixsec Probable Future Extensions
@ignore
@@ -31055,12 +30686,8 @@ Following is a list of probable future changes visible at the
@c these are ordered by likelihood
@table @asis
-@item Loadable module interface
-It is not clear that the @command{awk}-level interface to the
-modules facility is as good as it should be. The interface needs to be
-redesigned, particularly taking namespace issues into account, as
-well as possibly including issues such as library search path order
-and versioning.
+@item Databases
+It may be possible to map a GDBM/NDBM/SDBM file into an @command{awk} array.
@item @code{RECLEN} variable for fixed-length records
Along with @code{FIELDWIDTHS}, this would speed up the processing of
@@ -31068,9 +30695,6 @@ fixed-length records.
@code{PROCINFO["RS"]} would be @code{"RS"} or @code{"RECLEN"},
depending upon which kind of record processing is in effect.
-@item Databases
-It may be possible to map a GDBM/NDBM/SDBM file into an @command{awk} array.
-
@item More @code{lint} warnings
There are more things that could be checked for portability.
@end table
@@ -31079,21 +30703,6 @@ Following is a list of probable improvements that will make @command{gawk}'s
source code easier to work with:
@table @asis
-@item Loadable module mechanics
-The current extension mechanism works
-(@pxref{Dynamic Extensions}),
-but is rather primitive. It requires a fair amount of manual work
-to create and integrate a loadable module.
-Nor is the current mechanism as portable as might be desired.
-The GNU @command{libtool} package provides a number of features that
-would make using loadable modules much easier.
-@command{gawk} should be changed to use @command{libtool}.
-
-@item Loadable module internals
-The API to its internals that @command{gawk} ``exports'' should be revised.
-Too many things are needlessly exposed. A new API should be designed
-and implemented to make module writing easier.
-
@item Better array subscript management
@command{gawk}'s management of array subscript storage could use revamping,
so that using the same value to index multiple arrays only