Add .info files back.

author: Arnold D. Robbins <arnold@skeeve.com> 2016-10-26 21:49:00 +0300
committer: Arnold D. Robbins <arnold@skeeve.com> 2016-10-26 21:49:00 +0300
commit: e5abd6a16d42fc0f42277919a2d0a2c28476788c (patch)
tree: c620f8fd78ec318c7614efeb485c324be23689b7 /doc/gawkinet.info
parent: ff7329df359352a8f62a21bbb9fea47cba29b589 (diff)
download: egawk-e5abd6a16d42fc0f42277919a2d0a2c28476788c.tar.gz
egawk-e5abd6a16d42fc0f42277919a2d0a2c28476788c.tar.bz2
egawk-e5abd6a16d42fc0f42277919a2d0a2c28476788c.zip
1 files changed, 4406 insertions, 0 deletions
diff --git a/doc/gawkinet.info b/doc/gawkinet.info
new file mode 100644
index 00000000..d5a7abf8
--- /dev/null
+++ b/doc/gawkinet.info
@@ -0,0 +1,4406 @@
+This is gawkinet.info, produced by makeinfo version 6.1 from
+gawkinet.texi.
+
+This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
+4.1.4 (or later) version of the GNU implementation of AWK.
+
+
+   Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
+Foundation, Inc.
+
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", the Front-Cover
+texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below).  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. "A GNU Manual"
+
+  b. "You have the freedom to copy and modify this GNU manual.  Buying
+     copies from the FSF supports it in developing GNU and promoting
+     software freedom."
+INFO-DIR-SECTION Network applications
+START-INFO-DIR-ENTRY
+* Gawkinet: (gawkinet).         TCP/IP Internetworking With 'gawk'.
+END-INFO-DIR-ENTRY
+
+   This file documents the networking features in GNU 'awk'.
+
+   This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
+4.1.4 (or later) version of the GNU implementation of AWK.
+
+
+   Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
+Foundation, Inc.
+
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", the Front-Cover
+texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below).  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. "A GNU Manual"
+
+  b. "You have the freedom to copy and modify this GNU manual.  Buying
+     copies from the FSF supports it in developing GNU and promoting
+     software freedom."
+
+
+File: gawkinet.info,  Node: Top,  Next: Preface,  Prev: (dir),  Up: (dir)
+
+General Introduction
+********************
+
+This file documents the networking features in GNU Awk ('gawk') version
+4.0 and later.
+
+   This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
+4.1.4 (or later) version of the GNU implementation of AWK.
+
+
+   Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
+Foundation, Inc.
+
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", the Front-Cover
+texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below).  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. "A GNU Manual"
+
+  b. "You have the freedom to copy and modify this GNU manual.  Buying
+     copies from the FSF supports it in developing GNU and promoting
+     software freedom."
+
+* Menu:
+
+* Preface::                          About this document.
+* Introduction::                     About networking.
+* Using Networking::                 Some examples.
+* Some Applications and Techniques:: More extended examples.
+* Links::                            Where to find the stuff mentioned in this
+                                     document.
+* GNU Free Documentation License::   The license for this document.
+* Index::                            The index.
+
+* Stream Communications::          Sending data streams.
+* Datagram Communications::        Sending self-contained messages.
+* The TCP/IP Protocols::           How these models work in the Internet.
+* Basic Protocols::                The basic protocols.
+* Ports::                          The idea behind ports.
+* Making Connections::             Making TCP/IP connections.
+* Gawk Special Files::             How to do 'gawk' networking.
+* Special File Fields::            The fields in the special file name.
+* Comparing Protocols::            Differences between the protocols.
+* File /inet/tcp::                 The TCP special file.
+* File /inet/udp::                 The UDP special file.
+* TCP Connecting::                 Making a TCP connection.
+* Troubleshooting::                Troubleshooting TCP/IP connections.
+* Interacting::                    Interacting with a service.
+* Setting Up::                     Setting up a service.
+* Email::                          Reading email.
+* Web page::                       Reading a Web page.
+* Primitive Service::              A primitive Web service.
+* Interacting Service::            A Web service with interaction.
+* CGI Lib::                        A simple CGI library.
+* Simple Server::                  A simple Web server.
+* Caveats::                        Network programming caveats.
+* Challenges::                     Where to go from here.
+* PANIC::                          An Emergency Web Server.
+* GETURL::                         Retrieving Web Pages.
+* REMCONF::                        Remote Configuration Of Embedded Systems.
+* URLCHK::                         Look For Changed Web Pages.
+* WEBGRAB::                        Extract Links From A Page.
+* STATIST::                        Graphing A Statistical Distribution.
+* MAZE::                           Walking Through A Maze In Virtual Reality.
+* MOBAGWHO::                       A Simple Mobile Agent.
+* STOXPRED::                       Stock Market Prediction As A Service.
+* PROTBASE::                       Searching Through A Protein Database.
+
+
+File: gawkinet.info,  Node: Preface,  Next: Introduction,  Prev: Top,  Up: Top
+
+Preface
+*******
+
+In May of 1997, Ju"rgen Kahrs felt the need for network access from
+'awk', and, with a little help from me, set about adding features to do
+this for 'gawk'.  At that time, he wrote the bulk of this Info file.
+
+   The code and documentation were added to the 'gawk' 3.1 development
+tree, and languished somewhat until I could finally get down to some
+serious work on that version of 'gawk'.  This finally happened in the
+middle of 2000.
+
+   Meantime, Ju"rgen wrote an article about the Internet special files
+and '|&' operator for 'Linux Journal', and made a networking patch for
+the production versions of 'gawk' available from his home page.  In
+August of 2000 (for 'gawk' 3.0.6), this patch also made it to the main
+GNU 'ftp' distribution site.
+
+   For release with 'gawk', I edited Ju"rgen's prose for English grammar
+and style, as he is not a native English speaker.  I also rearranged the
+material somewhat for what I felt was a better order of presentation,
+and (re)wrote some of the introductory material.
+
+   The majority of this document and the code are his work, and the high
+quality and interesting ideas speak for themselves.  It is my hope that
+these features will be of significant value to the 'awk' community.
+
+
+Arnold Robbins
+Nof Ayalon, ISRAEL
+March, 2001
+
+
+File: gawkinet.info,  Node: Introduction,  Next: Using Networking,  Prev: Preface,  Up: Top
+
+1 Networking Concepts
+*********************
+
+This major node provides a (necessarily) brief introduction to computer
+networking concepts.  For many applications of 'gawk' to TCP/IP
+networking, we hope that this is enough.  For more advanced tasks, you
+will need deeper background, and it may be necessary to switch to
+lower-level programming in C or C++.
+
+   There are two real-life models for the way computers send messages to
+each other over a network.  While the analogies are not perfect, they
+are close enough to convey the major concepts.  These two models are the
+phone system (reliable byte-stream communications), and the postal
+system (best-effort datagrams).
+
+* Menu:
+
+* Stream Communications::       Sending data streams.
+* Datagram Communications::     Sending self-contained messages.
+* The TCP/IP Protocols::        How these models work in the Internet.
+* Making Connections::          Making TCP/IP connections.
+
+
+File: gawkinet.info,  Node: Stream Communications,  Next: Datagram Communications,  Prev: Introduction,  Up: Introduction
+
+1.1 Reliable Byte-streams (Phone Calls)
+=======================================
+
+When you make a phone call, the following steps occur:
+
+  1. You dial a number.
+
+  2. The phone system connects to the called party, telling them there
+     is an incoming call.  (Their phone rings.)
+
+  3. The other party answers the call, or, in the case of a computer
+     network, refuses to answer the call.
+
+  4. Assuming the other party answers, the connection between you is now
+     a "duplex" (two-way), "reliable" (no data lost), sequenced (data
+     comes out in the order sent) data stream.
+
+  5. You and your friend may now talk freely, with the phone system
+     moving the data (your voices) from one end to the other.  From your
+     point of view, you have a direct end-to-end connection with the
+     person on the other end.
+
+   The same steps occur in a duplex reliable computer networking
+connection.  There is considerably more overhead in setting up the
+communications, but once it's done, data moves in both directions,
+reliably, in sequence.
+
+
+File: gawkinet.info,  Node: Datagram Communications,  Next: The TCP/IP Protocols,  Prev: Stream Communications,  Up: Introduction
+
+1.2 Best-effort Datagrams (Mailed Letters)
+==========================================
+
+Suppose you mail three different documents to your office on the other
+side of the country on two different days.  Doing so entails the
+following.
+
+  1. Each document travels in its own envelope.
+
+  2. Each envelope contains both the sender and the recipient address.
+
+  3. Each envelope may travel a different route to its destination.
+
+  4. The envelopes may arrive in a different order from the one in which
+     they were sent.
+
+  5. One or more may get lost in the mail.  (Although, fortunately, this
+     does not occur very often.)
+
+  6. In a computer network, one or more "packets" may also arrive
+     multiple times.  (This doesn't happen with the postal system!)
+
+   The important characteristics of datagram communications, like those
+of the postal system are thus:
+
+   * Delivery is "best effort;" the data may never get there.
+
+   * Each message is self-contained, including the source and
+     destination addresses.
+
+   * Delivery is _not_ sequenced; packets may arrive out of order,
+     and/or multiple times.
+
+   * Unlike the phone system, overhead is considerably lower.  It is not
+     necessary to set up the call first.
+
+   The price the user pays for the lower overhead of datagram
+communications is exactly the lower reliability; it is often necessary
+for user-level protocols that use datagram communications to add their
+own reliability features on top of the basic communications.
+
+
+File: gawkinet.info,  Node: The TCP/IP Protocols,  Next: Making Connections,  Prev: Datagram Communications,  Up: Introduction
+
+1.3 The Internet Protocols
+==========================
+
+The Internet Protocol Suite (usually referred to as just TCP/IP)(1)
+consists of a number of different protocols at different levels or
+"layers."  For our purposes, three protocols provide the fundamental
+communications mechanisms.  All other defined protocols are referred to
+as user-level protocols (e.g., HTTP, used later in this Info file).
+
+* Menu:
+
+* Basic Protocols::             The basic protocols.
+* Ports::                       The idea behind ports.
+
+   ---------- Footnotes ----------
+
+   (1) It should be noted that although the Internet seems to have
+conquered the world, there are other networking protocol suites in
+existence and in use.
+
+
+File: gawkinet.info,  Node: Basic Protocols,  Next: Ports,  Prev: The TCP/IP Protocols,  Up: The TCP/IP Protocols
+
+1.3.1 The Basic Internet Protocols
+----------------------------------
+
+IP
+     The Internet Protocol.  This protocol is almost never used directly
+     by applications.  It provides the basic packet delivery and routing
+     infrastructure of the Internet.  Much like the phone company's
+     switching centers or the Post Office's trucks, it is not of much
+     day-to-day interest to the regular user (or programmer).  It
+     happens to be a best effort datagram protocol.  In the early
+     twenty-first century, there are two versions of this protocol in
+     use:
+
+     IPv4
+          The original version of the Internet Protocol, with 32-bit
+          addresses, on which most of the current Internet is based.
+
+     IPv6
+          The "next generation" of the Internet Protocol, with 128-bit
+          addresses.  This protocol is in wide use in certain parts of
+          the world, but has not yet replaced IPv4.(1)
+
+     Versions of the other protocols that sit "atop" IP exist for both
+     IPv4 and IPv6.  However, as the IPv6 versions are fundamentally the
+     same as the original IPv4 versions, we will not distinguish further
+     between them.
+
+UDP
+     The User Datagram Protocol.  This is a best effort datagram
+     protocol.  It provides a small amount of extra reliability over IP,
+     and adds the notion of "ports", described in *note TCP and UDP
+     Ports: Ports.
+
+TCP
+     The Transmission Control Protocol.  This is a duplex, reliable,
+     sequenced byte-stream protocol, again layered on top of IP, and
+     also providing the notion of ports.  This is the protocol that you
+     will most likely use when using 'gawk' for network programming.
+
+   All other user-level protocols use either TCP or UDP to do their
+basic communications.  Examples are SMTP (Simple Mail Transfer
+Protocol), FTP (File Transfer Protocol), and HTTP (HyperText Transfer
+Protocol).
+
+   ---------- Footnotes ----------
+
+   (1) There isn't an IPv5.
+
+
+File: gawkinet.info,  Node: Ports,  Prev: Basic Protocols,  Up: The TCP/IP Protocols
+
+1.3.2 TCP and UDP Ports
+-----------------------
+
+In the postal system, the address on an envelope indicates a physical
+location, such as a residence or office building.  But there may be more
+than one person at the location; thus you have to further quantify the
+recipient by putting a person or company name on the envelope.
+
+   In the phone system, one phone number may represent an entire
+company, in which case you need a person's extension number in order to
+reach that individual directly.  Or, when you call a home, you have to
+say, "May I please speak to ..."  before talking to the person directly.
+
+   IP networking provides the concept of addressing.  An IP address
+represents a particular computer, but no more.  In order to reach the
+mail service on a system, or the FTP or WWW service on a system, you
+must have some way to further specify which service you want.  In the
+Internet Protocol suite, this is done with "port numbers", which
+represent the services, much like an extension number used with a phone
+number.
+
+   Port numbers are 16-bit integers.  Unix and Unix-like systems reserve
+ports below 1024 for "well known" services, such as SMTP, FTP, and HTTP.
+Numbers 1024 and above may be used by any application, although there is
+no promise made that a particular port number is always available.
+
+
+File: gawkinet.info,  Node: Making Connections,  Prev: The TCP/IP Protocols,  Up: Introduction
+
+1.4 Making TCP/IP Connections (And Some Terminology)
+====================================================
+
+Two terms come up repeatedly when discussing networking: "client" and
+"server".  For now, we'll discuss these terms at the "connection level",
+when first establishing connections between two processes on different
+systems over a network.  (Once the connection is established, the higher
+level, or "application level" protocols, such as HTTP or FTP, determine
+who is the client and who is the server.  Often, it turns out that the
+client and server are the same in both roles.)
+
+   The "server" is the system providing the service, such as the web
+server or email server.  It is the "host" (system) which is _connected
+to_ in a transaction.  For this to work though, the server must be
+expecting connections.  Much as there has to be someone at the office
+building to answer the phone(1), the server process (usually) has to be
+started first and be waiting for a connection.
+
+   The "client" is the system requesting the service.  It is the system
+_initiating the connection_ in a transaction.  (Just as when you pick up
+the phone to call an office or store.)
+
+   In the TCP/IP framework, each end of a connection is represented by a
+pair of (ADDRESS, PORT) pairs.  For the duration of the connection, the
+ports in use at each end are unique, and cannot be used simultaneously
+by other processes on the same system.  (Only after closing a connection
+can a new one be built up on the same port.  This is contrary to the
+usual behavior of fully developed web servers which have to avoid
+situations in which they are not reachable.  We have to pay this price
+in order to enjoy the benefits of a simple communication paradigm in
+'gawk'.)
+
+   Furthermore, once the connection is established, communications are
+"synchronous".(2)  I.e., each end waits on the other to finish
+transmitting, before replying.  This is much like two people in a phone
+conversation.  While both could talk simultaneously, doing so usually
+doesn't work too well.
+
+   In the case of TCP, the synchronicity is enforced by the protocol
+when sending data.  Data writes "block" until the data have been
+received on the other end.  For both TCP and UDP, data reads block until
+there is incoming data waiting to be read.  This is summarized in the
+following table, where an "X" indicates that the given action blocks.
+
+TCP        X       X
+UDP        X
+
+   ---------- Footnotes ----------
+
+   (1) In the days before voice mail systems!
+
+   (2) For the technically savvy, data reads block--if there's no
+incoming data, the program is made to wait until there is, instead of
+receiving a "there's no data" error return.
+
+
+File: gawkinet.info,  Node: Using Networking,  Next: Some Applications and Techniques,  Prev: Introduction,  Up: Top
+
+2 Networking With 'gawk'
+************************
+
+The 'awk' programming language was originally developed as a
+pattern-matching language for writing short programs to perform data
+manipulation tasks.  'awk''s strength is the manipulation of textual
+data that is stored in files.  It was never meant to be used for
+networking purposes.  To exploit its features in a networking context,
+it's necessary to use an access mode for network connections that
+resembles the access of files as closely as possible.
+
+   'awk' is also meant to be a prototyping language.  It is used to
+demonstrate feasibility and to play with features and user interfaces.
+This can be done with file-like handling of network connections.  'gawk'
+trades the lack of many of the advanced features of the TCP/IP family of
+protocols for the convenience of simple connection handling.  The
+advanced features are available when programming in C or Perl.  In fact,
+the network programming in this major node is very similar to what is
+described in books such as 'Internet Programming with Python', 'Advanced
+Perl Programming', or 'Web Client Programming with Perl'.
+
+   However, you can do the programming here without first having to
+learn object-oriented ideology; underlying languages such as Tcl/Tk,
+Perl, Python; or all of the libraries necessary to extend these
+languages before they are ready for the Internet.
+
+   This major node demonstrates how to use the TCP protocol.  The UDP
+protocol is much less important for most users.
+
+* Menu:
+
+* Gawk Special Files::          How to do 'gawk' networking.
+* TCP Connecting::              Making a TCP connection.
+* Troubleshooting::             Troubleshooting TCP/IP connections.
+* Interacting::                 Interacting with a service.
+* Setting Up::                  Setting up a service.
+* Email::                       Reading email.
+* Web page::                    Reading a Web page.
+* Primitive Service::           A primitive Web service.
+* Interacting Service::         A Web service with interaction.
+* Simple Server::               A simple Web server.
+* Caveats::                     Network programming caveats.
+* Challenges::                  Where to go from here.
+
+
+File: gawkinet.info,  Node: Gawk Special Files,  Next: TCP Connecting,  Prev: Using Networking,  Up: Using Networking
+
+2.1 'gawk''s Networking Mechanisms
+==================================
+
+The '|&' operator for use in communicating with a "coprocess" is
+described in *note Two-way Communications With Another Process:
+(gawk)Two-way I/O. It shows how to do two-way I/O to a separate process,
+sending it data with 'print' or 'printf' and reading data with
+'getline'.  If you haven't read it already, you should detour there to
+do so.
+
+   'gawk' transparently extends the two-way I/O mechanism to simple
+networking through the use of special file names.  When a "coprocess"
+that matches the special files we are about to describe is started,
+'gawk' creates the appropriate network connection, and then two-way I/O
+proceeds as usual.
+
+   At the C, C++, and Perl level, networking is accomplished via
+"sockets", an Application Programming Interface (API) originally
+developed at the University of California at Berkeley that is now used
+almost universally for TCP/IP networking.  Socket level programming,
+while fairly straightforward, requires paying attention to a number of
+details, as well as using binary data.  It is not well-suited for use
+from a high-level language like 'awk'.  The special files provided in
+'gawk' hide the details from the programmer, making things much simpler
+and easier to use.
+
+   The special file name for network access is made up of several
+fields, all of which are mandatory:
+
+     /NET-TYPE/PROTOCOL/LOCALPORT/HOSTNAME/REMOTEPORT
+
+   The NET-TYPE field lets you specify IPv4 versus IPv6, or lets you
+allow the system to choose.
+
+* Menu:
+
+* Special File Fields::         The fields in the special file name.
+* Comparing Protocols::         Differences between the protocols.
+
+
+File: gawkinet.info,  Node: Special File Fields,  Next: Comparing Protocols,  Prev: Gawk Special Files,  Up: Gawk Special Files
+
+2.1.1 The Fields of the Special File Name
+-----------------------------------------
+
+This node explains the meaning of all the other fields, as well as the
+range of values and the defaults.  All of the fields are mandatory.  To
+let the system pick a value, or if the field doesn't apply to the
+protocol, specify it as '0':
+
+NET-TYPE
+     This is one of 'inet4' for IPv4, 'inet6' for IPv6, or 'inet' to use
+     the system default (which is likely to be IPv4).  For the rest of
+     this document, we will use the generic '/inet' in our descriptions
+     of how 'gawk''s networking works.
+
+PROTOCOL
+     Determines which member of the TCP/IP family of protocols is
+     selected to transport the data across the network.  There are two
+     possible values (always written in lowercase): 'tcp' and 'udp'.
+     The exact meaning of each is explained later in this node.
+
+LOCALPORT
+     Determines which port on the local machine is used to communicate
+     across the network.  Application-level clients usually use '0' to
+     indicate they do not care which local port is used--instead they
+     specify a remote port to connect to.  It is vital for
+     application-level servers to use a number different from '0' here
+     because their service has to be available at a specific publicly
+     known port number.  It is possible to use a name from
+     '/etc/services' here.
+
+HOSTNAME
+     Determines which remote host is to be at the other end of the
+     connection.  Application-level servers must fill this field with a
+     '0' to indicate their being open for all other hosts to connect to
+     them and enforce connection level server behavior this way.  It is
+     not possible for an application-level server to restrict its
+     availability to one remote host by entering a host name here.
+     Application-level clients must enter a name different from '0'.
+     The name can be either symbolic (e.g., 'jpl-devvax.jpl.nasa.gov')
+     or numeric (e.g., '128.149.1.143').
+
+REMOTEPORT
+     Determines which port on the remote machine is used to communicate
+     across the network.  For '/inet/tcp' and '/inet/udp',
+     application-level clients _must_ use a number other than '0' to
+     indicate to which port on the remote machine they want to connect.
+     Application-level servers must not fill this field with a '0'.
+     Instead they specify a local port to which clients connect.  It is
+     possible to use a name from '/etc/services' here.
+
+   Experts in network programming will notice that the usual
+client/server asymmetry found at the level of the socket API is not
+visible here.  This is for the sake of simplicity of the high-level
+concept.  If this asymmetry is necessary for your application, use
+another language.  For 'gawk', it is more important to enable users to
+write a client program with a minimum of code.  What happens when first
+accessing a network connection is seen in the following pseudocode:
+
+     if ((name of remote host given) && (other side accepts connection)) {
+       rendez-vous successful; transmit with getline or print
+     } else {
+       if ((other side did not accept) && (localport == 0))
+         exit unsuccessful
+       if (TCP) {
+         set up a server accepting connections
+         this means waiting for the client on the other side to connect
+       } else
+         ready
+     }
+
+   The exact behavior of this algorithm depends on the values of the
+fields of the special file name.  When in doubt, *note Table 2.1:
+table-inet-components. gives you the combinations of values and their
+meaning.  If this table is too complicated, focus on the three lines
+printed in *bold*.  All the examples in *note Networking With 'gawk':
+Using Networking, use only the patterns printed in bold letters.
+
+PROTOCOL    LOCAL       HOST NAME   REMOTE      RESULTING CONNECTION-LEVEL
+            PORT                    PORT        BEHAVIOR
+------------------------------------------------------------------------------
+*tcp*       *0*         *x*         *x*         *Dedicated client, fails if
+                                                immediately connecting to a
+                                                server on the other side
+                                                fails*
+udp         0           x           x           Dedicated client
+*tcp,       *x*         *x*         *x*         *Client, switches to
+udp*                                            dedicated server if
+                                                necessary*
+*tcp,       *x*         *0*         *0*         *Dedicated server*
+udp*
+tcp, udp    x           x           0           Invalid
+tcp, udp    0           0           x           Invalid
+tcp, udp    x           0           x           Invalid
+tcp, udp    0           0           0           Invalid
+tcp, udp    0           x           0           Invalid
+
+Table 2.1: /inet Special File Components
+
+   In general, TCP is the preferred mechanism to use.  It is the
+simplest protocol to understand and to use.  Use UDP only if
+circumstances demand low-overhead.
+
+
+File: gawkinet.info,  Node: Comparing Protocols,  Prev: Special File Fields,  Up: Gawk Special Files
+
+2.1.2 Comparing Protocols
+-------------------------
+
+This node develops a pair of programs (sender and receiver) that do
+nothing but send a timestamp from one machine to another.  The sender
+and the receiver are implemented with each of the two protocols
+available and demonstrate the differences between them.
+
+* Menu:
+
+* File /inet/tcp::              The TCP special file.
+* File /inet/udp::              The UDP special file.
+
+
+File: gawkinet.info,  Node: File /inet/tcp,  Next: File /inet/udp,  Prev: Comparing Protocols,  Up: Comparing Protocols
+
+2.1.2.1 '/inet/tcp'
+...................
+
+Once again, always use TCP. (Use UDP when low overhead is a necessity,
+and use RAW for network experimentation.)  The first example is the
+sender program:
+
+     # Server
+     BEGIN {
+       print strftime() |& "/inet/tcp/8888/0/0"
+       close("/inet/tcp/8888/0/0")
+     }
+
+   The receiver is very simple:
+
+     # Client
+     BEGIN {
+       "/inet/tcp/0/localhost/8888" |& getline
+       print $0
+       close("/inet/tcp/0/localhost/8888")
+     }
+
+   TCP guarantees that the bytes arrive at the receiving end in exactly
+the same order that they were sent.  No byte is lost (except for broken
+connections), doubled, or out of order.  Some overhead is necessary to
+accomplish this, but this is the price to pay for a reliable service.
+It does matter which side starts first.  The sender/server has to be
+started first, and it waits for the receiver to read a line.
+
+
+File: gawkinet.info,  Node: File /inet/udp,  Prev: File /inet/tcp,  Up: Comparing Protocols
+
+2.1.2.2 '/inet/udp'
+...................
+
+The server and client programs that use UDP are almost identical to
+their TCP counterparts; only the PROTOCOL has changed.  As before, it
+does matter which side starts first.  The receiving side blocks and
+waits for the sender.  In this case, the receiver/client has to be
+started first:
+
+     # Server
+     BEGIN {
+       print strftime() |& "/inet/udp/8888/0/0"
+       close("/inet/udp/8888/0/0")
+     }
+
+   The receiver is almost identical to the TCP receiver:
+
+     # Client
+     BEGIN {
+       print "hi!" |& "/inet/udp/0/localhost/8888"
+       "/inet/udp/0/localhost/8888" |& getline
+       print $0
+       close("/inet/udp/0/localhost/8888")
+     }
+
+   In the case of UDP, the initial 'print' command is the one that
+actually sends data so that there is a connection.  UDP and "connection"
+sounds strange to anyone who has learned that UDP is a connectionless
+protocol.  Here, "connection" means that the 'connect()' system call has
+completed its work and completed the "association" between a certain
+socket and an IP address.  Thus there are subtle differences between
+'connect()' for TCP and UDP; see the man page for details.(1)
+
+   UDP cannot guarantee that the datagrams at the receiving end will
+arrive in exactly the same order they were sent.  Some datagrams could
+be lost, some doubled, and some out of order.  But no overhead is
+necessary to accomplish this.  This unreliable behavior is good enough
+for tasks such as data acquisition, logging, and even stateless services
+like the original versions of NFS.
+
+   ---------- Footnotes ----------
+
+   (1) This subtlety is just one of many details that are hidden in the
+socket API, invisible and intractable for the 'gawk' user.  The
+developers are currently considering how to rework the network
+facilities to make them easier to understand and use.
+
+
+File: gawkinet.info,  Node: TCP Connecting,  Next: Troubleshooting,  Prev: Gawk Special Files,  Up: Using Networking
+
+2.2 Establishing a TCP Connection
+=================================
+
+Let's observe a network connection at work.  Type in the following
+program and watch the output.  Within a second, it connects via TCP
+('/inet/tcp') to the machine it is running on ('localhost') and asks the
+service 'daytime' on the machine what time it is:
+
+     BEGIN {
+       "/inet/tcp/0/localhost/daytime" |& getline
+       print $0
+       close("/inet/tcp/0/localhost/daytime")
+     }
+
+   Even experienced 'awk' users will find the second line strange in two
+respects:
+
+   * A special file is used as a shell command that pipes its output
+     into 'getline'.  One would rather expect to see the special file
+     being read like any other file ('getline <
+     "/inet/tcp/0/localhost/daytime")'.
+
+   * The operator '|&' has not been part of any 'awk' implementation
+     (until now).  It is actually the only extension of the 'awk'
+     language needed (apart from the special files) to introduce network
+     access.
+
+   The '|&' operator was introduced in 'gawk' 3.1 in order to overcome
+the crucial restriction that access to files and pipes in 'awk' is
+always unidirectional.  It was formerly impossible to use both access
+modes on the same file or pipe.  Instead of changing the whole concept
+of file access, the '|&' operator behaves exactly like the usual pipe
+operator except for two additions:
+
+   * Normal shell commands connected to their 'gawk' program with a '|&'
+     pipe can be accessed bidirectionally.  The '|&' turns out to be a
+     quite general, useful, and natural extension of 'awk'.
+
+   * Pipes that consist of a special file name for network connections
+     are not executed as shell commands.  Instead, they can be read and
+     written to, just like a full-duplex network connection.
+
+   In the earlier example, the '|&' operator tells 'getline' to read a
+line from the special file '/inet/tcp/0/localhost/daytime'.  We could
+also have printed a line into the special file.  But instead we just
+read a line with the time, printed it, and closed the connection.
+(While we could just let 'gawk' close the connection by finishing the
+program, in this Info file we are pedantic and always explicitly close
+the connections.)
+
+
+File: gawkinet.info,  Node: Troubleshooting,  Next: Interacting,  Prev: TCP Connecting,  Up: Using Networking
+
+2.3 Troubleshooting Connection Problems
+=======================================
+
+It may well be that for some reason the program shown in the previous
+example does not run on your machine.  When looking at possible reasons
+for this, you will learn much about typical problems that arise in
+network programming.  First of all, your implementation of 'gawk' may
+not support network access because it is a pre-3.1 version or you do not
+have a network interface in your machine.  Perhaps your machine uses
+some other protocol, such as DECnet or Novell's IPX. For the rest of
+this major node, we will assume you work on a Unix machine that supports
+TCP/IP. If the previous example program does not run on your machine, it
+may help to replace the name 'localhost' with the name of your machine
+or its IP address.  If it does, you could replace 'localhost' with the
+name of another machine in your vicinity--this way, the program connects
+to another machine.  Now you should see the date and time being printed
+by the program, otherwise your machine may not support the 'daytime'
+service.  Try changing the service to 'chargen' or 'ftp'.  This way, the
+program connects to other services that should give you some response.
+If you are curious, you should have a look at your '/etc/services' file.
+It could look like this:
+
+     # /etc/services:
+     #
+     # Network services, Internet style
+     #
+     # Name     Number/Protocol  Alternate name # Comments
+
+     echo        7/tcp
+     echo        7/udp
+     discard     9/tcp         sink null
+     discard     9/udp         sink null
+     daytime     13/tcp
+     daytime     13/udp
+     chargen     19/tcp        ttytst source
+     chargen     19/udp        ttytst source
+     ftp         21/tcp
+     telnet      23/tcp
+     smtp        25/tcp        mail
+     finger      79/tcp
+     www         80/tcp        http      # WorldWideWeb HTTP
+     www         80/udp        # HyperText Transfer Protocol
+     pop-2       109/tcp       postoffice    # POP version 2
+     pop-2       109/udp
+     pop-3       110/tcp       # POP version 3
+     pop-3       110/udp
+     nntp        119/tcp       readnews untp  # USENET News
+     irc         194/tcp       # Internet Relay Chat
+     irc         194/udp
+     ...
+
+   Here, you find a list of services that traditional Unix machines
+usually support.  If your GNU/Linux machine does not do so, it may be
+that these services are switched off in some startup script.  Systems
+running some flavor of Microsoft Windows usually do _not_ support these
+services.  Nevertheless, it _is_ possible to do networking with 'gawk'
+on Microsoft Windows.(1)  The first column of the file gives the name of
+the service, and the second column gives a unique number and the
+protocol that one can use to connect to this service.  The rest of the
+line is treated as a comment.  You see that some services ('echo')
+support TCP as well as UDP.
+
+   ---------- Footnotes ----------
+
+   (1) Microsoft preferred to ignore the TCP/IP family of protocols
+until 1995.  Then came the rise of the Netscape browser as a landmark
+"killer application."  Microsoft added TCP/IP support and their own
+browser to Microsoft Windows 95 at the last minute.  They even
+back-ported their TCP/IP implementation to Microsoft Windows for
+Workgroups 3.11, but it was a rather rudimentary and half-hearted
+implementation.  Nevertheless, the equivalent of '/etc/services' resides
+under 'C:\WINNT\system32\drivers\etc\services' on Microsoft Windows 2000
+and Microsoft Windows XP.
+
+
+File: gawkinet.info,  Node: Interacting,  Next: Setting Up,  Prev: Troubleshooting,  Up: Using Networking
+
+2.4 Interacting with a Network Service
+======================================
+
+The next program makes use of the possibility to really interact with a
+network service by printing something into the special file.  It asks
+the so-called 'finger' service if a user of the machine is logged in.
+When testing this program, try to change 'localhost' to some other
+machine name in your local network:
+
+     BEGIN {
+       NetService = "/inet/tcp/0/localhost/finger"
+       print "NAME" |& NetService
+       while ((NetService |& getline) > 0)
+         print $0
+       close(NetService)
+     }
+
+   After telling the service on the machine which user to look for, the
+program repeatedly reads lines that come as a reply.  When no more lines
+are coming (because the service has closed the connection), the program
+also closes the connection.  Try replacing '"NAME"' with your login name
+(or the name of someone else logged in).  For a list of all users
+currently logged in, replace NAME with an empty string ('""').
+
+   The final 'close()' command could be safely deleted from the above
+script, because the operating system closes any open connection by
+default when a script reaches the end of execution.  In order to avoid
+portability problems, it is best to always close connections explicitly.
+With the Linux kernel, for example, proper closing results in flushing
+of buffers.  Letting the close happen by default may result in
+discarding buffers.
+
+   When looking at '/etc/services' you may have noticed that the
+'daytime' service is also available with 'udp'.  In the earlier example,
+change 'tcp' to 'udp', and change 'finger' to 'daytime'.  After starting
+the modified program, you see the expected day and time message.  The
+program then hangs, because it waits for more lines coming from the
+service.  However, they never come.  This behavior is a consequence of
+the differences between TCP and UDP. When using UDP, neither party is
+automatically informed about the other closing the connection.
+Continuing to experiment this way reveals many other subtle differences
+between TCP and UDP. To avoid such trouble, one should always remember
+the advice Douglas E. Comer and David Stevens give in Volume III of
+their series 'Internetworking With TCP' (page 14):
+
+     When designing client-server applications, beginners are strongly
+     advised to use TCP because it provides reliable,
+     connection-oriented communication.  Programs only use UDP if the
+     application protocol handles reliability, the application requires
+     hardware broadcast or multicast, or the application cannot tolerate
+     virtual circuit overhead.
+
+
+File: gawkinet.info,  Node: Setting Up,  Next: Email,  Prev: Interacting,  Up: Using Networking
+
+2.5 Setting Up a Service
+========================
+
+The preceding programs behaved as clients that connect to a server
+somewhere on the Internet and request a particular service.  Now we set
+up such a service to mimic the behavior of the 'daytime' service.  Such
+a server does not know in advance who is going to connect to it over the
+network.  Therefore, we cannot insert a name for the host to connect to
+in our special file name.
+
+   Start the following program in one window.  Notice that the service
+does not have the name 'daytime', but the number '8888'.  From looking
+at '/etc/services', you know that names like 'daytime' are just
+mnemonics for predetermined 16-bit integers.  Only the system
+administrator ('root') could enter our new service into '/etc/services'
+with an appropriate name.  Also notice that the service name has to be
+entered into a different field of the special file name because we are
+setting up a server, not a client:
+
+     BEGIN {
+       print strftime() |& "/inet/tcp/8888/0/0"
+       close("/inet/tcp/8888/0/0")
+     }
+
+   Now open another window on the same machine.  Copy the client program
+given as the first example (*note Establishing a TCP Connection: TCP
+Connecting.) to a new file and edit it, changing the name 'daytime' to
+'8888'.  Then start the modified client.  You should get a reply like
+this:
+
+     Sat Sep 27 19:08:16 CEST 1997
+
+Both programs explicitly close the connection.
+
+   Now we will intentionally make a mistake to see what happens when the
+name '8888' (the so-called port) is already used by another service.
+Start the server program in both windows.  The first one works, but the
+second one complains that it could not open the connection.  Each port
+on a single machine can only be used by one server program at a time.
+Now terminate the server program and change the name '8888' to 'echo'.
+After restarting it, the server program does not run any more, and you
+know why: there is already an 'echo' service running on your machine.
+But even if this isn't true, you would not get your own 'echo' server
+running on a Unix machine, because the ports with numbers smaller than
+1024 ('echo' is at port 7) are reserved for 'root'.  On machines running
+some flavor of Microsoft Windows, there is no restriction that reserves
+ports 1 to 1024 for a privileged user; hence, you can start an 'echo'
+server there.
+
+   Turning this short server program into something really useful is
+simple.  Imagine a server that first reads a file name from the client
+through the network connection, then does something with the file and
+sends a result back to the client.  The server-side processing could be:
+
+     BEGIN {
+       NetService = "/inet/tcp/8888/0/0"
+       NetService |& getline
+       CatPipe    = ("cat " $1)    # sets $0 and the fields
+       while ((CatPipe | getline) > 0)
+         print $0 |& NetService
+       close(NetService)
+     }
+
+and we would have a remote copying facility.  Such a server reads the
+name of a file from any client that connects to it and transmits the
+contents of the named file across the net.  The server-side processing
+could also be the execution of a command that is transmitted across the
+network.  From this example, you can see how simple it is to open up a
+security hole on your machine.  If you allow clients to connect to your
+machine and execute arbitrary commands, anyone would be free to do 'rm
+-rf *'.
+
+
+File: gawkinet.info,  Node: Email,  Next: Web page,  Prev: Setting Up,  Up: Using Networking
+
+2.6 Reading Email
+=================
+
+The distribution of email is usually done by dedicated email servers
+that communicate with your machine using special protocols.  To receive
+email, we will use the Post Office Protocol (POP). Sending can be done
+with the much older Simple Mail Transfer Protocol (SMTP).
+
+   When you type in the following program, replace the EMAILHOST by the
+name of your local email server.  Ask your administrator if the server
+has a POP service, and then use its name or number in the program below.
+Now the program is ready to connect to your email server, but it will
+not succeed in retrieving your mail because it does not yet know your
+login name or password.  Replace them in the program and it shows you
+the first email the server has in store:
+
+     BEGIN {
+       POPService  = "/inet/tcp/0/EMAILHOST/pop3"
+       RS = ORS = "\r\n"
+       print "user NAME"            |& POPService
+       POPService                    |& getline
+       print "pass PASSWORD"         |& POPService
+       POPService                    |& getline
+       print "retr 1"                |& POPService
+       POPService                    |& getline
+       if ($1 != "+OK") exit
+       print "quit"                  |& POPService
+       RS = "\r\n\\.\r\n"
+       POPService |& getline
+       print $0
+       close(POPService)
+     }
+
+   The record separators 'RS' and 'ORS' are redefined because the
+protocol (POP) requires CR-LF to separate lines.  After identifying
+yourself to the email service, the command 'retr 1' instructs the
+service to send the first of all your email messages in line.  If the
+service replies with something other than '+OK', the program exits;
+maybe there is no email.  Otherwise, the program first announces that it
+intends to finish reading email, and then redefines 'RS' in order to
+read the entire email as multiline input in one record.  From the POP
+RFC, we know that the body of the email always ends with a single line
+containing a single dot.  The program looks for this using 'RS =
+"\r\n\\.\r\n"'.  When it finds this sequence in the mail message, it
+quits.  You can invoke this program as often as you like; it does not
+delete the message it reads, but instead leaves it on the server.
+
+
+File: gawkinet.info,  Node: Web page,  Next: Primitive Service,  Prev: Email,  Up: Using Networking
+
+2.7 Reading a Web Page
+======================
+
+Retrieving a web page from a web server is as simple as retrieving email
+from an email server.  We only have to use a similar, but not identical,
+protocol and a different port.  The name of the protocol is HyperText
+Transfer Protocol (HTTP) and the port number is usually 80.  As in the
+preceding node, ask your administrator about the name of your local web
+server or proxy web server and its port number for HTTP requests.
+
+   The following program employs a rather crude approach toward
+retrieving a web page.  It uses the prehistoric syntax of HTTP 0.9,
+which almost all web servers still support.  The most noticeable thing
+about it is that the program directs the request to the local proxy
+server whose name you insert in the special file name (which in turn
+calls 'www.yahoo.com'):
+
+     BEGIN {
+       RS = ORS = "\r\n"
+       HttpService = "/inet/tcp/0/PROXY/80"
+       print "GET http://www.yahoo.com"     |& HttpService
+       while ((HttpService |& getline) > 0)
+          print $0
+       close(HttpService)
+     }
+
+   Again, lines are separated by a redefined 'RS' and 'ORS'.  The 'GET'
+request that we send to the server is the only kind of HTTP request that
+existed when the web was created in the early 1990s.  HTTP calls this
+'GET' request a "method," which tells the service to transmit a web page
+(here the home page of the Yahoo!  search engine).  Version 1.0 added
+the request methods 'HEAD' and 'POST'.  The current version of HTTP is
+1.1,(1) and knows the additional request methods 'OPTIONS', 'PUT',
+'DELETE', and 'TRACE'.  You can fill in any valid web address, and the
+program prints the HTML code of that page to your screen.
+
+   Notice the similarity between the responses of the POP and HTTP
+services.  First, you get a header that is terminated by an empty line,
+and then you get the body of the page in HTML. The lines of the headers
+also have the same form as in POP. There is the name of a parameter,
+then a colon, and finally the value of that parameter.
+
+   Images ('.png' or '.gif' files) can also be retrieved this way, but
+then you get binary data that should be redirected into a file.  Another
+application is calling a CGI (Common Gateway Interface) script on some
+server.  CGI scripts are used when the contents of a web page are not
+constant, but generated instantly at the moment you send a request for
+the page.  For example, to get a detailed report about the current
+quotes of Motorola stock shares, call a CGI script at Yahoo!  with the
+following:
+
+     get = "GET http://quote.yahoo.com/q?s=MOT&d=t"
+     print get |& HttpService
+
+   You can also request weather reports this way.
+
+   ---------- Footnotes ----------
+
+   (1) Version 1.0 of HTTP was defined in RFC 1945.  HTTP 1.1 was
+initially specified in RFC 2068.  In June 1999, RFC 2068 was made
+obsolete by RFC 2616, an update without any substantial changes.
+
+
+File: gawkinet.info,  Node: Primitive Service,  Next: Interacting Service,  Prev: Web page,  Up: Using Networking
+
+2.8 A Primitive Web Service
+===========================
+
+Now we know enough about HTTP to set up a primitive web service that
+just says '"Hello, world"' when someone connects to it with a browser.
+Compared to the situation in the preceding node, our program changes the
+role.  It tries to behave just like the server we have observed.  Since
+we are setting up a server here, we have to insert the port number in
+the 'localport' field of the special file name.  The other two fields
+(HOSTNAME and REMOTEPORT) have to contain a '0' because we do not know
+in advance which host will connect to our service.
+
+   In the early 1990s, all a server had to do was send an HTML document
+and close the connection.  Here, we adhere to the modern syntax of HTTP.
+The steps are as follows:
+
+  1. Send a status line telling the web browser that everything is okay.
+
+  2. Send a line to tell the browser how many bytes follow in the body
+     of the message.  This was not necessary earlier because both
+     parties knew that the document ended when the connection closed.
+     Nowadays it is possible to stay connected after the transmission of
+     one web page.  This is to avoid the network traffic necessary for
+     repeatedly establishing TCP connections for requesting several
+     images.  Thus, there is the need to tell the receiving party how
+     many bytes will be sent.  The header is terminated as usual with an
+     empty line.
+
+  3. Send the '"Hello, world"' body in HTML. The useless 'while' loop
+     swallows the request of the browser.  We could actually omit the
+     loop, and on most machines the program would still work.  First,
+     start the following program:
+
+     BEGIN {
+       RS = ORS = "\r\n"
+       HttpService = "/inet/tcp/8080/0/0"
+       Hello = "<HTML><HEAD>" \
+               "<TITLE>A Famous Greeting</TITLE></HEAD>" \
+               "<BODY><H1>Hello, world</H1></BODY></HTML>"
+       Len = length(Hello) + length(ORS)
+       print "HTTP/1.0 200 OK"          |& HttpService
+       print "Content-Length: " Len ORS |& HttpService
+       print Hello                      |& HttpService
+       while ((HttpService |& getline) > 0)
+          continue;
+       close(HttpService)
+     }
+
+   Now, on the same machine, start your favorite browser and let it
+point to <http://localhost:8080> (the browser needs to know on which
+port our server is listening for requests).  If this does not work, the
+browser probably tries to connect to a proxy server that does not know
+your machine.  If so, change the browser's configuration so that the
+browser does not try to use a proxy to connect to your machine.
+
+
+File: gawkinet.info,  Node: Interacting Service,  Next: Simple Server,  Prev: Primitive Service,  Up: Using Networking
+
+2.9 A Web Service with Interaction
+==================================
+
+This node shows how to set up a simple web server.  The subnode is a
+library file that we will use with all the examples in *note Some
+Applications and Techniques::.
+
+* Menu:
+
+* CGI Lib::                     A simple CGI library.
+
+   Setting up a web service that allows user interaction is more
+difficult and shows us the limits of network access in 'gawk'.  In this
+node, we develop a main program (a 'BEGIN' pattern and its action) that
+will become the core of event-driven execution controlled by a graphical
+user interface (GUI). Each HTTP event that the user triggers by some
+action within the browser is received in this central procedure.
+Parameters and menu choices are extracted from this request, and an
+appropriate measure is taken according to the user's choice.  For
+example:
+
+     BEGIN {
+       if (MyHost == "") {
+          "uname -n" | getline MyHost
+          close("uname -n")
+       }
+       if (MyPort ==  0) MyPort = 8080
+       HttpService = "/inet/tcp/" MyPort "/0/0"
+       MyPrefix    = "http://" MyHost ":" MyPort
+       SetUpServer()
+       while ("awk" != "complex") {
+         # header lines are terminated this way
+         RS = ORS = "\r\n"
+         Status   = 200          # this means OK
+         Reason   = "OK"
+         Header   = TopHeader
+         Document = TopDoc
+         Footer   = TopFooter
+         if        (GETARG["Method"] == "GET") {
+             HandleGET()
+         } else if (GETARG["Method"] == "HEAD") {
+             # not yet implemented
+         } else if (GETARG["Method"] != "") {
+             print "bad method", GETARG["Method"]
+         }
+         Prompt = Header Document Footer
+         print "HTTP/1.0", Status, Reason       |& HttpService
+         print "Connection: Close"              |& HttpService
+         print "Pragma: no-cache"               |& HttpService
+         len = length(Prompt) + length(ORS)
+         print "Content-length:", len           |& HttpService
+         print ORS Prompt                       |& HttpService
+         # ignore all the header lines
+         while ((HttpService |& getline) > 0)
+             ;
+         # stop talking to this client
+         close(HttpService)
+         # wait for new client request
+         HttpService |& getline
+         # do some logging
+         print systime(), strftime(), $0
+         # read request parameters
+         CGI_setup($1, $2, $3)
+       }
+     }
+
+   This web server presents menu choices in the form of HTML links.
+Therefore, it has to tell the browser the name of the host it is
+residing on.  When starting the server, the user may supply the name of
+the host from the command line with 'gawk -v MyHost="Rumpelstilzchen"'.
+If the user does not do this, the server looks up the name of the host
+it is running on for later use as a web address in HTML documents.  The
+same applies to the port number.  These values are inserted later into
+the HTML content of the web pages to refer to the home system.
+
+   Each server that is built around this core has to initialize some
+application-dependent variables (such as the default home page) in a
+procedure 'SetUpServer()', which is called immediately before entering
+the infinite loop of the server.  For now, we will write an instance
+that initiates a trivial interaction.  With this home page, the client
+user can click on two possible choices, and receive the current date
+either in human-readable format or in seconds since 1970:
+
+     function SetUpServer() {
+       TopHeader = "<HTML><HEAD>"
+       TopHeader = TopHeader \
+          "<title>My name is GAWK, GNU AWK</title></HEAD>"
+       TopDoc    = "<BODY><h2>\
+         Do you prefer your date <A HREF=" MyPrefix \
+         "/human>human</A> or \
+         <A HREF=" MyPrefix "/POSIX>POSIXed</A>?</h2>" ORS ORS
+       TopFooter = "</BODY></HTML>"
+     }
+
+   On the first run through the main loop, the default line terminators
+are set and the default home page is copied to the actual home page.
+Since this is the first run, 'GETARG["Method"]' is not initialized yet,
+hence the case selection over the method does nothing.  Now that the
+home page is initialized, the server can start communicating to a client
+browser.
+
+   It does so by printing the HTTP header into the network connection
+('print ... |& HttpService').  This command blocks execution of the
+server script until a client connects.  If this server script is
+compared with the primitive one we wrote before, you will notice two
+additional lines in the header.  The first instructs the browser to
+close the connection after each request.  The second tells the browser
+that it should never try to _remember_ earlier requests that had
+identical web addresses (no caching).  Otherwise, it could happen that
+the browser retrieves the time of day in the previous example just once,
+and later it takes the web page from the cache, always displaying the
+same time of day although time advances each second.
+
+   Having supplied the initial home page to the browser with a valid
+document stored in the parameter 'Prompt', it closes the connection and
+waits for the next request.  When the request comes, a log line is
+printed that allows us to see which request the server receives.  The
+final step in the loop is to call the function 'CGI_setup()', which
+reads all the lines of the request (coming from the browser), processes
+them, and stores the transmitted parameters in the array 'PARAM'.  The
+complete text of these application-independent functions can be found in
+*note A Simple CGI Library: CGI Lib.  For now, we use a simplified
+version of 'CGI_setup()':
+
+     function CGI_setup(   method, uri, version, i) {
+       delete GETARG;         delete MENU;        delete PARAM
+       GETARG["Method"] = $1
+       GETARG["URI"] = $2
+       GETARG["Version"] = $3
+       i = index($2, "?")
+       # is there a "?" indicating a CGI request?
+       if (i > 0) {
+         split(substr($2, 1, i-1), MENU, "[/:]")
+         split(substr($2, i+1), PARAM, "&")
+         for (i in PARAM) {
+           j = index(PARAM[i], "=")
+           GETARG[substr(PARAM[i], 1, j-1)] = \
+                                       substr(PARAM[i], j+1)
+         }
+       } else {    # there is no "?", no need for splitting PARAMs
+         split($2, MENU, "[/:]")
+       }
+     }
+
+   At first, the function clears all variables used for global storage
+of request parameters.  The rest of the function serves the purpose of
+filling the global parameters with the extracted new values.  To
+accomplish this, the name of the requested resource is split into parts
+and stored for later evaluation.  If the request contains a '?', then
+the request has CGI variables seamlessly appended to the web address.
+Everything in front of the '?' is split up into menu items, and
+everything behind the '?' is a list of 'VARIABLE=VALUE' pairs (separated
+by '&') that also need splitting.  This way, CGI variables are isolated
+and stored.  This procedure lacks recognition of special characters that
+are transmitted in coded form(1).  Here, any optional request header and
+body parts are ignored.  We do not need header parameters and the
+request body.  However, when refining our approach or working with the
+'POST' and 'PUT' methods, reading the header and body becomes
+inevitable.  Header parameters should then be stored in a global array
+as well as the body.
+
+   On each subsequent run through the main loop, one request from a
+browser is received, evaluated, and answered according to the user's
+choice.  This can be done by letting the value of the HTTP method guide
+the main loop into execution of the procedure 'HandleGET()', which
+evaluates the user's choice.  In this case, we have only one
+hierarchical level of menus, but in the general case, menus are nested.
+The menu choices at each level are separated by '/', just as in file
+names.  Notice how simple it is to construct menus of arbitrary depth:
+
+     function HandleGET() {
+       if (       MENU[2] == "human") {
+         Footer = strftime() TopFooter
+       } else if (MENU[2] == "POSIX") {
+         Footer = systime()  TopFooter
+       }
+     }
+
+   The disadvantage of this approach is that our server is slow and can
+handle only one request at a time.  Its main advantage, however, is that
+the server consists of just one 'gawk' program.  No need for installing
+an 'httpd', and no need for static separate HTML files, CGI scripts, or
+'root' privileges.  This is rapid prototyping.  This program can be
+started on the same host that runs your browser.  Then let your browser
+point to <http://localhost:8080>.
+
+   It is also possible to include images into the HTML pages.  Most
+browsers support the not very well-known '.xbm' format, which may
+contain only monochrome pictures but is an ASCII format.  Binary images
+are possible but not so easy to handle.  Another way of including images
+is to generate them with a tool such as GNUPlot, by calling the tool
+with the 'system()' function or through a pipe.
+
+   ---------- Footnotes ----------
+
+   (1) As defined in RFC 2068.
+
+
+File: gawkinet.info,  Node: CGI Lib,  Prev: Interacting Service,  Up: Interacting Service
+
+2.9.1 A Simple CGI Library
+--------------------------
+
+     HTTP is like being married: you have to be able to handle whatever
+     you're given, while being very careful what you send back.
+     Phil Smith III,
+     <http://www.netfunny.com/rhf/jokes/99/Mar/http.html>
+
+   In *note A Web Service with Interaction: Interacting Service, we saw
+the function 'CGI_setup()' as part of the web server "core logic"
+framework.  The code presented there handles almost everything necessary
+for CGI requests.  One thing it doesn't do is handle encoded characters
+in the requests.  For example, an '&' is encoded as a percent sign
+followed by the hexadecimal value: '%26'.  These encoded values should
+be decoded.  Following is a simple library to perform these tasks.  This
+code is used for all web server examples used throughout the rest of
+this Info file.  If you want to use it for your own web server, store
+the source code into a file named 'inetlib.awk'.  Then you can include
+these functions into your code by placing the following statement into
+your program (on the first line of your script):
+
+     @include inetlib.awk
+
+But beware, this mechanism is only possible if you invoke your web
+server script with 'igawk' instead of the usual 'awk' or 'gawk'.  Here
+is the code:
+
+     # CGI Library and core of a web server
+     # Global arrays
+     #   GETARG --- arguments to CGI GET command
+     #   MENU   --- menu items (path names)
+     #   PARAM  --- parameters of form x=y
+
+     # Optional variable MyHost contains host address
+     # Optional variable MyPort contains port number
+     # Needs TopHeader, TopDoc, TopFooter
+     # Sets MyPrefix, HttpService, Status, Reason
+
+     BEGIN {
+       if (MyHost == "") {
+          "uname -n" | getline MyHost
+          close("uname -n")
+       }
+       if (MyPort ==  0) MyPort = 8080
+       HttpService = "/inet/tcp/" MyPort "/0/0"
+       MyPrefix    = "http://" MyHost ":" MyPort
+       SetUpServer()
+       while ("awk" != "complex") {
+         # header lines are terminated this way
+         RS = ORS    = "\r\n"
+         Status      = 200             # this means OK
+         Reason      = "OK"
+         Header      = TopHeader
+         Document    = TopDoc
+         Footer      = TopFooter
+         if        (GETARG["Method"] == "GET") {
+             HandleGET()
+         } else if (GETARG["Method"] == "HEAD") {
+             # not yet implemented
+         } else if (GETARG["Method"] != "") {
+             print "bad method", GETARG["Method"]
+         }
+         Prompt = Header Document Footer
+         print "HTTP/1.0", Status, Reason     |& HttpService
+         print "Connection: Close"            |& HttpService
+         print "Pragma: no-cache"             |& HttpService
+         len = length(Prompt) + length(ORS)
+         print "Content-length:", len         |& HttpService
+         print ORS Prompt                     |& HttpService
+         # ignore all the header lines
+         while ((HttpService |& getline) > 0)
+             continue
+         # stop talking to this client
+         close(HttpService)
+         # wait for new client request
+         HttpService |& getline
+         # do some logging
+         print systime(), strftime(), $0
+         CGI_setup($1, $2, $3)
+       }
+     }
+
+     function CGI_setup(   method, uri, version, i)
+     {
+         delete GETARG
+         delete MENU
+         delete PARAM
+         GETARG["Method"] = method
+         GETARG["URI"] = uri
+         GETARG["Version"] = version
+
+         i = index(uri, "?")
+         if (i > 0) {  # is there a "?" indicating a CGI request?
+             split(substr(uri, 1, i-1), MENU, "[/:]")
+             split(substr(uri, i+1), PARAM, "&")
+             for (i in PARAM) {
+                 PARAM[i] = _CGI_decode(PARAM[i])
+                 j = index(PARAM[i], "=")
+                 GETARG[substr(PARAM[i], 1, j-1)] = \
+                                              substr(PARAM[i], j+1)
+             }
+         } else { # there is no "?", no need for splitting PARAMs
+             split(uri, MENU, "[/:]")
+         }
+         for (i in MENU)     # decode characters in path
+             if (i > 4)      # but not those in host name
+                 MENU[i] = _CGI_decode(MENU[i])
+     }
+
+   This isolates details in a single function, 'CGI_setup()'.  Decoding
+of encoded characters is pushed off to a helper function,
+'_CGI_decode()'.  The use of the leading underscore ('_') in the
+function name is intended to indicate that it is an "internal" function,
+although there is nothing to enforce this:
+
+     function _CGI_decode(str,   hexdigs, i, pre, code1, code2,
+                                 val, result)
+     {
+        hexdigs = "123456789abcdef"
+
+        i = index(str, "%")
+        if (i == 0) # no work to do
+           return str
+
+        do {
+           pre = substr(str, 1, i-1)   # part before %xx
+           code1 = substr(str, i+1, 1) # first hex digit
+           code2 = substr(str, i+2, 1) # second hex digit
+           str = substr(str, i+3)      # rest of string
+
+           code1 = tolower(code1)
+           code2 = tolower(code2)
+           val = index(hexdigs, code1) * 16 \
+                 + index(hexdigs, code2)
+
+           result = result pre sprintf("%c", val)
+           i = index(str, "%")
+        } while (i != 0)
+        if (length(str) > 0)
+           result = result str
+        return result
+     }
+
+   This works by splitting the string apart around an encoded character.
+The two digits are converted to lowercase characters and looked up in a
+string of hex digits.  Note that '0' is not in the string on purpose;
+'index()' returns zero when it's not found, automatically giving the
+correct value!  Once the hexadecimal value is converted from characters
+in a string into a numerical value, 'sprintf()' converts the value back
+into a real character.  The following is a simple test harness for the
+above functions:
+
+     BEGIN {
+       CGI_setup("GET",
+       "http://www.gnu.org/cgi-bin/foo?p1=stuff&p2=stuff%26junk" \
+            "&percent=a %25 sign",
+       "1.0")
+       for (i in MENU)
+           printf "MENU[\"%s\"] = %s\n", i, MENU[i]
+       for (i in PARAM)
+           printf "PARAM[\"%s\"] = %s\n", i, PARAM[i]
+       for (i in GETARG)
+           printf "GETARG[\"%s\"] = %s\n", i, GETARG[i]
+     }
+
+   And this is the result when we run it:
+
+     $ gawk -f testserv.awk
+     -| MENU["4"] = www.gnu.org
+     -| MENU["5"] = cgi-bin
+     -| MENU["6"] = foo
+     -| MENU["1"] = http
+     -| MENU["2"] =
+     -| MENU["3"] =
+     -| PARAM["1"] = p1=stuff
+     -| PARAM["2"] = p2=stuff&junk
+     -| PARAM["3"] = percent=a % sign
+     -| GETARG["p1"] = stuff
+     -| GETARG["percent"] = a % sign
+     -| GETARG["p2"] = stuff&junk
+     -| GETARG["Method"] = GET
+     -| GETARG["Version"] = 1.0
+     -| GETARG["URI"] = http://www.gnu.org/cgi-bin/foo?p1=stuff&
+     p2=stuff%26junk&percent=a %25 sign
+
+
+File: gawkinet.info,  Node: Simple Server,  Next: Caveats,  Prev: Interacting Service,  Up: Using Networking
+
+2.10 A Simple Web Server
+========================
+
+In the preceding node, we built the core logic for event-driven GUIs.
+In this node, we finally extend the core to a real application.  No one
+would actually write a commercial web server in 'gawk', but it is
+instructive to see that it is feasible in principle.
+
+   The application is ELIZA, the famous program by Joseph Weizenbaum
+that mimics the behavior of a professional psychotherapist when talking
+to you.  Weizenbaum would certainly object to this description, but this
+is part of the legend around ELIZA. Take the site-independent core logic
+and append the following code:
+
+     function SetUpServer() {
+       SetUpEliza()
+       TopHeader = \
+         "<HTML><title>An HTTP-based System with GAWK</title>\
+         <HEAD><META HTTP-EQUIV=\"Content-Type\"\
+         CONTENT=\"text/html; charset=iso-8859-1\"></HEAD>\
+         <BODY BGCOLOR=\"#ffffff\" TEXT=\"#000000\"\
+         LINK=\"#0000ff\" VLINK=\"#0000ff\"\
+         ALINK=\"#0000ff\"> <A NAME=\"top\">"
+       TopDoc    = "\
+        <h2>Please choose one of the following actions:</h2>\
+        <UL>\
+        <LI>\
+        <A HREF=" MyPrefix "/AboutServer>About this server</A>\
+        </LI><LI>\
+        <A HREF=" MyPrefix "/AboutELIZA>About Eliza</A></LI>\
+        <LI>\
+        <A HREF=" MyPrefix \
+           "/StartELIZA>Start talking to Eliza</A></LI></UL>"
+       TopFooter = "</BODY></HTML>"
+     }
+
+   'SetUpServer()' is similar to the previous example, except for
+calling another function, 'SetUpEliza()'.  This approach can be used to
+implement other kinds of servers.  The only changes needed to do so are
+hidden in the functions 'SetUpServer()' and 'HandleGET()'.  Perhaps it
+might be necessary to implement other HTTP methods.  The 'igawk' program
+that comes with 'gawk' may be useful for this process.
+
+   When extending this example to a complete application, the first
+thing to do is to implement the function 'SetUpServer()' to initialize
+the HTML pages and some variables.  These initializations determine the
+way your HTML pages look (colors, titles, menu items, etc.).
+
+   The function 'HandleGET()' is a nested case selection that decides
+which page the user wants to see next.  Each nesting level refers to a
+menu level of the GUI. Each case implements a certain action of the
+menu.  On the deepest level of case selection, the handler essentially
+knows what the user wants and stores the answer into the variable that
+holds the HTML page contents:
+
+     function HandleGET() {
+       # A real HTTP server would treat some parts of the URI as a file name.
+       # We take parts of the URI as menu choices and go on accordingly.
+       if(MENU[2] == "AboutServer") {
+         Document    = "This is not a CGI script.\
+           This is an httpd, an HTML file, and a CGI script all \
+           in one GAWK script. It needs no separate www-server, \
+           no installation, and no root privileges.\
+           <p>To run it, do this:</p><ul>\
+           <li> start this script with \"gawk -f httpserver.awk\",</li>\
+           <li> and on the same host let your www browser open location\
+                \"http://localhost:8080\"</li>\
+           </ul>\<p>\ Details of HTTP come from:</p><ul>\
+                 <li>Hethmon:  Illustrated Guide to HTTP</p>\
+                 <li>RFC 2068</li></ul><p>JK 14.9.1997</p>"
+       } else if (MENU[2] == "AboutELIZA") {
+         Document    = "This is an implementation of the famous ELIZA\
+             program by Joseph Weizenbaum. It is written in GAWK and\
+             uses an HTML GUI."
+       } else if (MENU[2] == "StartELIZA") {
+         gsub(/\+/, " ", GETARG["YouSay"])
+         # Here we also have to substitute coded special characters
+         Document    = "<form method=GET>" \
+           "<h3>" ElizaSays(GETARG["YouSay"]) "</h3>\
+           <p><input type=text name=YouSay value=\"\" size=60>\
+           <br><input type=submit value=\"Tell her about it\"></p></form>"
+       }
+     }
+
+   Now we are down to the heart of ELIZA, so you can see how it works.
+Initially the user does not say anything; then ELIZA resets its money
+counter and asks the user to tell what comes to mind open heartedly.
+The subsequent answers are converted to uppercase characters and stored
+for later comparison.  ELIZA presents the bill when being confronted
+with a sentence that contains the phrase "shut up."  Otherwise, it looks
+for keywords in the sentence, conjugates the rest of the sentence,
+remembers the keyword for later use, and finally selects an answer from
+the set of possible answers:
+
+     function ElizaSays(YouSay) {
+       if (YouSay == "") {
+         cost = 0
+         answer = "HI, IM ELIZA, TELL ME YOUR PROBLEM"
+       } else {
+         q = toupper(YouSay)
+         gsub("'", "", q)
+         if(q == qold) {
+           answer = "PLEASE DONT REPEAT YOURSELF !"
+         } else {
+           if (index(q, "SHUT UP") > 0) {
+             answer = "WELL, PLEASE PAY YOUR BILL. ITS EXACTLY ... $"\
+                      int(100*rand()+30+cost/100)
+           } else {
+             qold = q
+             w = "-"                 # no keyword recognized yet
+             for (i in k) {          # search for keywords
+               if (index(q, i) > 0) {
+                 w = i
+                 break
+               }
+             }
+             if (w == "-") {         # no keyword, take old subject
+               w    = wold
+               subj = subjold
+             } else {                # find subject
+               subj = substr(q, index(q, w) + length(w)+1)
+               wold = w
+               subjold = subj        #  remember keyword and subject
+             }
+             for (i in conj)
+                gsub(i, conj[i], q)   # conjugation
+             # from all answers to this keyword, select one randomly
+             answer = r[indices[int(split(k[w], indices) * rand()) + 1]]
+             # insert subject into answer
+             gsub("_", subj, answer)
+           }
+         }
+       }
+       cost += length(answer) # for later payment : 1 cent per character
+       return answer
+     }
+
+   In the long but simple function 'SetUpEliza()', you can see tables
+for conjugation, keywords, and answers.(1)  The associative array 'k'
+contains indices into the array of answers 'r'.  To choose an answer,
+ELIZA just picks an index randomly:
+
+     function SetUpEliza() {
+       srand()
+       wold = "-"
+       subjold = " "
+
+       # table for conjugation
+       conj[" ARE "     ] = " AM "
+       conj["WERE "     ] = "WAS "
+       conj[" YOU "     ] = " I "
+       conj["YOUR "     ] = "MY "
+       conj[" IVE "     ] =\
+       conj[" I HAVE "  ] = " YOU HAVE "
+       conj[" YOUVE "   ] =\
+       conj[" YOU HAVE "] = " I HAVE "
+       conj[" IM "      ] =\
+       conj[" I AM "    ] = " YOU ARE "
+       conj[" YOURE "   ] =\
+       conj[" YOU ARE " ] = " I AM "
+
+       # table of all answers
+       r[1]   = "DONT YOU BELIEVE THAT I CAN  _"
+       r[2]   = "PERHAPS YOU WOULD LIKE TO BE ABLE TO _ ?"
+       ...
+
+       # table for looking up answers that
+       # fit to a certain keyword
+       k["CAN YOU"]      = "1 2 3"
+       k["CAN I"]        = "4 5"
+       k["YOU ARE"]      =\
+       k["YOURE"]        = "6 7 8 9"
+       ...
+     }
+
+   Some interesting remarks and details (including the original source
+code of ELIZA) are found on Mark Humphrys' home page.  Yahoo!  also has
+a page with a collection of ELIZA-like programs.  Many of them are
+written in Java, some of them disclosing the Java source code, and a few
+even explain how to modify the Java source code.
+
+   ---------- Footnotes ----------
+
+   (1) The version shown here is abbreviated.  The full version comes
+with the 'gawk' distribution.
+
+
+File: gawkinet.info,  Node: Caveats,  Next: Challenges,  Prev: Simple Server,  Up: Using Networking
+
+2.11 Network Programming Caveats
+================================
+
+By now it should be clear that debugging a networked application is more
+complicated than debugging a single-process single-hosted application.
+The behavior of a networked application sometimes looks noncausal
+because it is not reproducible in a strong sense.  Whether a network
+application works or not sometimes depends on the following:
+
+   * How crowded the underlying network is
+
+   * If the party at the other end is running or not
+
+   * The state of the party at the other end
+
+   The most difficult problems for a beginner arise from the hidden
+states of the underlying network.  After closing a TCP connection, it's
+often necessary to wait a short while before reopening the connection.
+Even more difficult is the establishment of a connection that previously
+ended with a "broken pipe."  Those connections have to "time out" for a
+minute or so before they can reopen.  Check this with the command
+'netstat -a', which provides a list of still "active" connections.
+
+
+File: gawkinet.info,  Node: Challenges,  Prev: Caveats,  Up: Using Networking
+
+2.12 Where To Go From Here
+==========================
+
+Now, you have learned enough to build your own application.  You could,
+for example, take part in the Loebner Contest to win $100,000.  The
+Loebner Prize is the first formal instantiation of a Turing Test.  Hugh
+Loebner agreed with The Cambridge Center for Behavioral Studies to
+underwrite a contest designed to implement the Turing Test.  Dr. Loebner
+pledged a Grand Prize of $100,000 for the first computer whose responses
+were indistinguishable from a human's.  Each year an annual prize of
+$2000 and a bronze medal is awarded to the _most_ human computer.  The
+winner of the annual contest is the best entry relative to other entries
+that year, irrespective of how good it is in an absolute sense.  Here is
+an example of a conversation with the winning program of 1997:
+
+     (c)1997 Cambridge Center For Behavioral Studies all rights reserved [CONVERSE]
+     [Intelligent Research Ltd.] Start at: [1997/04/29 08:59:40]
+     PROGRAM[10:39:20]Hi, my name is Catherine, what's yours?
+     JUDGE02[10:39:36]My name is Nathan.
+     PROGRAM[10:39:37]Well, it's nice to meet you, Nathan.
+     PROGRAM[10:39:45]Did you see that story on CNN last night about the lesbian
+     PROGRAM[10:40:03]couple who came out at a White House party on Sunday?
+     JUDGE02[10:40:45]No.  I'm sorry I missed that.  How did the Prez take it?
+     PROGRAM[10:40:47]Ellen Degeneres was one of them - she was kissing her lover
+     PROGRAM[10:41:08]in front of Bill Clinton. Do you think he was very
+     PROGRAM[10:41:21]embarrassed?
+     JUDGE02[10:41:37]You tell me.  I didn't see it.
+     PROGRAM[10:41:38]Are you still a Friend of Bill Clinton?
+     JUDGE02[10:42:22]Yes.  I would like to talk about something else.  Do you know anything about sculpture.
+     PROGRAM[10:42:24]Well, I suppose he's doing his best like most of us, but he
+     PROGRAM[10:42:35]does seem to attract an UNUSUAL amount of bad publicity,
+     PROGRAM[10:42:51]wouldnt you say--can it all be bad luck?
+     JUDGE02[10:43:20]I asked you if you knew anything about sculpture.
+     PROGRAM[10:43:21]OK, but what happens now--does it all drift on, with
+     PROGRAM[10:43:36]everyone waiting to see how much his friends have said about
+     PROGRAM[10:43:47]him under threat of lots and lots of jail-or will something
+     PROGRAM[10:44:01]dramatic actually happen?
+     JUDGE02[10:44:19]Do you know what sculpture is?
+     PROGRAM[10:44:20]Well, youre more a fan of his than I am I guess---though I
+     PROGRAM[10:44:34]cant SWEAR what Id have done in that locked hotel room, I
+     PROGRAM[10:44:51]must say.
+     JUDGE02[10:45:26]Have you ever been to the Salmagundi Club?
+     PROGRAM[10:45:29]Never. to go back to the original discussion Time to get
+     PROGRAM[10:45:43]off this one I think and onto something more down to earth!
+     JUDGE02[10:46:20]What is more down to earth than art?
+     PROGRAM[10:46:21]I don't know what -- to be honest. -
+     PROGRAM[10:46:30]I suppose it depends on where you're coming from, but as the
+     PROGRAM[10:46:46]song goes in My Fair Lady, "Why can't a woman be more like a
+     PROGRAM[10:47:03]man? "
+     JUDGE02[10:47:52]I liked that song but I prefer "Why can't children be more like we are"
+     JUDGE02[10:48:10]from another play the name of which I cannot remember.
+     PROGRAM[10:48:12]What, if anything, do you enjoy about being a man?
+     JUDGE02[10:49:01]I would enjoy all aspects of life no matter what my gender.
+     PROGRAM[10:49:02]Do you think of yourself as being attractive?
+
+   This program insists on always speaking about the same story around
+Bill Clinton.  You see, even a program with a rather narrow mind can
+behave so much like a human being that it can win this prize.  It is
+quite common to let these programs talk to each other via network
+connections.  But during the competition itself, the program and its
+computer have to be present at the place the competition is held.  We
+all would love to see a 'gawk' program win in such an event.  Maybe it
+is up to you to accomplish this?
+
+   Some other ideas for useful networked applications:
+   * Read the file 'doc/awkforai.txt' in the 'gawk' distribution.  It
+     was written by Ronald P. Loui (at the time, Associate Professor of
+     Computer Science, at Washington University in St.  Louis,
+     <loui@ai.wustl.edu>) and summarizes why he taught 'gawk' to
+     students of Artificial Intelligence.  Here are some passages from
+     the text:
+
+          The GAWK manual can be consumed in a single lab session and
+          the language can be mastered by the next morning by the
+          average student.  GAWK's automatic initialization, implicit
+          coercion, I/O support and lack of pointers forgive many of the
+          mistakes that young programmers are likely to make.  Those who
+          have seen C but not mastered it are happy to see that GAWK
+          retains some of the same sensibilities while adding what must
+          be regarded as spoonsful of syntactic sugar.
+          ...
+          There are further simple answers.  Probably the best is the
+          fact that increasingly, undergraduate AI programming is
+          involving the Web.  Oren Etzioni (University of Washington,
+          Seattle) has for a while been arguing that the "softbot" is
+          replacing the mechanical engineers' robot as the most
+          glamorous AI testbed.  If the artifact whose behavior needs to
+          be controlled in an intelligent way is the software agent,
+          then a language that is well-suited to controlling the
+          software environment is the appropriate language.  That would
+          imply a scripting language.  If the robot is KAREL, then the
+          right language is "turn left; turn right."  If the robot is
+          Netscape, then the right language is something that can
+          generate 'netscape -remote
+          'openURL(http://cs.wustl.edu/~loui)'' with elan.
+          ...
+          AI programming requires high-level thinking.  There have
+          always been a few gifted programmers who can write high-level
+          programs in assembly language.  Most however need the ambient
+          abstraction to have a higher floor.
+          ...
+          Second, inference is merely the expansion of notation.  No
+          matter whether the logic that underlies an AI program is
+          fuzzy, probabilistic, deontic, defeasible, or deductive, the
+          logic merely defines how strings can be transformed into other
+          strings.  A language that provides the best support for string
+          processing in the end provides the best support for logic, for
+          the exploration of various logics, and for most forms of
+          symbolic processing that AI might choose to call "reasoning"
+          instead of "logic."  The implication is that PROLOG, which
+          saves the AI programmer from having to write a unifier, saves
+          perhaps two dozen lines of GAWK code at the expense of
+          strongly biasing the logic and representational expressiveness
+          of any approach.
+
+     Now that 'gawk' itself can connect to the Internet, it should be
+     obvious that it is suitable for writing intelligent web agents.
+
+   * 'awk' is strong at pattern recognition and string processing.  So,
+     it is well suited to the classic problem of language translation.
+     A first try could be a program that knows the 100 most frequent
+     English words and their counterparts in German or French.  The
+     service could be implemented by regularly reading email with the
+     program above, replacing each word by its translation and sending
+     the translation back via SMTP. Users would send English email to
+     their translation service and get back a translated email message
+     in return.  As soon as this works, more effort can be spent on a
+     real translation program.
+
+   * Another dialogue-oriented application (on the verge of ridicule) is
+     the email "support service."  Troubled customers write an email to
+     an automatic 'gawk' service that reads the email.  It looks for
+     keywords in the mail and assembles a reply email accordingly.  By
+     carefully investigating the email header, and repeating these
+     keywords through the reply email, it is rather simple to give the
+     customer a feeling that someone cares.  Ideally, such a service
+     would search a database of previous cases for solutions.  If none
+     exists, the database could, for example, consist of all the
+     newsgroups, mailing lists and FAQs on the Internet.
+
+
+File: gawkinet.info,  Node: Some Applications and Techniques,  Next: Links,  Prev: Using Networking,  Up: Top
+
+3 Some Applications and Techniques
+**********************************
+
+In this major node, we look at a number of self-contained scripts, with
+an emphasis on concise networking.  Along the way, we work towards
+creating building blocks that encapsulate often needed functions of the
+networking world, show new techniques that broaden the scope of problems
+that can be solved with 'gawk', and explore leading edge technology that
+may shape the future of networking.
+
+   We often refer to the site-independent core of the server that we
+built in *note A Simple Web Server: Simple Server.  When building new
+and nontrivial servers, we always copy this building block and append
+new instances of the two functions 'SetUpServer()' and 'HandleGET()'.
+
+   This makes a lot of sense, since this scheme of event-driven
+execution provides 'gawk' with an interface to the most widely accepted
+standard for GUIs: the web browser.  Now, 'gawk' can rival even Tcl/Tk.
+
+   Tcl and 'gawk' have much in common.  Both are simple scripting
+languages that allow us to quickly solve problems with short programs.
+But Tcl has Tk on top of it, and 'gawk' had nothing comparable up to
+now.  While Tcl needs a large and ever-changing library (Tk, which was
+bound to the X Window System until recently), 'gawk' needs just the
+networking interface and some kind of browser on the client's side.
+Besides better portability, the most important advantage of this
+approach (embracing well-established standards such HTTP and HTML) is
+that _we do not need to change the language_.  We let others do the work
+of fighting over protocols and standards.  We can use HTML, JavaScript,
+VRML, or whatever else comes along to do our work.
+
+* Menu:
+
+* PANIC::                       An Emergency Web Server.
+* GETURL::                      Retrieving Web Pages.
+* REMCONF::                     Remote Configuration Of Embedded Systems.
+* URLCHK::                      Look For Changed Web Pages.
+* WEBGRAB::                     Extract Links From A Page.
+* STATIST::                     Graphing A Statistical Distribution.
+* MAZE::                        Walking Through A Maze In Virtual Reality.
+* MOBAGWHO::                    A Simple Mobile Agent.
+* STOXPRED::                    Stock Market Prediction As A Service.
+* PROTBASE::                    Searching Through A Protein Database.
+
+
+File: gawkinet.info,  Node: PANIC,  Next: GETURL,  Prev: Some Applications and Techniques,  Up: Some Applications and Techniques
+
+3.1 PANIC: An Emergency Web Server
+==================================
+
+At first glance, the '"Hello, world"' example in *note A Primitive Web
+Service: Primitive Service, seems useless.  By adding just a few lines,
+we can turn it into something useful.
+
+   The PANIC program tells everyone who connects that the local site is
+not working.  When a web server breaks down, it makes a difference if
+customers get a strange "network unreachable" message, or a short
+message telling them that the server has a problem.  In such an
+emergency, the hard disk and everything on it (including the regular web
+service) may be unavailable.  Rebooting the web server off a diskette
+makes sense in this setting.
+
+   To use the PANIC program as an emergency web server, all you need are
+the 'gawk' executable and the program below on a diskette.  By default,
+it connects to port 8080.  A different value may be supplied on the
+command line:
+
+     BEGIN {
+       RS = ORS = "\r\n"
+       if (MyPort ==  0) MyPort = 8080
+       HttpService = "/inet/tcp/" MyPort "/0/0"
+       Hello = "<HTML><HEAD><TITLE>Out Of Service</TITLE>" \
+          "</HEAD><BODY><H1>" \
+          "This site is temporarily out of service." \
+          "</H1></BODY></HTML>"
+       Len = length(Hello) + length(ORS)
+       while ("awk" != "complex") {
+         print "HTTP/1.0 200 OK"          |& HttpService
+         print "Content-Length: " Len ORS |& HttpService
+         print Hello                      |& HttpService
+         while ((HttpService |& getline) > 0)
+            continue;
+         close(HttpService)
+       }
+     }
+
+
+File: gawkinet.info,  Node: GETURL,  Next: REMCONF,  Prev: PANIC,  Up: Some Applications and Techniques
+
+3.2 GETURL: Retrieving Web Pages
+================================
+
+GETURL is a versatile building block for shell scripts that need to
+retrieve files from the Internet.  It takes a web address as a
+command-line parameter and tries to retrieve the contents of this
+address.  The contents are printed to standard output, while the header
+is printed to '/dev/stderr'.  A surrounding shell script could analyze
+the contents and extract the text or the links.  An ASCII browser could
+be written around GETURL. But more interestingly, web robots are
+straightforward to write on top of GETURL. On the Internet, you can find
+several programs of the same name that do the same job.  They are
+usually much more complex internally and at least 10 times longer.
+
+   At first, GETURL checks if it was called with exactly one web
+address.  Then, it checks if the user chose to use a special proxy
+server whose name is handed over in a variable.  By default, it is
+assumed that the local machine serves as proxy.  GETURL uses the 'GET'
+method by default to access the web page.  By handing over the name of a
+different method (such as 'HEAD'), it is possible to choose a different
+behavior.  With the 'HEAD' method, the user does not receive the body of
+the page content, but does receive the header:
+
+     BEGIN {
+       if (ARGC != 2) {
+         print "GETURL - retrieve Web page via HTTP 1.0"
+         print "IN:\n    the URL as a command-line parameter"
+         print "PARAM(S):\n    -v Proxy=MyProxy"
+         print "OUT:\n    the page content on stdout"
+         print "    the page header on stderr"
+         print "JK 16.05.1997"
+         print "ADR 13.08.2000"
+         exit
+       }
+       URL = ARGV[1]; ARGV[1] = ""
+       if (Proxy     == "")  Proxy     = "127.0.0.1"
+       if (ProxyPort ==  0)  ProxyPort = 80
+       if (Method    == "")  Method    = "GET"
+       HttpService = "/inet/tcp/0/" Proxy "/" ProxyPort
+       ORS = RS = "\r\n\r\n"
+       print Method " " URL " HTTP/1.0" |& HttpService
+       HttpService                      |& getline Header
+       print Header > "/dev/stderr"
+       while ((HttpService |& getline) > 0)
+         printf "%s", $0
+       close(HttpService)
+     }
+
+   This program can be changed as needed, but be careful with the last
+lines.  Make sure transmission of binary data is not corrupted by
+additional line breaks.  Even as it is now, the byte sequence
+'"\r\n\r\n"' would disappear if it were contained in binary data.  Don't
+get caught in a trap when trying a quick fix on this one.
+
+
+File: gawkinet.info,  Node: REMCONF,  Next: URLCHK,  Prev: GETURL,  Up: Some Applications and Techniques
+
+3.3 REMCONF: Remote Configuration of Embedded Systems
+=====================================================
+
+Today, you often find powerful processors in embedded systems.
+Dedicated network routers and controllers for all kinds of machinery are
+examples of embedded systems.  Processors like the Intel 80x86 or the
+AMD Elan are able to run multitasking operating systems, such as XINU or
+GNU/Linux in embedded PCs.  These systems are small and usually do not
+have a keyboard or a display.  Therefore it is difficult to set up their
+configuration.  There are several widespread ways to set them up:
+
+   * DIP switches
+
+   * Read Only Memories such as EPROMs
+
+   * Serial lines or some kind of keyboard
+
+   * Network connections via 'telnet' or SNMP
+
+   * HTTP connections with HTML GUIs
+
+   In this node, we look at a solution that uses HTTP connections to
+control variables of an embedded system that are stored in a file.
+Since embedded systems have tight limits on resources like memory, it is
+difficult to employ advanced techniques such as SNMP and HTTP servers.
+'gawk' fits in quite nicely with its single executable which needs just
+a short script to start working.  The following program stores the
+variables in a file, and a concurrent process in the embedded system may
+read the file.  The program uses the site-independent part of the simple
+web server that we developed in *note A Web Service with Interaction:
+Interacting Service.  As mentioned there, all we have to do is to write
+two new procedures 'SetUpServer()' and 'HandleGET()':
+
+     function SetUpServer() {
+       TopHeader = "<HTML><title>Remote Configuration</title>"
+       TopDoc = "<BODY>\
+         <h2>Please choose one of the following actions:</h2>\
+         <UL>\
+           <LI><A HREF=" MyPrefix "/AboutServer>About this server</A></LI>\
+           <LI><A HREF=" MyPrefix "/ReadConfig>Read Configuration</A></LI>\
+           <LI><A HREF=" MyPrefix "/CheckConfig>Check Configuration</A></LI>\
+           <LI><A HREF=" MyPrefix "/ChangeConfig>Change Configuration</A></LI>\
+           <LI><A HREF=" MyPrefix "/SaveConfig>Save Configuration</A></LI>\
+         </UL>"
+       TopFooter  = "</BODY></HTML>"
+       if (ConfigFile == "") ConfigFile = "config.asc"
+     }
+
+   The function 'SetUpServer()' initializes the top level HTML texts as
+usual.  It also initializes the name of the file that contains the
+configuration parameters and their values.  In case the user supplies a
+name from the command line, that name is used.  The file is expected to
+contain one parameter per line, with the name of the parameter in column
+one and the value in column two.
+
+   The function 'HandleGET()' reflects the structure of the menu tree as
+usual.  The first menu choice tells the user what this is all about.
+The second choice reads the configuration file line by line and stores
+the parameters and their values.  Notice that the record separator for
+this file is '"\n"', in contrast to the record separator for HTTP. The
+third menu choice builds an HTML table to show the contents of the
+configuration file just read.  The fourth choice does the real work of
+changing parameters, and the last one just saves the configuration into
+a file:
+
+     function HandleGET() {
+       if(MENU[2] == "AboutServer") {
+         Document  = "This is a GUI for remote configuration of an\
+           embedded system. It is is implemented as one GAWK script."
+       } else if (MENU[2] == "ReadConfig") {
+         RS = "\n"
+         while ((getline < ConfigFile) > 0)
+            config[$1] = $2;
+         close(ConfigFile)
+         RS = "\r\n"
+         Document = "Configuration has been read."
+       } else if (MENU[2] == "CheckConfig") {
+         Document = "<TABLE BORDER=1 CELLPADDING=5>"
+         for (i in config)
+           Document = Document "<TR><TD>" i "</TD>" \
+             "<TD>" config[i] "</TD></TR>"
+         Document = Document "</TABLE>"
+       } else if (MENU[2] == "ChangeConfig") {
+         if ("Param" in GETARG) {            # any parameter to set?
+           if (GETARG["Param"] in config) {  # is  parameter valid?
+             config[GETARG["Param"]] = GETARG["Value"]
+             Document = (GETARG["Param"] " = " GETARG["Value"] ".")
+           } else {
+             Document = "Parameter <b>" GETARG["Param"] "</b> is invalid."
+           }
+         } else {
+           Document = "<FORM method=GET><h4>Change one parameter</h4>\
+             <TABLE BORDER CELLPADDING=5>\
+             <TR><TD>Parameter</TD><TD>Value</TD></TR>\
+             <TR><TD><input type=text name=Param value=\"\" size=20></TD>\
+                 <TD><input type=text name=Value value=\"\" size=40></TD>\
+             </TR></TABLE><input type=submit value=\"Set\"></FORM>"
+         }
+       } else if (MENU[2] == "SaveConfig") {
+         for (i in config)
+           printf("%s %s\n", i, config[i]) > ConfigFile
+         close(ConfigFile)
+         Document = "Configuration has been saved."
+       }
+     }
+
+   We could also view the configuration file as a database.  From this
+point of view, the previous program acts like a primitive database
+server.  Real SQL database systems also make a service available by
+providing a TCP port that clients can connect to.  But the application
+level protocols they use are usually proprietary and also change from
+time to time.  This is also true for the protocol that MiniSQL uses.
+
+
+File: gawkinet.info,  Node: URLCHK,  Next: WEBGRAB,  Prev: REMCONF,  Up: Some Applications and Techniques
+
+3.4 URLCHK: Look for Changed Web Pages
+======================================
+
+Most people who make heavy use of Internet resources have a large
+bookmark file with pointers to interesting web sites.  It is impossible
+to regularly check by hand if any of these sites have changed.  A
+program is needed to automatically look at the headers of web pages and
+tell which ones have changed.  URLCHK does the comparison after using
+GETURL with the 'HEAD' method to retrieve the header.
+
+   Like GETURL, this program first checks that it is called with exactly
+one command-line parameter.  URLCHK also takes the same command-line
+variables 'Proxy' and 'ProxyPort' as GETURL, because these variables are
+handed over to GETURL for each URL that gets checked.  The one and only
+parameter is the name of a file that contains one line for each URL. In
+the first column, we find the URL, and the second and third columns hold
+the length of the URL's body when checked for the two last times.  Now,
+we follow this plan:
+
+  1. Read the URLs from the file and remember their most recent lengths
+
+  2. Delete the contents of the file
+
+  3. For each URL, check its new length and write it into the file
+
+  4. If the most recent and the new length differ, tell the user
+
+   It may seem a bit peculiar to read the URLs from a file together with
+their two most recent lengths, but this approach has several advantages.
+You can call the program again and again with the same file.  After
+running the program, you can regenerate the changed URLs by extracting
+those lines that differ in their second and third columns:
+
+     BEGIN {
+       if (ARGC != 2) {
+         print "URLCHK - check if URLs have changed"
+         print "IN:\n    the file with URLs as a command-line parameter"
+         print "    file contains URL, old length, new length"
+         print "PARAMS:\n    -v Proxy=MyProxy -v ProxyPort=8080"
+         print "OUT:\n    same as file with URLs"
+         print "JK 02.03.1998"
+         exit
+       }
+       URLfile = ARGV[1]; ARGV[1] = ""
+       if (Proxy     != "") Proxy     = " -v Proxy="     Proxy
+       if (ProxyPort != "") ProxyPort = " -v ProxyPort=" ProxyPort
+       while ((getline < URLfile) > 0)
+          Length[$1] = $3 + 0
+       close(URLfile)      # now, URLfile is read in and can be updated
+       GetHeader = "gawk " Proxy ProxyPort " -v Method=\"HEAD\" -f geturl.awk "
+       for (i in Length) {
+         GetThisHeader = GetHeader i " 2>&1"
+         while ((GetThisHeader | getline) > 0)
+           if (toupper($0) ~ /CONTENT-LENGTH/) NewLength = $2 + 0
+         close(GetThisHeader)
+         print i, Length[i], NewLength > URLfile
+         if (Length[i] != NewLength)  # report only changed URLs
+           print i, Length[i], NewLength
+       }
+       close(URLfile)
+     }
+
+   Another thing that may look strange is the way GETURL is called.
+Before calling GETURL, we have to check if the proxy variables need to
+be passed on.  If so, we prepare strings that will become part of the
+command line later.  In 'GetHeader()', we store these strings together
+with the longest part of the command line.  Later, in the loop over the
+URLs, 'GetHeader()' is appended with the URL and a redirection operator
+to form the command that reads the URL's header over the Internet.
+GETURL always produces the headers over '/dev/stderr'.  That is the
+reason why we need the redirection operator to have the header piped in.
+
+   This program is not perfect because it assumes that changing URLs
+results in changed lengths, which is not necessarily true.  A more
+advanced approach is to look at some other header line that holds time
+information.  But, as always when things get a bit more complicated,
+this is left as an exercise to the reader.
+
+
+File: gawkinet.info,  Node: WEBGRAB,  Next: STATIST,  Prev: URLCHK,  Up: Some Applications and Techniques
+
+3.5 WEBGRAB: Extract Links from a Page
+======================================
+
+Sometimes it is necessary to extract links from web pages.  Browsers do
+it, web robots do it, and sometimes even humans do it.  Since we have a
+tool like GETURL at hand, we can solve this problem with some help from
+the Bourne shell:
+
+     BEGIN { RS = "http://[#%&\\+\\-\\./0-9\\:;\\?A-Z_a-z\\~]*" }
+     RT != "" {
+        command = ("gawk -v Proxy=MyProxy -f geturl.awk " RT \
+                    " > doc" NR ".html")
+        print command
+     }
+
+   Notice that the regular expression for URLs is rather crude.  A
+precise regular expression is much more complex.  But this one works
+rather well.  One problem is that it is unable to find internal links of
+an HTML document.  Another problem is that 'ftp', 'telnet', 'news',
+'mailto', and other kinds of links are missing in the regular
+expression.  However, it is straightforward to add them, if doing so is
+necessary for other tasks.
+
+   This program reads an HTML file and prints all the HTTP links that it
+finds.  It relies on 'gawk''s ability to use regular expressions as
+record separators.  With 'RS' set to a regular expression that matches
+links, the second action is executed each time a non-empty link is
+found.  We can find the matching link itself in 'RT'.
+
+   The action could use the 'system()' function to let another GETURL
+retrieve the page, but here we use a different approach.  This simple
+program prints shell commands that can be piped into 'sh' for execution.
+This way it is possible to first extract the links, wrap shell commands
+around them, and pipe all the shell commands into a file.  After editing
+the file, execution of the file retrieves exactly those files that we
+really need.  In case we do not want to edit, we can retrieve all the
+pages like this:
+
+     gawk -f geturl.awk http://www.suse.de | gawk -f webgrab.awk | sh
+
+   After this, you will find the contents of all referenced documents in
+files named 'doc*.html' even if they do not contain HTML code.  The most
+annoying thing is that we always have to pass the proxy to GETURL. If
+you do not like to see the headers of the web pages appear on the
+screen, you can redirect them to '/dev/null'.  Watching the headers
+appear can be quite interesting, because it reveals interesting details
+such as which web server the companies use.  Now, it is clear how the
+clever marketing people use web robots to determine the market shares of
+Microsoft and Netscape in the web server market.
+
+   Port 80 of any web server is like a small hole in a repellent
+firewall.  After attaching a browser to port 80, we usually catch a
+glimpse of the bright side of the server (its home page).  With a tool
+like GETURL at hand, we are able to discover some of the more concealed
+or even "indecent" services (i.e., lacking conformity to standards of
+quality).  It can be exciting to see the fancy CGI scripts that lie
+there, revealing the inner workings of the server, ready to be called:
+
+   * With a command such as:
+
+          gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/
+
+     some servers give you a directory listing of the CGI files.
+     Knowing the names, you can try to call some of them and watch for
+     useful results.  Sometimes there are executables in such
+     directories (such as Perl interpreters) that you may call remotely.
+     If there are subdirectories with configuration data of the web
+     server, this can also be quite interesting to read.
+
+   * The well-known Apache web server usually has its CGI files in the
+     directory '/cgi-bin'.  There you can often find the scripts
+     'test-cgi' and 'printenv'.  Both tell you some things about the
+     current connection and the installation of the web server.  Just
+     call:
+
+          gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/test-cgi
+          gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/printenv
+
+   * Sometimes it is even possible to retrieve system files like the web
+     server's log file--possibly containing customer data--or even the
+     file '/etc/passwd'.  (We don't recommend this!)
+
+   *Caution:* Although this may sound funny or simply irrelevant, we are
+talking about severe security holes.  Try to explore your own system
+this way and make sure that none of the above reveals too much
+information about your system.
+
+
+File: gawkinet.info,  Node: STATIST,  Next: MAZE,  Prev: WEBGRAB,  Up: Some Applications and Techniques
+
+3.6 STATIST: Graphing a Statistical Distribution
+================================================
+
+In the HTTP server examples we've shown thus far, we never present an
+image to the browser and its user.  Presenting images is one task.
+Generating images that reflect some user input and presenting these
+dynamically generated images is another.  In this node, we use GNUPlot
+for generating '.png', '.ps', or '.gif' files.(1)
+
+   The program we develop takes the statistical parameters of two
+samples and computes the t-test statistics.  As a result, we get the
+probabilities that the means and the variances of both samples are the
+same.  In order to let the user check plausibility, the program presents
+an image of the distributions.  The statistical computation follows
+'Numerical Recipes in C: The Art of Scientific Computing' by William H.
+Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery.
+Since 'gawk' does not have a built-in function for the computation of
+the beta function, we use the 'ibeta()' function of GNUPlot.  As a side
+effect, we learn how to use GNUPlot as a sophisticated calculator.  The
+comparison of means is done as in 'tutest', paragraph 14.2, page 613,
+and the comparison of variances is done as in 'ftest', page 611 in
+'Numerical Recipes'.
+
+   As usual, we take the site-independent code for servers and append
+our own functions 'SetUpServer()' and 'HandleGET()':
+
+     function SetUpServer() {
+       TopHeader = "<HTML><title>Statistics with GAWK</title>"
+       TopDoc = "<BODY>\
+        <h2>Please choose one of the following actions:</h2>\
+        <UL>\
+         <LI><A HREF=" MyPrefix "/AboutServer>About this server</A></LI>\
+         <LI><A HREF=" MyPrefix "/EnterParameters>Enter Parameters</A></LI>\
+        </UL>"
+       TopFooter  = "</BODY></HTML>"
+       GnuPlot    = "gnuplot 2>&1"
+       m1=m2=0;    v1=v2=1;    n1=n2=10
+     }
+
+   Here, you see the menu structure that the user sees.  Later, we will
+see how the program structure of the 'HandleGET()' function reflects the
+menu structure.  What is missing here is the link for the image we
+generate.  In an event-driven environment, request, generation, and
+delivery of images are separated.
+
+   Notice the way we initialize the 'GnuPlot' command string for the
+pipe.  By default, GNUPlot outputs the generated image via standard
+output, as well as the results of 'print'(ed) calculations via standard
+error.  The redirection causes standard error to be mixed into standard
+output, enabling us to read results of calculations with 'getline'.  By
+initializing the statistical parameters with some meaningful defaults,
+we make sure the user gets an image the first time he uses the program.
+
+   Following is the rather long function 'HandleGET()', which implements
+the contents of this service by reacting to the different kinds of
+requests from the browser.  Before you start playing with this script,
+make sure that your browser supports JavaScript and that it also has
+this option switched on.  The script uses a short snippet of JavaScript
+code for delayed opening of a window with an image.  A more detailed
+explanation follows:
+
+     function HandleGET() {
+       if(MENU[2] == "AboutServer") {
+         Document  = "This is a GUI for a statistical computation.\
+           It compares means and variances of two distributions.\
+           It is implemented as one GAWK script and uses GNUPLOT."
+       } else if (MENU[2] == "EnterParameters") {
+         Document = ""
+         if ("m1" in GETARG) {     # are there parameters to compare?
+           Document = Document "<SCRIPT LANGUAGE=\"JavaScript\">\
+             setTimeout(\"window.open(\\\"" MyPrefix "/Image" systime()\
+              "\\\",\\\"dist\\\", \\\"status=no\\\");\", 1000); </SCRIPT>"
+           m1 = GETARG["m1"]; v1 = GETARG["v1"]; n1 = GETARG["n1"]
+           m2 = GETARG["m2"]; v2 = GETARG["v2"]; n2 = GETARG["n2"]
+           t = (m1-m2)/sqrt(v1/n1+v2/n2)
+           df = (v1/n1+v2/n2)*(v1/n1+v2/n2)/((v1/n1)*(v1/n1)/(n1-1) \
+                + (v2/n2)*(v2/n2) /(n2-1))
+           if (v1>v2) {
+               f = v1/v2
+               df1 = n1 - 1
+               df2 = n2 - 1
+           } else {
+               f = v2/v1
+               df1 = n2 - 1
+               df2 = n1 - 1
+           }
+           print "pt=ibeta(" df/2 ",0.5," df/(df+t*t) ")"  |& GnuPlot
+           print "pF=2.0*ibeta(" df2/2 "," df1/2 "," \
+                 df2/(df2+df1*f) ")"                    |& GnuPlot
+           print "print pt, pF"                         |& GnuPlot
+           RS="\n"; GnuPlot |& getline; RS="\r\n"    # $1 is pt, $2 is pF
+           print "invsqrt2pi=1.0/sqrt(2.0*pi)"          |& GnuPlot
+           print "nd(x)=invsqrt2pi/sd*exp(-0.5*((x-mu)/sd)**2)" |& GnuPlot
+           print "set term png small color"             |& GnuPlot
+           #print "set term postscript color"           |& GnuPlot
+           #print "set term gif medium size 320,240"    |& GnuPlot
+           print "set yrange[-0.3:]"                    |& GnuPlot
+           print "set label 'p(m1=m2) =" $1 "' at 0,-0.1 left"  |& GnuPlot
+           print "set label 'p(v1=v2) =" $2 "' at 0,-0.2 left"  |& GnuPlot
+           print "plot mu=" m1 ",sd=" sqrt(v1) ", nd(x) title 'sample 1',\
+             mu=" m2 ",sd=" sqrt(v2) ", nd(x) title 'sample 2'" |& GnuPlot
+           print "quit"                                         |& GnuPlot
+           GnuPlot |& getline Image
+           while ((GnuPlot |& getline) > 0)
+               Image = Image RS $0
+           close(GnuPlot)
+         }
+         Document = Document "\
+         <h3>Do these samples have the same Gaussian distribution?</h3>\
+         <FORM METHOD=GET> <TABLE BORDER CELLPADDING=5>\
+         <TR>\
+         <TD>1. Mean    </TD>
+         <TD><input type=text name=m1 value=" m1 " size=8></TD>\
+         <TD>1. Variance</TD>
+         <TD><input type=text name=v1 value=" v1 " size=8></TD>\
+         <TD>1. Count   </TD>
+         <TD><input type=text name=n1 value=" n1 " size=8></TD>\
+         </TR><TR>\
+         <TD>2. Mean    </TD>
+         <TD><input type=text name=m2 value=" m2 " size=8></TD>\
+         <TD>2. Variance</TD>
+         <TD><input type=text name=v2 value=" v2 " size=8></TD>\
+         <TD>2. Count   </TD>
+         <TD><input type=text name=n2 value=" n2 " size=8></TD>\
+         </TR>                   <input type=submit value=\"Compute\">\
+         </TABLE></FORM><BR>"
+       } else if (MENU[2] ~ "Image") {
+         Reason = "OK" ORS "Content-type: image/png"
+         #Reason = "OK" ORS "Content-type: application/x-postscript"
+         #Reason = "OK" ORS "Content-type: image/gif"
+         Header = Footer = ""
+         Document = Image
+       }
+     }
+
+   As usual, we give a short description of the service in the first
+menu choice.  The third menu choice shows us that generation and
+presentation of an image are two separate actions.  While the latter
+takes place quite instantly in the third menu choice, the former takes
+place in the much longer second choice.  Image data passes from the
+generating action to the presenting action via the variable 'Image' that
+contains a complete '.png' image, which is otherwise stored in a file.
+If you prefer '.ps' or '.gif' images over the default '.png' images, you
+may select these options by uncommenting the appropriate lines.  But
+remember to do so in two places: when telling GNUPlot which kind of
+images to generate, and when transmitting the image at the end of the
+program.
+
+   Looking at the end of the program, the way we pass the 'Content-type'
+to the browser is a bit unusual.  It is appended to the 'OK' of the
+first header line to make sure the type information becomes part of the
+header.  The other variables that get transmitted across the network are
+made empty, because in this case we do not have an HTML document to
+transmit, but rather raw image data to contain in the body.
+
+   Most of the work is done in the second menu choice.  It starts with a
+strange JavaScript code snippet.  When first implementing this server,
+we used a short '"<IMG SRC=" MyPrefix "/Image>"' here.  But then
+browsers got smarter and tried to improve on speed by requesting the
+image and the HTML code at the same time.  When doing this, the browser
+tries to build up a connection for the image request while the request
+for the HTML text is not yet completed.  The browser tries to connect to
+the 'gawk' server on port 8080 while port 8080 is still in use for
+transmission of the HTML text.  The connection for the image cannot be
+built up, so the image appears as "broken" in the browser window.  We
+solved this problem by telling the browser to open a separate window for
+the image, but only after a delay of 1000 milliseconds.  By this time,
+the server should be ready for serving the next request.
+
+   But there is one more subtlety in the JavaScript code.  Each time the
+JavaScript code opens a window for the image, the name of the image is
+appended with a timestamp ('systime()').  Why this constant change of
+name for the image?  Initially, we always named the image 'Image', but
+then the Netscape browser noticed the name had _not_ changed since the
+previous request and displayed the previous image (caching behavior).
+The server core is implemented so that browsers are told _not_ to cache
+anything.  Obviously HTTP requests do not always work as expected.  One
+way to circumvent the cache of such overly smart browsers is to change
+the name of the image with each request.  These three lines of
+JavaScript caused us a lot of trouble.
+
+   The rest can be broken down into two phases.  At first, we check if
+there are statistical parameters.  When the program is first started,
+there usually are no parameters because it enters the page coming from
+the top menu.  Then, we only have to present the user a form that he can
+use to change statistical parameters and submit them.  Subsequently, the
+submission of the form causes the execution of the first phase because
+_now_ there _are_ parameters to handle.
+
+   Now that we have parameters, we know there will be an image
+available.  Therefore we insert the JavaScript code here to initiate the
+opening of the image in a separate window.  Then, we prepare some
+variables that will be passed to GNUPlot for calculation of the
+probabilities.  Prior to reading the results, we must temporarily change
+'RS' because GNUPlot separates lines with newlines.  After instructing
+GNUPlot to generate a '.png' (or '.ps' or '.gif') image, we initiate the
+insertion of some text, explaining the resulting probabilities.  The
+final 'plot' command actually generates the image data.  This raw binary
+has to be read in carefully without adding, changing, or deleting a
+single byte.  Hence the unusual initialization of 'Image' and completion
+with a 'while' loop.
+
+   When using this server, it soon becomes clear that it is far from
+being perfect.  It mixes source code of six scripting languages or
+protocols:
+
+   * GNU 'awk' implements a server for the protocol:
+   * HTTP which transmits:
+   * HTML text which contains a short piece of:
+   * JavaScript code opening a separate window.
+   * A Bourne shell script is used for piping commands into:
+   * GNUPlot to generate the image to be opened.
+
+   After all this work, the GNUPlot image opens in the JavaScript window
+where it can be viewed by the user.
+
+   It is probably better not to mix up so many different languages.  The
+result is not very readable.  Furthermore, the statistical part of the
+server does not take care of invalid input.  Among others, using
+negative variances will cause invalid results.
+
+   ---------- Footnotes ----------
+
+   (1) Due to licensing problems, the default installation of GNUPlot
+disables the generation of '.gif' files.  If your installed version does
+not accept 'set term gif', just download and install the most recent
+version of GNUPlot and the GD library (http://www.boutell.com/gd/) by
+Thomas Boutell.  Otherwise you still have the chance to generate some
+ASCII-art style images with GNUPlot by using 'set term dumb'.  (We tried
+it and it worked.)
+
+
+File: gawkinet.info,  Node: MAZE,  Next: MOBAGWHO,  Prev: STATIST,  Up: Some Applications and Techniques
+
+3.7 MAZE: Walking Through a Maze In Virtual Reality
+===================================================
+
+     In the long run, every program becomes rococo, and then rubble.
+     Alan Perlis
+
+   By now, we know how to present arbitrary 'Content-type's to a
+browser.  In this node, our server will present a 3D world to our
+browser.  The 3D world is described in a scene description language
+(VRML, Virtual Reality Modeling Language) that allows us to travel
+through a perspective view of a 2D maze with our browser.  Browsers with
+a VRML plugin enable exploration of this technology.  We could do one of
+those boring 'Hello world' examples here, that are usually presented
+when introducing novices to VRML. If you have never written any VRML
+code, have a look at the VRML FAQ. Presenting a static VRML scene is a
+bit trivial; in order to expose 'gawk''s new capabilities, we will
+present a dynamically generated VRML scene.  The function
+'SetUpServer()' is very simple because it only sets the default HTML
+page and initializes the random number generator.  As usual, the
+surrounding server lets you browse the maze.
+
+     function SetUpServer() {
+       TopHeader = "<HTML><title>Walk through a maze</title>"
+       TopDoc = "\
+         <h2>Please choose one of the following actions:</h2>\
+         <UL>\
+           <LI><A HREF=" MyPrefix "/AboutServer>About this server</A>\
+           <LI><A HREF=" MyPrefix "/VRMLtest>Watch a simple VRML scene</A>\
+         </UL>"
+       TopFooter  = "</HTML>"
+       srand()
+     }
+
+   The function 'HandleGET()' is a bit longer because it first computes
+the maze and afterwards generates the VRML code that is sent across the
+network.  As shown in the STATIST example (*note STATIST::), we set the
+type of the content to VRML and then store the VRML representation of
+the maze as the page content.  We assume that the maze is stored in a 2D
+array.  Initially, the maze consists of walls only.  Then, we add an
+entry and an exit to the maze and let the rest of the work be done by
+the function 'MakeMaze()'.  Now, only the wall fields are left in the
+maze.  By iterating over the these fields, we generate one line of VRML
+code for each wall field.
+
+     function HandleGET() {
+       if (MENU[2] == "AboutServer") {
+         Document  = "If your browser has a VRML 2 plugin,\
+           this server shows you a simple VRML scene."
+       } else if (MENU[2] == "VRMLtest") {
+         XSIZE = YSIZE = 11              # initially, everything is wall
+         for (y = 0; y < YSIZE; y++)
+            for (x = 0; x < XSIZE; x++)
+               Maze[x, y] = "#"
+         delete Maze[0, 1]              # entry is not wall
+         delete Maze[XSIZE-1, YSIZE-2]  # exit  is not wall
+         MakeMaze(1, 1)
+         Document = "\
+     #VRML V2.0 utf8\n\
+     Group {\n\
+       children [\n\
+         PointLight {\n\
+           ambientIntensity 0.2\n\
+           color 0.7 0.7 0.7\n\
+           location 0.0 8.0 10.0\n\
+         }\n\
+         DEF B1 Background {\n\
+           skyColor [0 0 0, 1.0 1.0 1.0 ]\n\
+           skyAngle 1.6\n\
+           groundColor [1 1 1, 0.8 0.8 0.8, 0.2 0.2 0.2 ]\n\
+           groundAngle [ 1.2 1.57 ]\n\
+         }\n\
+         DEF Wall Shape {\n\
+           geometry Box {size 1 1 1}\n\
+           appearance Appearance { material Material { diffuseColor 0 0 1 } }\n\
+         }\n\
+         DEF Entry Viewpoint {\n\
+           position 0.5 1.0 5.0\n\
+           orientation 0.0 0.0 -1.0 0.52\n\
+         }\n"
+         for (i in Maze) {
+           split(i, t, SUBSEP)
+           Document = Document "    Transform { translation "
+           Document = Document t[1] " 0 -" t[2] " children USE Wall }\n"
+         }
+         Document = Document "  ] # end of group for world\n}"
+         Reason = "OK" ORS "Content-type: model/vrml"
+         Header = Footer = ""
+       }
+     }
+
+   Finally, we have a look at 'MakeMaze()', the function that generates
+the 'Maze' array.  When entered, this function assumes that the array
+has been initialized so that each element represents a wall element and
+the maze is initially full of wall elements.  Only the entrance and the
+exit of the maze should have been left free.  The parameters of the
+function tell us which element must be marked as not being a wall.
+After this, we take a look at the four neighboring elements and remember
+which we have already treated.  Of all the neighboring elements, we take
+one at random and walk in that direction.  Therefore, the wall element
+in that direction has to be removed and then, we call the function
+recursively for that element.  The maze is only completed if we iterate
+the above procedure for _all_ neighboring elements (in random order) and
+for our present element by recursively calling the function for the
+present element.  This last iteration could have been done in a loop,
+but it is done much simpler recursively.
+
+   Notice that elements with coordinates that are both odd are assumed
+to be on our way through the maze and the generating process cannot
+terminate as long as there is such an element not being 'delete'd.  All
+other elements are potentially part of the wall.
+
+     function MakeMaze(x, y) {
+       delete Maze[x, y]     # here we are, we have no wall here
+       p = 0                 # count unvisited fields in all directions
+       if (x-2 SUBSEP y   in Maze) d[p++] = "-x"
+       if (x   SUBSEP y-2 in Maze) d[p++] = "-y"
+       if (x+2 SUBSEP y   in Maze) d[p++] = "+x"
+       if (x   SUBSEP y+2 in Maze) d[p++] = "+y"
+       if (p>0) {            # if there are unvisited fields, go there
+         p = int(p*rand())   # choose one unvisited field at random
+         if        (d[p] == "-x") { delete Maze[x - 1, y]; MakeMaze(x - 2, y)
+         } else if (d[p] == "-y") { delete Maze[x, y - 1]; MakeMaze(x, y - 2)
+         } else if (d[p] == "+x") { delete Maze[x + 1, y]; MakeMaze(x + 2, y)
+         } else if (d[p] == "+y") { delete Maze[x, y + 1]; MakeMaze(x, y + 2)
+         }                   # we are back from recursion
+         MakeMaze(x, y);     # try again while there are unvisited fields
+       }
+     }
+
+
+File: gawkinet.info,  Node: MOBAGWHO,  Next: STOXPRED,  Prev: MAZE,  Up: Some Applications and Techniques
+
+3.8 MOBAGWHO: a Simple Mobile Agent
+===================================
+
+     There are two ways of constructing a software design: One way is to
+     make it so simple that there are obviously no deficiencies, and the
+     other way is to make it so complicated that there are no obvious
+     deficiencies.
+     C. A. R. Hoare
+
+   A "mobile agent" is a program that can be dispatched from a computer
+and transported to a remote server for execution.  This is called
+"migration", which means that a process on another system is started
+that is independent from its originator.  Ideally, it wanders through a
+network while working for its creator or owner.  In places like the UMBC
+Agent Web, people are quite confident that (mobile) agents are a
+software engineering paradigm that enables us to significantly increase
+the efficiency of our work.  Mobile agents could become the mediators
+between users and the networking world.  For an unbiased view at this
+technology, see the remarkable paper 'Mobile Agents: Are they a good
+idea?'.(1)
+
+   When trying to migrate a process from one system to another, a server
+process is needed on the receiving side.  Depending on the kind of
+server process, several ways of implementation come to mind.  How the
+process is implemented depends upon the kind of server process:
+
+   * HTTP can be used as the protocol for delivery of the migrating
+     process.  In this case, we use a common web server as the receiving
+     server process.  A universal CGI script mediates between migrating
+     process and web server.  Each server willing to accept migrating
+     agents makes this universal service available.  HTTP supplies the
+     'POST' method to transfer some data to a file on the web server.
+     When a CGI script is called remotely with the 'POST' method instead
+     of the usual 'GET' method, data is transmitted from the client
+     process to the standard input of the server's CGI script.  So, to
+     implement a mobile agent, we must not only write the agent program
+     to start on the client side, but also the CGI script to receive the
+     agent on the server side.
+
+   * The 'PUT' method can also be used for migration.  HTTP does not
+     require a CGI script for migration via 'PUT'.  However, with common
+     web servers there is no advantage to this solution, because web
+     servers such as Apache require explicit activation of a special
+     'PUT' script.
+
+   * 'Agent Tcl' pursues a different course; it relies on a dedicated
+     server process with a dedicated protocol specialized for receiving
+     mobile agents.
+
+   Our agent example abuses a common web server as a migration tool.
+So, it needs a universal CGI script on the receiving side (the web
+server).  The receiving script is activated with a 'POST' request when
+placed into a location like '/httpd/cgi-bin/PostAgent.sh'.  Make sure
+that the server system uses a version of 'gawk' that supports network
+access (Version 3.1 or later; verify with 'gawk --version').
+
+     #!/bin/sh
+     MobAg=/tmp/MobileAgent.$$
+     # direct script to mobile agent file
+     cat > $MobAg
+     # execute agent concurrently
+     gawk -f $MobAg $MobAg > /dev/null &
+     # HTTP header, terminator and body
+     gawk 'BEGIN { print "\r\nAgent started" }'
+     rm $MobAg      # delete script file of agent
+
+   By making its process id ('$$') part of the unique file name, the
+script avoids conflicts between concurrent instances of the script.
+First, all lines from standard input (the mobile agent's source code)
+are copied into this unique file.  Then, the agent is started as a
+concurrent process and a short message reporting this fact is sent to
+the submitting client.  Finally, the script file of the mobile agent is
+removed because it is no longer needed.  Although it is a short script,
+there are several noteworthy points:
+
+Security
+     _There is none_.  In fact, the CGI script should never be made
+     available on a server that is part of the Internet because everyone
+     would be allowed to execute arbitrary commands with it.  This
+     behavior is acceptable only when performing rapid prototyping.
+
+Self-Reference
+     Each migrating instance of an agent is started in a way that
+     enables it to read its own source code from standard input and use
+     the code for subsequent migrations.  This is necessary because it
+     needs to treat the agent's code as data to transmit.  'gawk' is not
+     the ideal language for such a job.  Lisp and Tcl are more suitable
+     because they do not make a distinction between program code and
+     data.
+
+Independence
+     After migration, the agent is not linked to its former home in any
+     way.  By reporting 'Agent started', it waves "Goodbye" to its
+     origin.  The originator may choose to terminate or not.
+
+   The originating agent itself is started just like any other
+command-line script, and reports the results on standard output.  By
+letting the name of the original host migrate with the agent, the agent
+that migrates to a host far away from its origin can report the result
+back home.  Having arrived at the end of the journey, the agent
+establishes a connection and reports the results.  This is the reason
+for determining the name of the host with 'uname -n' and storing it in
+'MyOrigin' for later use.  We may also set variables with the '-v'
+option from the command line.  This interactivity is only of importance
+in the context of starting a mobile agent; therefore this 'BEGIN'
+pattern and its action do not take part in migration:
+
+     BEGIN {
+       if (ARGC != 2) {
+         print "MOBAG - a simple mobile agent"
+         print "CALL:\n    gawk -f mobag.awk mobag.awk"
+         print "IN:\n    the name of this script as a command-line parameter"
+         print "PARAM:\n    -v MyOrigin=myhost.com"
+         print "OUT:\n    the result on stdout"
+         print "JK 29.03.1998 01.04.1998"
+         exit
+       }
+       if (MyOrigin == "") {
+          "uname -n" | getline MyOrigin
+          close("uname -n")
+       }
+     }
+
+   Since 'gawk' cannot manipulate and transmit parts of the program
+directly, the source code is read and stored in strings.  Therefore, the
+program scans itself for the beginning and the ending of functions.
+Each line in between is appended to the code string until the end of the
+function has been reached.  A special case is this part of the program
+itself.  It is not a function.  Placing a similar framework around it
+causes it to be treated like a function.  Notice that this mechanism
+works for all the functions of the source code, but it cannot guarantee
+that the order of the functions is preserved during migration:
+
+     #ReadMySelf
+     /^function /                     { FUNC = $2 }
+     /^END/ || /^#ReadMySelf/         { FUNC = $1 }
+     FUNC != ""                       { MOBFUN[FUNC] = MOBFUN[FUNC] RS $0 }
+     (FUNC != "") && (/^}/ || /^#EndOfMySelf/) \
+                                      { FUNC = "" }
+     #EndOfMySelf
+
+   The web server code in *note A Web Service with Interaction:
+Interacting Service, was first developed as a site-independent core.
+Likewise, the 'gawk'-based mobile agent starts with an agent-independent
+core, to which can be appended application-dependent functions.  What
+follows is the only application-independent function needed for the
+mobile agent:
+
+     function migrate(Destination, MobCode, Label) {
+       MOBVAR["Label"] = Label
+       MOBVAR["Destination"] = Destination
+       RS = ORS = "\r\n"
+       HttpService = "/inet/tcp/0/" Destination
+       for (i in MOBFUN)
+          MobCode = (MobCode "\n" MOBFUN[i])
+       MobCode = MobCode  "\n\nBEGIN {"
+       for (i in MOBVAR)
+          MobCode = (MobCode "\n  MOBVAR[\"" i "\"] = \"" MOBVAR[i] "\"")
+       MobCode = MobCode "\n}\n"
+       print "POST /cgi-bin/PostAgent.sh HTTP/1.0"  |& HttpService
+       print "Content-length:", length(MobCode) ORS |& HttpService
+       printf "%s", MobCode                         |& HttpService
+       while ((HttpService |& getline) > 0)
+          print $0
+       close(HttpService)
+     }
+
+   The 'migrate()' function prepares the aforementioned strings
+containing the program code and transmits them to a server.  A
+consequence of this modular approach is that the 'migrate()' function
+takes some parameters that aren't needed in this application, but that
+will be in future ones.  Its mandatory parameter 'Destination' holds the
+name (or IP address) of the server that the agent wants as a host for
+its code.  The optional parameter 'MobCode' may contain some 'gawk' code
+that is inserted during migration in front of all other code.  The
+optional parameter 'Label' may contain a string that tells the agent
+what to do in program execution after arrival at its new home site.  One
+of the serious obstacles in implementing a framework for mobile agents
+is that it does not suffice to migrate the code.  It is also necessary
+to migrate the state of execution of the agent.  In contrast to 'Agent
+Tcl', this program does not try to migrate the complete set of
+variables.  The following conventions are used:
+
+   * Each variable in an agent program is local to the current host and
+     does _not_ migrate.
+
+   * The array 'MOBFUN' shown above is an exception.  It is handled by
+     the function 'migrate()' and does migrate with the application.
+
+   * The other exception is the array 'MOBVAR'.  Each variable that
+     takes part in migration has to be an element of this array.
+     'migrate()' also takes care of this.
+
+   Now it's clear what happens to the 'Label' parameter of the function
+'migrate()'.  It is copied into 'MOBVAR["Label"]' and travels alongside
+the other data.  Since travelling takes place via HTTP, records must be
+separated with '"\r\n"' in 'RS' and 'ORS' as usual.  The code assembly
+for migration takes place in three steps:
+
+   * Iterate over 'MOBFUN' to collect all functions verbatim.
+
+   * Prepare a 'BEGIN' pattern and put assignments to mobile variables
+     into the action part.
+
+   * Transmission itself resembles GETURL: the header with the request
+     and the 'Content-length' is followed by the body.  In case there is
+     any reply over the network, it is read completely and echoed to
+     standard output to avoid irritating the server.
+
+   The application-independent framework is now almost complete.  What
+follows is the 'END' pattern that is executed when the mobile agent has
+finished reading its own code.  First, it checks whether it is already
+running on a remote host or not.  In case initialization has not yet
+taken place, it starts 'MyInit()'.  Otherwise (later, on a remote host),
+it starts 'MyJob()':
+
+     END {
+       if (ARGC != 2) exit    # stop when called with wrong parameters
+       if (MyOrigin != "")    # is this the originating host?
+         MyInit()             # if so, initialize the application
+       else                   # we are on a host with migrated data
+         MyJob()              # so we do our job
+     }
+
+   All that's left to extend the framework into a complete application
+is to write two application-specific functions: 'MyInit()' and
+'MyJob()'.  Keep in mind that the former is executed once on the
+originating host, while the latter is executed after each migration:
+
+     function MyInit() {
+       MOBVAR["MyOrigin"] = MyOrigin
+       MOBVAR["Machines"] = "localhost/80 max/80 moritz/80 castor/80"
+       split(MOBVAR["Machines"], Machines)           # which host is the first?
+       migrate(Machines[1], "", "")                  # go to the first host
+       while (("/inet/tcp/8080/0/0" |& getline) > 0) # wait for result
+         print $0                                    # print result
+       close("/inet/tcp/8080/0/0")
+     }
+
+   As mentioned earlier, this agent takes the name of its origin
+('MyOrigin') with it.  Then, it takes the name of its first destination
+and goes there for further work.  Notice that this name has the port
+number of the web server appended to the name of the server, because the
+function 'migrate()' needs it this way to create the 'HttpService'
+variable.  Finally, it waits for the result to arrive.  The 'MyJob()'
+function runs on the remote host:
+
+     function MyJob() {
+       # forget this host
+       sub(MOBVAR["Destination"], "", MOBVAR["Machines"])
+       MOBVAR["Result"]=MOBVAR["Result"] SUBSEP SUBSEP MOBVAR["Destination"] ":"
+       while (("who" | getline) > 0)               # who is logged in?
+         MOBVAR["Result"] = MOBVAR["Result"] SUBSEP $0
+       close("who")
+       if (index(MOBVAR["Machines"], "/") > 0) {   # any more machines to visit?
+         split(MOBVAR["Machines"], Machines)       # which host is next?
+         migrate(Machines[1], "", "")              # go there
+       } else {                                    # no more machines
+         gsub(SUBSEP, "\n", MOBVAR["Result"])      # send result to origin
+         print MOBVAR["Result"] |& "/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080"
+         close("/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080")
+       }
+     }
+
+   After migrating, the first thing to do in 'MyJob()' is to delete the
+name of the current host from the list of hosts to visit.  Now, it is
+time to start the real work by appending the host's name to the result
+string, and reading line by line who is logged in on this host.  A very
+annoying circumstance is the fact that the elements of 'MOBVAR' cannot
+hold the newline character ('"\n"').  If they did, migration of this
+string did not work because the string didn't obey the syntax rule for a
+string in 'gawk'.  'SUBSEP' is used as a temporary replacement.  If the
+list of hosts to visit holds at least one more entry, the agent migrates
+to that place to go on working there.  Otherwise, we replace the
+'SUBSEP's with a newline character in the resulting string, and report
+it to the originating host, whose name is stored in
+'MOBVAR["MyOrigin"]'.
+
+   ---------- Footnotes ----------
+
+   (1) <http://www.research.ibm.com/massive/mobag.ps>
+
+
+File: gawkinet.info,  Node: STOXPRED,  Next: PROTBASE,  Prev: MOBAGWHO,  Up: Some Applications and Techniques
+
+3.9 STOXPRED: Stock Market Prediction As A Service
+==================================================
+
+     Far out in the uncharted backwaters of the unfashionable end of the
+     Western Spiral arm of the Galaxy lies a small unregarded yellow
+     sun.
+
+     Orbiting this at a distance of roughly ninety-two million miles is
+     an utterly insignificant little blue-green planet whose
+     ape-descendent life forms are so amazingly primitive that they
+     still think digital watches are a pretty neat idea.
+
+     This planet has -- or rather had -- a problem, which was this: most
+     of the people living on it were unhappy for pretty much of the
+     time.  Many solutions were suggested for this problem, but most of
+     these were largely concerned with the movements of small green
+     pieces of paper, which is odd because it wasn't the small green
+     pieces of paper that were unhappy.
+     Douglas Adams, 'The Hitch Hiker's Guide to the Galaxy'
+
+   Valuable services on the Internet are usually _not_ implemented as
+mobile agents.  There are much simpler ways of implementing services.
+All Unix systems provide, for example, the 'cron' service.  Unix system
+users can write a list of tasks to be done each day, each week, twice a
+day, or just once.  The list is entered into a file named 'crontab'.
+For example, to distribute a newsletter on a daily basis this way, use
+'cron' for calling a script each day early in the morning.
+
+     # run at 8 am on weekdays, distribute the newsletter
+     0 8 * * 1-5   $HOME/bin/daily.job >> $HOME/log/newsletter 2>&1
+
+   The script first looks for interesting information on the Internet,
+assembles it in a nice form and sends the results via email to the
+customers.
+
+   The following is an example of a primitive newsletter on stock market
+prediction.  It is a report which first tries to predict the change of
+each share in the Dow Jones Industrial Index for the particular day.
+Then it mentions some especially promising shares as well as some shares
+which look remarkably bad on that day.  The report ends with the usual
+disclaimer which tells every child _not_ to try this at home and hurt
+anybody.
+
+     Good morning Uncle Scrooge,
+
+     This is your daily stock market report for Monday, October 16, 2000.
+     Here are the predictions for today:
+
+             AA      neutral
+             GE      up
+             JNJ     down
+             MSFT    neutral
+             ...
+             UTX     up
+             DD      down
+             IBM     up
+             MO      down
+             WMT     up
+             DIS     up
+             INTC    up
+             MRK     down
+             XOM     down
+             EK      down
+             IP      down
+
+     The most promising shares for today are these:
+
+             INTC            http://biz.yahoo.com/n/i/intc.html
+
+     The stock shares to avoid today are these:
+
+             EK              http://biz.yahoo.com/n/e/ek.html
+             IP              http://biz.yahoo.com/n/i/ip.html
+             DD              http://biz.yahoo.com/n/d/dd.html
+             ...
+
+   The script as a whole is rather long.  In order to ease the pain of
+studying other people's source code, we have broken the script up into
+meaningful parts which are invoked one after the other.  The basic
+structure of the script is as follows:
+
+     BEGIN {
+       Init()
+       ReadQuotes()
+       CleanUp()
+       Prediction()
+       Report()
+       SendMail()
+     }
+
+   The earlier parts store data into variables and arrays which are
+subsequently used by later parts of the script.  The 'Init()' function
+first checks if the script is invoked correctly (without any
+parameters).  If not, it informs the user of the correct usage.  What
+follows are preparations for the retrieval of the historical quote data.
+The names of the 30 stock shares are stored in an array 'name' along
+with the current date in 'day', 'month', and 'year'.
+
+   All users who are separated from the Internet by a firewall and have
+to direct their Internet accesses to a proxy must supply the name of the
+proxy to this script with the '-v Proxy=NAME' option.  For most users,
+the default proxy and port number should suffice.
+
+     function Init() {
+       if (ARGC != 1) {
+         print "STOXPRED - daily stock share prediction"
+         print "IN:\n    no parameters, nothing on stdin"
+         print "PARAM:\n    -v Proxy=MyProxy -v ProxyPort=80"
+         print "OUT:\n    commented predictions as email"
+         print "JK 09.10.2000"
+         exit
+       }
+       # Remember ticker symbols from Dow Jones Industrial Index
+       StockCount = split("AA GE JNJ MSFT AXP GM JPM PG BA HD KO \
+         SBC C HON MCD T CAT HWP MMM UTX DD IBM MO WMT DIS INTC \
+         MRK XOM EK IP", name);
+       # Remember the current date as the end of the time series
+       day   = strftime("%d")
+       month = strftime("%m")
+       year  = strftime("%Y")
+       if (Proxy     == "")  Proxy     = "chart.yahoo.com"
+       if (ProxyPort ==  0)  ProxyPort = 80
+       YahooData = "/inet/tcp/0/" Proxy "/" ProxyPort
+     }
+
+   There are two really interesting parts in the script.  One is the
+function which reads the historical stock quotes from an Internet
+server.  The other is the one that does the actual prediction.  In the
+following function we see how the quotes are read from the Yahoo server.
+The data which comes from the server is in CSV format (comma-separated
+values):
+
+     Date,Open,High,Low,Close,Volume
+     9-Oct-00,22.75,22.75,21.375,22.375,7888500
+     6-Oct-00,23.8125,24.9375,21.5625,22,10701100
+     5-Oct-00,24.4375,24.625,23.125,23.50,5810300
+
+   Lines contain values of the same time instant, whereas columns are
+separated by commas and contain the kind of data that is described in
+the header (first) line.  At first, 'gawk' is instructed to separate
+columns by commas ('FS = ","').  In the loop that follows, a connection
+to the Yahoo server is first opened, then a download takes place, and
+finally the connection is closed.  All this happens once for each ticker
+symbol.  In the body of this loop, an Internet address is built up as a
+string according to the rules of the Yahoo server.  The starting and
+ending date are chosen to be exactly the same, but one year apart in the
+past.  All the action is initiated within the 'printf' command which
+transmits the request for data to the Yahoo server.
+
+   In the inner loop, the server's data is first read and then scanned
+line by line.  Only lines which have six columns and the name of a month
+in the first column contain relevant data.  This data is stored in the
+two-dimensional array 'quote'; one dimension being time, the other being
+the ticker symbol.  During retrieval of the first stock's data, the
+calendar names of the time instances are stored in the array 'day'
+because we need them later.
+
+     function ReadQuotes() {
+       # Retrieve historical data for each ticker symbol
+       FS = ","
+       for (stock = 1; stock <= StockCount; stock++) {
+         URL = "http://chart.yahoo.com/table.csv?s=" name[stock] \
+               "&a=" month "&b=" day   "&c=" year-1 \
+               "&d=" month "&e=" day   "&f=" year \
+               "g=d&q=q&y=0&z=" name[stock] "&x=.csv"
+         printf("GET " URL " HTTP/1.0\r\n\r\n") |& YahooData
+         while ((YahooData |& getline) > 0) {
+           if (NF == 6 && $1 ~ /Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec/) {
+             if (stock == 1)
+               days[++daycount] = $1;
+             quote[$1, stock] = $5
+           }
+         }
+         close(YahooData)
+       }
+       FS = " "
+     }
+
+   Now that we _have_ the data, it can be checked once again to make
+sure that no individual stock is missing or invalid, and that all the
+stock quotes are aligned correctly.  Furthermore, we renumber the time
+instances.  The most recent day gets day number 1 and all other days get
+consecutive numbers.  All quotes are rounded toward the nearest whole
+number in US Dollars.
+
+     function CleanUp() {
+       # clean up time series; eliminate incomplete data sets
+       for (d = 1; d <= daycount; d++) {
+         for (stock = 1; stock <= StockCount; stock++)
+           if (! ((days[d], stock) in quote))
+               stock = StockCount + 10
+         if (stock > StockCount + 1)
+             continue
+         datacount++
+         for (stock = 1; stock <= StockCount; stock++)
+           data[datacount, stock] = int(0.5 + quote[days[d], stock])
+       }
+       delete quote
+       delete days
+     }
+
+   Now we have arrived at the second really interesting part of the
+whole affair.  What we present here is a very primitive prediction
+algorithm: _If a stock fell yesterday, assume it will also fall today;
+if it rose yesterday, assume it will rise today_.  (Feel free to replace
+this algorithm with a smarter one.)  If a stock changed in the same
+direction on two consecutive days, this is an indication which should be
+highlighted.  Two-day advances are stored in 'hot' and two-day declines
+in 'avoid'.
+
+   The rest of the function is a sanity check.  It counts the number of
+correct predictions in relation to the total number of predictions one
+could have made in the year before.
+
+     function Prediction() {
+       # Predict each ticker symbol by prolonging yesterday's trend
+       for (stock = 1; stock <= StockCount; stock++) {
+         if         (data[1, stock] > data[2, stock]) {
+           predict[stock] = "up"
+         } else if  (data[1, stock] < data[2, stock]) {
+           predict[stock] = "down"
+         } else {
+           predict[stock] = "neutral"
+         }
+         if ((data[1, stock] > data[2, stock]) && (data[2, stock] > data[3, stock]))
+           hot[stock] = 1
+         if ((data[1, stock] < data[2, stock]) && (data[2, stock] < data[3, stock]))
+           avoid[stock] = 1
+       }
+       # Do a plausibility check: how many predictions proved correct?
+       for (s = 1; s <= StockCount; s++) {
+         for (d = 1; d <= datacount-2; d++) {
+           if         (data[d+1, s] > data[d+2, s]) {
+             UpCount++
+           } else if  (data[d+1, s] < data[d+2, s]) {
+             DownCount++
+           } else {
+             NeutralCount++
+           }
+           if (((data[d, s]  > data[d+1, s]) && (data[d+1, s]  > data[d+2, s])) ||
+               ((data[d, s]  < data[d+1, s]) && (data[d+1, s]  < data[d+2, s])) ||
+               ((data[d, s] == data[d+1, s]) && (data[d+1, s] == data[d+2, s])))
+             CorrectCount++
+         }
+       }
+     }
+
+   At this point the hard work has been done: the array 'predict'
+contains the predictions for all the ticker symbols.  It is up to the
+function 'Report()' to find some nice words to introduce the desired
+information.
+
+     function Report() {
+       # Generate report
+       report =        "\nThis is your daily "
+       report = report "stock market report for "strftime("%A, %B %d, %Y")".\n"
+       report = report "Here are the predictions for today:\n\n"
+       for (stock = 1; stock <= StockCount; stock++)
+         report = report "\t" name[stock] "\t" predict[stock] "\n"
+       for (stock in hot) {
+         if (HotCount++ == 0)
+           report = report "\nThe most promising shares for today are these:\n\n"
+         report = report "\t" name[stock] "\t\thttp://biz.yahoo.com/n/" \
+           tolower(substr(name[stock], 1, 1)) "/" tolower(name[stock]) ".html\n"
+       }
+       for (stock in avoid) {
+         if (AvoidCount++ == 0)
+           report = report "\nThe stock shares to avoid today are these:\n\n"
+         report = report "\t" name[stock] "\t\thttp://biz.yahoo.com/n/" \
+           tolower(substr(name[stock], 1, 1)) "/" tolower(name[stock]) ".html\n"
+       }
+       report = report "\nThis sums up to " HotCount+0 " winners and " AvoidCount+0
+       report = report " losers. When using this kind\nof prediction scheme for"
+       report = report " the 12 months which lie behind us,\nwe get " UpCount
+       report = report " 'ups' and " DownCount " 'downs' and " NeutralCount
+       report = report " 'neutrals'. Of all\nthese " UpCount+DownCount+NeutralCount
+       report = report " predictions " CorrectCount " proved correct next day.\n"
+       report = report "A success rate of "\
+                  int(100*CorrectCount/(UpCount+DownCount+NeutralCount)) "%.\n"
+       report = report "Random choice would have produced a 33% success rate.\n"
+       report = report "Disclaimer: Like every other prediction of the stock\n"
+       report = report "market, this report is, of course, complete nonsense.\n"
+       report = report "If you are stupid enough to believe these predictions\n"
+       report = report "you should visit a doctor who can treat your ailment."
+     }
+
+   The function 'SendMail()' goes through the list of customers and
+opens a pipe to the 'mail' command for each of them.  Each one receives
+an email message with a proper subject heading and is addressed with his
+full name.
+
+     function SendMail() {
+       # send report to customers
+       customer["uncle.scrooge@ducktown.gov"] = "Uncle Scrooge"
+       customer["more@utopia.org"           ] = "Sir Thomas More"
+       customer["spinoza@denhaag.nl"        ] = "Baruch de Spinoza"
+       customer["marx@highgate.uk"          ] = "Karl Marx"
+       customer["keynes@the.long.run"       ] = "John Maynard Keynes"
+       customer["bierce@devil.hell.org"     ] = "Ambrose Bierce"
+       customer["laplace@paris.fr"          ] = "Pierre Simon de Laplace"
+       for (c in customer) {
+         MailPipe = "mail -s 'Daily Stock Prediction Newsletter'" c
+         print "Good morning " customer[c] "," | MailPipe
+         print report "\n.\n" | MailPipe
+         close(MailPipe)
+       }
+     }
+
+   Be patient when running the script by hand.  Retrieving the data for
+all the ticker symbols and sending the emails may take several minutes
+to complete, depending upon network traffic and the speed of the
+available Internet link.  The quality of the prediction algorithm is
+likely to be disappointing.  Try to find a better one.  Should you find
+one with a success rate of more than 50%, please tell us about it!  It
+is only for the sake of curiosity, of course.  ':-)'
+
+
+File: gawkinet.info,  Node: PROTBASE,  Prev: STOXPRED,  Up: Some Applications and Techniques
+
+3.10 PROTBASE: Searching Through A Protein Database
+===================================================
+
+     Hoare's Law of Large Problems: Inside every large problem is a
+     small problem struggling to get out.
+
+   Yahoo's database of stock market data is just one among the many
+large databases on the Internet.  Another one is located at NCBI
+(National Center for Biotechnology Information).  Established in 1988 as
+a national resource for molecular biology information, NCBI creates
+public databases, conducts research in computational biology, develops
+software tools for analyzing genome data, and disseminates biomedical
+information.  In this section, we look at one of NCBI's public services,
+which is called BLAST (Basic Local Alignment Search Tool).
+
+   You probably know that the information necessary for reproducing
+living cells is encoded in the genetic material of the cells.  The
+genetic material is a very long chain of four base nucleotides.  It is
+the order of appearance (the sequence) of nucleotides which contains the
+information about the substance to be produced.  Scientists in
+biotechnology often find a specific fragment, determine the nucleotide
+sequence, and need to know where the sequence at hand comes from.  This
+is where the large databases enter the game.  At NCBI, databases store
+the knowledge about which sequences have ever been found and where they
+have been found.  When the scientist sends his sequence to the BLAST
+service, the server looks for regions of genetic material in its
+database which look the most similar to the delivered nucleotide
+sequence.  After a search time of some seconds or minutes the server
+sends an answer to the scientist.  In order to make access simple, NCBI
+chose to offer their database service through popular Internet
+protocols.  There are four basic ways to use the so-called BLAST
+services:
+
+   * The easiest way to use BLAST is through the web.  Users may simply
+     point their browsers at the NCBI home page and link to the BLAST
+     pages.  NCBI provides a stable URL that may be used to perform
+     BLAST searches without interactive use of a web browser.  This is
+     what we will do later in this section.  A demonstration client and
+     a 'README' file demonstrate how to access this URL.
+
+   * Currently, 'blastcl3' is the standard network BLAST client.  You
+     can download 'blastcl3' from the anonymous FTP location.
+
+   * BLAST 2.0 can be run locally as a full executable and can be used
+     to run BLAST searches against private local databases, or
+     downloaded copies of the NCBI databases.  BLAST 2.0 executables may
+     be found on the NCBI anonymous FTP server.
+
+   * The NCBI BLAST Email server is the best option for people without
+     convenient access to the web.  A similarity search can be performed
+     by sending a properly formatted mail message containing the
+     nucleotide or protein query sequence to <blast@ncbi.nlm.nih.gov>.
+     The query sequence is compared against the specified database using
+     the BLAST algorithm and the results are returned in an email
+     message.  For more information on formulating email BLAST searches,
+     you can send a message consisting of the word "HELP" to the same
+     address, <blast@ncbi.nlm.nih.gov>.
+
+   Our starting point is the demonstration client mentioned in the first
+option.  The 'README' file that comes along with the client explains the
+whole process in a nutshell.  In the rest of this section, we first show
+what such requests look like.  Then we show how to use 'gawk' to
+implement a client in about 10 lines of code.  Finally, we show how to
+interpret the result returned from the service.
+
+   Sequences are expected to be represented in the standard IUB/IUPAC
+amino acid and nucleic acid codes, with these exceptions: lower-case
+letters are accepted and are mapped into upper-case; a single hyphen or
+dash can be used to represent a gap of indeterminate length; and in
+amino acid sequences, 'U' and '*' are acceptable letters (see below).
+Before submitting a request, any numerical digits in the query sequence
+should either be removed or replaced by appropriate letter codes (e.g.,
+'N' for unknown nucleic acid residue or 'X' for unknown amino acid
+residue).  The nucleic acid codes supported are:
+
+     A --> adenosine               M --> A C (amino)
+     C --> cytidine                S --> G C (strong)
+     G --> guanine                 W --> A T (weak)
+     T --> thymidine               B --> G T C
+     U --> uridine                 D --> G A T
+     R --> G A (purine)            H --> A C T
+     Y --> T C (pyrimidine)        V --> G C A
+     K --> G T (keto)              N --> A G C T (any)
+                                   -  gap of indeterminate length
+
+   Now you know the alphabet of nucleotide sequences.  The last two
+lines of the following example query show you such a sequence, which is
+obviously made up only of elements of the alphabet just described.
+Store this example query into a file named 'protbase.request'.  You are
+now ready to send it to the server with the demonstration client.
+
+     PROGRAM blastn
+     DATALIB month
+     EXPECT 0.75
+     BEGIN
+     >GAWK310 the gawking gene GNU AWK
+     tgcttggctgaggagccataggacgagagcttcctggtgaagtgtgtttcttgaaatcat
+     caccaccatggacagcaaa
+
+   The actual search request begins with the mandatory parameter
+'PROGRAM' in the first column followed by the value 'blastn' (the name
+of the program) for searching nucleic acids.  The next line contains the
+mandatory search parameter 'DATALIB' with the value 'month' for the
+newest nucleic acid sequences.  The third line contains an optional
+'EXPECT' parameter and the value desired for it.  The fourth line
+contains the mandatory 'BEGIN' directive, followed by the query sequence
+in FASTA/Pearson format.  Each line of information must be less than 80
+characters in length.
+
+   The "month" database contains all new or revised sequences released
+in the last 30 days and is useful for searching against new sequences.
+There are five different blast programs, 'blastn' being the one that
+compares a nucleotide query sequence against a nucleotide sequence
+database.
+
+   The last server directive that must appear in every request is the
+'BEGIN' directive.  The query sequence should immediately follow the
+'BEGIN' directive and must appear in FASTA/Pearson format.  A sequence
+in FASTA/Pearson format begins with a single-line description.  The
+description line, which is required, is distinguished from the lines of
+sequence data that follow it by having a greater-than ('>') symbol in
+the first column.  For the purposes of the BLAST server, the text of the
+description is arbitrary.
+
+   If you prefer to use a client written in 'gawk', just store the
+following 10 lines of code into a file named 'protbase.awk' and use this
+client instead.  Invoke it with 'gawk -f protbase.awk protbase.request'.
+Then wait a minute and watch the result coming in.  In order to
+replicate the demonstration client's behavior as closely as possible,
+this client does not use a proxy server.  We could also have extended
+the client program in *note Retrieving Web Pages: GETURL, to implement
+the client request from 'protbase.awk' as a special case.
+
+     { request = request "\n" $0 }
+
+     END {
+       BLASTService     = "/inet/tcp/0/www.ncbi.nlm.nih.gov/80"
+       printf "POST /cgi-bin/BLAST/nph-blast_report HTTP/1.0\n" |& BLASTService
+       printf "Content-Length: " length(request) "\n\n"         |& BLASTService
+       printf request                                           |& BLASTService
+       while ((BLASTService |& getline) > 0)
+           print $0
+       close(BLASTService)
+     }
+
+   The demonstration client from NCBI is 214 lines long (written in C)
+and it is not immediately obvious what it does.  Our client is so short
+that it _is_ obvious what it does.  First it loops over all lines of the
+query and stores the whole query into a variable.  Then the script
+establishes an Internet connection to the NCBI server and transmits the
+query by framing it with a proper HTTP request.  Finally it receives and
+prints the complete result coming from the server.
+
+   Now, let us look at the result.  It begins with an HTTP header, which
+you can ignore.  Then there are some comments about the query having
+been filtered to avoid spuriously high scores.  After this, there is a
+reference to the paper that describes the software being used for
+searching the data base.  After a repetition of the original query's
+description we find the list of significant alignments:
+
+     Sequences producing significant alignments:                        (bits)  Value
+
+     gb|AC021182.14|AC021182 Homo sapiens chromosome 7 clone RP11-733...    38  0.20
+     gb|AC021056.12|AC021056 Homo sapiens chromosome 3 clone RP11-115...    38  0.20
+     emb|AL160278.10|AL160278 Homo sapiens chromosome 9 clone RP11-57...    38  0.20
+     emb|AL391139.11|AL391139 Homo sapiens chromosome X clone RP11-35...    38  0.20
+     emb|AL365192.6|AL365192 Homo sapiens chromosome 6 clone RP3-421H...    38  0.20
+     emb|AL138812.9|AL138812 Homo sapiens chromosome 11 clone RP1-276...    38  0.20
+     gb|AC073881.3|AC073881 Homo sapiens chromosome 15 clone CTD-2169...    38  0.20
+
+   This means that the query sequence was found in seven human
+chromosomes.  But the value 0.20 (20%) means that the probability of an
+accidental match is rather high (20%) in all cases and should be taken
+into account.  You may wonder what the first column means.  It is a key
+to the specific database in which this occurrence was found.  The unique
+sequence identifiers reported in the search results can be used as
+sequence retrieval keys via the NCBI server.  The syntax of sequence
+header lines used by the NCBI BLAST server depends on the database from
+which each sequence was obtained.  The table below lists the identifiers
+for the databases from which the sequences were derived.
+
+     Database Name                     Identifier Syntax
+     ============================      ========================
+     GenBank                           gb|accession|locus
+     EMBL Data Library                 emb|accession|locus
+     DDBJ, DNA Database of Japan       dbj|accession|locus
+     NBRF PIR                          pir||entry
+     Protein Research Foundation       prf||name
+     SWISS-PROT                        sp|accession|entry name
+     Brookhaven Protein Data Bank      pdb|entry|chain
+     Kabat's Sequences of Immuno...    gnl|kabat|identifier
+     Patents                           pat|country|number
+     GenInfo Backbone Id               bbs|number
+
+   For example, an identifier might be 'gb|AC021182.14|AC021182', where
+the 'gb' tag indicates that the identifier refers to a GenBank sequence,
+'AC021182.14' is its GenBank ACCESSION, and 'AC021182' is the GenBank
+LOCUS. The identifier contains no spaces, so that a space indicates the
+end of the identifier.
+
+   Let us continue in the result listing.  Each of the seven alignments
+mentioned above is subsequently described in detail.  We will have a
+closer look at the first of them.
+
+     >gb|AC021182.14|AC021182 Homo sapiens chromosome 7 clone RP11-733N23, WORKING DRAFT SEQUENCE, 4
+                  unordered pieces
+               Length = 176383
+
+      Score = 38.2 bits (19), Expect = 0.20
+      Identities = 19/19 (100%)
+      Strand = Plus / Plus
+
+     Query: 35    tggtgaagtgtgtttcttg 53
+                  |||||||||||||||||||
+     Sbjct: 69786 tggtgaagtgtgtttcttg 69804
+
+   This alignment was located on the human chromosome 7.  The fragment
+on which part of the query was found had a total length of 176383.  Only
+19 of the nucleotides matched and the matching sequence ran from
+character 35 to 53 in the query sequence and from 69786 to 69804 in the
+fragment on chromosome 7.  If you are still reading at this point, you
+are probably interested in finding out more about Computational Biology
+and you might appreciate the following hints.
+
+  1. There is a book called 'Introduction to Computational Biology' by
+     Michael S. Waterman, which is worth reading if you are seriously
+     interested.  You can find a good book review on the Internet.
+
+  2. While Waterman's book can explain to you the algorithms employed
+     internally in the database search engines, most practitioners
+     prefer to approach the subject differently.  The applied side of
+     Computational Biology is called Bioinformatics, and emphasizes the
+     tools available for day-to-day work as well as how to actually
+     _use_ them.  One of the very few affordable books on Bioinformatics
+     is 'Developing Bioinformatics Computer Skills'.
+
+  3. The sequences _gawk_ and _gnuawk_ are in widespread use in the
+     genetic material of virtually every earthly living being.  Let us
+     take this as a clear indication that the divine creator has
+     intended 'gawk' to prevail over other scripting languages such as
+     'perl', 'tcl', or 'python' which are not even proper sequences.
+     (:-)
+
+
+File: gawkinet.info,  Node: Links,  Next: GNU Free Documentation License,  Prev: Some Applications and Techniques,  Up: Top
+
+4 Related Links
+***************
+
+This section lists the URLs for various items discussed in this major
+node.  They are presented in the order in which they appear.
+
+'Internet Programming with Python'
+     <http://www.fsbassociates.com/books/python.htm>
+
+'Advanced Perl Programming'
+     <http://www.oreilly.com/catalog/advperl>
+
+'Web Client Programming with Perl'
+     <http://www.oreilly.com/catalog/webclient>
+
+Richard Stevens's home page and book
+     <http://www.kohala.com/~rstevens>
+
+The SPAK home page
+     <http://www.userfriendly.net/linux/RPM/contrib/libc6/i386/spak-0.6b-1.i386.html>
+
+Volume III of 'Internetworking with TCP/IP', by Comer and Stevens
+     <http://www.cs.purdue.edu/homes/dec/tcpip3s.cont.html>
+
+XBM Graphics File Format
+     <http://www.wotsit.org/download.asp?f=xbm>
+
+GNUPlot
+     <http://www.cs.dartmouth.edu/gnuplot_info.html>
+
+Mark Humphrys' Eliza page
+     <http://www.compapp.dcu.ie/~humphrys/eliza.html>
+
+Yahoo! Eliza Information
+     <http://dir.yahoo.com/Recreation/Games/Computer_Games/Internet_Games/Web_Games/Artificial_Intelligence>
+
+Java versions of Eliza
+     <http://www.tjhsst.edu/Psych/ch1/eliza.html>
+
+Java versions of Eliza with source code
+     <http://home.adelphia.net/~lifeisgood/eliza/eliza.htm>
+
+Eliza Programs with Explanations
+     <http://chayden.net/chayden/eliza/Eliza.shtml>
+
+Loebner Contest
+     <http://acm.org/~loebner/loebner-prize.htmlx>
+
+Tck/Tk Information
+     <http://www.scriptics.com/>
+
+Intel 80x86 Processors
+     <http://developer.intel.com/design/platform/embedpc/what_is.htm>
+
+AMD Elan Processors
+     <http://www.amd.com/products/epd/processors/4.32bitcont/32bitcont/index.html>
+
+XINU
+     <http://willow.canberra.edu.au/~chrisc/xinu.html>
+
+GNU/Linux
+     <http://uclinux.lineo.com/>
+
+Embedded PCs
+     <http://dir.yahoo.com/Business_and_Economy/Business_to_Business/Computers/Hardware/Embedded_Control/>
+
+MiniSQL
+     <http://www.hughes.com.au/library/>
+
+Market Share Surveys
+     <http://www.netcraft.com/survey>
+
+'Numerical Recipes in C: The Art of Scientific Computing'
+     <http://www.nr.com>
+
+VRML
+     <http://www.vrml.org>
+
+The VRML FAQ
+     <http://www.vrml.org/technicalinfo/specifications/specifications.htm#FAQ>
+
+The UMBC Agent Web
+     <http://www.cs.umbc.edu/agents>
+
+Apache Web Server
+     <http://www.apache.org>
+
+National Center for Biotechnology Information (NCBI)
+     <http://www.ncbi.nlm.nih.gov>
+
+Basic Local Alignment Search Tool (BLAST)
+     <http://www.ncbi.nlm.nih.gov/BLAST/blast_overview.html>
+
+NCBI Home Page
+     <http://www.ncbi.nlm.nih.gov>
+
+BLAST Pages
+     <http://www.ncbi.nlm.nih.gov/BLAST>
+
+BLAST Demonstration Client
+     <ftp://ncbi.nlm.nih.gov/blast/blasturl/>
+
+BLAST anonymous FTP location
+     <ftp://ncbi.nlm.nih.gov/blast/network/netblast/>
+
+BLAST 2.0 Executables
+     <ftp://ncbi.nlm.nih.gov/blast/executables/>
+
+IUB/IUPAC Amino Acid and Nucleic Acid Codes
+     <http://www.uthscsa.edu/geninfo/blastmail.html#item6>
+
+FASTA/Pearson Format
+     <http://www.ncbi.nlm.nih.gov/BLAST/fasta.html>
+
+Fasta/Pearson Sequence in Java
+     <http://www.kazusa.or.jp/java/codon_table_java/>
+
+Book Review of 'Introduction to Computational Biology'
+     <http://www.acm.org/crossroads/xrds5-1/introcb.html>
+
+'Developing Bioinformatics Computer Skills'
+     <http://www.oreilly.com/catalog/bioskills/>
+
+
+File: gawkinet.info,  Node: GNU Free Documentation License,  Next: Index,  Prev: Links,  Up: Top
+
+GNU Free Documentation License
+******************************
+
+                     Version 1.3, 3 November 2008
+
+     Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
+     <http://fsf.org/>
+
+     Everyone is permitted to copy and distribute verbatim copies
+     of this license document, but changing it is not allowed.
+
+  0. PREAMBLE
+
+     The purpose of this License is to make a manual, textbook, or other
+     functional and useful document "free" in the sense of freedom: to
+     assure everyone the effective freedom to copy and redistribute it,
+     with or without modifying it, either commercially or
+     noncommercially.  Secondarily, this License preserves for the
+     author and publisher a way to get credit for their work, while not
+     being considered responsible for modifications made by others.
+
+     This License is a kind of "copyleft", which means that derivative
+     works of the document must themselves be free in the same sense.
+     It complements the GNU General Public License, which is a copyleft
+     license designed for free software.
+
+     We have designed this License in order to use it for manuals for
+     free software, because free software needs free documentation: a
+     free program should come with manuals providing the same freedoms
+     that the software does.  But this License is not limited to
+     software manuals; it can be used for any textual work, regardless
+     of subject matter or whether it is published as a printed book.  We
+     recommend this License principally for works whose purpose is
+     instruction or reference.
+
+  1. APPLICABILITY AND DEFINITIONS
+
+     This License applies to any manual or other work, in any medium,
+     that contains a notice placed by the copyright holder saying it can
+     be distributed under the terms of this License.  Such a notice
+     grants a world-wide, royalty-free license, unlimited in duration,
+     to use that work under the conditions stated herein.  The
+     "Document", below, refers to any such manual or work.  Any member
+     of the public is a licensee, and is addressed as "you".  You accept
+     the license if you copy, modify or distribute the work in a way
+     requiring permission under copyright law.
+
+     A "Modified Version" of the Document means any work containing the
+     Document or a portion of it, either copied verbatim, or with
+     modifications and/or translated into another language.
+
+     A "Secondary Section" is a named appendix or a front-matter section
+     of the Document that deals exclusively with the relationship of the
+     publishers or authors of the Document to the Document's overall
+     subject (or to related matters) and contains nothing that could
+     fall directly within that overall subject.  (Thus, if the Document
+     is in part a textbook of mathematics, a Secondary Section may not
+     explain any mathematics.)  The relationship could be a matter of
+     historical connection with the subject or with related matters, or
+     of legal, commercial, philosophical, ethical or political position
+     regarding them.
+
+     The "Invariant Sections" are certain Secondary Sections whose
+     titles are designated, as being those of Invariant Sections, in the
+     notice that says that the Document is released under this License.
+     If a section does not fit the above definition of Secondary then it
+     is not allowed to be designated as Invariant.  The Document may
+     contain zero Invariant Sections.  If the Document does not identify
+     any Invariant Sections then there are none.
+
+     The "Cover Texts" are certain short passages of text that are
+     listed, as Front-Cover Texts or Back-Cover Texts, in the notice
+     that says that the Document is released under this License.  A
+     Front-Cover Text may be at most 5 words, and a Back-Cover Text may
+     be at most 25 words.
+
+     A "Transparent" copy of the Document means a machine-readable copy,
+     represented in a format whose specification is available to the
+     general public, that is suitable for revising the document
+     straightforwardly with generic text editors or (for images composed
+     of pixels) generic paint programs or (for drawings) some widely
+     available drawing editor, and that is suitable for input to text
+     formatters or for automatic translation to a variety of formats
+     suitable for input to text formatters.  A copy made in an otherwise
+     Transparent file format whose markup, or absence of markup, has
+     been arranged to thwart or discourage subsequent modification by
+     readers is not Transparent.  An image format is not Transparent if
+     used for any substantial amount of text.  A copy that is not
+     "Transparent" is called "Opaque".
+
+     Examples of suitable formats for Transparent copies include plain
+     ASCII without markup, Texinfo input format, LaTeX input format,
+     SGML or XML using a publicly available DTD, and standard-conforming
+     simple HTML, PostScript or PDF designed for human modification.
+     Examples of transparent image formats include PNG, XCF and JPG.
+     Opaque formats include proprietary formats that can be read and
+     edited only by proprietary word processors, SGML or XML for which
+     the DTD and/or processing tools are not generally available, and
+     the machine-generated HTML, PostScript or PDF produced by some word
+     processors for output purposes only.
+
+     The "Title Page" means, for a printed book, the title page itself,
+     plus such following pages as are needed to hold, legibly, the
+     material this License requires to appear in the title page.  For
+     works in formats which do not have any title page as such, "Title
+     Page" means the text near the most prominent appearance of the
+     work's title, preceding the beginning of the body of the text.
+
+     The "publisher" means any person or entity that distributes copies
+     of the Document to the public.
+
+     A section "Entitled XYZ" means a named subunit of the Document
+     whose title either is precisely XYZ or contains XYZ in parentheses
+     following text that translates XYZ in another language.  (Here XYZ
+     stands for a specific section name mentioned below, such as
+     "Acknowledgements", "Dedications", "Endorsements", or "History".)
+     To "Preserve the Title" of such a section when you modify the
+     Document means that it remains a section "Entitled XYZ" according
+     to this definition.
+
+     The Document may include Warranty Disclaimers next to the notice
+     which states that this License applies to the Document.  These
+     Warranty Disclaimers are considered to be included by reference in
+     this License, but only as regards disclaiming warranties: any other
+     implication that these Warranty Disclaimers may have is void and
+     has no effect on the meaning of this License.
+
+  2. VERBATIM COPYING
+
+     You may copy and distribute the Document in any medium, either
+     commercially or noncommercially, provided that this License, the
+     copyright notices, and the license notice saying this License
+     applies to the Document are reproduced in all copies, and that you
+     add no other conditions whatsoever to those of this License.  You
+     may not use technical measures to obstruct or control the reading
+     or further copying of the copies you make or distribute.  However,
+     you may accept compensation in exchange for copies.  If you
+     distribute a large enough number of copies you must also follow the
+     conditions in section 3.
+
+     You may also lend copies, under the same conditions stated above,
+     and you may publicly display copies.
+
+  3. COPYING IN QUANTITY
+
+     If you publish printed copies (or copies in media that commonly
+     have printed covers) of the Document, numbering more than 100, and
+     the Document's license notice requires Cover Texts, you must
+     enclose the copies in covers that carry, clearly and legibly, all
+     these Cover Texts: Front-Cover Texts on the front cover, and
+     Back-Cover Texts on the back cover.  Both covers must also clearly
+     and legibly identify you as the publisher of these copies.  The
+     front cover must present the full title with all words of the title
+     equally prominent and visible.  You may add other material on the
+     covers in addition.  Copying with changes limited to the covers, as
+     long as they preserve the title of the Document and satisfy these
+     conditions, can be treated as verbatim copying in other respects.
+
+     If the required texts for either cover are too voluminous to fit
+     legibly, you should put the first ones listed (as many as fit
+     reasonably) on the actual cover, and continue the rest onto
+     adjacent pages.
+
+     If you publish or distribute Opaque copies of the Document
+     numbering more than 100, you must either include a machine-readable
+     Transparent copy along with each Opaque copy, or state in or with
+     each Opaque copy a computer-network location from which the general
+     network-using public has access to download using public-standard
+     network protocols a complete Transparent copy of the Document, free
+     of added material.  If you use the latter option, you must take
+     reasonably prudent steps, when you begin distribution of Opaque
+     copies in quantity, to ensure that this Transparent copy will
+     remain thus accessible at the stated location until at least one
+     year after the last time you distribute an Opaque copy (directly or
+     through your agents or retailers) of that edition to the public.
+
+     It is requested, but not required, that you contact the authors of
+     the Document well before redistributing any large number of copies,
+     to give them a chance to provide you with an updated version of the
+     Document.
+
+  4. MODIFICATIONS
+
+     You may copy and distribute a Modified Version of the Document
+     under the conditions of sections 2 and 3 above, provided that you
+     release the Modified Version under precisely this License, with the
+     Modified Version filling the role of the Document, thus licensing
+     distribution and modification of the Modified Version to whoever
+     possesses a copy of it.  In addition, you must do these things in
+     the Modified Version:
+
+       A. Use in the Title Page (and on the covers, if any) a title
+          distinct from that of the Document, and from those of previous
+          versions (which should, if there were any, be listed in the
+          History section of the Document).  You may use the same title
+          as a previous version if the original publisher of that
+          version gives permission.
+
+       B. List on the Title Page, as authors, one or more persons or
+          entities responsible for authorship of the modifications in
+          the Modified Version, together with at least five of the
+          principal authors of the Document (all of its principal
+          authors, if it has fewer than five), unless they release you
+          from this requirement.
+
+       C. State on the Title page the name of the publisher of the
+          Modified Version, as the publisher.
+
+       D. Preserve all the copyright notices of the Document.
+
+       E. Add an appropriate copyright notice for your modifications
+          adjacent to the other copyright notices.
+
+       F. Include, immediately after the copyright notices, a license
+          notice giving the public permission to use the Modified
+          Version under the terms of this License, in the form shown in
+          the Addendum below.
+
+       G. Preserve in that license notice the full lists of Invariant
+          Sections and required Cover Texts given in the Document's
+          license notice.
+
+       H. Include an unaltered copy of this License.
+
+       I. Preserve the section Entitled "History", Preserve its Title,
+          and add to it an item stating at least the title, year, new
+          authors, and publisher of the Modified Version as given on the
+          Title Page.  If there is no section Entitled "History" in the
+          Document, create one stating the title, year, authors, and
+          publisher of the Document as given on its Title Page, then add
+          an item describing the Modified Version as stated in the
+          previous sentence.
+
+       J. Preserve the network location, if any, given in the Document
+          for public access to a Transparent copy of the Document, and
+          likewise the network locations given in the Document for
+          previous versions it was based on.  These may be placed in the
+          "History" section.  You may omit a network location for a work
+          that was published at least four years before the Document
+          itself, or if the original publisher of the version it refers
+          to gives permission.
+
+       K. For any section Entitled "Acknowledgements" or "Dedications",
+          Preserve the Title of the section, and preserve in the section
+          all the substance and tone of each of the contributor
+          acknowledgements and/or dedications given therein.
+
+       L. Preserve all the Invariant Sections of the Document, unaltered
+          in their text and in their titles.  Section numbers or the
+          equivalent are not considered part of the section titles.
+
+       M. Delete any section Entitled "Endorsements".  Such a section
+          may not be included in the Modified Version.
+
+       N. Do not retitle any existing section to be Entitled
+          "Endorsements" or to conflict in title with any Invariant
+          Section.
+
+       O. Preserve any Warranty Disclaimers.
+
+     If the Modified Version includes new front-matter sections or
+     appendices that qualify as Secondary Sections and contain no
+     material copied from the Document, you may at your option designate
+     some or all of these sections as invariant.  To do this, add their
+     titles to the list of Invariant Sections in the Modified Version's
+     license notice.  These titles must be distinct from any other
+     section titles.
+
+     You may add a section Entitled "Endorsements", provided it contains
+     nothing but endorsements of your Modified Version by various
+     parties--for example, statements of peer review or that the text
+     has been approved by an organization as the authoritative
+     definition of a standard.
+
+     You may add a passage of up to five words as a Front-Cover Text,
+     and a passage of up to 25 words as a Back-Cover Text, to the end of
+     the list of Cover Texts in the Modified Version.  Only one passage
+     of Front-Cover Text and one of Back-Cover Text may be added by (or
+     through arrangements made by) any one entity.  If the Document
+     already includes a cover text for the same cover, previously added
+     by you or by arrangement made by the same entity you are acting on
+     behalf of, you may not add another; but you may replace the old
+     one, on explicit permission from the previous publisher that added
+     the old one.
+
+     The author(s) and publisher(s) of the Document do not by this
+     License give permission to use their names for publicity for or to
+     assert or imply endorsement of any Modified Version.
+
+  5. COMBINING DOCUMENTS
+
+     You may combine the Document with other documents released under
+     this License, under the terms defined in section 4 above for
+     modified versions, provided that you include in the combination all
+     of the Invariant Sections of all of the original documents,
+     unmodified, and list them all as Invariant Sections of your
+     combined work in its license notice, and that you preserve all
+     their Warranty Disclaimers.
+
+     The combined work need only contain one copy of this License, and
+     multiple identical Invariant Sections may be replaced with a single
+     copy.  If there are multiple Invariant Sections with the same name
+     but different contents, make the title of each such section unique
+     by adding at the end of it, in parentheses, the name of the
+     original author or publisher of that section if known, or else a
+     unique number.  Make the same adjustment to the section titles in
+     the list of Invariant Sections in the license notice of the
+     combined work.
+
+     In the combination, you must combine any sections Entitled
+     "History" in the various original documents, forming one section
+     Entitled "History"; likewise combine any sections Entitled
+     "Acknowledgements", and any sections Entitled "Dedications".  You
+     must delete all sections Entitled "Endorsements."
+
+  6. COLLECTIONS OF DOCUMENTS
+
+     You may make a collection consisting of the Document and other
+     documents released under this License, and replace the individual
+     copies of this License in the various documents with a single copy
+     that is included in the collection, provided that you follow the
+     rules of this License for verbatim copying of each of the documents
+     in all other respects.
+
+     You may extract a single document from such a collection, and
+     distribute it individually under this License, provided you insert
+     a copy of this License into the extracted document, and follow this
+     License in all other respects regarding verbatim copying of that
+     document.
+
+  7. AGGREGATION WITH INDEPENDENT WORKS
+
+     A compilation of the Document or its derivatives with other
+     separate and independent documents or works, in or on a volume of a
+     storage or distribution medium, is called an "aggregate" if the
+     copyright resulting from the compilation is not used to limit the
+     legal rights of the compilation's users beyond what the individual
+     works permit.  When the Document is included in an aggregate, this
+     License does not apply to the other works in the aggregate which
+     are not themselves derivative works of the Document.
+
+     If the Cover Text requirement of section 3 is applicable to these
+     copies of the Document, then if the Document is less than one half
+     of the entire aggregate, the Document's Cover Texts may be placed
+     on covers that bracket the Document within the aggregate, or the
+     electronic equivalent of covers if the Document is in electronic
+     form.  Otherwise they must appear on printed covers that bracket
+     the whole aggregate.
+
+  8. TRANSLATION
+
+     Translation is considered a kind of modification, so you may
+     distribute translations of the Document under the terms of section
+     4.  Replacing Invariant Sections with translations requires special
+     permission from their copyright holders, but you may include
+     translations of some or all Invariant Sections in addition to the
+     original versions of these Invariant Sections.  You may include a
+     translation of this License, and all the license notices in the
+     Document, and any Warranty Disclaimers, provided that you also
+     include the original English version of this License and the
+     original versions of those notices and disclaimers.  In case of a
+     disagreement between the translation and the original version of
+     this License or a notice or disclaimer, the original version will
+     prevail.
+
+     If a section in the Document is Entitled "Acknowledgements",
+     "Dedications", or "History", the requirement (section 4) to
+     Preserve its Title (section 1) will typically require changing the
+     actual title.
+
+  9. TERMINATION
+
+     You may not copy, modify, sublicense, or distribute the Document
+     except as expressly provided under this License.  Any attempt
+     otherwise to copy, modify, sublicense, or distribute it is void,
+     and will automatically terminate your rights under this License.
+
+     However, if you cease all violation of this License, then your
+     license from a particular copyright holder is reinstated (a)
+     provisionally, unless and until the copyright holder explicitly and
+     finally terminates your license, and (b) permanently, if the
+     copyright holder fails to notify you of the violation by some
+     reasonable means prior to 60 days after the cessation.
+
+     Moreover, your license from a particular copyright holder is
+     reinstated permanently if the copyright holder notifies you of the
+     violation by some reasonable means, this is the first time you have
+     received notice of violation of this License (for any work) from
+     that copyright holder, and you cure the violation prior to 30 days
+     after your receipt of the notice.
+
+     Termination of your rights under this section does not terminate
+     the licenses of parties who have received copies or rights from you
+     under this License.  If your rights have been terminated and not
+     permanently reinstated, receipt of a copy of some or all of the
+     same material does not give you any rights to use it.
+
+  10. FUTURE REVISIONS OF THIS LICENSE
+
+     The Free Software Foundation may publish new, revised versions of
+     the GNU Free Documentation License from time to time.  Such new
+     versions will be similar in spirit to the present version, but may
+     differ in detail to address new problems or concerns.  See
+     <http://www.gnu.org/copyleft/>.
+
+     Each version of the License is given a distinguishing version
+     number.  If the Document specifies that a particular numbered
+     version of this License "or any later version" applies to it, you
+     have the option of following the terms and conditions either of
+     that specified version or of any later version that has been
+     published (not as a draft) by the Free Software Foundation.  If the
+     Document does not specify a version number of this License, you may
+     choose any version ever published (not as a draft) by the Free
+     Software Foundation.  If the Document specifies that a proxy can
+     decide which future versions of this License can be used, that
+     proxy's public statement of acceptance of a version permanently
+     authorizes you to choose that version for the Document.
+
+  11. RELICENSING
+
+     "Massive Multiauthor Collaboration Site" (or "MMC Site") means any
+     World Wide Web server that publishes copyrightable works and also
+     provides prominent facilities for anybody to edit those works.  A
+     public wiki that anybody can edit is an example of such a server.
+     A "Massive Multiauthor Collaboration" (or "MMC") contained in the
+     site means any set of copyrightable works thus published on the MMC
+     site.
+
+     "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0
+     license published by Creative Commons Corporation, a not-for-profit
+     corporation with a principal place of business in San Francisco,
+     California, as well as future copyleft versions of that license
+     published by that same organization.
+
+     "Incorporate" means to publish or republish a Document, in whole or
+     in part, as part of another Document.
+
+     An MMC is "eligible for relicensing" if it is licensed under this
+     License, and if all works that were first published under this
+     License somewhere other than this MMC, and subsequently
+     incorporated in whole or in part into the MMC, (1) had no cover
+     texts or invariant sections, and (2) were thus incorporated prior
+     to November 1, 2008.
+
+     The operator of an MMC Site may republish an MMC contained in the
+     site under CC-BY-SA on the same site at any time before August 1,
+     2009, provided the MMC is eligible for relicensing.
+
+ADDENDUM: How to use this License for your documents
+====================================================
+
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and license
+notices just after the title page:
+
+       Copyright (C)  YEAR  YOUR NAME.
+       Permission is granted to copy, distribute and/or modify this document
+       under the terms of the GNU Free Documentation License, Version 1.3
+       or any later version published by the Free Software Foundation;
+       with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+       Texts.  A copy of the license is included in the section entitled ``GNU
+       Free Documentation License''.
+
+   If you have Invariant Sections, Front-Cover Texts and Back-Cover
+Texts, replace the "with...Texts."  line with this:
+
+         with the Invariant Sections being LIST THEIR TITLES, with
+         the Front-Cover Texts being LIST, and with the Back-Cover Texts
+         being LIST.
+
+   If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+   If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of free
+software license, such as the GNU General Public License, to permit
+their use in free software.
+
+
+File: gawkinet.info,  Node: Index,  Prev: GNU Free Documentation License,  Up: Top
+
+Index
+*****
+
+
+* Menu:
+
+* /inet/ files (gawk):                   Gawk Special Files.  (line  34)
+* /inet/tcp special files (gawk):        File /inet/tcp.      (line   6)
+* /inet/udp special files (gawk):        File /inet/udp.      (line   6)
+* | (vertical bar), |& operator (I/O):   TCP Connecting.      (line  25)
+* advanced features, network connections: Troubleshooting.    (line   6)
+* agent:                                 Challenges.          (line  75)
+* agent <1>:                             MOBAGWHO.            (line   6)
+* AI:                                    Challenges.          (line  75)
+* apache:                                WEBGRAB.             (line  72)
+* apache <1>:                            MOBAGWHO.            (line  42)
+* Bioinformatics:                        PROTBASE.            (line 227)
+* BLAST, Basic Local Alignment Search Tool: PROTBASE.         (line   6)
+* blocking:                              Making Connections.  (line  35)
+* Boutell, Thomas:                       STATIST.             (line   6)
+* CGI (Common Gateway Interface):        MOBAGWHO.            (line  42)
+* CGI (Common Gateway Interface), dynamic web pages and: Web page.
+                                                              (line  45)
+* CGI (Common Gateway Interface), library: CGI Lib.           (line  11)
+* clients:                               Making Connections.  (line  21)
+* Clinton, Bill:                         Challenges.          (line  58)
+* Common Gateway Interface, See CGI:     Web page.            (line  45)
+* Computational Biology:                 PROTBASE.            (line 227)
+* contest:                               Challenges.          (line   6)
+* cron utility:                          STOXPRED.            (line  23)
+* CSV format:                            STOXPRED.            (line 128)
+* Dow Jones Industrial Index:            STOXPRED.            (line  44)
+* ELIZA program:                         Simple Server.       (line  11)
+* ELIZA program <1>:                     Simple Server.       (line 178)
+* email:                                 Email.               (line  11)
+* FASTA/Pearson format:                  PROTBASE.            (line 102)
+* FDL (Free Documentation License):      GNU Free Documentation License.
+                                                              (line   6)
+* filenames, for network access:         Gawk Special Files.  (line  29)
+* files, /inet/ (gawk):                  Gawk Special Files.  (line  34)
+* files, /inet/tcp (gawk):               File /inet/tcp.      (line   6)
+* files, /inet/udp (gawk):               File /inet/udp.      (line   6)
+* finger utility:                        Setting Up.          (line  22)
+* Free Documentation License (FDL):      GNU Free Documentation License.
+                                                              (line   6)
+* FTP (File Transfer Protocol):          Basic Protocols.     (line  45)
+* gawk, networking:                      Using Networking.    (line   6)
+* gawk, networking, connections:         Special File Fields. (line  53)
+* gawk, networking, connections <1>:     TCP Connecting.      (line   6)
+* gawk, networking, filenames:           Gawk Special Files.  (line  29)
+* gawk, networking, See Also email:      Email.               (line   6)
+* gawk, networking, service, establishing: Setting Up.        (line   6)
+* gawk, networking, troubleshooting:     Caveats.             (line   6)
+* gawk, web and, See web service:        Interacting Service. (line   6)
+* getline command:                       TCP Connecting.      (line  11)
+* GETURL program:                        GETURL.              (line   6)
+* GIF image format:                      Web page.            (line  45)
+* GIF image format <1>:                  STATIST.             (line   6)
+* GNU Free Documentation License:        GNU Free Documentation License.
+                                                              (line   6)
+* GNU/Linux:                             Troubleshooting.     (line  54)
+* GNU/Linux <1>:                         Interacting.         (line  27)
+* GNU/Linux <2>:                         REMCONF.             (line   6)
+* GNUPlot utility:                       Interacting Service. (line 189)
+* GNUPlot utility <1>:                   STATIST.             (line   6)
+* Hoare, C.A.R.:                         MOBAGWHO.            (line   6)
+* Hoare, C.A.R. <1>:                     PROTBASE.            (line   6)
+* hostname field:                        Special File Fields. (line  34)
+* HTML (Hypertext Markup Language):      Web page.            (line  29)
+* HTTP (Hypertext Transfer Protocol):    Basic Protocols.     (line  45)
+* HTTP (Hypertext Transfer Protocol) <1>: Web page.           (line   6)
+* HTTP (Hypertext Transfer Protocol), record separators and: Web page.
+                                                              (line  29)
+* HTTP server, core logic:               Interacting Service. (line   6)
+* HTTP server, core logic <1>:           Interacting Service. (line  24)
+* Humphrys, Mark:                        Simple Server.       (line 178)
+* Hypertext Markup Language (HTML):      Web page.            (line  29)
+* Hypertext Transfer Protocol, See HTTP: Web page.            (line   6)
+* image format:                          STATIST.             (line   6)
+* images, in web pages:                  Interacting Service. (line 189)
+* images, retrieving over networks:      Web page.            (line  45)
+* input/output, two-way, See Also gawk, networking: Gawk Special Files.
+                                                              (line  19)
+* Internet, See networks:                Interacting.         (line  48)
+* JavaScript:                            STATIST.             (line  56)
+* Linux:                                 Troubleshooting.     (line  54)
+* Linux <1>:                             Interacting.         (line  27)
+* Linux <2>:                             REMCONF.             (line   6)
+* Lisp:                                  MOBAGWHO.            (line  98)
+* localport field:                       Gawk Special Files.  (line  34)
+* Loebner, Hugh:                         Challenges.          (line   6)
+* Loui, Ronald:                          Challenges.          (line  75)
+* MAZE:                                  MAZE.                (line   6)
+* Microsoft Windows:                     WEBGRAB.             (line  43)
+* Microsoft Windows, networking:         Troubleshooting.     (line  54)
+* Microsoft Windows, networking, ports:  Setting Up.          (line  37)
+* MiniSQL:                               REMCONF.             (line 109)
+* MOBAGWHO program:                      MOBAGWHO.            (line   6)
+* NCBI, National Center for Biotechnology Information: PROTBASE.
+                                                              (line   6)
+* network type field:                    Special File Fields. (line  11)
+* networks, gawk and:                    Using Networking.    (line   6)
+* networks, gawk and, connections:       Special File Fields. (line  53)
+* networks, gawk and, connections <1>:   TCP Connecting.      (line   6)
+* networks, gawk and, filenames:         Gawk Special Files.  (line  29)
+* networks, gawk and, See Also email:    Email.               (line   6)
+* networks, gawk and, service, establishing: Setting Up.      (line   6)
+* networks, gawk and, troubleshooting:   Caveats.             (line   6)
+* networks, ports, reserved:             Setting Up.          (line  37)
+* networks, ports, specifying:           Special File Fields. (line  24)
+* networks, See Also web pages:          PANIC.               (line   6)
+* Numerical Recipes:                     STATIST.             (line  24)
+* ORS variable, HTTP and:                Web page.            (line  29)
+* ORS variable, POP and:                 Email.               (line  36)
+* PANIC program:                         PANIC.               (line   6)
+* Perl:                                  Using Networking.    (line  14)
+* Perl, gawk networking and:             Using Networking.    (line  24)
+* Perlis, Alan:                          MAZE.                (line   6)
+* pipes, networking and:                 TCP Connecting.      (line  30)
+* PNG image format:                      Web page.            (line  45)
+* PNG image format <1>:                  STATIST.             (line   6)
+* POP (Post Office Protocol):            Email.               (line   6)
+* POP (Post Office Protocol) <1>:        Email.               (line  36)
+* Post Office Protocol (POP):            Email.               (line   6)
+* PostScript:                            STATIST.             (line 138)
+* PROLOG:                                Challenges.          (line  75)
+* PROTBASE:                              PROTBASE.            (line   6)
+* protocol field:                        Special File Fields. (line  17)
+* PS image format:                       STATIST.             (line   6)
+* Python:                                Using Networking.    (line  14)
+* Python, gawk networking and:           Using Networking.    (line  24)
+* record separators, HTTP and:           Web page.            (line  29)
+* record separators, POP and:            Email.               (line  36)
+* REMCONF program:                       REMCONF.             (line   6)
+* remoteport field:                      Gawk Special Files.  (line  34)
+* RFC 1939:                              Email.               (line   6)
+* RFC 1939 <1>:                          Email.               (line  36)
+* RFC 1945:                              Web page.            (line  29)
+* RFC 2068:                              Web page.            (line   6)
+* RFC 2068 <1>:                          Interacting Service. (line 104)
+* RFC 2616:                              Web page.            (line   6)
+* RFC 821:                               Email.               (line   6)
+* robot:                                 Challenges.          (line  84)
+* robot <1>:                             WEBGRAB.             (line   6)
+* RS variable, HTTP and:                 Web page.            (line  29)
+* RS variable, POP and:                  Email.               (line  36)
+* servers:                               Making Connections.  (line  14)
+* servers <1>:                           Setting Up.          (line  22)
+* servers, as hosts:                     Special File Fields. (line  34)
+* servers, HTTP:                         Interacting Service. (line   6)
+* servers, web:                          Simple Server.       (line   6)
+* Simple Mail Transfer Protocol (SMTP):  Email.               (line   6)
+* SMTP (Simple Mail Transfer Protocol):  Basic Protocols.     (line  45)
+* SMTP (Simple Mail Transfer Protocol) <1>: Email.            (line   6)
+* STATIST program:                       STATIST.             (line   6)
+* STOXPRED program:                      STOXPRED.            (line   6)
+* synchronous communications:            Making Connections.  (line  35)
+* Tcl/Tk:                                Using Networking.    (line  14)
+* Tcl/Tk, gawk and:                      Using Networking.    (line  24)
+* Tcl/Tk, gawk and <1>:                  Some Applications and Techniques.
+                                                              (line  22)
+* TCP (Transmission Control Protocol):   Using Networking.    (line  29)
+* TCP (Transmission Control Protocol) <1>: File /inet/tcp.    (line   6)
+* TCP (Transmission Control Protocol), connection, establishing: TCP Connecting.
+                                                              (line   6)
+* TCP (Transmission Control Protocol), UDP and: Interacting.  (line  48)
+* TCP/IP, network type, selecting:       Special File Fields. (line  11)
+* TCP/IP, protocols, selecting:          Special File Fields. (line  17)
+* TCP/IP, sockets and:                   Gawk Special Files.  (line  19)
+* Transmission Control Protocol, See TCP: Using Networking.   (line  29)
+* troubleshooting, gawk, networks:       Caveats.             (line   6)
+* troubleshooting, networks, connections: Troubleshooting.    (line   6)
+* troubleshooting, networks, timeouts:   Caveats.             (line  18)
+* UDP (User Datagram Protocol):          File /inet/udp.      (line   6)
+* UDP (User Datagram Protocol), TCP and: Interacting.         (line  48)
+* Unix, network ports and:               Setting Up.          (line  37)
+* URLCHK program:                        URLCHK.              (line   6)
+* User Datagram Protocol, See UDP:       File /inet/udp.      (line   6)
+* vertical bar (|), |& operator (I/O):   TCP Connecting.      (line  25)
+* VRML:                                  MAZE.                (line   6)
+* web browsers, See web service:         Interacting Service. (line   6)
+* web pages:                             Web page.            (line   6)
+* web pages, images in:                  Interacting Service. (line 189)
+* web pages, retrieving:                 GETURL.              (line   6)
+* web servers:                           Simple Server.       (line   6)
+* web service:                           Primitive Service.   (line   6)
+* web service <1>:                       PANIC.               (line   6)
+* WEBGRAB program:                       WEBGRAB.             (line   6)
+* Weizenbaum, Joseph:                    Simple Server.       (line  11)
+* XBM image format:                      Interacting Service. (line 189)
+* Yahoo!:                                REMCONF.             (line   6)
+* Yahoo! <1>:                            STOXPRED.            (line   6)
+
+
+
+Tag Table:
+Node: Top2022
+Node: Preface5665
+Node: Introduction7040
+Node: Stream Communications8066
+Node: Datagram Communications9240
+Node: The TCP/IP Protocols10870
+Ref: The TCP/IP Protocols-Footnote-111554
+Node: Basic Protocols11711
+Ref: Basic Protocols-Footnote-113756
+Node: Ports13785
+Node: Making Connections15192
+Ref: Making Connections-Footnote-117750
+Ref: Making Connections-Footnote-217797
+Node: Using Networking17978
+Node: Gawk Special Files20301
+Node: Special File Fields22110
+Ref: table-inet-components26003
+Node: Comparing Protocols27312
+Node: File /inet/tcp27846
+Node: File /inet/udp28874
+Ref: File /inet/udp-Footnote-130573
+Node: TCP Connecting30827
+Node: Troubleshooting33173
+Ref: Troubleshooting-Footnote-136232
+Node: Interacting36805
+Node: Setting Up39545
+Node: Email43048
+Node: Web page45380
+Ref: Web page-Footnote-148197
+Node: Primitive Service48395
+Node: Interacting Service51136
+Ref: Interacting Service-Footnote-160303
+Node: CGI Lib60335
+Node: Simple Server67310
+Ref: Simple Server-Footnote-175053
+Node: Caveats75154
+Node: Challenges76299
+Node: Some Applications and Techniques84997
+Node: PANIC87462
+Node: GETURL89186
+Node: REMCONF91819
+Node: URLCHK97314
+Node: WEBGRAB101166
+Node: STATIST105628
+Ref: STATIST-Footnote-1117377
+Node: MAZE117822
+Node: MOBAGWHO124029
+Ref: MOBAGWHO-Footnote-1138047
+Node: STOXPRED138102
+Node: PROTBASE152390
+Node: Links165506
+Node: GNU Free Documentation License168939
+Node: Index194059
+
+End Tag Table
author	Arnold D. Robbins <arnold@skeeve.com>	2016-10-26 21:49:00 +0300
committer	Arnold D. Robbins <arnold@skeeve.com>	2016-10-26 21:49:00 +0300
commit	e5abd6a16d42fc0f42277919a2d0a2c28476788c (patch)
tree	c620f8fd78ec318c7614efeb485c324be23689b7 /doc/gawkinet.info
parent	ff7329df359352a8f62a21bbb9fea47cba29b589 (diff)
download	egawk-e5abd6a16d42fc0f42277919a2d0a2c28476788c.tar.gz egawk-e5abd6a16d42fc0f42277919a2d0a2c28476788c.tar.bz2 egawk-e5abd6a16d42fc0f42277919a2d0a2c28476788c.zip