aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawkinet.info
diff options
context:
space:
mode:
authorArnold D. Robbins <arnold@skeeve.com>2016-10-25 21:38:59 +0300
committerArnold D. Robbins <arnold@skeeve.com>2016-10-25 21:38:59 +0300
commit8231da563c810ce210ce309ee1a022bad22a1e13 (patch)
tree857e7756692c2db6ec510a3fc67433b18e0b234f /doc/gawkinet.info
parenta90f46df6a98818c99abfe4c4e0b738cb845294e (diff)
downloadegawk-8231da563c810ce210ce309ee1a022bad22a1e13.tar.gz
egawk-8231da563c810ce210ce309ee1a022bad22a1e13.tar.bz2
egawk-8231da563c810ce210ce309ee1a022bad22a1e13.zip
Remove info files from repo. No need to keep updating them.
Diffstat (limited to 'doc/gawkinet.info')
-rw-r--r--doc/gawkinet.info4406
1 files changed, 0 insertions, 4406 deletions
diff --git a/doc/gawkinet.info b/doc/gawkinet.info
deleted file mode 100644
index d5a7abf8..00000000
--- a/doc/gawkinet.info
+++ /dev/null
@@ -1,4406 +0,0 @@
-This is gawkinet.info, produced by makeinfo version 6.1 from
-gawkinet.texi.
-
-This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
-4.1.4 (or later) version of the GNU implementation of AWK.
-
-
- Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
-Foundation, Inc.
-
-
- Permission is granted to copy, distribute and/or modify this document
-under the terms of the GNU Free Documentation License, Version 1.3 or
-any later version published by the Free Software Foundation; with the
-Invariant Sections being "GNU General Public License", the Front-Cover
-texts being (a) (see below), and with the Back-Cover Texts being (b)
-(see below). A copy of the license is included in the section entitled
-"GNU Free Documentation License".
-
- a. "A GNU Manual"
-
- b. "You have the freedom to copy and modify this GNU manual. Buying
- copies from the FSF supports it in developing GNU and promoting
- software freedom."
-INFO-DIR-SECTION Network applications
-START-INFO-DIR-ENTRY
-* Gawkinet: (gawkinet). TCP/IP Internetworking With 'gawk'.
-END-INFO-DIR-ENTRY
-
- This file documents the networking features in GNU 'awk'.
-
- This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
-4.1.4 (or later) version of the GNU implementation of AWK.
-
-
- Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
-Foundation, Inc.
-
-
- Permission is granted to copy, distribute and/or modify this document
-under the terms of the GNU Free Documentation License, Version 1.3 or
-any later version published by the Free Software Foundation; with the
-Invariant Sections being "GNU General Public License", the Front-Cover
-texts being (a) (see below), and with the Back-Cover Texts being (b)
-(see below). A copy of the license is included in the section entitled
-"GNU Free Documentation License".
-
- a. "A GNU Manual"
-
- b. "You have the freedom to copy and modify this GNU manual. Buying
- copies from the FSF supports it in developing GNU and promoting
- software freedom."
-
-
-File: gawkinet.info, Node: Top, Next: Preface, Prev: (dir), Up: (dir)
-
-General Introduction
-********************
-
-This file documents the networking features in GNU Awk ('gawk') version
-4.0 and later.
-
- This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
-4.1.4 (or later) version of the GNU implementation of AWK.
-
-
- Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
-Foundation, Inc.
-
-
- Permission is granted to copy, distribute and/or modify this document
-under the terms of the GNU Free Documentation License, Version 1.3 or
-any later version published by the Free Software Foundation; with the
-Invariant Sections being "GNU General Public License", the Front-Cover
-texts being (a) (see below), and with the Back-Cover Texts being (b)
-(see below). A copy of the license is included in the section entitled
-"GNU Free Documentation License".
-
- a. "A GNU Manual"
-
- b. "You have the freedom to copy and modify this GNU manual. Buying
- copies from the FSF supports it in developing GNU and promoting
- software freedom."
-
-* Menu:
-
-* Preface:: About this document.
-* Introduction:: About networking.
-* Using Networking:: Some examples.
-* Some Applications and Techniques:: More extended examples.
-* Links:: Where to find the stuff mentioned in this
- document.
-* GNU Free Documentation License:: The license for this document.
-* Index:: The index.
-
-* Stream Communications:: Sending data streams.
-* Datagram Communications:: Sending self-contained messages.
-* The TCP/IP Protocols:: How these models work in the Internet.
-* Basic Protocols:: The basic protocols.
-* Ports:: The idea behind ports.
-* Making Connections:: Making TCP/IP connections.
-* Gawk Special Files:: How to do 'gawk' networking.
-* Special File Fields:: The fields in the special file name.
-* Comparing Protocols:: Differences between the protocols.
-* File /inet/tcp:: The TCP special file.
-* File /inet/udp:: The UDP special file.
-* TCP Connecting:: Making a TCP connection.
-* Troubleshooting:: Troubleshooting TCP/IP connections.
-* Interacting:: Interacting with a service.
-* Setting Up:: Setting up a service.
-* Email:: Reading email.
-* Web page:: Reading a Web page.
-* Primitive Service:: A primitive Web service.
-* Interacting Service:: A Web service with interaction.
-* CGI Lib:: A simple CGI library.
-* Simple Server:: A simple Web server.
-* Caveats:: Network programming caveats.
-* Challenges:: Where to go from here.
-* PANIC:: An Emergency Web Server.
-* GETURL:: Retrieving Web Pages.
-* REMCONF:: Remote Configuration Of Embedded Systems.
-* URLCHK:: Look For Changed Web Pages.
-* WEBGRAB:: Extract Links From A Page.
-* STATIST:: Graphing A Statistical Distribution.
-* MAZE:: Walking Through A Maze In Virtual Reality.
-* MOBAGWHO:: A Simple Mobile Agent.
-* STOXPRED:: Stock Market Prediction As A Service.
-* PROTBASE:: Searching Through A Protein Database.
-
-
-File: gawkinet.info, Node: Preface, Next: Introduction, Prev: Top, Up: Top
-
-Preface
-*******
-
-In May of 1997, Ju"rgen Kahrs felt the need for network access from
-'awk', and, with a little help from me, set about adding features to do
-this for 'gawk'. At that time, he wrote the bulk of this Info file.
-
- The code and documentation were added to the 'gawk' 3.1 development
-tree, and languished somewhat until I could finally get down to some
-serious work on that version of 'gawk'. This finally happened in the
-middle of 2000.
-
- Meantime, Ju"rgen wrote an article about the Internet special files
-and '|&' operator for 'Linux Journal', and made a networking patch for
-the production versions of 'gawk' available from his home page. In
-August of 2000 (for 'gawk' 3.0.6), this patch also made it to the main
-GNU 'ftp' distribution site.
-
- For release with 'gawk', I edited Ju"rgen's prose for English grammar
-and style, as he is not a native English speaker. I also rearranged the
-material somewhat for what I felt was a better order of presentation,
-and (re)wrote some of the introductory material.
-
- The majority of this document and the code are his work, and the high
-quality and interesting ideas speak for themselves. It is my hope that
-these features will be of significant value to the 'awk' community.
-
-
-Arnold Robbins
-Nof Ayalon, ISRAEL
-March, 2001
-
-
-File: gawkinet.info, Node: Introduction, Next: Using Networking, Prev: Preface, Up: Top
-
-1 Networking Concepts
-*********************
-
-This major node provides a (necessarily) brief introduction to computer
-networking concepts. For many applications of 'gawk' to TCP/IP
-networking, we hope that this is enough. For more advanced tasks, you
-will need deeper background, and it may be necessary to switch to
-lower-level programming in C or C++.
-
- There are two real-life models for the way computers send messages to
-each other over a network. While the analogies are not perfect, they
-are close enough to convey the major concepts. These two models are the
-phone system (reliable byte-stream communications), and the postal
-system (best-effort datagrams).
-
-* Menu:
-
-* Stream Communications:: Sending data streams.
-* Datagram Communications:: Sending self-contained messages.
-* The TCP/IP Protocols:: How these models work in the Internet.
-* Making Connections:: Making TCP/IP connections.
-
-
-File: gawkinet.info, Node: Stream Communications, Next: Datagram Communications, Prev: Introduction, Up: Introduction
-
-1.1 Reliable Byte-streams (Phone Calls)
-=======================================
-
-When you make a phone call, the following steps occur:
-
- 1. You dial a number.
-
- 2. The phone system connects to the called party, telling them there
- is an incoming call. (Their phone rings.)
-
- 3. The other party answers the call, or, in the case of a computer
- network, refuses to answer the call.
-
- 4. Assuming the other party answers, the connection between you is now
- a "duplex" (two-way), "reliable" (no data lost), sequenced (data
- comes out in the order sent) data stream.
-
- 5. You and your friend may now talk freely, with the phone system
- moving the data (your voices) from one end to the other. From your
- point of view, you have a direct end-to-end connection with the
- person on the other end.
-
- The same steps occur in a duplex reliable computer networking
-connection. There is considerably more overhead in setting up the
-communications, but once it's done, data moves in both directions,
-reliably, in sequence.
-
-
-File: gawkinet.info, Node: Datagram Communications, Next: The TCP/IP Protocols, Prev: Stream Communications, Up: Introduction
-
-1.2 Best-effort Datagrams (Mailed Letters)
-==========================================
-
-Suppose you mail three different documents to your office on the other
-side of the country on two different days. Doing so entails the
-following.
-
- 1. Each document travels in its own envelope.
-
- 2. Each envelope contains both the sender and the recipient address.
-
- 3. Each envelope may travel a different route to its destination.
-
- 4. The envelopes may arrive in a different order from the one in which
- they were sent.
-
- 5. One or more may get lost in the mail. (Although, fortunately, this
- does not occur very often.)
-
- 6. In a computer network, one or more "packets" may also arrive
- multiple times. (This doesn't happen with the postal system!)
-
- The important characteristics of datagram communications, like those
-of the postal system are thus:
-
- * Delivery is "best effort;" the data may never get there.
-
- * Each message is self-contained, including the source and
- destination addresses.
-
- * Delivery is _not_ sequenced; packets may arrive out of order,
- and/or multiple times.
-
- * Unlike the phone system, overhead is considerably lower. It is not
- necessary to set up the call first.
-
- The price the user pays for the lower overhead of datagram
-communications is exactly the lower reliability; it is often necessary
-for user-level protocols that use datagram communications to add their
-own reliability features on top of the basic communications.
-
-
-File: gawkinet.info, Node: The TCP/IP Protocols, Next: Making Connections, Prev: Datagram Communications, Up: Introduction
-
-1.3 The Internet Protocols
-==========================
-
-The Internet Protocol Suite (usually referred to as just TCP/IP)(1)
-consists of a number of different protocols at different levels or
-"layers." For our purposes, three protocols provide the fundamental
-communications mechanisms. All other defined protocols are referred to
-as user-level protocols (e.g., HTTP, used later in this Info file).
-
-* Menu:
-
-* Basic Protocols:: The basic protocols.
-* Ports:: The idea behind ports.
-
- ---------- Footnotes ----------
-
- (1) It should be noted that although the Internet seems to have
-conquered the world, there are other networking protocol suites in
-existence and in use.
-
-
-File: gawkinet.info, Node: Basic Protocols, Next: Ports, Prev: The TCP/IP Protocols, Up: The TCP/IP Protocols
-
-1.3.1 The Basic Internet Protocols
-----------------------------------
-
-IP
- The Internet Protocol. This protocol is almost never used directly
- by applications. It provides the basic packet delivery and routing
- infrastructure of the Internet. Much like the phone company's
- switching centers or the Post Office's trucks, it is not of much
- day-to-day interest to the regular user (or programmer). It
- happens to be a best effort datagram protocol. In the early
- twenty-first century, there are two versions of this protocol in
- use:
-
- IPv4
- The original version of the Internet Protocol, with 32-bit
- addresses, on which most of the current Internet is based.
-
- IPv6
- The "next generation" of the Internet Protocol, with 128-bit
- addresses. This protocol is in wide use in certain parts of
- the world, but has not yet replaced IPv4.(1)
-
- Versions of the other protocols that sit "atop" IP exist for both
- IPv4 and IPv6. However, as the IPv6 versions are fundamentally the
- same as the original IPv4 versions, we will not distinguish further
- between them.
-
-UDP
- The User Datagram Protocol. This is a best effort datagram
- protocol. It provides a small amount of extra reliability over IP,
- and adds the notion of "ports", described in *note TCP and UDP
- Ports: Ports.
-
-TCP
- The Transmission Control Protocol. This is a duplex, reliable,
- sequenced byte-stream protocol, again layered on top of IP, and
- also providing the notion of ports. This is the protocol that you
- will most likely use when using 'gawk' for network programming.
-
- All other user-level protocols use either TCP or UDP to do their
-basic communications. Examples are SMTP (Simple Mail Transfer
-Protocol), FTP (File Transfer Protocol), and HTTP (HyperText Transfer
-Protocol).
-
- ---------- Footnotes ----------
-
- (1) There isn't an IPv5.
-
-
-File: gawkinet.info, Node: Ports, Prev: Basic Protocols, Up: The TCP/IP Protocols
-
-1.3.2 TCP and UDP Ports
------------------------
-
-In the postal system, the address on an envelope indicates a physical
-location, such as a residence or office building. But there may be more
-than one person at the location; thus you have to further quantify the
-recipient by putting a person or company name on the envelope.
-
- In the phone system, one phone number may represent an entire
-company, in which case you need a person's extension number in order to
-reach that individual directly. Or, when you call a home, you have to
-say, "May I please speak to ..." before talking to the person directly.
-
- IP networking provides the concept of addressing. An IP address
-represents a particular computer, but no more. In order to reach the
-mail service on a system, or the FTP or WWW service on a system, you
-must have some way to further specify which service you want. In the
-Internet Protocol suite, this is done with "port numbers", which
-represent the services, much like an extension number used with a phone
-number.
-
- Port numbers are 16-bit integers. Unix and Unix-like systems reserve
-ports below 1024 for "well known" services, such as SMTP, FTP, and HTTP.
-Numbers 1024 and above may be used by any application, although there is
-no promise made that a particular port number is always available.
-
-
-File: gawkinet.info, Node: Making Connections, Prev: The TCP/IP Protocols, Up: Introduction
-
-1.4 Making TCP/IP Connections (And Some Terminology)
-====================================================
-
-Two terms come up repeatedly when discussing networking: "client" and
-"server". For now, we'll discuss these terms at the "connection level",
-when first establishing connections between two processes on different
-systems over a network. (Once the connection is established, the higher
-level, or "application level" protocols, such as HTTP or FTP, determine
-who is the client and who is the server. Often, it turns out that the
-client and server are the same in both roles.)
-
- The "server" is the system providing the service, such as the web
-server or email server. It is the "host" (system) which is _connected
-to_ in a transaction. For this to work though, the server must be
-expecting connections. Much as there has to be someone at the office
-building to answer the phone(1), the server process (usually) has to be
-started first and be waiting for a connection.
-
- The "client" is the system requesting the service. It is the system
-_initiating the connection_ in a transaction. (Just as when you pick up
-the phone to call an office or store.)
-
- In the TCP/IP framework, each end of a connection is represented by a
-pair of (ADDRESS, PORT) pairs. For the duration of the connection, the
-ports in use at each end are unique, and cannot be used simultaneously
-by other processes on the same system. (Only after closing a connection
-can a new one be built up on the same port. This is contrary to the
-usual behavior of fully developed web servers which have to avoid
-situations in which they are not reachable. We have to pay this price
-in order to enjoy the benefits of a simple communication paradigm in
-'gawk'.)
-
- Furthermore, once the connection is established, communications are
-"synchronous".(2) I.e., each end waits on the other to finish
-transmitting, before replying. This is much like two people in a phone
-conversation. While both could talk simultaneously, doing so usually
-doesn't work too well.
-
- In the case of TCP, the synchronicity is enforced by the protocol
-when sending data. Data writes "block" until the data have been
-received on the other end. For both TCP and UDP, data reads block until
-there is incoming data waiting to be read. This is summarized in the
-following table, where an "X" indicates that the given action blocks.
-
-TCP X X
-UDP X
-
- ---------- Footnotes ----------
-
- (1) In the days before voice mail systems!
-
- (2) For the technically savvy, data reads block--if there's no
-incoming data, the program is made to wait until there is, instead of
-receiving a "there's no data" error return.
-
-
-File: gawkinet.info, Node: Using Networking, Next: Some Applications and Techniques, Prev: Introduction, Up: Top
-
-2 Networking With 'gawk'
-************************
-
-The 'awk' programming language was originally developed as a
-pattern-matching language for writing short programs to perform data
-manipulation tasks. 'awk''s strength is the manipulation of textual
-data that is stored in files. It was never meant to be used for
-networking purposes. To exploit its features in a networking context,
-it's necessary to use an access mode for network connections that
-resembles the access of files as closely as possible.
-
- 'awk' is also meant to be a prototyping language. It is used to
-demonstrate feasibility and to play with features and user interfaces.
-This can be done with file-like handling of network connections. 'gawk'
-trades the lack of many of the advanced features of the TCP/IP family of
-protocols for the convenience of simple connection handling. The
-advanced features are available when programming in C or Perl. In fact,
-the network programming in this major node is very similar to what is
-described in books such as 'Internet Programming with Python', 'Advanced
-Perl Programming', or 'Web Client Programming with Perl'.
-
- However, you can do the programming here without first having to
-learn object-oriented ideology; underlying languages such as Tcl/Tk,
-Perl, Python; or all of the libraries necessary to extend these
-languages before they are ready for the Internet.
-
- This major node demonstrates how to use the TCP protocol. The UDP
-protocol is much less important for most users.
-
-* Menu:
-
-* Gawk Special Files:: How to do 'gawk' networking.
-* TCP Connecting:: Making a TCP connection.
-* Troubleshooting:: Troubleshooting TCP/IP connections.
-* Interacting:: Interacting with a service.
-* Setting Up:: Setting up a service.
-* Email:: Reading email.
-* Web page:: Reading a Web page.
-* Primitive Service:: A primitive Web service.
-* Interacting Service:: A Web service with interaction.
-* Simple Server:: A simple Web server.
-* Caveats:: Network programming caveats.
-* Challenges:: Where to go from here.
-
-
-File: gawkinet.info, Node: Gawk Special Files, Next: TCP Connecting, Prev: Using Networking, Up: Using Networking
-
-2.1 'gawk''s Networking Mechanisms
-==================================
-
-The '|&' operator for use in communicating with a "coprocess" is
-described in *note Two-way Communications With Another Process:
-(gawk)Two-way I/O. It shows how to do two-way I/O to a separate process,
-sending it data with 'print' or 'printf' and reading data with
-'getline'. If you haven't read it already, you should detour there to
-do so.
-
- 'gawk' transparently extends the two-way I/O mechanism to simple
-networking through the use of special file names. When a "coprocess"
-that matches the special files we are about to describe is started,
-'gawk' creates the appropriate network connection, and then two-way I/O
-proceeds as usual.
-
- At the C, C++, and Perl level, networking is accomplished via
-"sockets", an Application Programming Interface (API) originally
-developed at the University of California at Berkeley that is now used
-almost universally for TCP/IP networking. Socket level programming,
-while fairly straightforward, requires paying attention to a number of
-details, as well as using binary data. It is not well-suited for use
-from a high-level language like 'awk'. The special files provided in
-'gawk' hide the details from the programmer, making things much simpler
-and easier to use.
-
- The special file name for network access is made up of several
-fields, all of which are mandatory:
-
- /NET-TYPE/PROTOCOL/LOCALPORT/HOSTNAME/REMOTEPORT
-
- The NET-TYPE field lets you specify IPv4 versus IPv6, or lets you
-allow the system to choose.
-
-* Menu:
-
-* Special File Fields:: The fields in the special file name.
-* Comparing Protocols:: Differences between the protocols.
-
-
-File: gawkinet.info, Node: Special File Fields, Next: Comparing Protocols, Prev: Gawk Special Files, Up: Gawk Special Files
-
-2.1.1 The Fields of the Special File Name
------------------------------------------
-
-This node explains the meaning of all the other fields, as well as the
-range of values and the defaults. All of the fields are mandatory. To
-let the system pick a value, or if the field doesn't apply to the
-protocol, specify it as '0':
-
-NET-TYPE
- This is one of 'inet4' for IPv4, 'inet6' for IPv6, or 'inet' to use
- the system default (which is likely to be IPv4). For the rest of
- this document, we will use the generic '/inet' in our descriptions
- of how 'gawk''s networking works.
-
-PROTOCOL
- Determines which member of the TCP/IP family of protocols is
- selected to transport the data across the network. There are two
- possible values (always written in lowercase): 'tcp' and 'udp'.
- The exact meaning of each is explained later in this node.
-
-LOCALPORT
- Determines which port on the local machine is used to communicate
- across the network. Application-level clients usually use '0' to
- indicate they do not care which local port is used--instead they
- specify a remote port to connect to. It is vital for
- application-level servers to use a number different from '0' here
- because their service has to be available at a specific publicly
- known port number. It is possible to use a name from
- '/etc/services' here.
-
-HOSTNAME
- Determines which remote host is to be at the other end of the
- connection. Application-level servers must fill this field with a
- '0' to indicate their being open for all other hosts to connect to
- them and enforce connection level server behavior this way. It is
- not possible for an application-level server to restrict its
- availability to one remote host by entering a host name here.
- Application-level clients must enter a name different from '0'.
- The name can be either symbolic (e.g., 'jpl-devvax.jpl.nasa.gov')
- or numeric (e.g., '128.149.1.143').
-
-REMOTEPORT
- Determines which port on the remote machine is used to communicate
- across the network. For '/inet/tcp' and '/inet/udp',
- application-level clients _must_ use a number other than '0' to
- indicate to which port on the remote machine they want to connect.
- Application-level servers must not fill this field with a '0'.
- Instead they specify a local port to which clients connect. It is
- possible to use a name from '/etc/services' here.
-
- Experts in network programming will notice that the usual
-client/server asymmetry found at the level of the socket API is not
-visible here. This is for the sake of simplicity of the high-level
-concept. If this asymmetry is necessary for your application, use
-another language. For 'gawk', it is more important to enable users to
-write a client program with a minimum of code. What happens when first
-accessing a network connection is seen in the following pseudocode:
-
- if ((name of remote host given) && (other side accepts connection)) {
- rendez-vous successful; transmit with getline or print
- } else {
- if ((other side did not accept) && (localport == 0))
- exit unsuccessful
- if (TCP) {
- set up a server accepting connections
- this means waiting for the client on the other side to connect
- } else
- ready
- }
-
- The exact behavior of this algorithm depends on the values of the
-fields of the special file name. When in doubt, *note Table 2.1:
-table-inet-components. gives you the combinations of values and their
-meaning. If this table is too complicated, focus on the three lines
-printed in *bold*. All the examples in *note Networking With 'gawk':
-Using Networking, use only the patterns printed in bold letters.
-
-PROTOCOL LOCAL HOST NAME REMOTE RESULTING CONNECTION-LEVEL
- PORT PORT BEHAVIOR
-------------------------------------------------------------------------------
-*tcp* *0* *x* *x* *Dedicated client, fails if
- immediately connecting to a
- server on the other side
- fails*
-udp 0 x x Dedicated client
-*tcp, *x* *x* *x* *Client, switches to
-udp* dedicated server if
- necessary*
-*tcp, *x* *0* *0* *Dedicated server*
-udp*
-tcp, udp x x 0 Invalid
-tcp, udp 0 0 x Invalid
-tcp, udp x 0 x Invalid
-tcp, udp 0 0 0 Invalid
-tcp, udp 0 x 0 Invalid
-
-Table 2.1: /inet Special File Components
-
- In general, TCP is the preferred mechanism to use. It is the
-simplest protocol to understand and to use. Use UDP only if
-circumstances demand low-overhead.
-
-
-File: gawkinet.info, Node: Comparing Protocols, Prev: Special File Fields, Up: Gawk Special Files
-
-2.1.2 Comparing Protocols
--------------------------
-
-This node develops a pair of programs (sender and receiver) that do
-nothing but send a timestamp from one machine to another. The sender
-and the receiver are implemented with each of the two protocols
-available and demonstrate the differences between them.
-
-* Menu:
-
-* File /inet/tcp:: The TCP special file.
-* File /inet/udp:: The UDP special file.
-
-
-File: gawkinet.info, Node: File /inet/tcp, Next: File /inet/udp, Prev: Comparing Protocols, Up: Comparing Protocols
-
-2.1.2.1 '/inet/tcp'
-...................
-
-Once again, always use TCP. (Use UDP when low overhead is a necessity,
-and use RAW for network experimentation.) The first example is the
-sender program:
-
- # Server
- BEGIN {
- print strftime() |& "/inet/tcp/8888/0/0"
- close("/inet/tcp/8888/0/0")
- }
-
- The receiver is very simple:
-
- # Client
- BEGIN {
- "/inet/tcp/0/localhost/8888" |& getline
- print $0
- close("/inet/tcp/0/localhost/8888")
- }
-
- TCP guarantees that the bytes arrive at the receiving end in exactly
-the same order that they were sent. No byte is lost (except for broken
-connections), doubled, or out of order. Some overhead is necessary to
-accomplish this, but this is the price to pay for a reliable service.
-It does matter which side starts first. The sender/server has to be
-started first, and it waits for the receiver to read a line.
-
-
-File: gawkinet.info, Node: File /inet/udp, Prev: File /inet/tcp, Up: Comparing Protocols
-
-2.1.2.2 '/inet/udp'
-...................
-
-The server and client programs that use UDP are almost identical to
-their TCP counterparts; only the PROTOCOL has changed. As before, it
-does matter which side starts first. The receiving side blocks and
-waits for the sender. In this case, the receiver/client has to be
-started first:
-
- # Server
- BEGIN {
- print strftime() |& "/inet/udp/8888/0/0"
- close("/inet/udp/8888/0/0")
- }
-
- The receiver is almost identical to the TCP receiver:
-
- # Client
- BEGIN {
- print "hi!" |& "/inet/udp/0/localhost/8888"
- "/inet/udp/0/localhost/8888" |& getline
- print $0
- close("/inet/udp/0/localhost/8888")
- }
-
- In the case of UDP, the initial 'print' command is the one that
-actually sends data so that there is a connection. UDP and "connection"
-sounds strange to anyone who has learned that UDP is a connectionless
-protocol. Here, "connection" means that the 'connect()' system call has
-completed its work and completed the "association" between a certain
-socket and an IP address. Thus there are subtle differences between
-'connect()' for TCP and UDP; see the man page for details.(1)
-
- UDP cannot guarantee that the datagrams at the receiving end will
-arrive in exactly the same order they were sent. Some datagrams could
-be lost, some doubled, and some out of order. But no overhead is
-necessary to accomplish this. This unreliable behavior is good enough
-for tasks such as data acquisition, logging, and even stateless services
-like the original versions of NFS.
-
- ---------- Footnotes ----------
-
- (1) This subtlety is just one of many details that are hidden in the
-socket API, invisible and intractable for the 'gawk' user. The
-developers are currently considering how to rework the network
-facilities to make them easier to understand and use.
-
-
-File: gawkinet.info, Node: TCP Connecting, Next: Troubleshooting, Prev: Gawk Special Files, Up: Using Networking
-
-2.2 Establishing a TCP Connection
-=================================
-
-Let's observe a network connection at work. Type in the following
-program and watch the output. Within a second, it connects via TCP
-('/inet/tcp') to the machine it is running on ('localhost') and asks the
-service 'daytime' on the machine what time it is:
-
- BEGIN {
- "/inet/tcp/0/localhost/daytime" |& getline
- print $0
- close("/inet/tcp/0/localhost/daytime")
- }
-
- Even experienced 'awk' users will find the second line strange in two
-respects:
-
- * A special file is used as a shell command that pipes its output
- into 'getline'. One would rather expect to see the special file
- being read like any other file ('getline <
- "/inet/tcp/0/localhost/daytime")'.
-
- * The operator '|&' has not been part of any 'awk' implementation
- (until now). It is actually the only extension of the 'awk'
- language needed (apart from the special files) to introduce network
- access.
-
- The '|&' operator was introduced in 'gawk' 3.1 in order to overcome
-the crucial restriction that access to files and pipes in 'awk' is
-always unidirectional. It was formerly impossible to use both access
-modes on the same file or pipe. Instead of changing the whole concept
-of file access, the '|&' operator behaves exactly like the usual pipe
-operator except for two additions:
-
- * Normal shell commands connected to their 'gawk' program with a '|&'
- pipe can be accessed bidirectionally. The '|&' turns out to be a
- quite general, useful, and natural extension of 'awk'.
-
- * Pipes that consist of a special file name for network connections
- are not executed as shell commands. Instead, they can be read and
- written to, just like a full-duplex network connection.
-
- In the earlier example, the '|&' operator tells 'getline' to read a
-line from the special file '/inet/tcp/0/localhost/daytime'. We could
-also have printed a line into the special file. But instead we just
-read a line with the time, printed it, and closed the connection.
-(While we could just let 'gawk' close the connection by finishing the
-program, in this Info file we are pedantic and always explicitly close
-the connections.)
-
-
-File: gawkinet.info, Node: Troubleshooting, Next: Interacting, Prev: TCP Connecting, Up: Using Networking
-
-2.3 Troubleshooting Connection Problems
-=======================================
-
-It may well be that for some reason the program shown in the previous
-example does not run on your machine. When looking at possible reasons
-for this, you will learn much about typical problems that arise in
-network programming. First of all, your implementation of 'gawk' may
-not support network access because it is a pre-3.1 version or you do not
-have a network interface in your machine. Perhaps your machine uses
-some other protocol, such as DECnet or Novell's IPX. For the rest of
-this major node, we will assume you work on a Unix machine that supports
-TCP/IP. If the previous example program does not run on your machine, it
-may help to replace the name 'localhost' with the name of your machine
-or its IP address. If it does, you could replace 'localhost' with the
-name of another machine in your vicinity--this way, the program connects
-to another machine. Now you should see the date and time being printed
-by the program, otherwise your machine may not support the 'daytime'
-service. Try changing the service to 'chargen' or 'ftp'. This way, the
-program connects to other services that should give you some response.
-If you are curious, you should have a look at your '/etc/services' file.
-It could look like this:
-
- # /etc/services:
- #
- # Network services, Internet style
- #
- # Name Number/Protocol Alternate name # Comments
-
- echo 7/tcp
- echo 7/udp
- discard 9/tcp sink null
- discard 9/udp sink null
- daytime 13/tcp
- daytime 13/udp
- chargen 19/tcp ttytst source
- chargen 19/udp ttytst source
- ftp 21/tcp
- telnet 23/tcp
- smtp 25/tcp mail
- finger 79/tcp
- www 80/tcp http # WorldWideWeb HTTP
- www 80/udp # HyperText Transfer Protocol
- pop-2 109/tcp postoffice # POP version 2
- pop-2 109/udp
- pop-3 110/tcp # POP version 3
- pop-3 110/udp
- nntp 119/tcp readnews untp # USENET News
- irc 194/tcp # Internet Relay Chat
- irc 194/udp
- ...
-
- Here, you find a list of services that traditional Unix machines
-usually support. If your GNU/Linux machine does not do so, it may be
-that these services are switched off in some startup script. Systems
-running some flavor of Microsoft Windows usually do _not_ support these
-services. Nevertheless, it _is_ possible to do networking with 'gawk'
-on Microsoft Windows.(1) The first column of the file gives the name of
-the service, and the second column gives a unique number and the
-protocol that one can use to connect to this service. The rest of the
-line is treated as a comment. You see that some services ('echo')
-support TCP as well as UDP.
-
- ---------- Footnotes ----------
-
- (1) Microsoft preferred to ignore the TCP/IP family of protocols
-until 1995. Then came the rise of the Netscape browser as a landmark
-"killer application." Microsoft added TCP/IP support and their own
-browser to Microsoft Windows 95 at the last minute. They even
-back-ported their TCP/IP implementation to Microsoft Windows for
-Workgroups 3.11, but it was a rather rudimentary and half-hearted
-implementation. Nevertheless, the equivalent of '/etc/services' resides
-under 'C:\WINNT\system32\drivers\etc\services' on Microsoft Windows 2000
-and Microsoft Windows XP.
-
-
-File: gawkinet.info, Node: Interacting, Next: Setting Up, Prev: Troubleshooting, Up: Using Networking
-
-2.4 Interacting with a Network Service
-======================================
-
-The next program makes use of the possibility to really interact with a
-network service by printing something into the special file. It asks
-the so-called 'finger' service if a user of the machine is logged in.
-When testing this program, try to change 'localhost' to some other
-machine name in your local network:
-
- BEGIN {
- NetService = "/inet/tcp/0/localhost/finger"
- print "NAME" |& NetService
- while ((NetService |& getline) > 0)
- print $0
- close(NetService)
- }
-
- After telling the service on the machine which user to look for, the
-program repeatedly reads lines that come as a reply. When no more lines
-are coming (because the service has closed the connection), the program
-also closes the connection. Try replacing '"NAME"' with your login name
-(or the name of someone else logged in). For a list of all users
-currently logged in, replace NAME with an empty string ('""').
-
- The final 'close()' command could be safely deleted from the above
-script, because the operating system closes any open connection by
-default when a script reaches the end of execution. In order to avoid
-portability problems, it is best to always close connections explicitly.
-With the Linux kernel, for example, proper closing results in flushing
-of buffers. Letting the close happen by default may result in
-discarding buffers.
-
- When looking at '/etc/services' you may have noticed that the
-'daytime' service is also available with 'udp'. In the earlier example,
-change 'tcp' to 'udp', and change 'finger' to 'daytime'. After starting
-the modified program, you see the expected day and time message. The
-program then hangs, because it waits for more lines coming from the
-service. However, they never come. This behavior is a consequence of
-the differences between TCP and UDP. When using UDP, neither party is
-automatically informed about the other closing the connection.
-Continuing to experiment this way reveals many other subtle differences
-between TCP and UDP. To avoid such trouble, one should always remember
-the advice Douglas E. Comer and David Stevens give in Volume III of
-their series 'Internetworking With TCP' (page 14):
-
- When designing client-server applications, beginners are strongly
- advised to use TCP because it provides reliable,
- connection-oriented communication. Programs only use UDP if the
- application protocol handles reliability, the application requires
- hardware broadcast or multicast, or the application cannot tolerate
- virtual circuit overhead.
-
-
-File: gawkinet.info, Node: Setting Up, Next: Email, Prev: Interacting, Up: Using Networking
-
-2.5 Setting Up a Service
-========================
-
-The preceding programs behaved as clients that connect to a server
-somewhere on the Internet and request a particular service. Now we set
-up such a service to mimic the behavior of the 'daytime' service. Such
-a server does not know in advance who is going to connect to it over the
-network. Therefore, we cannot insert a name for the host to connect to
-in our special file name.
-
- Start the following program in one window. Notice that the service
-does not have the name 'daytime', but the number '8888'. From looking
-at '/etc/services', you know that names like 'daytime' are just
-mnemonics for predetermined 16-bit integers. Only the system
-administrator ('root') could enter our new service into '/etc/services'
-with an appropriate name. Also notice that the service name has to be
-entered into a different field of the special file name because we are
-setting up a server, not a client:
-
- BEGIN {
- print strftime() |& "/inet/tcp/8888/0/0"
- close("/inet/tcp/8888/0/0")
- }
-
- Now open another window on the same machine. Copy the client program
-given as the first example (*note Establishing a TCP Connection: TCP
-Connecting.) to a new file and edit it, changing the name 'daytime' to
-'8888'. Then start the modified client. You should get a reply like
-this:
-
- Sat Sep 27 19:08:16 CEST 1997
-
-Both programs explicitly close the connection.
-
- Now we will intentionally make a mistake to see what happens when the
-name '8888' (the so-called port) is already used by another service.
-Start the server program in both windows. The first one works, but the
-second one complains that it could not open the connection. Each port
-on a single machine can only be used by one server program at a time.
-Now terminate the server program and change the name '8888' to 'echo'.
-After restarting it, the server program does not run any more, and you
-know why: there is already an 'echo' service running on your machine.
-But even if this isn't true, you would not get your own 'echo' server
-running on a Unix machine, because the ports with numbers smaller than
-1024 ('echo' is at port 7) are reserved for 'root'. On machines running
-some flavor of Microsoft Windows, there is no restriction that reserves
-ports 1 to 1024 for a privileged user; hence, you can start an 'echo'
-server there.
-
- Turning this short server program into something really useful is
-simple. Imagine a server that first reads a file name from the client
-through the network connection, then does something with the file and
-sends a result back to the client. The server-side processing could be:
-
- BEGIN {
- NetService = "/inet/tcp/8888/0/0"
- NetService |& getline
- CatPipe = ("cat " $1) # sets $0 and the fields
- while ((CatPipe | getline) > 0)
- print $0 |& NetService
- close(NetService)
- }
-
-and we would have a remote copying facility. Such a server reads the
-name of a file from any client that connects to it and transmits the
-contents of the named file across the net. The server-side processing
-could also be the execution of a command that is transmitted across the
-network. From this example, you can see how simple it is to open up a
-security hole on your machine. If you allow clients to connect to your
-machine and execute arbitrary commands, anyone would be free to do 'rm
--rf *'.
-
-
-File: gawkinet.info, Node: Email, Next: Web page, Prev: Setting Up, Up: Using Networking
-
-2.6 Reading Email
-=================
-
-The distribution of email is usually done by dedicated email servers
-that communicate with your machine using special protocols. To receive
-email, we will use the Post Office Protocol (POP). Sending can be done
-with the much older Simple Mail Transfer Protocol (SMTP).
-
- When you type in the following program, replace the EMAILHOST by the
-name of your local email server. Ask your administrator if the server
-has a POP service, and then use its name or number in the program below.
-Now the program is ready to connect to your email server, but it will
-not succeed in retrieving your mail because it does not yet know your
-login name or password. Replace them in the program and it shows you
-the first email the server has in store:
-
- BEGIN {
- POPService = "/inet/tcp/0/EMAILHOST/pop3"
- RS = ORS = "\r\n"
- print "user NAME" |& POPService
- POPService |& getline
- print "pass PASSWORD" |& POPService
- POPService |& getline
- print "retr 1" |& POPService
- POPService |& getline
- if ($1 != "+OK") exit
- print "quit" |& POPService
- RS = "\r\n\\.\r\n"
- POPService |& getline
- print $0
- close(POPService)
- }
-
- The record separators 'RS' and 'ORS' are redefined because the
-protocol (POP) requires CR-LF to separate lines. After identifying
-yourself to the email service, the command 'retr 1' instructs the
-service to send the first of all your email messages in line. If the
-service replies with something other than '+OK', the program exits;
-maybe there is no email. Otherwise, the program first announces that it
-intends to finish reading email, and then redefines 'RS' in order to
-read the entire email as multiline input in one record. From the POP
-RFC, we know that the body of the email always ends with a single line
-containing a single dot. The program looks for this using 'RS =
-"\r\n\\.\r\n"'. When it finds this sequence in the mail message, it
-quits. You can invoke this program as often as you like; it does not
-delete the message it reads, but instead leaves it on the server.
-
-
-File: gawkinet.info, Node: Web page, Next: Primitive Service, Prev: Email, Up: Using Networking
-
-2.7 Reading a Web Page
-======================
-
-Retrieving a web page from a web server is as simple as retrieving email
-from an email server. We only have to use a similar, but not identical,
-protocol and a different port. The name of the protocol is HyperText
-Transfer Protocol (HTTP) and the port number is usually 80. As in the
-preceding node, ask your administrator about the name of your local web
-server or proxy web server and its port number for HTTP requests.
-
- The following program employs a rather crude approach toward
-retrieving a web page. It uses the prehistoric syntax of HTTP 0.9,
-which almost all web servers still support. The most noticeable thing
-about it is that the program directs the request to the local proxy
-server whose name you insert in the special file name (which in turn
-calls 'www.yahoo.com'):
-
- BEGIN {
- RS = ORS = "\r\n"
- HttpService = "/inet/tcp/0/PROXY/80"
- print "GET http://www.yahoo.com" |& HttpService
- while ((HttpService |& getline) > 0)
- print $0
- close(HttpService)
- }
-
- Again, lines are separated by a redefined 'RS' and 'ORS'. The 'GET'
-request that we send to the server is the only kind of HTTP request that
-existed when the web was created in the early 1990s. HTTP calls this
-'GET' request a "method," which tells the service to transmit a web page
-(here the home page of the Yahoo! search engine). Version 1.0 added
-the request methods 'HEAD' and 'POST'. The current version of HTTP is
-1.1,(1) and knows the additional request methods 'OPTIONS', 'PUT',
-'DELETE', and 'TRACE'. You can fill in any valid web address, and the
-program prints the HTML code of that page to your screen.
-
- Notice the similarity between the responses of the POP and HTTP
-services. First, you get a header that is terminated by an empty line,
-and then you get the body of the page in HTML. The lines of the headers
-also have the same form as in POP. There is the name of a parameter,
-then a colon, and finally the value of that parameter.
-
- Images ('.png' or '.gif' files) can also be retrieved this way, but
-then you get binary data that should be redirected into a file. Another
-application is calling a CGI (Common Gateway Interface) script on some
-server. CGI scripts are used when the contents of a web page are not
-constant, but generated instantly at the moment you send a request for
-the page. For example, to get a detailed report about the current
-quotes of Motorola stock shares, call a CGI script at Yahoo! with the
-following:
-
- get = "GET http://quote.yahoo.com/q?s=MOT&d=t"
- print get |& HttpService
-
- You can also request weather reports this way.
-
- ---------- Footnotes ----------
-
- (1) Version 1.0 of HTTP was defined in RFC 1945. HTTP 1.1 was
-initially specified in RFC 2068. In June 1999, RFC 2068 was made
-obsolete by RFC 2616, an update without any substantial changes.
-
-
-File: gawkinet.info, Node: Primitive Service, Next: Interacting Service, Prev: Web page, Up: Using Networking
-
-2.8 A Primitive Web Service
-===========================
-
-Now we know enough about HTTP to set up a primitive web service that
-just says '"Hello, world"' when someone connects to it with a browser.
-Compared to the situation in the preceding node, our program changes the
-role. It tries to behave just like the server we have observed. Since
-we are setting up a server here, we have to insert the port number in
-the 'localport' field of the special file name. The other two fields
-(HOSTNAME and REMOTEPORT) have to contain a '0' because we do not know
-in advance which host will connect to our service.
-
- In the early 1990s, all a server had to do was send an HTML document
-and close the connection. Here, we adhere to the modern syntax of HTTP.
-The steps are as follows:
-
- 1. Send a status line telling the web browser that everything is okay.
-
- 2. Send a line to tell the browser how many bytes follow in the body
- of the message. This was not necessary earlier because both
- parties knew that the document ended when the connection closed.
- Nowadays it is possible to stay connected after the transmission of
- one web page. This is to avoid the network traffic necessary for
- repeatedly establishing TCP connections for requesting several
- images. Thus, there is the need to tell the receiving party how
- many bytes will be sent. The header is terminated as usual with an
- empty line.
-
- 3. Send the '"Hello, world"' body in HTML. The useless 'while' loop
- swallows the request of the browser. We could actually omit the
- loop, and on most machines the program would still work. First,
- start the following program:
-
- BEGIN {
- RS = ORS = "\r\n"
- HttpService = "/inet/tcp/8080/0/0"
- Hello = "<HTML><HEAD>" \
- "<TITLE>A Famous Greeting</TITLE></HEAD>" \
- "<BODY><H1>Hello, world</H1></BODY></HTML>"
- Len = length(Hello) + length(ORS)
- print "HTTP/1.0 200 OK" |& HttpService
- print "Content-Length: " Len ORS |& HttpService
- print Hello |& HttpService
- while ((HttpService |& getline) > 0)
- continue;
- close(HttpService)
- }
-
- Now, on the same machine, start your favorite browser and let it
-point to <http://localhost:8080> (the browser needs to know on which
-port our server is listening for requests). If this does not work, the
-browser probably tries to connect to a proxy server that does not know
-your machine. If so, change the browser's configuration so that the
-browser does not try to use a proxy to connect to your machine.
-
-
-File: gawkinet.info, Node: Interacting Service, Next: Simple Server, Prev: Primitive Service, Up: Using Networking
-
-2.9 A Web Service with Interaction
-==================================
-
-This node shows how to set up a simple web server. The subnode is a
-library file that we will use with all the examples in *note Some
-Applications and Techniques::.
-
-* Menu:
-
-* CGI Lib:: A simple CGI library.
-
- Setting up a web service that allows user interaction is more
-difficult and shows us the limits of network access in 'gawk'. In this
-node, we develop a main program (a 'BEGIN' pattern and its action) that
-will become the core of event-driven execution controlled by a graphical
-user interface (GUI). Each HTTP event that the user triggers by some
-action within the browser is received in this central procedure.
-Parameters and menu choices are extracted from this request, and an
-appropriate measure is taken according to the user's choice. For
-example:
-
- BEGIN {
- if (MyHost == "") {
- "uname -n" | getline MyHost
- close("uname -n")
- }
- if (MyPort == 0) MyPort = 8080
- HttpService = "/inet/tcp/" MyPort "/0/0"
- MyPrefix = "http://" MyHost ":" MyPort
- SetUpServer()
- while ("awk" != "complex") {
- # header lines are terminated this way
- RS = ORS = "\r\n"
- Status = 200 # this means OK
- Reason = "OK"
- Header = TopHeader
- Document = TopDoc
- Footer = TopFooter
- if (GETARG["Method"] == "GET") {
- HandleGET()
- } else if (GETARG["Method"] == "HEAD") {
- # not yet implemented
- } else if (GETARG["Method"] != "") {
- print "bad method", GETARG["Method"]
- }
- Prompt = Header Document Footer
- print "HTTP/1.0", Status, Reason |& HttpService
- print "Connection: Close" |& HttpService
- print "Pragma: no-cache" |& HttpService
- len = length(Prompt) + length(ORS)
- print "Content-length:", len |& HttpService
- print ORS Prompt |& HttpService
- # ignore all the header lines
- while ((HttpService |& getline) > 0)
- ;
- # stop talking to this client
- close(HttpService)
- # wait for new client request
- HttpService |& getline
- # do some logging
- print systime(), strftime(), $0
- # read request parameters
- CGI_setup($1, $2, $3)
- }
- }
-
- This web server presents menu choices in the form of HTML links.
-Therefore, it has to tell the browser the name of the host it is
-residing on. When starting the server, the user may supply the name of
-the host from the command line with 'gawk -v MyHost="Rumpelstilzchen"'.
-If the user does not do this, the server looks up the name of the host
-it is running on for later use as a web address in HTML documents. The
-same applies to the port number. These values are inserted later into
-the HTML content of the web pages to refer to the home system.
-
- Each server that is built around this core has to initialize some
-application-dependent variables (such as the default home page) in a
-procedure 'SetUpServer()', which is called immediately before entering
-the infinite loop of the server. For now, we will write an instance
-that initiates a trivial interaction. With this home page, the client
-user can click on two possible choices, and receive the current date
-either in human-readable format or in seconds since 1970:
-
- function SetUpServer() {
- TopHeader = "<HTML><HEAD>"
- TopHeader = TopHeader \
- "<title>My name is GAWK, GNU AWK</title></HEAD>"
- TopDoc = "<BODY><h2>\
- Do you prefer your date <A HREF=" MyPrefix \
- "/human>human</A> or \
- <A HREF=" MyPrefix "/POSIX>POSIXed</A>?</h2>" ORS ORS
- TopFooter = "</BODY></HTML>"
- }
-
- On the first run through the main loop, the default line terminators
-are set and the default home page is copied to the actual home page.
-Since this is the first run, 'GETARG["Method"]' is not initialized yet,
-hence the case selection over the method does nothing. Now that the
-home page is initialized, the server can start communicating to a client
-browser.
-
- It does so by printing the HTTP header into the network connection
-('print ... |& HttpService'). This command blocks execution of the
-server script until a client connects. If this server script is
-compared with the primitive one we wrote before, you will notice two
-additional lines in the header. The first instructs the browser to
-close the connection after each request. The second tells the browser
-that it should never try to _remember_ earlier requests that had
-identical web addresses (no caching). Otherwise, it could happen that
-the browser retrieves the time of day in the previous example just once,
-and later it takes the web page from the cache, always displaying the
-same time of day although time advances each second.
-
- Having supplied the initial home page to the browser with a valid
-document stored in the parameter 'Prompt', it closes the connection and
-waits for the next request. When the request comes, a log line is
-printed that allows us to see which request the server receives. The
-final step in the loop is to call the function 'CGI_setup()', which
-reads all the lines of the request (coming from the browser), processes
-them, and stores the transmitted parameters in the array 'PARAM'. The
-complete text of these application-independent functions can be found in
-*note A Simple CGI Library: CGI Lib. For now, we use a simplified
-version of 'CGI_setup()':
-
- function CGI_setup( method, uri, version, i) {
- delete GETARG; delete MENU; delete PARAM
- GETARG["Method"] = $1
- GETARG["URI"] = $2
- GETARG["Version"] = $3
- i = index($2, "?")
- # is there a "?" indicating a CGI request?
- if (i > 0) {
- split(substr($2, 1, i-1), MENU, "[/:]")
- split(substr($2, i+1), PARAM, "&")
- for (i in PARAM) {
- j = index(PARAM[i], "=")
- GETARG[substr(PARAM[i], 1, j-1)] = \
- substr(PARAM[i], j+1)
- }
- } else { # there is no "?", no need for splitting PARAMs
- split($2, MENU, "[/:]")
- }
- }
-
- At first, the function clears all variables used for global storage
-of request parameters. The rest of the function serves the purpose of
-filling the global parameters with the extracted new values. To
-accomplish this, the name of the requested resource is split into parts
-and stored for later evaluation. If the request contains a '?', then
-the request has CGI variables seamlessly appended to the web address.
-Everything in front of the '?' is split up into menu items, and
-everything behind the '?' is a list of 'VARIABLE=VALUE' pairs (separated
-by '&') that also need splitting. This way, CGI variables are isolated
-and stored. This procedure lacks recognition of special characters that
-are transmitted in coded form(1). Here, any optional request header and
-body parts are ignored. We do not need header parameters and the
-request body. However, when refining our approach or working with the
-'POST' and 'PUT' methods, reading the header and body becomes
-inevitable. Header parameters should then be stored in a global array
-as well as the body.
-
- On each subsequent run through the main loop, one request from a
-browser is received, evaluated, and answered according to the user's
-choice. This can be done by letting the value of the HTTP method guide
-the main loop into execution of the procedure 'HandleGET()', which
-evaluates the user's choice. In this case, we have only one
-hierarchical level of menus, but in the general case, menus are nested.
-The menu choices at each level are separated by '/', just as in file
-names. Notice how simple it is to construct menus of arbitrary depth:
-
- function HandleGET() {
- if ( MENU[2] == "human") {
- Footer = strftime() TopFooter
- } else if (MENU[2] == "POSIX") {
- Footer = systime() TopFooter
- }
- }
-
- The disadvantage of this approach is that our server is slow and can
-handle only one request at a time. Its main advantage, however, is that
-the server consists of just one 'gawk' program. No need for installing
-an 'httpd', and no need for static separate HTML files, CGI scripts, or
-'root' privileges. This is rapid prototyping. This program can be
-started on the same host that runs your browser. Then let your browser
-point to <http://localhost:8080>.
-
- It is also possible to include images into the HTML pages. Most
-browsers support the not very well-known '.xbm' format, which may
-contain only monochrome pictures but is an ASCII format. Binary images
-are possible but not so easy to handle. Another way of including images
-is to generate them with a tool such as GNUPlot, by calling the tool
-with the 'system()' function or through a pipe.
-
- ---------- Footnotes ----------
-
- (1) As defined in RFC 2068.
-
-
-File: gawkinet.info, Node: CGI Lib, Prev: Interacting Service, Up: Interacting Service
-
-2.9.1 A Simple CGI Library
---------------------------
-
- HTTP is like being married: you have to be able to handle whatever
- you're given, while being very careful what you send back.
- Phil Smith III,
- <http://www.netfunny.com/rhf/jokes/99/Mar/http.html>
-
- In *note A Web Service with Interaction: Interacting Service, we saw
-the function 'CGI_setup()' as part of the web server "core logic"
-framework. The code presented there handles almost everything necessary
-for CGI requests. One thing it doesn't do is handle encoded characters
-in the requests. For example, an '&' is encoded as a percent sign
-followed by the hexadecimal value: '%26'. These encoded values should
-be decoded. Following is a simple library to perform these tasks. This
-code is used for all web server examples used throughout the rest of
-this Info file. If you want to use it for your own web server, store
-the source code into a file named 'inetlib.awk'. Then you can include
-these functions into your code by placing the following statement into
-your program (on the first line of your script):
-
- @include inetlib.awk
-
-But beware, this mechanism is only possible if you invoke your web
-server script with 'igawk' instead of the usual 'awk' or 'gawk'. Here
-is the code:
-
- # CGI Library and core of a web server
- # Global arrays
- # GETARG --- arguments to CGI GET command
- # MENU --- menu items (path names)
- # PARAM --- parameters of form x=y
-
- # Optional variable MyHost contains host address
- # Optional variable MyPort contains port number
- # Needs TopHeader, TopDoc, TopFooter
- # Sets MyPrefix, HttpService, Status, Reason
-
- BEGIN {
- if (MyHost == "") {
- "uname -n" | getline MyHost
- close("uname -n")
- }
- if (MyPort == 0) MyPort = 8080
- HttpService = "/inet/tcp/" MyPort "/0/0"
- MyPrefix = "http://" MyHost ":" MyPort
- SetUpServer()
- while ("awk" != "complex") {
- # header lines are terminated this way
- RS = ORS = "\r\n"
- Status = 200 # this means OK
- Reason = "OK"
- Header = TopHeader
- Document = TopDoc
- Footer = TopFooter
- if (GETARG["Method"] == "GET") {
- HandleGET()
- } else if (GETARG["Method"] == "HEAD") {
- # not yet implemented
- } else if (GETARG["Method"] != "") {
- print "bad method", GETARG["Method"]
- }
- Prompt = Header Document Footer
- print "HTTP/1.0", Status, Reason |& HttpService
- print "Connection: Close" |& HttpService
- print "Pragma: no-cache" |& HttpService
- len = length(Prompt) + length(ORS)
- print "Content-length:", len |& HttpService
- print ORS Prompt |& HttpService
- # ignore all the header lines
- while ((HttpService |& getline) > 0)
- continue
- # stop talking to this client
- close(HttpService)
- # wait for new client request
- HttpService |& getline
- # do some logging
- print systime(), strftime(), $0
- CGI_setup($1, $2, $3)
- }
- }
-
- function CGI_setup( method, uri, version, i)
- {
- delete GETARG
- delete MENU
- delete PARAM
- GETARG["Method"] = method
- GETARG["URI"] = uri
- GETARG["Version"] = version
-
- i = index(uri, "?")
- if (i > 0) { # is there a "?" indicating a CGI request?
- split(substr(uri, 1, i-1), MENU, "[/:]")
- split(substr(uri, i+1), PARAM, "&")
- for (i in PARAM) {
- PARAM[i] = _CGI_decode(PARAM[i])
- j = index(PARAM[i], "=")
- GETARG[substr(PARAM[i], 1, j-1)] = \
- substr(PARAM[i], j+1)
- }
- } else { # there is no "?", no need for splitting PARAMs
- split(uri, MENU, "[/:]")
- }
- for (i in MENU) # decode characters in path
- if (i > 4) # but not those in host name
- MENU[i] = _CGI_decode(MENU[i])
- }
-
- This isolates details in a single function, 'CGI_setup()'. Decoding
-of encoded characters is pushed off to a helper function,
-'_CGI_decode()'. The use of the leading underscore ('_') in the
-function name is intended to indicate that it is an "internal" function,
-although there is nothing to enforce this:
-
- function _CGI_decode(str, hexdigs, i, pre, code1, code2,
- val, result)
- {
- hexdigs = "123456789abcdef"
-
- i = index(str, "%")
- if (i == 0) # no work to do
- return str
-
- do {
- pre = substr(str, 1, i-1) # part before %xx
- code1 = substr(str, i+1, 1) # first hex digit
- code2 = substr(str, i+2, 1) # second hex digit
- str = substr(str, i+3) # rest of string
-
- code1 = tolower(code1)
- code2 = tolower(code2)
- val = index(hexdigs, code1) * 16 \
- + index(hexdigs, code2)
-
- result = result pre sprintf("%c", val)
- i = index(str, "%")
- } while (i != 0)
- if (length(str) > 0)
- result = result str
- return result
- }
-
- This works by splitting the string apart around an encoded character.
-The two digits are converted to lowercase characters and looked up in a
-string of hex digits. Note that '0' is not in the string on purpose;
-'index()' returns zero when it's not found, automatically giving the
-correct value! Once the hexadecimal value is converted from characters
-in a string into a numerical value, 'sprintf()' converts the value back
-into a real character. The following is a simple test harness for the
-above functions:
-
- BEGIN {
- CGI_setup("GET",
- "http://www.gnu.org/cgi-bin/foo?p1=stuff&p2=stuff%26junk" \
- "&percent=a %25 sign",
- "1.0")
- for (i in MENU)
- printf "MENU[\"%s\"] = %s\n", i, MENU[i]
- for (i in PARAM)
- printf "PARAM[\"%s\"] = %s\n", i, PARAM[i]
- for (i in GETARG)
- printf "GETARG[\"%s\"] = %s\n", i, GETARG[i]
- }
-
- And this is the result when we run it:
-
- $ gawk -f testserv.awk
- -| MENU["4"] = www.gnu.org
- -| MENU["5"] = cgi-bin
- -| MENU["6"] = foo
- -| MENU["1"] = http
- -| MENU["2"] =
- -| MENU["3"] =
- -| PARAM["1"] = p1=stuff
- -| PARAM["2"] = p2=stuff&junk
- -| PARAM["3"] = percent=a % sign
- -| GETARG["p1"] = stuff
- -| GETARG["percent"] = a % sign
- -| GETARG["p2"] = stuff&junk
- -| GETARG["Method"] = GET
- -| GETARG["Version"] = 1.0
- -| GETARG["URI"] = http://www.gnu.org/cgi-bin/foo?p1=stuff&
- p2=stuff%26junk&percent=a %25 sign
-
-
-File: gawkinet.info, Node: Simple Server, Next: Caveats, Prev: Interacting Service, Up: Using Networking
-
-2.10 A Simple Web Server
-========================
-
-In the preceding node, we built the core logic for event-driven GUIs.
-In this node, we finally extend the core to a real application. No one
-would actually write a commercial web server in 'gawk', but it is
-instructive to see that it is feasible in principle.
-
- The application is ELIZA, the famous program by Joseph Weizenbaum
-that mimics the behavior of a professional psychotherapist when talking
-to you. Weizenbaum would certainly object to this description, but this
-is part of the legend around ELIZA. Take the site-independent core logic
-and append the following code:
-
- function SetUpServer() {
- SetUpEliza()
- TopHeader = \
- "<HTML><title>An HTTP-based System with GAWK</title>\
- <HEAD><META HTTP-EQUIV=\"Content-Type\"\
- CONTENT=\"text/html; charset=iso-8859-1\"></HEAD>\
- <BODY BGCOLOR=\"#ffffff\" TEXT=\"#000000\"\
- LINK=\"#0000ff\" VLINK=\"#0000ff\"\
- ALINK=\"#0000ff\"> <A NAME=\"top\">"
- TopDoc = "\
- <h2>Please choose one of the following actions:</h2>\
- <UL>\
- <LI>\
- <A HREF=" MyPrefix "/AboutServer>About this server</A>\
- </LI><LI>\
- <A HREF=" MyPrefix "/AboutELIZA>About Eliza</A></LI>\
- <LI>\
- <A HREF=" MyPrefix \
- "/StartELIZA>Start talking to Eliza</A></LI></UL>"
- TopFooter = "</BODY></HTML>"
- }
-
- 'SetUpServer()' is similar to the previous example, except for
-calling another function, 'SetUpEliza()'. This approach can be used to
-implement other kinds of servers. The only changes needed to do so are
-hidden in the functions 'SetUpServer()' and 'HandleGET()'. Perhaps it
-might be necessary to implement other HTTP methods. The 'igawk' program
-that comes with 'gawk' may be useful for this process.
-
- When extending this example to a complete application, the first
-thing to do is to implement the function 'SetUpServer()' to initialize
-the HTML pages and some variables. These initializations determine the
-way your HTML pages look (colors, titles, menu items, etc.).
-
- The function 'HandleGET()' is a nested case selection that decides
-which page the user wants to see next. Each nesting level refers to a
-menu level of the GUI. Each case implements a certain action of the
-menu. On the deepest level of case selection, the handler essentially
-knows what the user wants and stores the answer into the variable that
-holds the HTML page contents:
-
- function HandleGET() {
- # A real HTTP server would treat some parts of the URI as a file name.
- # We take parts of the URI as menu choices and go on accordingly.
- if(MENU[2] == "AboutServer") {
- Document = "This is not a CGI script.\
- This is an httpd, an HTML file, and a CGI script all \
- in one GAWK script. It needs no separate www-server, \
- no installation, and no root privileges.\
- <p>To run it, do this:</p><ul>\
- <li> start this script with \"gawk -f httpserver.awk\",</li>\
- <li> and on the same host let your www browser open location\
- \"http://localhost:8080\"</li>\
- </ul>\<p>\ Details of HTTP come from:</p><ul>\
- <li>Hethmon: Illustrated Guide to HTTP</p>\
- <li>RFC 2068</li></ul><p>JK 14.9.1997</p>"
- } else if (MENU[2] == "AboutELIZA") {
- Document = "This is an implementation of the famous ELIZA\
- program by Joseph Weizenbaum. It is written in GAWK and\
- uses an HTML GUI."
- } else if (MENU[2] == "StartELIZA") {
- gsub(/\+/, " ", GETARG["YouSay"])
- # Here we also have to substitute coded special characters
- Document = "<form method=GET>" \
- "<h3>" ElizaSays(GETARG["YouSay"]) "</h3>\
- <p><input type=text name=YouSay value=\"\" size=60>\
- <br><input type=submit value=\"Tell her about it\"></p></form>"
- }
- }
-
- Now we are down to the heart of ELIZA, so you can see how it works.
-Initially the user does not say anything; then ELIZA resets its money
-counter and asks the user to tell what comes to mind open heartedly.
-The subsequent answers are converted to uppercase characters and stored
-for later comparison. ELIZA presents the bill when being confronted
-with a sentence that contains the phrase "shut up." Otherwise, it looks
-for keywords in the sentence, conjugates the rest of the sentence,
-remembers the keyword for later use, and finally selects an answer from
-the set of possible answers:
-
- function ElizaSays(YouSay) {
- if (YouSay == "") {
- cost = 0
- answer = "HI, IM ELIZA, TELL ME YOUR PROBLEM"
- } else {
- q = toupper(YouSay)
- gsub("'", "", q)
- if(q == qold) {
- answer = "PLEASE DONT REPEAT YOURSELF !"
- } else {
- if (index(q, "SHUT UP") > 0) {
- answer = "WELL, PLEASE PAY YOUR BILL. ITS EXACTLY ... $"\
- int(100*rand()+30+cost/100)
- } else {
- qold = q
- w = "-" # no keyword recognized yet
- for (i in k) { # search for keywords
- if (index(q, i) > 0) {
- w = i
- break
- }
- }
- if (w == "-") { # no keyword, take old subject
- w = wold
- subj = subjold
- } else { # find subject
- subj = substr(q, index(q, w) + length(w)+1)
- wold = w
- subjold = subj # remember keyword and subject
- }
- for (i in conj)
- gsub(i, conj[i], q) # conjugation
- # from all answers to this keyword, select one randomly
- answer = r[indices[int(split(k[w], indices) * rand()) + 1]]
- # insert subject into answer
- gsub("_", subj, answer)
- }
- }
- }
- cost += length(answer) # for later payment : 1 cent per character
- return answer
- }
-
- In the long but simple function 'SetUpEliza()', you can see tables
-for conjugation, keywords, and answers.(1) The associative array 'k'
-contains indices into the array of answers 'r'. To choose an answer,
-ELIZA just picks an index randomly:
-
- function SetUpEliza() {
- srand()
- wold = "-"
- subjold = " "
-
- # table for conjugation
- conj[" ARE " ] = " AM "
- conj["WERE " ] = "WAS "
- conj[" YOU " ] = " I "
- conj["YOUR " ] = "MY "
- conj[" IVE " ] =\
- conj[" I HAVE " ] = " YOU HAVE "
- conj[" YOUVE " ] =\
- conj[" YOU HAVE "] = " I HAVE "
- conj[" IM " ] =\
- conj[" I AM " ] = " YOU ARE "
- conj[" YOURE " ] =\
- conj[" YOU ARE " ] = " I AM "
-
- # table of all answers
- r[1] = "DONT YOU BELIEVE THAT I CAN _"
- r[2] = "PERHAPS YOU WOULD LIKE TO BE ABLE TO _ ?"
- ...
-
- # table for looking up answers that
- # fit to a certain keyword
- k["CAN YOU"] = "1 2 3"
- k["CAN I"] = "4 5"
- k["YOU ARE"] =\
- k["YOURE"] = "6 7 8 9"
- ...
- }
-
- Some interesting remarks and details (including the original source
-code of ELIZA) are found on Mark Humphrys' home page. Yahoo! also has
-a page with a collection of ELIZA-like programs. Many of them are
-written in Java, some of them disclosing the Java source code, and a few
-even explain how to modify the Java source code.
-
- ---------- Footnotes ----------
-
- (1) The version shown here is abbreviated. The full version comes
-with the 'gawk' distribution.
-
-
-File: gawkinet.info, Node: Caveats, Next: Challenges, Prev: Simple Server, Up: Using Networking
-
-2.11 Network Programming Caveats
-================================
-
-By now it should be clear that debugging a networked application is more
-complicated than debugging a single-process single-hosted application.
-The behavior of a networked application sometimes looks noncausal
-because it is not reproducible in a strong sense. Whether a network
-application works or not sometimes depends on the following:
-
- * How crowded the underlying network is
-
- * If the party at the other end is running or not
-
- * The state of the party at the other end
-
- The most difficult problems for a beginner arise from the hidden
-states of the underlying network. After closing a TCP connection, it's
-often necessary to wait a short while before reopening the connection.
-Even more difficult is the establishment of a connection that previously
-ended with a "broken pipe." Those connections have to "time out" for a
-minute or so before they can reopen. Check this with the command
-'netstat -a', which provides a list of still "active" connections.
-
-
-File: gawkinet.info, Node: Challenges, Prev: Caveats, Up: Using Networking
-
-2.12 Where To Go From Here
-==========================
-
-Now, you have learned enough to build your own application. You could,
-for example, take part in the Loebner Contest to win $100,000. The
-Loebner Prize is the first formal instantiation of a Turing Test. Hugh
-Loebner agreed with The Cambridge Center for Behavioral Studies to
-underwrite a contest designed to implement the Turing Test. Dr. Loebner
-pledged a Grand Prize of $100,000 for the first computer whose responses
-were indistinguishable from a human's. Each year an annual prize of
-$2000 and a bronze medal is awarded to the _most_ human computer. The
-winner of the annual contest is the best entry relative to other entries
-that year, irrespective of how good it is in an absolute sense. Here is
-an example of a conversation with the winning program of 1997:
-
- (c)1997 Cambridge Center For Behavioral Studies all rights reserved [CONVERSE]
- [Intelligent Research Ltd.] Start at: [1997/04/29 08:59:40]
- PROGRAM[10:39:20]Hi, my name is Catherine, what's yours?
- JUDGE02[10:39:36]My name is Nathan.
- PROGRAM[10:39:37]Well, it's nice to meet you, Nathan.
- PROGRAM[10:39:45]Did you see that story on CNN last night about the lesbian
- PROGRAM[10:40:03]couple who came out at a White House party on Sunday?
- JUDGE02[10:40:45]No. I'm sorry I missed that. How did the Prez take it?
- PROGRAM[10:40:47]Ellen Degeneres was one of them - she was kissing her lover
- PROGRAM[10:41:08]in front of Bill Clinton. Do you think he was very
- PROGRAM[10:41:21]embarrassed?
- JUDGE02[10:41:37]You tell me. I didn't see it.
- PROGRAM[10:41:38]Are you still a Friend of Bill Clinton?
- JUDGE02[10:42:22]Yes. I would like to talk about something else. Do you know anything about sculpture.
- PROGRAM[10:42:24]Well, I suppose he's doing his best like most of us, but he
- PROGRAM[10:42:35]does seem to attract an UNUSUAL amount of bad publicity,
- PROGRAM[10:42:51]wouldnt you say--can it all be bad luck?
- JUDGE02[10:43:20]I asked you if you knew anything about sculpture.
- PROGRAM[10:43:21]OK, but what happens now--does it all drift on, with
- PROGRAM[10:43:36]everyone waiting to see how much his friends have said about
- PROGRAM[10:43:47]him under threat of lots and lots of jail-or will something
- PROGRAM[10:44:01]dramatic actually happen?
- JUDGE02[10:44:19]Do you know what sculpture is?
- PROGRAM[10:44:20]Well, youre more a fan of his than I am I guess---though I
- PROGRAM[10:44:34]cant SWEAR what Id have done in that locked hotel room, I
- PROGRAM[10:44:51]must say.
- JUDGE02[10:45:26]Have you ever been to the Salmagundi Club?
- PROGRAM[10:45:29]Never. to go back to the original discussion Time to get
- PROGRAM[10:45:43]off this one I think and onto something more down to earth!
- JUDGE02[10:46:20]What is more down to earth than art?
- PROGRAM[10:46:21]I don't know what -- to be honest. -
- PROGRAM[10:46:30]I suppose it depends on where you're coming from, but as the
- PROGRAM[10:46:46]song goes in My Fair Lady, "Why can't a woman be more like a
- PROGRAM[10:47:03]man? "
- JUDGE02[10:47:52]I liked that song but I prefer "Why can't children be more like we are"
- JUDGE02[10:48:10]from another play the name of which I cannot remember.
- PROGRAM[10:48:12]What, if anything, do you enjoy about being a man?
- JUDGE02[10:49:01]I would enjoy all aspects of life no matter what my gender.
- PROGRAM[10:49:02]Do you think of yourself as being attractive?
-
- This program insists on always speaking about the same story around
-Bill Clinton. You see, even a program with a rather narrow mind can
-behave so much like a human being that it can win this prize. It is
-quite common to let these programs talk to each other via network
-connections. But during the competition itself, the program and its
-computer have to be present at the place the competition is held. We
-all would love to see a 'gawk' program win in such an event. Maybe it
-is up to you to accomplish this?
-
- Some other ideas for useful networked applications:
- * Read the file 'doc/awkforai.txt' in the 'gawk' distribution. It
- was written by Ronald P. Loui (at the time, Associate Professor of
- Computer Science, at Washington University in St. Louis,
- <loui@ai.wustl.edu>) and summarizes why he taught 'gawk' to
- students of Artificial Intelligence. Here are some passages from
- the text:
-
- The GAWK manual can be consumed in a single lab session and
- the language can be mastered by the next morning by the
- average student. GAWK's automatic initialization, implicit
- coercion, I/O support and lack of pointers forgive many of the
- mistakes that young programmers are likely to make. Those who
- have seen C but not mastered it are happy to see that GAWK
- retains some of the same sensibilities while adding what must
- be regarded as spoonsful of syntactic sugar.
- ...
- There are further simple answers. Probably the best is the
- fact that increasingly, undergraduate AI programming is
- involving the Web. Oren Etzioni (University of Washington,
- Seattle) has for a while been arguing that the "softbot" is
- replacing the mechanical engineers' robot as the most
- glamorous AI testbed. If the artifact whose behavior needs to
- be controlled in an intelligent way is the software agent,
- then a language that is well-suited to controlling the
- software environment is the appropriate language. That would
- imply a scripting language. If the robot is KAREL, then the
- right language is "turn left; turn right." If the robot is
- Netscape, then the right language is something that can
- generate 'netscape -remote
- 'openURL(http://cs.wustl.edu/~loui)'' with elan.
- ...
- AI programming requires high-level thinking. There have
- always been a few gifted programmers who can write high-level
- programs in assembly language. Most however need the ambient
- abstraction to have a higher floor.
- ...
- Second, inference is merely the expansion of notation. No
- matter whether the logic that underlies an AI program is
- fuzzy, probabilistic, deontic, defeasible, or deductive, the
- logic merely defines how strings can be transformed into other
- strings. A language that provides the best support for string
- processing in the end provides the best support for logic, for
- the exploration of various logics, and for most forms of
- symbolic processing that AI might choose to call "reasoning"
- instead of "logic." The implication is that PROLOG, which
- saves the AI programmer from having to write a unifier, saves
- perhaps two dozen lines of GAWK code at the expense of
- strongly biasing the logic and representational expressiveness
- of any approach.
-
- Now that 'gawk' itself can connect to the Internet, it should be
- obvious that it is suitable for writing intelligent web agents.
-
- * 'awk' is strong at pattern recognition and string processing. So,
- it is well suited to the classic problem of language translation.
- A first try could be a program that knows the 100 most frequent
- English words and their counterparts in German or French. The
- service could be implemented by regularly reading email with the
- program above, replacing each word by its translation and sending
- the translation back via SMTP. Users would send English email to
- their translation service and get back a translated email message
- in return. As soon as this works, more effort can be spent on a
- real translation program.
-
- * Another dialogue-oriented application (on the verge of ridicule) is
- the email "support service." Troubled customers write an email to
- an automatic 'gawk' service that reads the email. It looks for
- keywords in the mail and assembles a reply email accordingly. By
- carefully investigating the email header, and repeating these
- keywords through the reply email, it is rather simple to give the
- customer a feeling that someone cares. Ideally, such a service
- would search a database of previous cases for solutions. If none
- exists, the database could, for example, consist of all the
- newsgroups, mailing lists and FAQs on the Internet.
-
-
-File: gawkinet.info, Node: Some Applications and Techniques, Next: Links, Prev: Using Networking, Up: Top
-
-3 Some Applications and Techniques
-**********************************
-
-In this major node, we look at a number of self-contained scripts, with
-an emphasis on concise networking. Along the way, we work towards
-creating building blocks that encapsulate often needed functions of the
-networking world, show new techniques that broaden the scope of problems
-that can be solved with 'gawk', and explore leading edge technology that
-may shape the future of networking.
-
- We often refer to the site-independent core of the server that we
-built in *note A Simple Web Server: Simple Server. When building new
-and nontrivial servers, we always copy this building block and append
-new instances of the two functions 'SetUpServer()' and 'HandleGET()'.
-
- This makes a lot of sense, since this scheme of event-driven
-execution provides 'gawk' with an interface to the most widely accepted
-standard for GUIs: the web browser. Now, 'gawk' can rival even Tcl/Tk.
-
- Tcl and 'gawk' have much in common. Both are simple scripting
-languages that allow us to quickly solve problems with short programs.
-But Tcl has Tk on top of it, and 'gawk' had nothing comparable up to
-now. While Tcl needs a large and ever-changing library (Tk, which was
-bound to the X Window System until recently), 'gawk' needs just the
-networking interface and some kind of browser on the client's side.
-Besides better portability, the most important advantage of this
-approach (embracing well-established standards such HTTP and HTML) is
-that _we do not need to change the language_. We let others do the work
-of fighting over protocols and standards. We can use HTML, JavaScript,
-VRML, or whatever else comes along to do our work.
-
-* Menu:
-
-* PANIC:: An Emergency Web Server.
-* GETURL:: Retrieving Web Pages.
-* REMCONF:: Remote Configuration Of Embedded Systems.
-* URLCHK:: Look For Changed Web Pages.
-* WEBGRAB:: Extract Links From A Page.
-* STATIST:: Graphing A Statistical Distribution.
-* MAZE:: Walking Through A Maze In Virtual Reality.
-* MOBAGWHO:: A Simple Mobile Agent.
-* STOXPRED:: Stock Market Prediction As A Service.
-* PROTBASE:: Searching Through A Protein Database.
-
-
-File: gawkinet.info, Node: PANIC, Next: GETURL, Prev: Some Applications and Techniques, Up: Some Applications and Techniques
-
-3.1 PANIC: An Emergency Web Server
-==================================
-
-At first glance, the '"Hello, world"' example in *note A Primitive Web
-Service: Primitive Service, seems useless. By adding just a few lines,
-we can turn it into something useful.
-
- The PANIC program tells everyone who connects that the local site is
-not working. When a web server breaks down, it makes a difference if
-customers get a strange "network unreachable" message, or a short
-message telling them that the server has a problem. In such an
-emergency, the hard disk and everything on it (including the regular web
-service) may be unavailable. Rebooting the web server off a diskette
-makes sense in this setting.
-
- To use the PANIC program as an emergency web server, all you need are
-the 'gawk' executable and the program below on a diskette. By default,
-it connects to port 8080. A different value may be supplied on the
-command line:
-
- BEGIN {
- RS = ORS = "\r\n"
- if (MyPort == 0) MyPort = 8080
- HttpService = "/inet/tcp/" MyPort "/0/0"
- Hello = "<HTML><HEAD><TITLE>Out Of Service</TITLE>" \
- "</HEAD><BODY><H1>" \
- "This site is temporarily out of service." \
- "</H1></BODY></HTML>"
- Len = length(Hello) + length(ORS)
- while ("awk" != "complex") {
- print "HTTP/1.0 200 OK" |& HttpService
- print "Content-Length: " Len ORS |& HttpService
- print Hello |& HttpService
- while ((HttpService |& getline) > 0)
- continue;
- close(HttpService)
- }
- }
-
-
-File: gawkinet.info, Node: GETURL, Next: REMCONF, Prev: PANIC, Up: Some Applications and Techniques
-
-3.2 GETURL: Retrieving Web Pages
-================================
-
-GETURL is a versatile building block for shell scripts that need to
-retrieve files from the Internet. It takes a web address as a
-command-line parameter and tries to retrieve the contents of this
-address. The contents are printed to standard output, while the header
-is printed to '/dev/stderr'. A surrounding shell script could analyze
-the contents and extract the text or the links. An ASCII browser could
-be written around GETURL. But more interestingly, web robots are
-straightforward to write on top of GETURL. On the Internet, you can find
-several programs of the same name that do the same job. They are
-usually much more complex internally and at least 10 times longer.
-
- At first, GETURL checks if it was called with exactly one web
-address. Then, it checks if the user chose to use a special proxy
-server whose name is handed over in a variable. By default, it is
-assumed that the local machine serves as proxy. GETURL uses the 'GET'
-method by default to access the web page. By handing over the name of a
-different method (such as 'HEAD'), it is possible to choose a different
-behavior. With the 'HEAD' method, the user does not receive the body of
-the page content, but does receive the header:
-
- BEGIN {
- if (ARGC != 2) {
- print "GETURL - retrieve Web page via HTTP 1.0"
- print "IN:\n the URL as a command-line parameter"
- print "PARAM(S):\n -v Proxy=MyProxy"
- print "OUT:\n the page content on stdout"
- print " the page header on stderr"
- print "JK 16.05.1997"
- print "ADR 13.08.2000"
- exit
- }
- URL = ARGV[1]; ARGV[1] = ""
- if (Proxy == "") Proxy = "127.0.0.1"
- if (ProxyPort == 0) ProxyPort = 80
- if (Method == "") Method = "GET"
- HttpService = "/inet/tcp/0/" Proxy "/" ProxyPort
- ORS = RS = "\r\n\r\n"
- print Method " " URL " HTTP/1.0" |& HttpService
- HttpService |& getline Header
- print Header > "/dev/stderr"
- while ((HttpService |& getline) > 0)
- printf "%s", $0
- close(HttpService)
- }
-
- This program can be changed as needed, but be careful with the last
-lines. Make sure transmission of binary data is not corrupted by
-additional line breaks. Even as it is now, the byte sequence
-'"\r\n\r\n"' would disappear if it were contained in binary data. Don't
-get caught in a trap when trying a quick fix on this one.
-
-
-File: gawkinet.info, Node: REMCONF, Next: URLCHK, Prev: GETURL, Up: Some Applications and Techniques
-
-3.3 REMCONF: Remote Configuration of Embedded Systems
-=====================================================
-
-Today, you often find powerful processors in embedded systems.
-Dedicated network routers and controllers for all kinds of machinery are
-examples of embedded systems. Processors like the Intel 80x86 or the
-AMD Elan are able to run multitasking operating systems, such as XINU or
-GNU/Linux in embedded PCs. These systems are small and usually do not
-have a keyboard or a display. Therefore it is difficult to set up their
-configuration. There are several widespread ways to set them up:
-
- * DIP switches
-
- * Read Only Memories such as EPROMs
-
- * Serial lines or some kind of keyboard
-
- * Network connections via 'telnet' or SNMP
-
- * HTTP connections with HTML GUIs
-
- In this node, we look at a solution that uses HTTP connections to
-control variables of an embedded system that are stored in a file.
-Since embedded systems have tight limits on resources like memory, it is
-difficult to employ advanced techniques such as SNMP and HTTP servers.
-'gawk' fits in quite nicely with its single executable which needs just
-a short script to start working. The following program stores the
-variables in a file, and a concurrent process in the embedded system may
-read the file. The program uses the site-independent part of the simple
-web server that we developed in *note A Web Service with Interaction:
-Interacting Service. As mentioned there, all we have to do is to write
-two new procedures 'SetUpServer()' and 'HandleGET()':
-
- function SetUpServer() {
- TopHeader = "<HTML><title>Remote Configuration</title>"
- TopDoc = "<BODY>\
- <h2>Please choose one of the following actions:</h2>\
- <UL>\
- <LI><A HREF=" MyPrefix "/AboutServer>About this server</A></LI>\
- <LI><A HREF=" MyPrefix "/ReadConfig>Read Configuration</A></LI>\
- <LI><A HREF=" MyPrefix "/CheckConfig>Check Configuration</A></LI>\
- <LI><A HREF=" MyPrefix "/ChangeConfig>Change Configuration</A></LI>\
- <LI><A HREF=" MyPrefix "/SaveConfig>Save Configuration</A></LI>\
- </UL>"
- TopFooter = "</BODY></HTML>"
- if (ConfigFile == "") ConfigFile = "config.asc"
- }
-
- The function 'SetUpServer()' initializes the top level HTML texts as
-usual. It also initializes the name of the file that contains the
-configuration parameters and their values. In case the user supplies a
-name from the command line, that name is used. The file is expected to
-contain one parameter per line, with the name of the parameter in column
-one and the value in column two.
-
- The function 'HandleGET()' reflects the structure of the menu tree as
-usual. The first menu choice tells the user what this is all about.
-The second choice reads the configuration file line by line and stores
-the parameters and their values. Notice that the record separator for
-this file is '"\n"', in contrast to the record separator for HTTP. The
-third menu choice builds an HTML table to show the contents of the
-configuration file just read. The fourth choice does the real work of
-changing parameters, and the last one just saves the configuration into
-a file:
-
- function HandleGET() {
- if(MENU[2] == "AboutServer") {
- Document = "This is a GUI for remote configuration of an\
- embedded system. It is is implemented as one GAWK script."
- } else if (MENU[2] == "ReadConfig") {
- RS = "\n"
- while ((getline < ConfigFile) > 0)
- config[$1] = $2;
- close(ConfigFile)
- RS = "\r\n"
- Document = "Configuration has been read."
- } else if (MENU[2] == "CheckConfig") {
- Document = "<TABLE BORDER=1 CELLPADDING=5>"
- for (i in config)
- Document = Document "<TR><TD>" i "</TD>" \
- "<TD>" config[i] "</TD></TR>"
- Document = Document "</TABLE>"
- } else if (MENU[2] == "ChangeConfig") {
- if ("Param" in GETARG) { # any parameter to set?
- if (GETARG["Param"] in config) { # is parameter valid?
- config[GETARG["Param"]] = GETARG["Value"]
- Document = (GETARG["Param"] " = " GETARG["Value"] ".")
- } else {
- Document = "Parameter <b>" GETARG["Param"] "</b> is invalid."
- }
- } else {
- Document = "<FORM method=GET><h4>Change one parameter</h4>\
- <TABLE BORDER CELLPADDING=5>\
- <TR><TD>Parameter</TD><TD>Value</TD></TR>\
- <TR><TD><input type=text name=Param value=\"\" size=20></TD>\
- <TD><input type=text name=Value value=\"\" size=40></TD>\
- </TR></TABLE><input type=submit value=\"Set\"></FORM>"
- }
- } else if (MENU[2] == "SaveConfig") {
- for (i in config)
- printf("%s %s\n", i, config[i]) > ConfigFile
- close(ConfigFile)
- Document = "Configuration has been saved."
- }
- }
-
- We could also view the configuration file as a database. From this
-point of view, the previous program acts like a primitive database
-server. Real SQL database systems also make a service available by
-providing a TCP port that clients can connect to. But the application
-level protocols they use are usually proprietary and also change from
-time to time. This is also true for the protocol that MiniSQL uses.
-
-
-File: gawkinet.info, Node: URLCHK, Next: WEBGRAB, Prev: REMCONF, Up: Some Applications and Techniques
-
-3.4 URLCHK: Look for Changed Web Pages
-======================================
-
-Most people who make heavy use of Internet resources have a large
-bookmark file with pointers to interesting web sites. It is impossible
-to regularly check by hand if any of these sites have changed. A
-program is needed to automatically look at the headers of web pages and
-tell which ones have changed. URLCHK does the comparison after using
-GETURL with the 'HEAD' method to retrieve the header.
-
- Like GETURL, this program first checks that it is called with exactly
-one command-line parameter. URLCHK also takes the same command-line
-variables 'Proxy' and 'ProxyPort' as GETURL, because these variables are
-handed over to GETURL for each URL that gets checked. The one and only
-parameter is the name of a file that contains one line for each URL. In
-the first column, we find the URL, and the second and third columns hold
-the length of the URL's body when checked for the two last times. Now,
-we follow this plan:
-
- 1. Read the URLs from the file and remember their most recent lengths
-
- 2. Delete the contents of the file
-
- 3. For each URL, check its new length and write it into the file
-
- 4. If the most recent and the new length differ, tell the user
-
- It may seem a bit peculiar to read the URLs from a file together with
-their two most recent lengths, but this approach has several advantages.
-You can call the program again and again with the same file. After
-running the program, you can regenerate the changed URLs by extracting
-those lines that differ in their second and third columns:
-
- BEGIN {
- if (ARGC != 2) {
- print "URLCHK - check if URLs have changed"
- print "IN:\n the file with URLs as a command-line parameter"
- print " file contains URL, old length, new length"
- print "PARAMS:\n -v Proxy=MyProxy -v ProxyPort=8080"
- print "OUT:\n same as file with URLs"
- print "JK 02.03.1998"
- exit
- }
- URLfile = ARGV[1]; ARGV[1] = ""
- if (Proxy != "") Proxy = " -v Proxy=" Proxy
- if (ProxyPort != "") ProxyPort = " -v ProxyPort=" ProxyPort
- while ((getline < URLfile) > 0)
- Length[$1] = $3 + 0
- close(URLfile) # now, URLfile is read in and can be updated
- GetHeader = "gawk " Proxy ProxyPort " -v Method=\"HEAD\" -f geturl.awk "
- for (i in Length) {
- GetThisHeader = GetHeader i " 2>&1"
- while ((GetThisHeader | getline) > 0)
- if (toupper($0) ~ /CONTENT-LENGTH/) NewLength = $2 + 0
- close(GetThisHeader)
- print i, Length[i], NewLength > URLfile
- if (Length[i] != NewLength) # report only changed URLs
- print i, Length[i], NewLength
- }
- close(URLfile)
- }
-
- Another thing that may look strange is the way GETURL is called.
-Before calling GETURL, we have to check if the proxy variables need to
-be passed on. If so, we prepare strings that will become part of the
-command line later. In 'GetHeader()', we store these strings together
-with the longest part of the command line. Later, in the loop over the
-URLs, 'GetHeader()' is appended with the URL and a redirection operator
-to form the command that reads the URL's header over the Internet.
-GETURL always produces the headers over '/dev/stderr'. That is the
-reason why we need the redirection operator to have the header piped in.
-
- This program is not perfect because it assumes that changing URLs
-results in changed lengths, which is not necessarily true. A more
-advanced approach is to look at some other header line that holds time
-information. But, as always when things get a bit more complicated,
-this is left as an exercise to the reader.
-
-
-File: gawkinet.info, Node: WEBGRAB, Next: STATIST, Prev: URLCHK, Up: Some Applications and Techniques
-
-3.5 WEBGRAB: Extract Links from a Page
-======================================
-
-Sometimes it is necessary to extract links from web pages. Browsers do
-it, web robots do it, and sometimes even humans do it. Since we have a
-tool like GETURL at hand, we can solve this problem with some help from
-the Bourne shell:
-
- BEGIN { RS = "http://[#%&\\+\\-\\./0-9\\:;\\?A-Z_a-z\\~]*" }
- RT != "" {
- command = ("gawk -v Proxy=MyProxy -f geturl.awk " RT \
- " > doc" NR ".html")
- print command
- }
-
- Notice that the regular expression for URLs is rather crude. A
-precise regular expression is much more complex. But this one works
-rather well. One problem is that it is unable to find internal links of
-an HTML document. Another problem is that 'ftp', 'telnet', 'news',
-'mailto', and other kinds of links are missing in the regular
-expression. However, it is straightforward to add them, if doing so is
-necessary for other tasks.
-
- This program reads an HTML file and prints all the HTTP links that it
-finds. It relies on 'gawk''s ability to use regular expressions as
-record separators. With 'RS' set to a regular expression that matches
-links, the second action is executed each time a non-empty link is
-found. We can find the matching link itself in 'RT'.
-
- The action could use the 'system()' function to let another GETURL
-retrieve the page, but here we use a different approach. This simple
-program prints shell commands that can be piped into 'sh' for execution.
-This way it is possible to first extract the links, wrap shell commands
-around them, and pipe all the shell commands into a file. After editing
-the file, execution of the file retrieves exactly those files that we
-really need. In case we do not want to edit, we can retrieve all the
-pages like this:
-
- gawk -f geturl.awk http://www.suse.de | gawk -f webgrab.awk | sh
-
- After this, you will find the contents of all referenced documents in
-files named 'doc*.html' even if they do not contain HTML code. The most
-annoying thing is that we always have to pass the proxy to GETURL. If
-you do not like to see the headers of the web pages appear on the
-screen, you can redirect them to '/dev/null'. Watching the headers
-appear can be quite interesting, because it reveals interesting details
-such as which web server the companies use. Now, it is clear how the
-clever marketing people use web robots to determine the market shares of
-Microsoft and Netscape in the web server market.
-
- Port 80 of any web server is like a small hole in a repellent
-firewall. After attaching a browser to port 80, we usually catch a
-glimpse of the bright side of the server (its home page). With a tool
-like GETURL at hand, we are able to discover some of the more concealed
-or even "indecent" services (i.e., lacking conformity to standards of
-quality). It can be exciting to see the fancy CGI scripts that lie
-there, revealing the inner workings of the server, ready to be called:
-
- * With a command such as:
-
- gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/
-
- some servers give you a directory listing of the CGI files.
- Knowing the names, you can try to call some of them and watch for
- useful results. Sometimes there are executables in such
- directories (such as Perl interpreters) that you may call remotely.
- If there are subdirectories with configuration data of the web
- server, this can also be quite interesting to read.
-
- * The well-known Apache web server usually has its CGI files in the
- directory '/cgi-bin'. There you can often find the scripts
- 'test-cgi' and 'printenv'. Both tell you some things about the
- current connection and the installation of the web server. Just
- call:
-
- gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/test-cgi
- gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/printenv
-
- * Sometimes it is even possible to retrieve system files like the web
- server's log file--possibly containing customer data--or even the
- file '/etc/passwd'. (We don't recommend this!)
-
- *Caution:* Although this may sound funny or simply irrelevant, we are
-talking about severe security holes. Try to explore your own system
-this way and make sure that none of the above reveals too much
-information about your system.
-
-
-File: gawkinet.info, Node: STATIST, Next: MAZE, Prev: WEBGRAB, Up: Some Applications and Techniques
-
-3.6 STATIST: Graphing a Statistical Distribution
-================================================
-
-In the HTTP server examples we've shown thus far, we never present an
-image to the browser and its user. Presenting images is one task.
-Generating images that reflect some user input and presenting these
-dynamically generated images is another. In this node, we use GNUPlot
-for generating '.png', '.ps', or '.gif' files.(1)
-
- The program we develop takes the statistical parameters of two
-samples and computes the t-test statistics. As a result, we get the
-probabilities that the means and the variances of both samples are the
-same. In order to let the user check plausibility, the program presents
-an image of the distributions. The statistical computation follows
-'Numerical Recipes in C: The Art of Scientific Computing' by William H.
-Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery.
-Since 'gawk' does not have a built-in function for the computation of
-the beta function, we use the 'ibeta()' function of GNUPlot. As a side
-effect, we learn how to use GNUPlot as a sophisticated calculator. The
-comparison of means is done as in 'tutest', paragraph 14.2, page 613,
-and the comparison of variances is done as in 'ftest', page 611 in
-'Numerical Recipes'.
-
- As usual, we take the site-independent code for servers and append
-our own functions 'SetUpServer()' and 'HandleGET()':
-
- function SetUpServer() {
- TopHeader = "<HTML><title>Statistics with GAWK</title>"
- TopDoc = "<BODY>\
- <h2>Please choose one of the following actions:</h2>\
- <UL>\
- <LI><A HREF=" MyPrefix "/AboutServer>About this server</A></LI>\
- <LI><A HREF=" MyPrefix "/EnterParameters>Enter Parameters</A></LI>\
- </UL>"
- TopFooter = "</BODY></HTML>"
- GnuPlot = "gnuplot 2>&1"
- m1=m2=0; v1=v2=1; n1=n2=10
- }
-
- Here, you see the menu structure that the user sees. Later, we will
-see how the program structure of the 'HandleGET()' function reflects the
-menu structure. What is missing here is the link for the image we
-generate. In an event-driven environment, request, generation, and
-delivery of images are separated.
-
- Notice the way we initialize the 'GnuPlot' command string for the
-pipe. By default, GNUPlot outputs the generated image via standard
-output, as well as the results of 'print'(ed) calculations via standard
-error. The redirection causes standard error to be mixed into standard
-output, enabling us to read results of calculations with 'getline'. By
-initializing the statistical parameters with some meaningful defaults,
-we make sure the user gets an image the first time he uses the program.
-
- Following is the rather long function 'HandleGET()', which implements
-the contents of this service by reacting to the different kinds of
-requests from the browser. Before you start playing with this script,
-make sure that your browser supports JavaScript and that it also has
-this option switched on. The script uses a short snippet of JavaScript
-code for delayed opening of a window with an image. A more detailed
-explanation follows:
-
- function HandleGET() {
- if(MENU[2] == "AboutServer") {
- Document = "This is a GUI for a statistical computation.\
- It compares means and variances of two distributions.\
- It is implemented as one GAWK script and uses GNUPLOT."
- } else if (MENU[2] == "EnterParameters") {
- Document = ""
- if ("m1" in GETARG) { # are there parameters to compare?
- Document = Document "<SCRIPT LANGUAGE=\"JavaScript\">\
- setTimeout(\"window.open(\\\"" MyPrefix "/Image" systime()\
- "\\\",\\\"dist\\\", \\\"status=no\\\");\", 1000); </SCRIPT>"
- m1 = GETARG["m1"]; v1 = GETARG["v1"]; n1 = GETARG["n1"]
- m2 = GETARG["m2"]; v2 = GETARG["v2"]; n2 = GETARG["n2"]
- t = (m1-m2)/sqrt(v1/n1+v2/n2)
- df = (v1/n1+v2/n2)*(v1/n1+v2/n2)/((v1/n1)*(v1/n1)/(n1-1) \
- + (v2/n2)*(v2/n2) /(n2-1))
- if (v1>v2) {
- f = v1/v2
- df1 = n1 - 1
- df2 = n2 - 1
- } else {
- f = v2/v1
- df1 = n2 - 1
- df2 = n1 - 1
- }
- print "pt=ibeta(" df/2 ",0.5," df/(df+t*t) ")" |& GnuPlot
- print "pF=2.0*ibeta(" df2/2 "," df1/2 "," \
- df2/(df2+df1*f) ")" |& GnuPlot
- print "print pt, pF" |& GnuPlot
- RS="\n"; GnuPlot |& getline; RS="\r\n" # $1 is pt, $2 is pF
- print "invsqrt2pi=1.0/sqrt(2.0*pi)" |& GnuPlot
- print "nd(x)=invsqrt2pi/sd*exp(-0.5*((x-mu)/sd)**2)" |& GnuPlot
- print "set term png small color" |& GnuPlot
- #print "set term postscript color" |& GnuPlot
- #print "set term gif medium size 320,240" |& GnuPlot
- print "set yrange[-0.3:]" |& GnuPlot
- print "set label 'p(m1=m2) =" $1 "' at 0,-0.1 left" |& GnuPlot
- print "set label 'p(v1=v2) =" $2 "' at 0,-0.2 left" |& GnuPlot
- print "plot mu=" m1 ",sd=" sqrt(v1) ", nd(x) title 'sample 1',\
- mu=" m2 ",sd=" sqrt(v2) ", nd(x) title 'sample 2'" |& GnuPlot
- print "quit" |& GnuPlot
- GnuPlot |& getline Image
- while ((GnuPlot |& getline) > 0)
- Image = Image RS $0
- close(GnuPlot)
- }
- Document = Document "\
- <h3>Do these samples have the same Gaussian distribution?</h3>\
- <FORM METHOD=GET> <TABLE BORDER CELLPADDING=5>\
- <TR>\
- <TD>1. Mean </TD>
- <TD><input type=text name=m1 value=" m1 " size=8></TD>\
- <TD>1. Variance</TD>
- <TD><input type=text name=v1 value=" v1 " size=8></TD>\
- <TD>1. Count </TD>
- <TD><input type=text name=n1 value=" n1 " size=8></TD>\
- </TR><TR>\
- <TD>2. Mean </TD>
- <TD><input type=text name=m2 value=" m2 " size=8></TD>\
- <TD>2. Variance</TD>
- <TD><input type=text name=v2 value=" v2 " size=8></TD>\
- <TD>2. Count </TD>
- <TD><input type=text name=n2 value=" n2 " size=8></TD>\
- </TR> <input type=submit value=\"Compute\">\
- </TABLE></FORM><BR>"
- } else if (MENU[2] ~ "Image") {
- Reason = "OK" ORS "Content-type: image/png"
- #Reason = "OK" ORS "Content-type: application/x-postscript"
- #Reason = "OK" ORS "Content-type: image/gif"
- Header = Footer = ""
- Document = Image
- }
- }
-
- As usual, we give a short description of the service in the first
-menu choice. The third menu choice shows us that generation and
-presentation of an image are two separate actions. While the latter
-takes place quite instantly in the third menu choice, the former takes
-place in the much longer second choice. Image data passes from the
-generating action to the presenting action via the variable 'Image' that
-contains a complete '.png' image, which is otherwise stored in a file.
-If you prefer '.ps' or '.gif' images over the default '.png' images, you
-may select these options by uncommenting the appropriate lines. But
-remember to do so in two places: when telling GNUPlot which kind of
-images to generate, and when transmitting the image at the end of the
-program.
-
- Looking at the end of the program, the way we pass the 'Content-type'
-to the browser is a bit unusual. It is appended to the 'OK' of the
-first header line to make sure the type information becomes part of the
-header. The other variables that get transmitted across the network are
-made empty, because in this case we do not have an HTML document to
-transmit, but rather raw image data to contain in the body.
-
- Most of the work is done in the second menu choice. It starts with a
-strange JavaScript code snippet. When first implementing this server,
-we used a short '"<IMG SRC=" MyPrefix "/Image>"' here. But then
-browsers got smarter and tried to improve on speed by requesting the
-image and the HTML code at the same time. When doing this, the browser
-tries to build up a connection for the image request while the request
-for the HTML text is not yet completed. The browser tries to connect to
-the 'gawk' server on port 8080 while port 8080 is still in use for
-transmission of the HTML text. The connection for the image cannot be
-built up, so the image appears as "broken" in the browser window. We
-solved this problem by telling the browser to open a separate window for
-the image, but only after a delay of 1000 milliseconds. By this time,
-the server should be ready for serving the next request.
-
- But there is one more subtlety in the JavaScript code. Each time the
-JavaScript code opens a window for the image, the name of the image is
-appended with a timestamp ('systime()'). Why this constant change of
-name for the image? Initially, we always named the image 'Image', but
-then the Netscape browser noticed the name had _not_ changed since the
-previous request and displayed the previous image (caching behavior).
-The server core is implemented so that browsers are told _not_ to cache
-anything. Obviously HTTP requests do not always work as expected. One
-way to circumvent the cache of such overly smart browsers is to change
-the name of the image with each request. These three lines of
-JavaScript caused us a lot of trouble.
-
- The rest can be broken down into two phases. At first, we check if
-there are statistical parameters. When the program is first started,
-there usually are no parameters because it enters the page coming from
-the top menu. Then, we only have to present the user a form that he can
-use to change statistical parameters and submit them. Subsequently, the
-submission of the form causes the execution of the first phase because
-_now_ there _are_ parameters to handle.
-
- Now that we have parameters, we know there will be an image
-available. Therefore we insert the JavaScript code here to initiate the
-opening of the image in a separate window. Then, we prepare some
-variables that will be passed to GNUPlot for calculation of the
-probabilities. Prior to reading the results, we must temporarily change
-'RS' because GNUPlot separates lines with newlines. After instructing
-GNUPlot to generate a '.png' (or '.ps' or '.gif') image, we initiate the
-insertion of some text, explaining the resulting probabilities. The
-final 'plot' command actually generates the image data. This raw binary
-has to be read in carefully without adding, changing, or deleting a
-single byte. Hence the unusual initialization of 'Image' and completion
-with a 'while' loop.
-
- When using this server, it soon becomes clear that it is far from
-being perfect. It mixes source code of six scripting languages or
-protocols:
-
- * GNU 'awk' implements a server for the protocol:
- * HTTP which transmits:
- * HTML text which contains a short piece of:
- * JavaScript code opening a separate window.
- * A Bourne shell script is used for piping commands into:
- * GNUPlot to generate the image to be opened.
-
- After all this work, the GNUPlot image opens in the JavaScript window
-where it can be viewed by the user.
-
- It is probably better not to mix up so many different languages. The
-result is not very readable. Furthermore, the statistical part of the
-server does not take care of invalid input. Among others, using
-negative variances will cause invalid results.
-
- ---------- Footnotes ----------
-
- (1) Due to licensing problems, the default installation of GNUPlot
-disables the generation of '.gif' files. If your installed version does
-not accept 'set term gif', just download and install the most recent
-version of GNUPlot and the GD library (http://www.boutell.com/gd/) by
-Thomas Boutell. Otherwise you still have the chance to generate some
-ASCII-art style images with GNUPlot by using 'set term dumb'. (We tried
-it and it worked.)
-
-
-File: gawkinet.info, Node: MAZE, Next: MOBAGWHO, Prev: STATIST, Up: Some Applications and Techniques
-
-3.7 MAZE: Walking Through a Maze In Virtual Reality
-===================================================
-
- In the long run, every program becomes rococo, and then rubble.
- Alan Perlis
-
- By now, we know how to present arbitrary 'Content-type's to a
-browser. In this node, our server will present a 3D world to our
-browser. The 3D world is described in a scene description language
-(VRML, Virtual Reality Modeling Language) that allows us to travel
-through a perspective view of a 2D maze with our browser. Browsers with
-a VRML plugin enable exploration of this technology. We could do one of
-those boring 'Hello world' examples here, that are usually presented
-when introducing novices to VRML. If you have never written any VRML
-code, have a look at the VRML FAQ. Presenting a static VRML scene is a
-bit trivial; in order to expose 'gawk''s new capabilities, we will
-present a dynamically generated VRML scene. The function
-'SetUpServer()' is very simple because it only sets the default HTML
-page and initializes the random number generator. As usual, the
-surrounding server lets you browse the maze.
-
- function SetUpServer() {
- TopHeader = "<HTML><title>Walk through a maze</title>"
- TopDoc = "\
- <h2>Please choose one of the following actions:</h2>\
- <UL>\
- <LI><A HREF=" MyPrefix "/AboutServer>About this server</A>\
- <LI><A HREF=" MyPrefix "/VRMLtest>Watch a simple VRML scene</A>\
- </UL>"
- TopFooter = "</HTML>"
- srand()
- }
-
- The function 'HandleGET()' is a bit longer because it first computes
-the maze and afterwards generates the VRML code that is sent across the
-network. As shown in the STATIST example (*note STATIST::), we set the
-type of the content to VRML and then store the VRML representation of
-the maze as the page content. We assume that the maze is stored in a 2D
-array. Initially, the maze consists of walls only. Then, we add an
-entry and an exit to the maze and let the rest of the work be done by
-the function 'MakeMaze()'. Now, only the wall fields are left in the
-maze. By iterating over the these fields, we generate one line of VRML
-code for each wall field.
-
- function HandleGET() {
- if (MENU[2] == "AboutServer") {
- Document = "If your browser has a VRML 2 plugin,\
- this server shows you a simple VRML scene."
- } else if (MENU[2] == "VRMLtest") {
- XSIZE = YSIZE = 11 # initially, everything is wall
- for (y = 0; y < YSIZE; y++)
- for (x = 0; x < XSIZE; x++)
- Maze[x, y] = "#"
- delete Maze[0, 1] # entry is not wall
- delete Maze[XSIZE-1, YSIZE-2] # exit is not wall
- MakeMaze(1, 1)
- Document = "\
- #VRML V2.0 utf8\n\
- Group {\n\
- children [\n\
- PointLight {\n\
- ambientIntensity 0.2\n\
- color 0.7 0.7 0.7\n\
- location 0.0 8.0 10.0\n\
- }\n\
- DEF B1 Background {\n\
- skyColor [0 0 0, 1.0 1.0 1.0 ]\n\
- skyAngle 1.6\n\
- groundColor [1 1 1, 0.8 0.8 0.8, 0.2 0.2 0.2 ]\n\
- groundAngle [ 1.2 1.57 ]\n\
- }\n\
- DEF Wall Shape {\n\
- geometry Box {size 1 1 1}\n\
- appearance Appearance { material Material { diffuseColor 0 0 1 } }\n\
- }\n\
- DEF Entry Viewpoint {\n\
- position 0.5 1.0 5.0\n\
- orientation 0.0 0.0 -1.0 0.52\n\
- }\n"
- for (i in Maze) {
- split(i, t, SUBSEP)
- Document = Document " Transform { translation "
- Document = Document t[1] " 0 -" t[2] " children USE Wall }\n"
- }
- Document = Document " ] # end of group for world\n}"
- Reason = "OK" ORS "Content-type: model/vrml"
- Header = Footer = ""
- }
- }
-
- Finally, we have a look at 'MakeMaze()', the function that generates
-the 'Maze' array. When entered, this function assumes that the array
-has been initialized so that each element represents a wall element and
-the maze is initially full of wall elements. Only the entrance and the
-exit of the maze should have been left free. The parameters of the
-function tell us which element must be marked as not being a wall.
-After this, we take a look at the four neighboring elements and remember
-which we have already treated. Of all the neighboring elements, we take
-one at random and walk in that direction. Therefore, the wall element
-in that direction has to be removed and then, we call the function
-recursively for that element. The maze is only completed if we iterate
-the above procedure for _all_ neighboring elements (in random order) and
-for our present element by recursively calling the function for the
-present element. This last iteration could have been done in a loop,
-but it is done much simpler recursively.
-
- Notice that elements with coordinates that are both odd are assumed
-to be on our way through the maze and the generating process cannot
-terminate as long as there is such an element not being 'delete'd. All
-other elements are potentially part of the wall.
-
- function MakeMaze(x, y) {
- delete Maze[x, y] # here we are, we have no wall here
- p = 0 # count unvisited fields in all directions
- if (x-2 SUBSEP y in Maze) d[p++] = "-x"
- if (x SUBSEP y-2 in Maze) d[p++] = "-y"
- if (x+2 SUBSEP y in Maze) d[p++] = "+x"
- if (x SUBSEP y+2 in Maze) d[p++] = "+y"
- if (p>0) { # if there are unvisited fields, go there
- p = int(p*rand()) # choose one unvisited field at random
- if (d[p] == "-x") { delete Maze[x - 1, y]; MakeMaze(x - 2, y)
- } else if (d[p] == "-y") { delete Maze[x, y - 1]; MakeMaze(x, y - 2)
- } else if (d[p] == "+x") { delete Maze[x + 1, y]; MakeMaze(x + 2, y)
- } else if (d[p] == "+y") { delete Maze[x, y + 1]; MakeMaze(x, y + 2)
- } # we are back from recursion
- MakeMaze(x, y); # try again while there are unvisited fields
- }
- }
-
-
-File: gawkinet.info, Node: MOBAGWHO, Next: STOXPRED, Prev: MAZE, Up: Some Applications and Techniques
-
-3.8 MOBAGWHO: a Simple Mobile Agent
-===================================
-
- There are two ways of constructing a software design: One way is to
- make it so simple that there are obviously no deficiencies, and the
- other way is to make it so complicated that there are no obvious
- deficiencies.
- C. A. R. Hoare
-
- A "mobile agent" is a program that can be dispatched from a computer
-and transported to a remote server for execution. This is called
-"migration", which means that a process on another system is started
-that is independent from its originator. Ideally, it wanders through a
-network while working for its creator or owner. In places like the UMBC
-Agent Web, people are quite confident that (mobile) agents are a
-software engineering paradigm that enables us to significantly increase
-the efficiency of our work. Mobile agents could become the mediators
-between users and the networking world. For an unbiased view at this
-technology, see the remarkable paper 'Mobile Agents: Are they a good
-idea?'.(1)
-
- When trying to migrate a process from one system to another, a server
-process is needed on the receiving side. Depending on the kind of
-server process, several ways of implementation come to mind. How the
-process is implemented depends upon the kind of server process:
-
- * HTTP can be used as the protocol for delivery of the migrating
- process. In this case, we use a common web server as the receiving
- server process. A universal CGI script mediates between migrating
- process and web server. Each server willing to accept migrating
- agents makes this universal service available. HTTP supplies the
- 'POST' method to transfer some data to a file on the web server.
- When a CGI script is called remotely with the 'POST' method instead
- of the usual 'GET' method, data is transmitted from the client
- process to the standard input of the server's CGI script. So, to
- implement a mobile agent, we must not only write the agent program
- to start on the client side, but also the CGI script to receive the
- agent on the server side.
-
- * The 'PUT' method can also be used for migration. HTTP does not
- require a CGI script for migration via 'PUT'. However, with common
- web servers there is no advantage to this solution, because web
- servers such as Apache require explicit activation of a special
- 'PUT' script.
-
- * 'Agent Tcl' pursues a different course; it relies on a dedicated
- server process with a dedicated protocol specialized for receiving
- mobile agents.
-
- Our agent example abuses a common web server as a migration tool.
-So, it needs a universal CGI script on the receiving side (the web
-server). The receiving script is activated with a 'POST' request when
-placed into a location like '/httpd/cgi-bin/PostAgent.sh'. Make sure
-that the server system uses a version of 'gawk' that supports network
-access (Version 3.1 or later; verify with 'gawk --version').
-
- #!/bin/sh
- MobAg=/tmp/MobileAgent.$$
- # direct script to mobile agent file
- cat > $MobAg
- # execute agent concurrently
- gawk -f $MobAg $MobAg > /dev/null &
- # HTTP header, terminator and body
- gawk 'BEGIN { print "\r\nAgent started" }'
- rm $MobAg # delete script file of agent
-
- By making its process id ('$$') part of the unique file name, the
-script avoids conflicts between concurrent instances of the script.
-First, all lines from standard input (the mobile agent's source code)
-are copied into this unique file. Then, the agent is started as a
-concurrent process and a short message reporting this fact is sent to
-the submitting client. Finally, the script file of the mobile agent is
-removed because it is no longer needed. Although it is a short script,
-there are several noteworthy points:
-
-Security
- _There is none_. In fact, the CGI script should never be made
- available on a server that is part of the Internet because everyone
- would be allowed to execute arbitrary commands with it. This
- behavior is acceptable only when performing rapid prototyping.
-
-Self-Reference
- Each migrating instance of an agent is started in a way that
- enables it to read its own source code from standard input and use
- the code for subsequent migrations. This is necessary because it
- needs to treat the agent's code as data to transmit. 'gawk' is not
- the ideal language for such a job. Lisp and Tcl are more suitable
- because they do not make a distinction between program code and
- data.
-
-Independence
- After migration, the agent is not linked to its former home in any
- way. By reporting 'Agent started', it waves "Goodbye" to its
- origin. The originator may choose to terminate or not.
-
- The originating agent itself is started just like any other
-command-line script, and reports the results on standard output. By
-letting the name of the original host migrate with the agent, the agent
-that migrates to a host far away from its origin can report the result
-back home. Having arrived at the end of the journey, the agent
-establishes a connection and reports the results. This is the reason
-for determining the name of the host with 'uname -n' and storing it in
-'MyOrigin' for later use. We may also set variables with the '-v'
-option from the command line. This interactivity is only of importance
-in the context of starting a mobile agent; therefore this 'BEGIN'
-pattern and its action do not take part in migration:
-
- BEGIN {
- if (ARGC != 2) {
- print "MOBAG - a simple mobile agent"
- print "CALL:\n gawk -f mobag.awk mobag.awk"
- print "IN:\n the name of this script as a command-line parameter"
- print "PARAM:\n -v MyOrigin=myhost.com"
- print "OUT:\n the result on stdout"
- print "JK 29.03.1998 01.04.1998"
- exit
- }
- if (MyOrigin == "") {
- "uname -n" | getline MyOrigin
- close("uname -n")
- }
- }
-
- Since 'gawk' cannot manipulate and transmit parts of the program
-directly, the source code is read and stored in strings. Therefore, the
-program scans itself for the beginning and the ending of functions.
-Each line in between is appended to the code string until the end of the
-function has been reached. A special case is this part of the program
-itself. It is not a function. Placing a similar framework around it
-causes it to be treated like a function. Notice that this mechanism
-works for all the functions of the source code, but it cannot guarantee
-that the order of the functions is preserved during migration:
-
- #ReadMySelf
- /^function / { FUNC = $2 }
- /^END/ || /^#ReadMySelf/ { FUNC = $1 }
- FUNC != "" { MOBFUN[FUNC] = MOBFUN[FUNC] RS $0 }
- (FUNC != "") && (/^}/ || /^#EndOfMySelf/) \
- { FUNC = "" }
- #EndOfMySelf
-
- The web server code in *note A Web Service with Interaction:
-Interacting Service, was first developed as a site-independent core.
-Likewise, the 'gawk'-based mobile agent starts with an agent-independent
-core, to which can be appended application-dependent functions. What
-follows is the only application-independent function needed for the
-mobile agent:
-
- function migrate(Destination, MobCode, Label) {
- MOBVAR["Label"] = Label
- MOBVAR["Destination"] = Destination
- RS = ORS = "\r\n"
- HttpService = "/inet/tcp/0/" Destination
- for (i in MOBFUN)
- MobCode = (MobCode "\n" MOBFUN[i])
- MobCode = MobCode "\n\nBEGIN {"
- for (i in MOBVAR)
- MobCode = (MobCode "\n MOBVAR[\"" i "\"] = \"" MOBVAR[i] "\"")
- MobCode = MobCode "\n}\n"
- print "POST /cgi-bin/PostAgent.sh HTTP/1.0" |& HttpService
- print "Content-length:", length(MobCode) ORS |& HttpService
- printf "%s", MobCode |& HttpService
- while ((HttpService |& getline) > 0)
- print $0
- close(HttpService)
- }
-
- The 'migrate()' function prepares the aforementioned strings
-containing the program code and transmits them to a server. A
-consequence of this modular approach is that the 'migrate()' function
-takes some parameters that aren't needed in this application, but that
-will be in future ones. Its mandatory parameter 'Destination' holds the
-name (or IP address) of the server that the agent wants as a host for
-its code. The optional parameter 'MobCode' may contain some 'gawk' code
-that is inserted during migration in front of all other code. The
-optional parameter 'Label' may contain a string that tells the agent
-what to do in program execution after arrival at its new home site. One
-of the serious obstacles in implementing a framework for mobile agents
-is that it does not suffice to migrate the code. It is also necessary
-to migrate the state of execution of the agent. In contrast to 'Agent
-Tcl', this program does not try to migrate the complete set of
-variables. The following conventions are used:
-
- * Each variable in an agent program is local to the current host and
- does _not_ migrate.
-
- * The array 'MOBFUN' shown above is an exception. It is handled by
- the function 'migrate()' and does migrate with the application.
-
- * The other exception is the array 'MOBVAR'. Each variable that
- takes part in migration has to be an element of this array.
- 'migrate()' also takes care of this.
-
- Now it's clear what happens to the 'Label' parameter of the function
-'migrate()'. It is copied into 'MOBVAR["Label"]' and travels alongside
-the other data. Since travelling takes place via HTTP, records must be
-separated with '"\r\n"' in 'RS' and 'ORS' as usual. The code assembly
-for migration takes place in three steps:
-
- * Iterate over 'MOBFUN' to collect all functions verbatim.
-
- * Prepare a 'BEGIN' pattern and put assignments to mobile variables
- into the action part.
-
- * Transmission itself resembles GETURL: the header with the request
- and the 'Content-length' is followed by the body. In case there is
- any reply over the network, it is read completely and echoed to
- standard output to avoid irritating the server.
-
- The application-independent framework is now almost complete. What
-follows is the 'END' pattern that is executed when the mobile agent has
-finished reading its own code. First, it checks whether it is already
-running on a remote host or not. In case initialization has not yet
-taken place, it starts 'MyInit()'. Otherwise (later, on a remote host),
-it starts 'MyJob()':
-
- END {
- if (ARGC != 2) exit # stop when called with wrong parameters
- if (MyOrigin != "") # is this the originating host?
- MyInit() # if so, initialize the application
- else # we are on a host with migrated data
- MyJob() # so we do our job
- }
-
- All that's left to extend the framework into a complete application
-is to write two application-specific functions: 'MyInit()' and
-'MyJob()'. Keep in mind that the former is executed once on the
-originating host, while the latter is executed after each migration:
-
- function MyInit() {
- MOBVAR["MyOrigin"] = MyOrigin
- MOBVAR["Machines"] = "localhost/80 max/80 moritz/80 castor/80"
- split(MOBVAR["Machines"], Machines) # which host is the first?
- migrate(Machines[1], "", "") # go to the first host
- while (("/inet/tcp/8080/0/0" |& getline) > 0) # wait for result
- print $0 # print result
- close("/inet/tcp/8080/0/0")
- }
-
- As mentioned earlier, this agent takes the name of its origin
-('MyOrigin') with it. Then, it takes the name of its first destination
-and goes there for further work. Notice that this name has the port
-number of the web server appended to the name of the server, because the
-function 'migrate()' needs it this way to create the 'HttpService'
-variable. Finally, it waits for the result to arrive. The 'MyJob()'
-function runs on the remote host:
-
- function MyJob() {
- # forget this host
- sub(MOBVAR["Destination"], "", MOBVAR["Machines"])
- MOBVAR["Result"]=MOBVAR["Result"] SUBSEP SUBSEP MOBVAR["Destination"] ":"
- while (("who" | getline) > 0) # who is logged in?
- MOBVAR["Result"] = MOBVAR["Result"] SUBSEP $0
- close("who")
- if (index(MOBVAR["Machines"], "/") > 0) { # any more machines to visit?
- split(MOBVAR["Machines"], Machines) # which host is next?
- migrate(Machines[1], "", "") # go there
- } else { # no more machines
- gsub(SUBSEP, "\n", MOBVAR["Result"]) # send result to origin
- print MOBVAR["Result"] |& "/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080"
- close("/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080")
- }
- }
-
- After migrating, the first thing to do in 'MyJob()' is to delete the
-name of the current host from the list of hosts to visit. Now, it is
-time to start the real work by appending the host's name to the result
-string, and reading line by line who is logged in on this host. A very
-annoying circumstance is the fact that the elements of 'MOBVAR' cannot
-hold the newline character ('"\n"'). If they did, migration of this
-string did not work because the string didn't obey the syntax rule for a
-string in 'gawk'. 'SUBSEP' is used as a temporary replacement. If the
-list of hosts to visit holds at least one more entry, the agent migrates
-to that place to go on working there. Otherwise, we replace the
-'SUBSEP's with a newline character in the resulting string, and report
-it to the originating host, whose name is stored in
-'MOBVAR["MyOrigin"]'.
-
- ---------- Footnotes ----------
-
- (1) <http://www.research.ibm.com/massive/mobag.ps>
-
-
-File: gawkinet.info, Node: STOXPRED, Next: PROTBASE, Prev: MOBAGWHO, Up: Some Applications and Techniques
-
-3.9 STOXPRED: Stock Market Prediction As A Service
-==================================================
-
- Far out in the uncharted backwaters of the unfashionable end of the
- Western Spiral arm of the Galaxy lies a small unregarded yellow
- sun.
-
- Orbiting this at a distance of roughly ninety-two million miles is
- an utterly insignificant little blue-green planet whose
- ape-descendent life forms are so amazingly primitive that they
- still think digital watches are a pretty neat idea.
-
- This planet has -- or rather had -- a problem, which was this: most
- of the people living on it were unhappy for pretty much of the
- time. Many solutions were suggested for this problem, but most of
- these were largely concerned with the movements of small green
- pieces of paper, which is odd because it wasn't the small green
- pieces of paper that were unhappy.
- Douglas Adams, 'The Hitch Hiker's Guide to the Galaxy'
-
- Valuable services on the Internet are usually _not_ implemented as
-mobile agents. There are much simpler ways of implementing services.
-All Unix systems provide, for example, the 'cron' service. Unix system
-users can write a list of tasks to be done each day, each week, twice a
-day, or just once. The list is entered into a file named 'crontab'.
-For example, to distribute a newsletter on a daily basis this way, use
-'cron' for calling a script each day early in the morning.
-
- # run at 8 am on weekdays, distribute the newsletter
- 0 8 * * 1-5 $HOME/bin/daily.job >> $HOME/log/newsletter 2>&1
-
- The script first looks for interesting information on the Internet,
-assembles it in a nice form and sends the results via email to the
-customers.
-
- The following is an example of a primitive newsletter on stock market
-prediction. It is a report which first tries to predict the change of
-each share in the Dow Jones Industrial Index for the particular day.
-Then it mentions some especially promising shares as well as some shares
-which look remarkably bad on that day. The report ends with the usual
-disclaimer which tells every child _not_ to try this at home and hurt
-anybody.
-
- Good morning Uncle Scrooge,
-
- This is your daily stock market report for Monday, October 16, 2000.
- Here are the predictions for today:
-
- AA neutral
- GE up
- JNJ down
- MSFT neutral
- ...
- UTX up
- DD down
- IBM up
- MO down
- WMT up
- DIS up
- INTC up
- MRK down
- XOM down
- EK down
- IP down
-
- The most promising shares for today are these:
-
- INTC http://biz.yahoo.com/n/i/intc.html
-
- The stock shares to avoid today are these:
-
- EK http://biz.yahoo.com/n/e/ek.html
- IP http://biz.yahoo.com/n/i/ip.html
- DD http://biz.yahoo.com/n/d/dd.html
- ...
-
- The script as a whole is rather long. In order to ease the pain of
-studying other people's source code, we have broken the script up into
-meaningful parts which are invoked one after the other. The basic
-structure of the script is as follows:
-
- BEGIN {
- Init()
- ReadQuotes()
- CleanUp()
- Prediction()
- Report()
- SendMail()
- }
-
- The earlier parts store data into variables and arrays which are
-subsequently used by later parts of the script. The 'Init()' function
-first checks if the script is invoked correctly (without any
-parameters). If not, it informs the user of the correct usage. What
-follows are preparations for the retrieval of the historical quote data.
-The names of the 30 stock shares are stored in an array 'name' along
-with the current date in 'day', 'month', and 'year'.
-
- All users who are separated from the Internet by a firewall and have
-to direct their Internet accesses to a proxy must supply the name of the
-proxy to this script with the '-v Proxy=NAME' option. For most users,
-the default proxy and port number should suffice.
-
- function Init() {
- if (ARGC != 1) {
- print "STOXPRED - daily stock share prediction"
- print "IN:\n no parameters, nothing on stdin"
- print "PARAM:\n -v Proxy=MyProxy -v ProxyPort=80"
- print "OUT:\n commented predictions as email"
- print "JK 09.10.2000"
- exit
- }
- # Remember ticker symbols from Dow Jones Industrial Index
- StockCount = split("AA GE JNJ MSFT AXP GM JPM PG BA HD KO \
- SBC C HON MCD T CAT HWP MMM UTX DD IBM MO WMT DIS INTC \
- MRK XOM EK IP", name);
- # Remember the current date as the end of the time series
- day = strftime("%d")
- month = strftime("%m")
- year = strftime("%Y")
- if (Proxy == "") Proxy = "chart.yahoo.com"
- if (ProxyPort == 0) ProxyPort = 80
- YahooData = "/inet/tcp/0/" Proxy "/" ProxyPort
- }
-
- There are two really interesting parts in the script. One is the
-function which reads the historical stock quotes from an Internet
-server. The other is the one that does the actual prediction. In the
-following function we see how the quotes are read from the Yahoo server.
-The data which comes from the server is in CSV format (comma-separated
-values):
-
- Date,Open,High,Low,Close,Volume
- 9-Oct-00,22.75,22.75,21.375,22.375,7888500
- 6-Oct-00,23.8125,24.9375,21.5625,22,10701100
- 5-Oct-00,24.4375,24.625,23.125,23.50,5810300
-
- Lines contain values of the same time instant, whereas columns are
-separated by commas and contain the kind of data that is described in
-the header (first) line. At first, 'gawk' is instructed to separate
-columns by commas ('FS = ","'). In the loop that follows, a connection
-to the Yahoo server is first opened, then a download takes place, and
-finally the connection is closed. All this happens once for each ticker
-symbol. In the body of this loop, an Internet address is built up as a
-string according to the rules of the Yahoo server. The starting and
-ending date are chosen to be exactly the same, but one year apart in the
-past. All the action is initiated within the 'printf' command which
-transmits the request for data to the Yahoo server.
-
- In the inner loop, the server's data is first read and then scanned
-line by line. Only lines which have six columns and the name of a month
-in the first column contain relevant data. This data is stored in the
-two-dimensional array 'quote'; one dimension being time, the other being
-the ticker symbol. During retrieval of the first stock's data, the
-calendar names of the time instances are stored in the array 'day'
-because we need them later.
-
- function ReadQuotes() {
- # Retrieve historical data for each ticker symbol
- FS = ","
- for (stock = 1; stock <= StockCount; stock++) {
- URL = "http://chart.yahoo.com/table.csv?s=" name[stock] \
- "&a=" month "&b=" day "&c=" year-1 \
- "&d=" month "&e=" day "&f=" year \
- "g=d&q=q&y=0&z=" name[stock] "&x=.csv"
- printf("GET " URL " HTTP/1.0\r\n\r\n") |& YahooData
- while ((YahooData |& getline) > 0) {
- if (NF == 6 && $1 ~ /Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec/) {
- if (stock == 1)
- days[++daycount] = $1;
- quote[$1, stock] = $5
- }
- }
- close(YahooData)
- }
- FS = " "
- }
-
- Now that we _have_ the data, it can be checked once again to make
-sure that no individual stock is missing or invalid, and that all the
-stock quotes are aligned correctly. Furthermore, we renumber the time
-instances. The most recent day gets day number 1 and all other days get
-consecutive numbers. All quotes are rounded toward the nearest whole
-number in US Dollars.
-
- function CleanUp() {
- # clean up time series; eliminate incomplete data sets
- for (d = 1; d <= daycount; d++) {
- for (stock = 1; stock <= StockCount; stock++)
- if (! ((days[d], stock) in quote))
- stock = StockCount + 10
- if (stock > StockCount + 1)
- continue
- datacount++
- for (stock = 1; stock <= StockCount; stock++)
- data[datacount, stock] = int(0.5 + quote[days[d], stock])
- }
- delete quote
- delete days
- }
-
- Now we have arrived at the second really interesting part of the
-whole affair. What we present here is a very primitive prediction
-algorithm: _If a stock fell yesterday, assume it will also fall today;
-if it rose yesterday, assume it will rise today_. (Feel free to replace
-this algorithm with a smarter one.) If a stock changed in the same
-direction on two consecutive days, this is an indication which should be
-highlighted. Two-day advances are stored in 'hot' and two-day declines
-in 'avoid'.
-
- The rest of the function is a sanity check. It counts the number of
-correct predictions in relation to the total number of predictions one
-could have made in the year before.
-
- function Prediction() {
- # Predict each ticker symbol by prolonging yesterday's trend
- for (stock = 1; stock <= StockCount; stock++) {
- if (data[1, stock] > data[2, stock]) {
- predict[stock] = "up"
- } else if (data[1, stock] < data[2, stock]) {
- predict[stock] = "down"
- } else {
- predict[stock] = "neutral"
- }
- if ((data[1, stock] > data[2, stock]) && (data[2, stock] > data[3, stock]))
- hot[stock] = 1
- if ((data[1, stock] < data[2, stock]) && (data[2, stock] < data[3, stock]))
- avoid[stock] = 1
- }
- # Do a plausibility check: how many predictions proved correct?
- for (s = 1; s <= StockCount; s++) {
- for (d = 1; d <= datacount-2; d++) {
- if (data[d+1, s] > data[d+2, s]) {
- UpCount++
- } else if (data[d+1, s] < data[d+2, s]) {
- DownCount++
- } else {
- NeutralCount++
- }
- if (((data[d, s] > data[d+1, s]) && (data[d+1, s] > data[d+2, s])) ||
- ((data[d, s] < data[d+1, s]) && (data[d+1, s] < data[d+2, s])) ||
- ((data[d, s] == data[d+1, s]) && (data[d+1, s] == data[d+2, s])))
- CorrectCount++
- }
- }
- }
-
- At this point the hard work has been done: the array 'predict'
-contains the predictions for all the ticker symbols. It is up to the
-function 'Report()' to find some nice words to introduce the desired
-information.
-
- function Report() {
- # Generate report
- report = "\nThis is your daily "
- report = report "stock market report for "strftime("%A, %B %d, %Y")".\n"
- report = report "Here are the predictions for today:\n\n"
- for (stock = 1; stock <= StockCount; stock++)
- report = report "\t" name[stock] "\t" predict[stock] "\n"
- for (stock in hot) {
- if (HotCount++ == 0)
- report = report "\nThe most promising shares for today are these:\n\n"
- report = report "\t" name[stock] "\t\thttp://biz.yahoo.com/n/" \
- tolower(substr(name[stock], 1, 1)) "/" tolower(name[stock]) ".html\n"
- }
- for (stock in avoid) {
- if (AvoidCount++ == 0)
- report = report "\nThe stock shares to avoid today are these:\n\n"
- report = report "\t" name[stock] "\t\thttp://biz.yahoo.com/n/" \
- tolower(substr(name[stock], 1, 1)) "/" tolower(name[stock]) ".html\n"
- }
- report = report "\nThis sums up to " HotCount+0 " winners and " AvoidCount+0
- report = report " losers. When using this kind\nof prediction scheme for"
- report = report " the 12 months which lie behind us,\nwe get " UpCount
- report = report " 'ups' and " DownCount " 'downs' and " NeutralCount
- report = report " 'neutrals'. Of all\nthese " UpCount+DownCount+NeutralCount
- report = report " predictions " CorrectCount " proved correct next day.\n"
- report = report "A success rate of "\
- int(100*CorrectCount/(UpCount+DownCount+NeutralCount)) "%.\n"
- report = report "Random choice would have produced a 33% success rate.\n"
- report = report "Disclaimer: Like every other prediction of the stock\n"
- report = report "market, this report is, of course, complete nonsense.\n"
- report = report "If you are stupid enough to believe these predictions\n"
- report = report "you should visit a doctor who can treat your ailment."
- }
-
- The function 'SendMail()' goes through the list of customers and
-opens a pipe to the 'mail' command for each of them. Each one receives
-an email message with a proper subject heading and is addressed with his
-full name.
-
- function SendMail() {
- # send report to customers
- customer["uncle.scrooge@ducktown.gov"] = "Uncle Scrooge"
- customer["more@utopia.org" ] = "Sir Thomas More"
- customer["spinoza@denhaag.nl" ] = "Baruch de Spinoza"
- customer["marx@highgate.uk" ] = "Karl Marx"
- customer["keynes@the.long.run" ] = "John Maynard Keynes"
- customer["bierce@devil.hell.org" ] = "Ambrose Bierce"
- customer["laplace@paris.fr" ] = "Pierre Simon de Laplace"
- for (c in customer) {
- MailPipe = "mail -s 'Daily Stock Prediction Newsletter'" c
- print "Good morning " customer[c] "," | MailPipe
- print report "\n.\n" | MailPipe
- close(MailPipe)
- }
- }
-
- Be patient when running the script by hand. Retrieving the data for
-all the ticker symbols and sending the emails may take several minutes
-to complete, depending upon network traffic and the speed of the
-available Internet link. The quality of the prediction algorithm is
-likely to be disappointing. Try to find a better one. Should you find
-one with a success rate of more than 50%, please tell us about it! It
-is only for the sake of curiosity, of course. ':-)'
-
-
-File: gawkinet.info, Node: PROTBASE, Prev: STOXPRED, Up: Some Applications and Techniques
-
-3.10 PROTBASE: Searching Through A Protein Database
-===================================================
-
- Hoare's Law of Large Problems: Inside every large problem is a
- small problem struggling to get out.
-
- Yahoo's database of stock market data is just one among the many
-large databases on the Internet. Another one is located at NCBI
-(National Center for Biotechnology Information). Established in 1988 as
-a national resource for molecular biology information, NCBI creates
-public databases, conducts research in computational biology, develops
-software tools for analyzing genome data, and disseminates biomedical
-information. In this section, we look at one of NCBI's public services,
-which is called BLAST (Basic Local Alignment Search Tool).
-
- You probably know that the information necessary for reproducing
-living cells is encoded in the genetic material of the cells. The
-genetic material is a very long chain of four base nucleotides. It is
-the order of appearance (the sequence) of nucleotides which contains the
-information about the substance to be produced. Scientists in
-biotechnology often find a specific fragment, determine the nucleotide
-sequence, and need to know where the sequence at hand comes from. This
-is where the large databases enter the game. At NCBI, databases store
-the knowledge about which sequences have ever been found and where they
-have been found. When the scientist sends his sequence to the BLAST
-service, the server looks for regions of genetic material in its
-database which look the most similar to the delivered nucleotide
-sequence. After a search time of some seconds or minutes the server
-sends an answer to the scientist. In order to make access simple, NCBI
-chose to offer their database service through popular Internet
-protocols. There are four basic ways to use the so-called BLAST
-services:
-
- * The easiest way to use BLAST is through the web. Users may simply
- point their browsers at the NCBI home page and link to the BLAST
- pages. NCBI provides a stable URL that may be used to perform
- BLAST searches without interactive use of a web browser. This is
- what we will do later in this section. A demonstration client and
- a 'README' file demonstrate how to access this URL.
-
- * Currently, 'blastcl3' is the standard network BLAST client. You
- can download 'blastcl3' from the anonymous FTP location.
-
- * BLAST 2.0 can be run locally as a full executable and can be used
- to run BLAST searches against private local databases, or
- downloaded copies of the NCBI databases. BLAST 2.0 executables may
- be found on the NCBI anonymous FTP server.
-
- * The NCBI BLAST Email server is the best option for people without
- convenient access to the web. A similarity search can be performed
- by sending a properly formatted mail message containing the
- nucleotide or protein query sequence to <blast@ncbi.nlm.nih.gov>.
- The query sequence is compared against the specified database using
- the BLAST algorithm and the results are returned in an email
- message. For more information on formulating email BLAST searches,
- you can send a message consisting of the word "HELP" to the same
- address, <blast@ncbi.nlm.nih.gov>.
-
- Our starting point is the demonstration client mentioned in the first
-option. The 'README' file that comes along with the client explains the
-whole process in a nutshell. In the rest of this section, we first show
-what such requests look like. Then we show how to use 'gawk' to
-implement a client in about 10 lines of code. Finally, we show how to
-interpret the result returned from the service.
-
- Sequences are expected to be represented in the standard IUB/IUPAC
-amino acid and nucleic acid codes, with these exceptions: lower-case
-letters are accepted and are mapped into upper-case; a single hyphen or
-dash can be used to represent a gap of indeterminate length; and in
-amino acid sequences, 'U' and '*' are acceptable letters (see below).
-Before submitting a request, any numerical digits in the query sequence
-should either be removed or replaced by appropriate letter codes (e.g.,
-'N' for unknown nucleic acid residue or 'X' for unknown amino acid
-residue). The nucleic acid codes supported are:
-
- A --> adenosine M --> A C (amino)
- C --> cytidine S --> G C (strong)
- G --> guanine W --> A T (weak)
- T --> thymidine B --> G T C
- U --> uridine D --> G A T
- R --> G A (purine) H --> A C T
- Y --> T C (pyrimidine) V --> G C A
- K --> G T (keto) N --> A G C T (any)
- - gap of indeterminate length
-
- Now you know the alphabet of nucleotide sequences. The last two
-lines of the following example query show you such a sequence, which is
-obviously made up only of elements of the alphabet just described.
-Store this example query into a file named 'protbase.request'. You are
-now ready to send it to the server with the demonstration client.
-
- PROGRAM blastn
- DATALIB month
- EXPECT 0.75
- BEGIN
- >GAWK310 the gawking gene GNU AWK
- tgcttggctgaggagccataggacgagagcttcctggtgaagtgtgtttcttgaaatcat
- caccaccatggacagcaaa
-
- The actual search request begins with the mandatory parameter
-'PROGRAM' in the first column followed by the value 'blastn' (the name
-of the program) for searching nucleic acids. The next line contains the
-mandatory search parameter 'DATALIB' with the value 'month' for the
-newest nucleic acid sequences. The third line contains an optional
-'EXPECT' parameter and the value desired for it. The fourth line
-contains the mandatory 'BEGIN' directive, followed by the query sequence
-in FASTA/Pearson format. Each line of information must be less than 80
-characters in length.
-
- The "month" database contains all new or revised sequences released
-in the last 30 days and is useful for searching against new sequences.
-There are five different blast programs, 'blastn' being the one that
-compares a nucleotide query sequence against a nucleotide sequence
-database.
-
- The last server directive that must appear in every request is the
-'BEGIN' directive. The query sequence should immediately follow the
-'BEGIN' directive and must appear in FASTA/Pearson format. A sequence
-in FASTA/Pearson format begins with a single-line description. The
-description line, which is required, is distinguished from the lines of
-sequence data that follow it by having a greater-than ('>') symbol in
-the first column. For the purposes of the BLAST server, the text of the
-description is arbitrary.
-
- If you prefer to use a client written in 'gawk', just store the
-following 10 lines of code into a file named 'protbase.awk' and use this
-client instead. Invoke it with 'gawk -f protbase.awk protbase.request'.
-Then wait a minute and watch the result coming in. In order to
-replicate the demonstration client's behavior as closely as possible,
-this client does not use a proxy server. We could also have extended
-the client program in *note Retrieving Web Pages: GETURL, to implement
-the client request from 'protbase.awk' as a special case.
-
- { request = request "\n" $0 }
-
- END {
- BLASTService = "/inet/tcp/0/www.ncbi.nlm.nih.gov/80"
- printf "POST /cgi-bin/BLAST/nph-blast_report HTTP/1.0\n" |& BLASTService
- printf "Content-Length: " length(request) "\n\n" |& BLASTService
- printf request |& BLASTService
- while ((BLASTService |& getline) > 0)
- print $0
- close(BLASTService)
- }
-
- The demonstration client from NCBI is 214 lines long (written in C)
-and it is not immediately obvious what it does. Our client is so short
-that it _is_ obvious what it does. First it loops over all lines of the
-query and stores the whole query into a variable. Then the script
-establishes an Internet connection to the NCBI server and transmits the
-query by framing it with a proper HTTP request. Finally it receives and
-prints the complete result coming from the server.
-
- Now, let us look at the result. It begins with an HTTP header, which
-you can ignore. Then there are some comments about the query having
-been filtered to avoid spuriously high scores. After this, there is a
-reference to the paper that describes the software being used for
-searching the data base. After a repetition of the original query's
-description we find the list of significant alignments:
-
- Sequences producing significant alignments: (bits) Value
-
- gb|AC021182.14|AC021182 Homo sapiens chromosome 7 clone RP11-733... 38 0.20
- gb|AC021056.12|AC021056 Homo sapiens chromosome 3 clone RP11-115... 38 0.20
- emb|AL160278.10|AL160278 Homo sapiens chromosome 9 clone RP11-57... 38 0.20
- emb|AL391139.11|AL391139 Homo sapiens chromosome X clone RP11-35... 38 0.20
- emb|AL365192.6|AL365192 Homo sapiens chromosome 6 clone RP3-421H... 38 0.20
- emb|AL138812.9|AL138812 Homo sapiens chromosome 11 clone RP1-276... 38 0.20
- gb|AC073881.3|AC073881 Homo sapiens chromosome 15 clone CTD-2169... 38 0.20
-
- This means that the query sequence was found in seven human
-chromosomes. But the value 0.20 (20%) means that the probability of an
-accidental match is rather high (20%) in all cases and should be taken
-into account. You may wonder what the first column means. It is a key
-to the specific database in which this occurrence was found. The unique
-sequence identifiers reported in the search results can be used as
-sequence retrieval keys via the NCBI server. The syntax of sequence
-header lines used by the NCBI BLAST server depends on the database from
-which each sequence was obtained. The table below lists the identifiers
-for the databases from which the sequences were derived.
-
- Database Name Identifier Syntax
- ============================ ========================
- GenBank gb|accession|locus
- EMBL Data Library emb|accession|locus
- DDBJ, DNA Database of Japan dbj|accession|locus
- NBRF PIR pir||entry
- Protein Research Foundation prf||name
- SWISS-PROT sp|accession|entry name
- Brookhaven Protein Data Bank pdb|entry|chain
- Kabat's Sequences of Immuno... gnl|kabat|identifier
- Patents pat|country|number
- GenInfo Backbone Id bbs|number
-
- For example, an identifier might be 'gb|AC021182.14|AC021182', where
-the 'gb' tag indicates that the identifier refers to a GenBank sequence,
-'AC021182.14' is its GenBank ACCESSION, and 'AC021182' is the GenBank
-LOCUS. The identifier contains no spaces, so that a space indicates the
-end of the identifier.
-
- Let us continue in the result listing. Each of the seven alignments
-mentioned above is subsequently described in detail. We will have a
-closer look at the first of them.
-
- >gb|AC021182.14|AC021182 Homo sapiens chromosome 7 clone RP11-733N23, WORKING DRAFT SEQUENCE, 4
- unordered pieces
- Length = 176383
-
- Score = 38.2 bits (19), Expect = 0.20
- Identities = 19/19 (100%)
- Strand = Plus / Plus
-
- Query: 35 tggtgaagtgtgtttcttg 53
- |||||||||||||||||||
- Sbjct: 69786 tggtgaagtgtgtttcttg 69804
-
- This alignment was located on the human chromosome 7. The fragment
-on which part of the query was found had a total length of 176383. Only
-19 of the nucleotides matched and the matching sequence ran from
-character 35 to 53 in the query sequence and from 69786 to 69804 in the
-fragment on chromosome 7. If you are still reading at this point, you
-are probably interested in finding out more about Computational Biology
-and you might appreciate the following hints.
-
- 1. There is a book called 'Introduction to Computational Biology' by
- Michael S. Waterman, which is worth reading if you are seriously
- interested. You can find a good book review on the Internet.
-
- 2. While Waterman's book can explain to you the algorithms employed
- internally in the database search engines, most practitioners
- prefer to approach the subject differently. The applied side of
- Computational Biology is called Bioinformatics, and emphasizes the
- tools available for day-to-day work as well as how to actually
- _use_ them. One of the very few affordable books on Bioinformatics
- is 'Developing Bioinformatics Computer Skills'.
-
- 3. The sequences _gawk_ and _gnuawk_ are in widespread use in the
- genetic material of virtually every earthly living being. Let us
- take this as a clear indication that the divine creator has
- intended 'gawk' to prevail over other scripting languages such as
- 'perl', 'tcl', or 'python' which are not even proper sequences.
- (:-)
-
-
-File: gawkinet.info, Node: Links, Next: GNU Free Documentation License, Prev: Some Applications and Techniques, Up: Top
-
-4 Related Links
-***************
-
-This section lists the URLs for various items discussed in this major
-node. They are presented in the order in which they appear.
-
-'Internet Programming with Python'
- <http://www.fsbassociates.com/books/python.htm>
-
-'Advanced Perl Programming'
- <http://www.oreilly.com/catalog/advperl>
-
-'Web Client Programming with Perl'
- <http://www.oreilly.com/catalog/webclient>
-
-Richard Stevens's home page and book
- <http://www.kohala.com/~rstevens>
-
-The SPAK home page
- <http://www.userfriendly.net/linux/RPM/contrib/libc6/i386/spak-0.6b-1.i386.html>
-
-Volume III of 'Internetworking with TCP/IP', by Comer and Stevens
- <http://www.cs.purdue.edu/homes/dec/tcpip3s.cont.html>
-
-XBM Graphics File Format
- <http://www.wotsit.org/download.asp?f=xbm>
-
-GNUPlot
- <http://www.cs.dartmouth.edu/gnuplot_info.html>
-
-Mark Humphrys' Eliza page
- <http://www.compapp.dcu.ie/~humphrys/eliza.html>
-
-Yahoo! Eliza Information
- <http://dir.yahoo.com/Recreation/Games/Computer_Games/Internet_Games/Web_Games/Artificial_Intelligence>
-
-Java versions of Eliza
- <http://www.tjhsst.edu/Psych/ch1/eliza.html>
-
-Java versions of Eliza with source code
- <http://home.adelphia.net/~lifeisgood/eliza/eliza.htm>
-
-Eliza Programs with Explanations
- <http://chayden.net/chayden/eliza/Eliza.shtml>
-
-Loebner Contest
- <http://acm.org/~loebner/loebner-prize.htmlx>
-
-Tck/Tk Information
- <http://www.scriptics.com/>
-
-Intel 80x86 Processors
- <http://developer.intel.com/design/platform/embedpc/what_is.htm>
-
-AMD Elan Processors
- <http://www.amd.com/products/epd/processors/4.32bitcont/32bitcont/index.html>
-
-XINU
- <http://willow.canberra.edu.au/~chrisc/xinu.html>
-
-GNU/Linux
- <http://uclinux.lineo.com/>
-
-Embedded PCs
- <http://dir.yahoo.com/Business_and_Economy/Business_to_Business/Computers/Hardware/Embedded_Control/>
-
-MiniSQL
- <http://www.hughes.com.au/library/>
-
-Market Share Surveys
- <http://www.netcraft.com/survey>
-
-'Numerical Recipes in C: The Art of Scientific Computing'
- <http://www.nr.com>
-
-VRML
- <http://www.vrml.org>
-
-The VRML FAQ
- <http://www.vrml.org/technicalinfo/specifications/specifications.htm#FAQ>
-
-The UMBC Agent Web
- <http://www.cs.umbc.edu/agents>
-
-Apache Web Server
- <http://www.apache.org>
-
-National Center for Biotechnology Information (NCBI)
- <http://www.ncbi.nlm.nih.gov>
-
-Basic Local Alignment Search Tool (BLAST)
- <http://www.ncbi.nlm.nih.gov/BLAST/blast_overview.html>
-
-NCBI Home Page
- <http://www.ncbi.nlm.nih.gov>
-
-BLAST Pages
- <http://www.ncbi.nlm.nih.gov/BLAST>
-
-BLAST Demonstration Client
- <ftp://ncbi.nlm.nih.gov/blast/blasturl/>
-
-BLAST anonymous FTP location
- <ftp://ncbi.nlm.nih.gov/blast/network/netblast/>
-
-BLAST 2.0 Executables
- <ftp://ncbi.nlm.nih.gov/blast/executables/>
-
-IUB/IUPAC Amino Acid and Nucleic Acid Codes
- <http://www.uthscsa.edu/geninfo/blastmail.html#item6>
-
-FASTA/Pearson Format
- <http://www.ncbi.nlm.nih.gov/BLAST/fasta.html>
-
-Fasta/Pearson Sequence in Java
- <http://www.kazusa.or.jp/java/codon_table_java/>
-
-Book Review of 'Introduction to Computational Biology'
- <http://www.acm.org/crossroads/xrds5-1/introcb.html>
-
-'Developing Bioinformatics Computer Skills'
- <http://www.oreilly.com/catalog/bioskills/>
-
-
-File: gawkinet.info, Node: GNU Free Documentation License, Next: Index, Prev: Links, Up: Top
-
-GNU Free Documentation License
-******************************
-
- Version 1.3, 3 November 2008
-
- Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
- <http://fsf.org/>
-
- Everyone is permitted to copy and distribute verbatim copies
- of this license document, but changing it is not allowed.
-
- 0. PREAMBLE
-
- The purpose of this License is to make a manual, textbook, or other
- functional and useful document "free" in the sense of freedom: to
- assure everyone the effective freedom to copy and redistribute it,
- with or without modifying it, either commercially or
- noncommercially. Secondarily, this License preserves for the
- author and publisher a way to get credit for their work, while not
- being considered responsible for modifications made by others.
-
- This License is a kind of "copyleft", which means that derivative
- works of the document must themselves be free in the same sense.
- It complements the GNU General Public License, which is a copyleft
- license designed for free software.
-
- We have designed this License in order to use it for manuals for
- free software, because free software needs free documentation: a
- free program should come with manuals providing the same freedoms
- that the software does. But this License is not limited to
- software manuals; it can be used for any textual work, regardless
- of subject matter or whether it is published as a printed book. We
- recommend this License principally for works whose purpose is
- instruction or reference.
-
- 1. APPLICABILITY AND DEFINITIONS
-
- This License applies to any manual or other work, in any medium,
- that contains a notice placed by the copyright holder saying it can
- be distributed under the terms of this License. Such a notice
- grants a world-wide, royalty-free license, unlimited in duration,
- to use that work under the conditions stated herein. The
- "Document", below, refers to any such manual or work. Any member
- of the public is a licensee, and is addressed as "you". You accept
- the license if you copy, modify or distribute the work in a way
- requiring permission under copyright law.
-
- A "Modified Version" of the Document means any work containing the
- Document or a portion of it, either copied verbatim, or with
- modifications and/or translated into another language.
-
- A "Secondary Section" is a named appendix or a front-matter section
- of the Document that deals exclusively with the relationship of the
- publishers or authors of the Document to the Document's overall
- subject (or to related matters) and contains nothing that could
- fall directly within that overall subject. (Thus, if the Document
- is in part a textbook of mathematics, a Secondary Section may not
- explain any mathematics.) The relationship could be a matter of
- historical connection with the subject or with related matters, or
- of legal, commercial, philosophical, ethical or political position
- regarding them.
-
- The "Invariant Sections" are certain Secondary Sections whose
- titles are designated, as being those of Invariant Sections, in the
- notice that says that the Document is released under this License.
- If a section does not fit the above definition of Secondary then it
- is not allowed to be designated as Invariant. The Document may
- contain zero Invariant Sections. If the Document does not identify
- any Invariant Sections then there are none.
-
- The "Cover Texts" are certain short passages of text that are
- listed, as Front-Cover Texts or Back-Cover Texts, in the notice
- that says that the Document is released under this License. A
- Front-Cover Text may be at most 5 words, and a Back-Cover Text may
- be at most 25 words.
-
- A "Transparent" copy of the Document means a machine-readable copy,
- represented in a format whose specification is available to the
- general public, that is suitable for revising the document
- straightforwardly with generic text editors or (for images composed
- of pixels) generic paint programs or (for drawings) some widely
- available drawing editor, and that is suitable for input to text
- formatters or for automatic translation to a variety of formats
- suitable for input to text formatters. A copy made in an otherwise
- Transparent file format whose markup, or absence of markup, has
- been arranged to thwart or discourage subsequent modification by
- readers is not Transparent. An image format is not Transparent if
- used for any substantial amount of text. A copy that is not
- "Transparent" is called "Opaque".
-
- Examples of suitable formats for Transparent copies include plain
- ASCII without markup, Texinfo input format, LaTeX input format,
- SGML or XML using a publicly available DTD, and standard-conforming
- simple HTML, PostScript or PDF designed for human modification.
- Examples of transparent image formats include PNG, XCF and JPG.
- Opaque formats include proprietary formats that can be read and
- edited only by proprietary word processors, SGML or XML for which
- the DTD and/or processing tools are not generally available, and
- the machine-generated HTML, PostScript or PDF produced by some word
- processors for output purposes only.
-
- The "Title Page" means, for a printed book, the title page itself,
- plus such following pages as are needed to hold, legibly, the
- material this License requires to appear in the title page. For
- works in formats which do not have any title page as such, "Title
- Page" means the text near the most prominent appearance of the
- work's title, preceding the beginning of the body of the text.
-
- The "publisher" means any person or entity that distributes copies
- of the Document to the public.
-
- A section "Entitled XYZ" means a named subunit of the Document
- whose title either is precisely XYZ or contains XYZ in parentheses
- following text that translates XYZ in another language. (Here XYZ
- stands for a specific section name mentioned below, such as
- "Acknowledgements", "Dedications", "Endorsements", or "History".)
- To "Preserve the Title" of such a section when you modify the
- Document means that it remains a section "Entitled XYZ" according
- to this definition.
-
- The Document may include Warranty Disclaimers next to the notice
- which states that this License applies to the Document. These
- Warranty Disclaimers are considered to be included by reference in
- this License, but only as regards disclaiming warranties: any other
- implication that these Warranty Disclaimers may have is void and
- has no effect on the meaning of this License.
-
- 2. VERBATIM COPYING
-
- You may copy and distribute the Document in any medium, either
- commercially or noncommercially, provided that this License, the
- copyright notices, and the license notice saying this License
- applies to the Document are reproduced in all copies, and that you
- add no other conditions whatsoever to those of this License. You
- may not use technical measures to obstruct or control the reading
- or further copying of the copies you make or distribute. However,
- you may accept compensation in exchange for copies. If you
- distribute a large enough number of copies you must also follow the
- conditions in section 3.
-
- You may also lend copies, under the same conditions stated above,
- and you may publicly display copies.
-
- 3. COPYING IN QUANTITY
-
- If you publish printed copies (or copies in media that commonly
- have printed covers) of the Document, numbering more than 100, and
- the Document's license notice requires Cover Texts, you must
- enclose the copies in covers that carry, clearly and legibly, all
- these Cover Texts: Front-Cover Texts on the front cover, and
- Back-Cover Texts on the back cover. Both covers must also clearly
- and legibly identify you as the publisher of these copies. The
- front cover must present the full title with all words of the title
- equally prominent and visible. You may add other material on the
- covers in addition. Copying with changes limited to the covers, as
- long as they preserve the title of the Document and satisfy these
- conditions, can be treated as verbatim copying in other respects.
-
- If the required texts for either cover are too voluminous to fit
- legibly, you should put the first ones listed (as many as fit
- reasonably) on the actual cover, and continue the rest onto
- adjacent pages.
-
- If you publish or distribute Opaque copies of the Document
- numbering more than 100, you must either include a machine-readable
- Transparent copy along with each Opaque copy, or state in or with
- each Opaque copy a computer-network location from which the general
- network-using public has access to download using public-standard
- network protocols a complete Transparent copy of the Document, free
- of added material. If you use the latter option, you must take
- reasonably prudent steps, when you begin distribution of Opaque
- copies in quantity, to ensure that this Transparent copy will
- remain thus accessible at the stated location until at least one
- year after the last time you distribute an Opaque copy (directly or
- through your agents or retailers) of that edition to the public.
-
- It is requested, but not required, that you contact the authors of
- the Document well before redistributing any large number of copies,
- to give them a chance to provide you with an updated version of the
- Document.
-
- 4. MODIFICATIONS
-
- You may copy and distribute a Modified Version of the Document
- under the conditions of sections 2 and 3 above, provided that you
- release the Modified Version under precisely this License, with the
- Modified Version filling the role of the Document, thus licensing
- distribution and modification of the Modified Version to whoever
- possesses a copy of it. In addition, you must do these things in
- the Modified Version:
-
- A. Use in the Title Page (and on the covers, if any) a title
- distinct from that of the Document, and from those of previous
- versions (which should, if there were any, be listed in the
- History section of the Document). You may use the same title
- as a previous version if the original publisher of that
- version gives permission.
-
- B. List on the Title Page, as authors, one or more persons or
- entities responsible for authorship of the modifications in
- the Modified Version, together with at least five of the
- principal authors of the Document (all of its principal
- authors, if it has fewer than five), unless they release you
- from this requirement.
-
- C. State on the Title page the name of the publisher of the
- Modified Version, as the publisher.
-
- D. Preserve all the copyright notices of the Document.
-
- E. Add an appropriate copyright notice for your modifications
- adjacent to the other copyright notices.
-
- F. Include, immediately after the copyright notices, a license
- notice giving the public permission to use the Modified
- Version under the terms of this License, in the form shown in
- the Addendum below.
-
- G. Preserve in that license notice the full lists of Invariant
- Sections and required Cover Texts given in the Document's
- license notice.
-
- H. Include an unaltered copy of this License.
-
- I. Preserve the section Entitled "History", Preserve its Title,
- and add to it an item stating at least the title, year, new
- authors, and publisher of the Modified Version as given on the
- Title Page. If there is no section Entitled "History" in the
- Document, create one stating the title, year, authors, and
- publisher of the Document as given on its Title Page, then add
- an item describing the Modified Version as stated in the
- previous sentence.
-
- J. Preserve the network location, if any, given in the Document
- for public access to a Transparent copy of the Document, and
- likewise the network locations given in the Document for
- previous versions it was based on. These may be placed in the
- "History" section. You may omit a network location for a work
- that was published at least four years before the Document
- itself, or if the original publisher of the version it refers
- to gives permission.
-
- K. For any section Entitled "Acknowledgements" or "Dedications",
- Preserve the Title of the section, and preserve in the section
- all the substance and tone of each of the contributor
- acknowledgements and/or dedications given therein.
-
- L. Preserve all the Invariant Sections of the Document, unaltered
- in their text and in their titles. Section numbers or the
- equivalent are not considered part of the section titles.
-
- M. Delete any section Entitled "Endorsements". Such a section
- may not be included in the Modified Version.
-
- N. Do not retitle any existing section to be Entitled
- "Endorsements" or to conflict in title with any Invariant
- Section.
-
- O. Preserve any Warranty Disclaimers.
-
- If the Modified Version includes new front-matter sections or
- appendices that qualify as Secondary Sections and contain no
- material copied from the Document, you may at your option designate
- some or all of these sections as invariant. To do this, add their
- titles to the list of Invariant Sections in the Modified Version's
- license notice. These titles must be distinct from any other
- section titles.
-
- You may add a section Entitled "Endorsements", provided it contains
- nothing but endorsements of your Modified Version by various
- parties--for example, statements of peer review or that the text
- has been approved by an organization as the authoritative
- definition of a standard.
-
- You may add a passage of up to five words as a Front-Cover Text,
- and a passage of up to 25 words as a Back-Cover Text, to the end of
- the list of Cover Texts in the Modified Version. Only one passage
- of Front-Cover Text and one of Back-Cover Text may be added by (or
- through arrangements made by) any one entity. If the Document
- already includes a cover text for the same cover, previously added
- by you or by arrangement made by the same entity you are acting on
- behalf of, you may not add another; but you may replace the old
- one, on explicit permission from the previous publisher that added
- the old one.
-
- The author(s) and publisher(s) of the Document do not by this
- License give permission to use their names for publicity for or to
- assert or imply endorsement of any Modified Version.
-
- 5. COMBINING DOCUMENTS
-
- You may combine the Document with other documents released under
- this License, under the terms defined in section 4 above for
- modified versions, provided that you include in the combination all
- of the Invariant Sections of all of the original documents,
- unmodified, and list them all as Invariant Sections of your
- combined work in its license notice, and that you preserve all
- their Warranty Disclaimers.
-
- The combined work need only contain one copy of this License, and
- multiple identical Invariant Sections may be replaced with a single
- copy. If there are multiple Invariant Sections with the same name
- but different contents, make the title of each such section unique
- by adding at the end of it, in parentheses, the name of the
- original author or publisher of that section if known, or else a
- unique number. Make the same adjustment to the section titles in
- the list of Invariant Sections in the license notice of the
- combined work.
-
- In the combination, you must combine any sections Entitled
- "History" in the various original documents, forming one section
- Entitled "History"; likewise combine any sections Entitled
- "Acknowledgements", and any sections Entitled "Dedications". You
- must delete all sections Entitled "Endorsements."
-
- 6. COLLECTIONS OF DOCUMENTS
-
- You may make a collection consisting of the Document and other
- documents released under this License, and replace the individual
- copies of this License in the various documents with a single copy
- that is included in the collection, provided that you follow the
- rules of this License for verbatim copying of each of the documents
- in all other respects.
-
- You may extract a single document from such a collection, and
- distribute it individually under this License, provided you insert
- a copy of this License into the extracted document, and follow this
- License in all other respects regarding verbatim copying of that
- document.
-
- 7. AGGREGATION WITH INDEPENDENT WORKS
-
- A compilation of the Document or its derivatives with other
- separate and independent documents or works, in or on a volume of a
- storage or distribution medium, is called an "aggregate" if the
- copyright resulting from the compilation is not used to limit the
- legal rights of the compilation's users beyond what the individual
- works permit. When the Document is included in an aggregate, this
- License does not apply to the other works in the aggregate which
- are not themselves derivative works of the Document.
-
- If the Cover Text requirement of section 3 is applicable to these
- copies of the Document, then if the Document is less than one half
- of the entire aggregate, the Document's Cover Texts may be placed
- on covers that bracket the Document within the aggregate, or the
- electronic equivalent of covers if the Document is in electronic
- form. Otherwise they must appear on printed covers that bracket
- the whole aggregate.
-
- 8. TRANSLATION
-
- Translation is considered a kind of modification, so you may
- distribute translations of the Document under the terms of section
- 4. Replacing Invariant Sections with translations requires special
- permission from their copyright holders, but you may include
- translations of some or all Invariant Sections in addition to the
- original versions of these Invariant Sections. You may include a
- translation of this License, and all the license notices in the
- Document, and any Warranty Disclaimers, provided that you also
- include the original English version of this License and the
- original versions of those notices and disclaimers. In case of a
- disagreement between the translation and the original version of
- this License or a notice or disclaimer, the original version will
- prevail.
-
- If a section in the Document is Entitled "Acknowledgements",
- "Dedications", or "History", the requirement (section 4) to
- Preserve its Title (section 1) will typically require changing the
- actual title.
-
- 9. TERMINATION
-
- You may not copy, modify, sublicense, or distribute the Document
- except as expressly provided under this License. Any attempt
- otherwise to copy, modify, sublicense, or distribute it is void,
- and will automatically terminate your rights under this License.
-
- However, if you cease all violation of this License, then your
- license from a particular copyright holder is reinstated (a)
- provisionally, unless and until the copyright holder explicitly and
- finally terminates your license, and (b) permanently, if the
- copyright holder fails to notify you of the violation by some
- reasonable means prior to 60 days after the cessation.
-
- Moreover, your license from a particular copyright holder is
- reinstated permanently if the copyright holder notifies you of the
- violation by some reasonable means, this is the first time you have
- received notice of violation of this License (for any work) from
- that copyright holder, and you cure the violation prior to 30 days
- after your receipt of the notice.
-
- Termination of your rights under this section does not terminate
- the licenses of parties who have received copies or rights from you
- under this License. If your rights have been terminated and not
- permanently reinstated, receipt of a copy of some or all of the
- same material does not give you any rights to use it.
-
- 10. FUTURE REVISIONS OF THIS LICENSE
-
- The Free Software Foundation may publish new, revised versions of
- the GNU Free Documentation License from time to time. Such new
- versions will be similar in spirit to the present version, but may
- differ in detail to address new problems or concerns. See
- <http://www.gnu.org/copyleft/>.
-
- Each version of the License is given a distinguishing version
- number. If the Document specifies that a particular numbered
- version of this License "or any later version" applies to it, you
- have the option of following the terms and conditions either of
- that specified version or of any later version that has been
- published (not as a draft) by the Free Software Foundation. If the
- Document does not specify a version number of this License, you may
- choose any version ever published (not as a draft) by the Free
- Software Foundation. If the Document specifies that a proxy can
- decide which future versions of this License can be used, that
- proxy's public statement of acceptance of a version permanently
- authorizes you to choose that version for the Document.
-
- 11. RELICENSING
-
- "Massive Multiauthor Collaboration Site" (or "MMC Site") means any
- World Wide Web server that publishes copyrightable works and also
- provides prominent facilities for anybody to edit those works. A
- public wiki that anybody can edit is an example of such a server.
- A "Massive Multiauthor Collaboration" (or "MMC") contained in the
- site means any set of copyrightable works thus published on the MMC
- site.
-
- "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0
- license published by Creative Commons Corporation, a not-for-profit
- corporation with a principal place of business in San Francisco,
- California, as well as future copyleft versions of that license
- published by that same organization.
-
- "Incorporate" means to publish or republish a Document, in whole or
- in part, as part of another Document.
-
- An MMC is "eligible for relicensing" if it is licensed under this
- License, and if all works that were first published under this
- License somewhere other than this MMC, and subsequently
- incorporated in whole or in part into the MMC, (1) had no cover
- texts or invariant sections, and (2) were thus incorporated prior
- to November 1, 2008.
-
- The operator of an MMC Site may republish an MMC contained in the
- site under CC-BY-SA on the same site at any time before August 1,
- 2009, provided the MMC is eligible for relicensing.
-
-ADDENDUM: How to use this License for your documents
-====================================================
-
-To use this License in a document you have written, include a copy of
-the License in the document and put the following copyright and license
-notices just after the title page:
-
- Copyright (C) YEAR YOUR NAME.
- Permission is granted to copy, distribute and/or modify this document
- under the terms of the GNU Free Documentation License, Version 1.3
- or any later version published by the Free Software Foundation;
- with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
- Texts. A copy of the license is included in the section entitled ``GNU
- Free Documentation License''.
-
- If you have Invariant Sections, Front-Cover Texts and Back-Cover
-Texts, replace the "with...Texts." line with this:
-
- with the Invariant Sections being LIST THEIR TITLES, with
- the Front-Cover Texts being LIST, and with the Back-Cover Texts
- being LIST.
-
- If you have Invariant Sections without Cover Texts, or some other
-combination of the three, merge those two alternatives to suit the
-situation.
-
- If your document contains nontrivial examples of program code, we
-recommend releasing these examples in parallel under your choice of free
-software license, such as the GNU General Public License, to permit
-their use in free software.
-
-
-File: gawkinet.info, Node: Index, Prev: GNU Free Documentation License, Up: Top
-
-Index
-*****
-
-
-* Menu:
-
-* /inet/ files (gawk): Gawk Special Files. (line 34)
-* /inet/tcp special files (gawk): File /inet/tcp. (line 6)
-* /inet/udp special files (gawk): File /inet/udp. (line 6)
-* | (vertical bar), |& operator (I/O): TCP Connecting. (line 25)
-* advanced features, network connections: Troubleshooting. (line 6)
-* agent: Challenges. (line 75)
-* agent <1>: MOBAGWHO. (line 6)
-* AI: Challenges. (line 75)
-* apache: WEBGRAB. (line 72)
-* apache <1>: MOBAGWHO. (line 42)
-* Bioinformatics: PROTBASE. (line 227)
-* BLAST, Basic Local Alignment Search Tool: PROTBASE. (line 6)
-* blocking: Making Connections. (line 35)
-* Boutell, Thomas: STATIST. (line 6)
-* CGI (Common Gateway Interface): MOBAGWHO. (line 42)
-* CGI (Common Gateway Interface), dynamic web pages and: Web page.
- (line 45)
-* CGI (Common Gateway Interface), library: CGI Lib. (line 11)
-* clients: Making Connections. (line 21)
-* Clinton, Bill: Challenges. (line 58)
-* Common Gateway Interface, See CGI: Web page. (line 45)
-* Computational Biology: PROTBASE. (line 227)
-* contest: Challenges. (line 6)
-* cron utility: STOXPRED. (line 23)
-* CSV format: STOXPRED. (line 128)
-* Dow Jones Industrial Index: STOXPRED. (line 44)
-* ELIZA program: Simple Server. (line 11)
-* ELIZA program <1>: Simple Server. (line 178)
-* email: Email. (line 11)
-* FASTA/Pearson format: PROTBASE. (line 102)
-* FDL (Free Documentation License): GNU Free Documentation License.
- (line 6)
-* filenames, for network access: Gawk Special Files. (line 29)
-* files, /inet/ (gawk): Gawk Special Files. (line 34)
-* files, /inet/tcp (gawk): File /inet/tcp. (line 6)
-* files, /inet/udp (gawk): File /inet/udp. (line 6)
-* finger utility: Setting Up. (line 22)
-* Free Documentation License (FDL): GNU Free Documentation License.
- (line 6)
-* FTP (File Transfer Protocol): Basic Protocols. (line 45)
-* gawk, networking: Using Networking. (line 6)
-* gawk, networking, connections: Special File Fields. (line 53)
-* gawk, networking, connections <1>: TCP Connecting. (line 6)
-* gawk, networking, filenames: Gawk Special Files. (line 29)
-* gawk, networking, See Also email: Email. (line 6)
-* gawk, networking, service, establishing: Setting Up. (line 6)
-* gawk, networking, troubleshooting: Caveats. (line 6)
-* gawk, web and, See web service: Interacting Service. (line 6)
-* getline command: TCP Connecting. (line 11)
-* GETURL program: GETURL. (line 6)
-* GIF image format: Web page. (line 45)
-* GIF image format <1>: STATIST. (line 6)
-* GNU Free Documentation License: GNU Free Documentation License.
- (line 6)
-* GNU/Linux: Troubleshooting. (line 54)
-* GNU/Linux <1>: Interacting. (line 27)
-* GNU/Linux <2>: REMCONF. (line 6)
-* GNUPlot utility: Interacting Service. (line 189)
-* GNUPlot utility <1>: STATIST. (line 6)
-* Hoare, C.A.R.: MOBAGWHO. (line 6)
-* Hoare, C.A.R. <1>: PROTBASE. (line 6)
-* hostname field: Special File Fields. (line 34)
-* HTML (Hypertext Markup Language): Web page. (line 29)
-* HTTP (Hypertext Transfer Protocol): Basic Protocols. (line 45)
-* HTTP (Hypertext Transfer Protocol) <1>: Web page. (line 6)
-* HTTP (Hypertext Transfer Protocol), record separators and: Web page.
- (line 29)
-* HTTP server, core logic: Interacting Service. (line 6)
-* HTTP server, core logic <1>: Interacting Service. (line 24)
-* Humphrys, Mark: Simple Server. (line 178)
-* Hypertext Markup Language (HTML): Web page. (line 29)
-* Hypertext Transfer Protocol, See HTTP: Web page. (line 6)
-* image format: STATIST. (line 6)
-* images, in web pages: Interacting Service. (line 189)
-* images, retrieving over networks: Web page. (line 45)
-* input/output, two-way, See Also gawk, networking: Gawk Special Files.
- (line 19)
-* Internet, See networks: Interacting. (line 48)
-* JavaScript: STATIST. (line 56)
-* Linux: Troubleshooting. (line 54)
-* Linux <1>: Interacting. (line 27)
-* Linux <2>: REMCONF. (line 6)
-* Lisp: MOBAGWHO. (line 98)
-* localport field: Gawk Special Files. (line 34)
-* Loebner, Hugh: Challenges. (line 6)
-* Loui, Ronald: Challenges. (line 75)
-* MAZE: MAZE. (line 6)
-* Microsoft Windows: WEBGRAB. (line 43)
-* Microsoft Windows, networking: Troubleshooting. (line 54)
-* Microsoft Windows, networking, ports: Setting Up. (line 37)
-* MiniSQL: REMCONF. (line 109)
-* MOBAGWHO program: MOBAGWHO. (line 6)
-* NCBI, National Center for Biotechnology Information: PROTBASE.
- (line 6)
-* network type field: Special File Fields. (line 11)
-* networks, gawk and: Using Networking. (line 6)
-* networks, gawk and, connections: Special File Fields. (line 53)
-* networks, gawk and, connections <1>: TCP Connecting. (line 6)
-* networks, gawk and, filenames: Gawk Special Files. (line 29)
-* networks, gawk and, See Also email: Email. (line 6)
-* networks, gawk and, service, establishing: Setting Up. (line 6)
-* networks, gawk and, troubleshooting: Caveats. (line 6)
-* networks, ports, reserved: Setting Up. (line 37)
-* networks, ports, specifying: Special File Fields. (line 24)
-* networks, See Also web pages: PANIC. (line 6)
-* Numerical Recipes: STATIST. (line 24)
-* ORS variable, HTTP and: Web page. (line 29)
-* ORS variable, POP and: Email. (line 36)
-* PANIC program: PANIC. (line 6)
-* Perl: Using Networking. (line 14)
-* Perl, gawk networking and: Using Networking. (line 24)
-* Perlis, Alan: MAZE. (line 6)
-* pipes, networking and: TCP Connecting. (line 30)
-* PNG image format: Web page. (line 45)
-* PNG image format <1>: STATIST. (line 6)
-* POP (Post Office Protocol): Email. (line 6)
-* POP (Post Office Protocol) <1>: Email. (line 36)
-* Post Office Protocol (POP): Email. (line 6)
-* PostScript: STATIST. (line 138)
-* PROLOG: Challenges. (line 75)
-* PROTBASE: PROTBASE. (line 6)
-* protocol field: Special File Fields. (line 17)
-* PS image format: STATIST. (line 6)
-* Python: Using Networking. (line 14)
-* Python, gawk networking and: Using Networking. (line 24)
-* record separators, HTTP and: Web page. (line 29)
-* record separators, POP and: Email. (line 36)
-* REMCONF program: REMCONF. (line 6)
-* remoteport field: Gawk Special Files. (line 34)
-* RFC 1939: Email. (line 6)
-* RFC 1939 <1>: Email. (line 36)
-* RFC 1945: Web page. (line 29)
-* RFC 2068: Web page. (line 6)
-* RFC 2068 <1>: Interacting Service. (line 104)
-* RFC 2616: Web page. (line 6)
-* RFC 821: Email. (line 6)
-* robot: Challenges. (line 84)
-* robot <1>: WEBGRAB. (line 6)
-* RS variable, HTTP and: Web page. (line 29)
-* RS variable, POP and: Email. (line 36)
-* servers: Making Connections. (line 14)
-* servers <1>: Setting Up. (line 22)
-* servers, as hosts: Special File Fields. (line 34)
-* servers, HTTP: Interacting Service. (line 6)
-* servers, web: Simple Server. (line 6)
-* Simple Mail Transfer Protocol (SMTP): Email. (line 6)
-* SMTP (Simple Mail Transfer Protocol): Basic Protocols. (line 45)
-* SMTP (Simple Mail Transfer Protocol) <1>: Email. (line 6)
-* STATIST program: STATIST. (line 6)
-* STOXPRED program: STOXPRED. (line 6)
-* synchronous communications: Making Connections. (line 35)
-* Tcl/Tk: Using Networking. (line 14)
-* Tcl/Tk, gawk and: Using Networking. (line 24)
-* Tcl/Tk, gawk and <1>: Some Applications and Techniques.
- (line 22)
-* TCP (Transmission Control Protocol): Using Networking. (line 29)
-* TCP (Transmission Control Protocol) <1>: File /inet/tcp. (line 6)
-* TCP (Transmission Control Protocol), connection, establishing: TCP Connecting.
- (line 6)
-* TCP (Transmission Control Protocol), UDP and: Interacting. (line 48)
-* TCP/IP, network type, selecting: Special File Fields. (line 11)
-* TCP/IP, protocols, selecting: Special File Fields. (line 17)
-* TCP/IP, sockets and: Gawk Special Files. (line 19)
-* Transmission Control Protocol, See TCP: Using Networking. (line 29)
-* troubleshooting, gawk, networks: Caveats. (line 6)
-* troubleshooting, networks, connections: Troubleshooting. (line 6)
-* troubleshooting, networks, timeouts: Caveats. (line 18)
-* UDP (User Datagram Protocol): File /inet/udp. (line 6)
-* UDP (User Datagram Protocol), TCP and: Interacting. (line 48)
-* Unix, network ports and: Setting Up. (line 37)
-* URLCHK program: URLCHK. (line 6)
-* User Datagram Protocol, See UDP: File /inet/udp. (line 6)
-* vertical bar (|), |& operator (I/O): TCP Connecting. (line 25)
-* VRML: MAZE. (line 6)
-* web browsers, See web service: Interacting Service. (line 6)
-* web pages: Web page. (line 6)
-* web pages, images in: Interacting Service. (line 189)
-* web pages, retrieving: GETURL. (line 6)
-* web servers: Simple Server. (line 6)
-* web service: Primitive Service. (line 6)
-* web service <1>: PANIC. (line 6)
-* WEBGRAB program: WEBGRAB. (line 6)
-* Weizenbaum, Joseph: Simple Server. (line 11)
-* XBM image format: Interacting Service. (line 189)
-* Yahoo!: REMCONF. (line 6)
-* Yahoo! <1>: STOXPRED. (line 6)
-
-
-
-Tag Table:
-Node: Top2022
-Node: Preface5665
-Node: Introduction7040
-Node: Stream Communications8066
-Node: Datagram Communications9240
-Node: The TCP/IP Protocols10870
-Ref: The TCP/IP Protocols-Footnote-111554
-Node: Basic Protocols11711
-Ref: Basic Protocols-Footnote-113756
-Node: Ports13785
-Node: Making Connections15192
-Ref: Making Connections-Footnote-117750
-Ref: Making Connections-Footnote-217797
-Node: Using Networking17978
-Node: Gawk Special Files20301
-Node: Special File Fields22110
-Ref: table-inet-components26003
-Node: Comparing Protocols27312
-Node: File /inet/tcp27846
-Node: File /inet/udp28874
-Ref: File /inet/udp-Footnote-130573
-Node: TCP Connecting30827
-Node: Troubleshooting33173
-Ref: Troubleshooting-Footnote-136232
-Node: Interacting36805
-Node: Setting Up39545
-Node: Email43048
-Node: Web page45380
-Ref: Web page-Footnote-148197
-Node: Primitive Service48395
-Node: Interacting Service51136
-Ref: Interacting Service-Footnote-160303
-Node: CGI Lib60335
-Node: Simple Server67310
-Ref: Simple Server-Footnote-175053
-Node: Caveats75154
-Node: Challenges76299
-Node: Some Applications and Techniques84997
-Node: PANIC87462
-Node: GETURL89186
-Node: REMCONF91819
-Node: URLCHK97314
-Node: WEBGRAB101166
-Node: STATIST105628
-Ref: STATIST-Footnote-1117377
-Node: MAZE117822
-Node: MOBAGWHO124029
-Ref: MOBAGWHO-Footnote-1138047
-Node: STOXPRED138102
-Node: PROTBASE152390
-Node: Links165506
-Node: GNU Free Documentation License168939
-Node: Index194059
-
-End Tag Table