\input texinfo
@comment %**start of header (This is for running Texinfo on a region.)
@setfilename mkid.info
@settitle The ID Database
@setchapternewpage odd
@comment %**end of header (This is for running Texinfo on a region.)

@include version.texi

@ifinfo
@format
START-INFO-DIR-ENTRY
* mkid: (mkid).			Identifier database utilities
END-INFO-DIR-ENTRY
@end format
@end ifinfo

@ifinfo
This file documents the @code{mkid} identifier database utilities.

Copyright (C) 1991 Tom Horsley

Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.

@ignore
Permission is granted to process this file through TeX and print the
results, provided the printed document carries copying permission
notice identical to this one except for the removal of this paragraph
(this paragraph not being relevant to the printed manual).

@end ignore
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.

Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation.
@end ifinfo

@titlepage
@title The MKID Identifier Database, version @value{VERSION}
@subtitle A Simple, Fast, High-Capacity Cross-Referencer
@subtitle lid, gid, aid, eid, pid, iid
@author by Tom Horsley

@page
@vskip 0pt plus 1filll
Copyright @copyright{} 1991 Tom Horsley

Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.

Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.

Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation.
@end titlepage

@ifinfo
@node Top, Overview, (dir), (dir)
@top GNU @code{mkid}

@menu
* Overview::                    What is an ID database and what tools manipulate it?
* Mkid::                        Mkid
* Database Query Tools::        Database Query Tools
* Iid::                         Iid
* Other Tools::                 Other Tools
* Command Index::               Command Index
@end menu

@end ifinfo

@node Overview, Mkid, Top, Top
@chapter Overview
@cindex Reference to First Chapter
An ID database is simply a file containing a list of file names, a list of
identifiers, and a binary relation (stored as a bit matrix) indicating which
of the identifiers appear in each file.  With this database and some tools
to manipulate the data, a host of tasks become simpler and faster. You can
@code{grep} through hundreds of files for a name, skipping the files that
don't contain the name.  You can search for all the memos containing
references to a project.  You can edit every file that calls some function,
adding a new required argument. Anyone with a large software project to
maintain, or a large set of text files to organize can benefit from the ID
database and the tools that manipulate it.

There are several programs in the ID family.  The @code{mkid} program
scans the files, finds the identifiers and builds the ID database.  The
@code{lid} and @code{aid} tools are used to generate lists of file names
containing an identifier (perhaps to recompile every file that
references a macro which just changed). The @code{eid} program will
invoke an editor on each of the files containing an identifier and the
@code{gid} program will @code{grep} for an identifier in the subset of
files known to contain it.  The @code{pid} tool is used to query the
path names of the files in the database (rather than the contents).
Finally, the @code{iid} tool is an interactive program supporting
complex queries to intersect and join sets of file names.

@menu
* History::                     History
@end menu

@node History,  , Overview, Overview
@section History
Greg McGary conceived of the ideas behind mkid when he began hacking
the UNIX kernel in 1984.  He needed a navigation tool to help him find
his way the expansive, unfamiliar landscape.  The first mkid-like tools
were built with shell scripts, and produced an ascii database that looks
much like the output of `lid' with no arguments.  It took over an hour
on a VAX 11/750 to build a database for a 4.1BSDish kernel.  Lookups were
done with the UNIX command @code{look}, modified to handle very long lines.

In 1986, Greg rewrote mkid, lid, fid and idx in C to improve
performance.  Database-build times were shortened by an order of
magnitude.  The mkid tools were first posted to @file{comp.sources.unix}
September of 1987.

Over the next few years, several versions diverged from the original
source.  Tom Horsley at Harris Computer Systems Division stepped forward
to take over maintenance and integrated some of the fixes from divergent
versions.  He also wrote the @code{iid} program.  A pre-release of
@code{mkid2} was posted to @file{alt.sources} near the end of 1990.  At
that time Tom wrote this texinfo manual with the encouragement the net
community.  (Tom thanks Doug Scofield and Bill Leonard whom I dragooned
into helping me poorf raed and edit --- they found several problems in
the initial version.)

In January, 1995, Greg McGary reemerged as the primary maintaner and is
hereby launching @code{mkid-3} whose primary new feature is an efficient
algorithm for building databases that is linear over the size of the
input text for both time and space.  (The old algorithm was quadratic
for space and choked on very large source trees.)  The code is now under
GPL and might become a part of the GNU system.  @code{Mkid-3} is an
interim release, since several significant enhacements are in the works.
These include an optional coupling with GNU grep, so that grep can use
an ID database for hints; a cscope work-alike query interface;
incremental update of the ID database; and an automatic file-tree walker
so you need not explicitly supply every file name argument to
the @code{mkid} program.

@node Mkid, Database Query Tools, Overview, Top
@chapter Mkid
The @code{mkid} program builds the ID database.  To do this it must scan
each of the files included in the database.  This takes some time, but
once the work is done the query programs run very rapidly.

The @code{mkid} program knows how to scan a variety of of files. For
example, it knows how to skip over comments and strings in a C program,
only picking out the identifiers used in the code.

Identifiers are not the only thing included in the database.
Numbers are also scanned and included in the database indexed by
their binary value. Since the same number can be written many
different ways (47, 0x2f, 057 in a C program for instance), this
feature allows you to find hard coded uses of constants without
regard to the radix used to specify them.

All the places in this document where identifiers are written about
should really mention identifiers and numbers, but that gets fairly
clumsy after a while, so you should always keep in mind that numbers are
included in the database as well as identifiers.

@menu
* Mkid Command Line Options::   Mkid Command Line Options
* Builtin Scanners::            Builtin Scanners
* Adding Your Own Scanner::     Adding Your Own Scanner
* Mkid Examples::               Mkid Examples
@end menu

@node Mkid Command Line Options, Builtin Scanners, Mkid, Mkid
@section Mkid Command Line Options
@deffn Command mkid [@code{-v}] [@code{-S@var{scanarg}}] [@code{-a@var{arg-file}}] [@code{-}] [@code{-f@var{out-file}}] [@code{-u}] [@code{files}@dots{}]
@table @code
@item -v
Verbose. Mkid tells you as it scans each file and indicates which scanner
it is using. It also summarizes some statistics about the database at
the end.
@item -S@var{scanarg}
The @code{-S} option is used to specify arguments to the various language
scanners. @xref{Scanner Arguments}, for details.
@item -a@var{arg-file}
Name a file containing additional command line arguments (one per line). This
may be used to specify lists of file names longer than will fit on a command
line.
@item -
A simple @code{-} by itself means read arguments from stdin.
@item -f@var{out-file}
Specify the name of the database file to create. The default name is @code{ID}
(in the current directory), but you may specify any name. The file names
stored in the database will be stored relative to the directory containing
the database, so if you move the database after creating it, you may have
trouble finding files unless they remain in the same relative position.
@item -u
The @code{-u} option updates an existing database by rescanning any files
that have changed since the database was written. Unfortunately you cannot
incrementally add new files to a database.
@item files
Remaining arguments are names of files to be scanned and included in the
database.
@end table
@end deffn

@menu
* Scanner Arguments::           Scanner Arguments
@end menu

@node Scanner Arguments,  , Mkid Command Line Options, Mkid Command Line Options
@subsection Scanner Arguments
Scanner arguments all start with @code{-S}. Scanner arguments are used to tell
@code{mkid} which language scanner to use for which files, to pass language
specific options to the individual scanners, and to get some limited
online help about scanner options.

@code{Mkid} usually determines which language scanner to use on a file
by looking at the suffix of the file name. The suffix starts at the last
@samp{.} in a file name and includes the @samp{.} and all remaining
characters (for example the suffix of @file{fred.c} is @file{.c}). Not
all files have a suffix, and not all suffixes are bound to a specific
language by mkid. If @code{mkid} cannot determine what language a file
is, it will use the language bound to the @file{.default} suffix. The
plain text scanner is normally bound to @file{.default}, but the
@code{-S} option can be used to change any language bindings.

There are several different forms for scanner options:
@table @code
@item -S.@var{<suffix>}=@var{<language>}
@code{Mkid} determines which language scanner to use on a file by examining the
file name suffix. The @samp{.} is part of the suffix and must be specified
in this form of the @code{-S} option. For example @samp{-S.y=c} tells
@code{mkid} to use the @samp{c} language scanner for all files ending in
the @samp{.y} suffix.
@item -S.@var{<suffix>}=?
@code{Mkid} has several built in suffixes it already recognizes. Passing
a @samp{?} will cause it to print the language it will use to scan files
with that suffix.
@item -S?=@var{<language>}
This form will print which suffixes are scanned with the given language.
@item -S?=?
This prints all the suffix@expansion{}language bindings recognized by
@code{mkid}.
@item -S@var{<language>}-@var{<arg>}
Each language scanner accepts scanner dependent arguments. This form of the
@code{-S} option is used to pass arbitrary arguments to the language scanners.
@item -S@var{<language>}?
Passing a @samp{?} instead of a language option will print a brief summary
of the options recognized by the specified language scanner.
@item -S@var{<new language>}/@var{<builtin language>}/@var{<filter command>}
This form specifies a new language defined in terms of a builtin language
and a shell command that will be used to filter the file prior to passing
on to the builtin language scanner.
@end table

@node Builtin Scanners, Adding Your Own Scanner, Mkid Command Line Options, Mkid
@section Builtin Scanners
If you run @code{mkid -S?=?} you will find bindings for a number of
languages; unfortunately pascal, though mentioned in the list, is not
actually supported.  The supported languages are documented below
@footnote{This is not strictly true --- vhil is a supported language, but
it is an obsolete and arcane dialect of C and should be ignored}.

@menu
* C::                           C
* Plain Text::                  Plain Text
* Assembler::                   Assembler
@end menu

@node C, Plain Text, Builtin Scanners, Builtin Scanners
@subsection C

The C scanner is probably the most popular. It scans identifiers out of
C programs, skipping over comments and strings in the process.  The
normal @file{.c} and @file{.h} suffixes are automatically recognized as
C language, as well as the more obscure @file{.y} (yacc) and @file{.l}
(lex) suffixes.

The @code{-S} options recognized by the C scanner are:

@table @code
@item -Sc-s@var{<character>}
Allow the specified @var{<character>} in identifiers (some dialects of
C allow @code{$} in identifiers, so you could say @code{-Sc-s$} to
accept that dialect).
@item -Sc-u
Don't strip leading underscores from identifier names (this is the default
mode of operation).
@item -Sc+u
Do strip leading underscores from identifier names (I don't know why you
would want to do this in C programs, but the option is available).
@end table

@node Plain Text, Assembler, C, Builtin Scanners
@subsection Plain Text
The plain text scanner is designed for scanning documents. This is
typically the scanner used when adding custom scanners, and several
custom scanners are built in to @code{mkid} and defined in terms of filters
and the text scanner. A troff scanner runs @code{deroff} over the file
then feeds the result to the text scanner. A compressed man page scanner
runs @code{pcat} piped into @code{col -b}, and a @TeX{} scanner runs
@code{detex}.

Options:

@table @code
@item -Stext+a@var{<character>}
Include the specified character in identifiers. By default, standard
C identifiers are recognized.
@item -Stext-a@var{<character>}
Exclude the specified character from identifiers.
@item -Stext+s@var{<character>}
Squeeze the specified character out of identifiers. By default, the
characters @samp{'}, @samp{-}, and @samp{.} are squeezed out of identifiers.
This generates transformations like @var{fred's}@expansion{}@var{freds} or
@var{a.s.p.c.a.}@expansion{}@var{aspca}.
@item -Stext-s@var{<character>}
Do not squeeze out the specified character.
@end table

@node Assembler,  , Plain Text, Builtin Scanners
@subsection Assembler
Assemblers come in several flavors, so there are several options to
control scanning of assembly code:

@table @code
@item -Sasm-c@var{<character>}
The specified character starts a comment that extends to end of line
(in many assemblers this is a semicolon or number sign --- there is
no default value for this).
@item -Sasm+u
Strip the leading underscores off identifiers (the default behavior).
@item -Sasm-u
Do not strip the leading underscores.
@item -Sasm+a@var{<character>}
The specified character is allowed in identifiers.
@item -Sasm-a@var{<character>}
The specified character is allowed in identifiers, but any identifier
containing that character is ignored (often a @samp{.} or @samp{@@}
will be used to indicate an internal temp label, you may want to
ignore these).
@item -Sasm+p
Recognize C preprocessor directives in assembler source (default).
@item -Sasm-p
Do not recognize C preprocessor directives in assembler source.
@item -Sasm+C
Skip over C style comments in assembler source (default).
@item -Sasm-C
Do not skip over C style comments in assembler source.
@end table

@node Adding Your Own Scanner, Mkid Examples, Builtin Scanners, Mkid
@section Adding Your Own Scanner

There are two ways to add new scanners to @code{mkid}. The first is to
modify the code in @file{getscan.c} and add a new @file{scan-*.c} file
with the code for your scanner. This is not too hard, but it requires
relinking and installing a new version of @code{mkid}, which might be
inconvenient, and would lead to the proliferation of @code{mkid}
versions.

The second technique uses the  @code{-S<lang>/<lang>/<filter>} form
of the @code{-S} option to specify a new language scanner. In this form
the first language is the name of the new language to be defined,
the second language is the name of an existing language scanner to
be invoked on the output of the filter command specified as the
third component of the @code{-S} option.

The filter is an arbitrary shell command. Somewhere in the filter string,
a @code{%s} should occur. This @code{%s} is replaced by the name of the
source file being scanned, the shell command is invoked, and whatever
comes out on @var{stdout} is scanned using the builtin scanner.

For example, no scanner is provided for texinfo files (like this one).
If I wished to index the contents of this file, but avoid indexing the
texinfo directives, I would need a filter that stripped out the texinfo
directives, but left the remainder of the file intact. I could then use
the plain text scanner on the remainder. A quick way to specify this
might be:

@example
'-S/texinfo/text/sed s,@@[a-z]*,,g < %s'
@end example

This defines a new language scanner (@var{texinfo}) defined in terms of
a @code{sed} command to strip out texinfo directives (at signs followed
by letters). Once the directives are stripped, the remaining text is run
through the plain text scanner.

This is just an example, to do a better job I would actually need to
delete some lines (such as those beginning with @code{@@end}) as well
as deleting the @code{@@} directives embedded in the text.

@node Mkid Examples,  , Adding Your Own Scanner, Mkid
@section Mkid Examples

The simplest example of @code{mkid} is something like:

@example
mkid *.[chy]
@end example

This will build an ID database indexing all the
identifiers and numbers in the @file{.c}, @file{.h}, and @file{.y} files
in the current directory. Because those suffixes are already known to
@code{mkid} as C language files, no other special arguments are required.

From a simple example, lets go to a more complex one. Suppose you want
to build a database indexing the contents of all the @var{man} pages.
Since @code{mkid} already knows how to deal with @file{.z} files, let's
assume your system is using the @code{compress} program to store
compressed cattable versions of the @var{man} pages.  The
@code{compress} program creates files with a @code{.Z} suffix, so
@code{mkid} will have to be told how to scan @file{.Z} files. The
following code shows how to combine the @code{find} command with the
special scanner arguments to @code{mkid} to generate the required ID
database:

@example
cd /usr/catman
find . -name '*.Z' -print | mkid '-Sman/text/uncompress -c < %s' -S.Z=man -
@end example

This example first switches to the @file{/usr/catman} directory where
the compressed @var{man} pages are stored. The @code{find} command then
finds all the @file{.Z} files under that directory and prints their
names.  This list is piped into the @code{mkid} program. The @code{-}
argument by itself (at the end of the line) tells @code{mkid} to read
arguments (in this case the list of file names) from @var{stdin}. The
first @code{-S} argument defines a new language (@var{man}) in terms of
the @code{uncompress} utility and the existing text scanner. The second
@code{-S} argument tells @code{mkid} to treat all @file{.Z} files as
language @var{man}. In practice, you might find the @code{mkid}
arguments need to be even more complex, something like:

@example
mkid '-Sman/text/uncompress -c < %s | col -b' -S.Z=man -
@end example

This will take the additional step of getting rid of any underlining and
backspacing which might be present in the compressed @var{man} pages.

@node Database Query Tools, Iid, Mkid, Top
@chapter Database Query Tools

The ID database is useless without database query tools. The remainder
of this document describes those tools.

The @code{lid}, @code{gid},
@code{aid}, @code{eid}, and @code{pid} programs are all the same program
installed with links to different names. The name used to invoke the
program determines how it will act.

The @code{iid} program is an interactive query shell that sits on top
of the other query tools.

@menu
* Common Options::              Common command line options
* Patterns::                    Identifier pattern matching
* Lid::                         Look up identifiers
* Aid::                         Case insensitive lid
* Gid::                         Grep for identifiers
* Eid::                         Edit files with matching identifiers
* Pid::                         Look up path names in database
@end menu

@node Common Options, Patterns, Database Query Tools, Database Query Tools
@section Common Options

Since many of the programs are really links to one common program, it
is only reasonable to expect that most of the query tools would share
common command line options. Not all options make sense for all programs,
but they are all described here. The description of each program
gives the options that program uses.

@table @code
@item -f@var{<file>}
Read the database specified by @var{<file>}. Normally the tools look
for a file named @file{ID} in either the current directory or in any
of the directories above the current directory. This means you can keep
a global @file{ID} database in the root of a large source tree and use
the query tools from anywhere within that tree.
@item -r@var{<directory>}
The query tools usually assume the file names in the database are relative
to the directory holding the database. The @code{-r} option tells the
tools to look for the files relative to @var{<directory>} regardless
of the location of the database.
@item -c
This is shorthand for @code{-r`pwd`}. It tells the query tools to assume
the file names are stored relative to the current working directory.
@item -e
Force the pattern arguments to be treated as regular expressions.
Normally the query tools attempt to guess if the patterns are regular
expressions or simple identifiers by looking for special characters
in the pattern.
@item -w
Force the pattern arguments to be treated as simple words even if
they contain special regular expression characters.
@item -k
Normally the query tools that generate lists of file names attempt to
compress the lists using the @code{csh} brace notation. This option
suppresses the file name compression and outputs each name in full.
(This is particularly useful if you are a @code{ksh} user and want to
feed the list of names to another command --- the @code{-k} option
comes from the @code{k} in @code{ksh}).
@item -g
It is possible to build the query tools so the @code{-k} option is the
default behavior. If this is the case for your system, the @code{-g}
option turns on the globbing of file names using the @code{csh} brace
notation.
@item -n
Normally the query tools that generate lists of file names also list
the matching identifier at the head of the list of names. This is
irritating if you want just a list of names to feed to another command,
so the @code{-n} option suppresses the identifier and lists only
file names.
@item -b
This option is only used by the @code{pid} tool. It restricts @code{pid}
to pattern match only the basename part of a file name. Normally the
absolute file name is matched against the pattern.
@item -d -o -x -a
These options may be used in any combination to limit the radix of
numeric matches. The @code{-d} option will allow matches on decimal
numbers, @code{-o} on octal, and @code{-x} on hexadecimal numbers.
The @code{-a} option is shorthand for specifying all three. Any
combination of these options may be used.
@item -m
Merge multiple lines of output into a single line. (If your query
matches more than one identifier the default action is to generate
a separate line of output for each matching identifier).
@item -s
Search for identifiers that appear only once in the database. This
helps to locate identifiers that are defined but never used.
@item -u@var{<number>}
List identifiers that conflict in the first @var{<number>} characters.
This could be useful porting programs to brain-dead computers that
refuse to support long identifiers, but your best long term option
is to set such computers on fire.
@end table

@node Patterns, Lid, Common Options, Database Query Tools
@section Patterns

You can attempt to match either simple identifiers or numbers in a
query, or you can specify a regular expression pattern which may
match many different identifiers in the database. The query
programs use either @var{regex} and @var{regcmp} or @var{re_comp}
and @var{re_exec}, depending on which one is available in the library
on your system. These might not always support the exact same
regular expression syntax, so consult your local @var{man} pages
to find out. Any regular expression routines should support the following
syntax:

@table @code
@item .
A dot matches any character.
@item [ ]
Brackets match any of the characters specified within the brackets.  You
can match any characters @emph{except} the ones in brackets by typing
@code{^} as the first character. A range of characters can be specified
using @code{-}.
@item *
An asterisk means repeat the previous pattern zero or more times.
@item ^
An @code{^} at the beginning of a pattern means the pattern must match
starting at the first character of the identifier.
@item $
A @code{$} at the end of the pattern means the pattern must match ending
at the last character in the identifier.
@end table

@node Lid, Aid, Patterns, Database Query Tools
@section Lid

@deffn Command lid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-ewdoxamskgnc}] patterns@dots{}
@end deffn

The @code{lid} program stands for @var{lookup identifier}.
It searches the database for any identifiers matching the patterns
and prints the names of the files that match each pattern. The exact
format of the output depends on the options.

@node Aid, Gid, Lid, Database Query Tools
@section Aid

@deffn Command aid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-doxamskgnc}] patterns@dots{}
@end deffn

The @code{aid} command is an abbreviation for @var{apropos identifier}.
The patterns cannot be regular expressions, but it looks for them using
a case insensitive match, and any pattern that is a substring of an
identifier in the database will match that identifier.

For example @samp{aid get} might match the identifiers @code{fgets},
@code{GETLINE}, and @code{getchar}.

@node Gid, Eid, Aid, Database Query Tools
@section Gid

@deffn Command gid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-doxasc}] patterns@dots{}
@end deffn

The @code{gid} command stands for @var{grep for identifiers}. It finds
identifiers in the database that match the specified patterns, then
@code{greps} for those identifiers in just the set of files containing
matches. In a large source tree, this saves a fantastic amount of time.

There is an @var{emacs} interface to this program (@pxref{GNU Emacs Interface}).
If you are an @var{emacs} user, you will probably prefer the @var{emacs}
interface over the @code{eid} tool.

@node Eid, Pid, Gid, Database Query Tools
@section Eid

@deffn Command eid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-doxasc}] patterns@dots{}
@end deffn

The @code{eid} command allows you to invoke an editor on each file containing
a matching pattern. The @code{EDITOR} environment variable is the name of the
program to be invoked. If the specified editor can accept an initial search
argument on the command line, you can use the @code{EIDARG}, @code{EIDLDEL},
and @code{EIDRDEL} environment variables to specify the form of that argument.

@table @code
@item EDITOR
The name of the editor program to invoke.
@item EIDARG
A printf string giving the form of the argument to pass containing the
initial search string (the matching identifier). For @code{vi}
it should be set to @samp{+/%s/'}.
@item EIDLDEL
A string giving the regular expression pattern that forces a match at
the beginning (left end) of a word. This string is inserted in front
of the matching identifier when composing the search argument. For @code{vi},
this should be @samp{\<}.
@item EIDRDEL
The matching right end word delimiter. For @code{vi}, use @samp{\>}.
@end table

@node Pid,  , Eid, Database Query Tools
@section Pid

@deffn Command pid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-ebkgnc}] patterns@dots{}
@end deffn

The @code{pid} tool is unlike all the other tools. It matches the
patterns against the file names in the database rather than the
identifiers in the database.  Patterns are treated as shell wild card
patterns unless the @code{-e} option is given, in which case full
regular expression matching is done.

The wild card pattern is matched against the absolute path name of the
file. Most shells treat slashes @samp{/} and file names that start with
dot @samp{.} specially, @code{pid} does not do this. It simply attempts
to match the absolute path name string against the wild card pattern.

The @code{-b} option restricts the pattern matching to the base name of
the file (all the leading directory names are stripped prior to pattern
matching).

@node Iid, Other Tools, Database Query Tools, Top
@chapter Iid

@deffn Command iid [@code{-a}] [@code{-c@var{<command>}}] [@code{-H}]
@table @code
@item -a
Normally @code{iid} uses the @code{lid} command to search for names.
If you give the @code{-a} option on the command line, then it will
use @code{aid} as the default search engine.
@item -c@var{<command>}
In normal operation, @code{iid} starts up and prompts you for commands
used to build sets of files. The @code{-c} option is used to pass a
single query command to @code{iid} which it then executes and exits.
@item -H
The @code{-H} option prints a short help message and exits. To get more
help use the @code{help} command from inside @code{iid}.
@end table
@end deffn

The @code{iid} program is an interactive ID query tool. It operates by
running the other query programs (such as @code{lid} and @code{aid})
and creating sets of file names returned by these queries. It also
provides operators for @code{anding} and @code{oring} these sets to
create new sets.

The @code{PAGER} environment variable names the program @code{iid} uses
to display files. If you use @code{emacs}, you might want to set
@code{PAGER} so it invokes the @code{emacsclient} program. Check the
file @file{lisp/server.el} in the emacs source tree for documentation on
this. It is useful not only with X windows, but also when running
@code{iid} from an emacs shell buffer. There is also a somewhat spiffier
version called gnuserv by Andy Norman
(@code{ange%anorman@@hplabs.hp.com}) which appeared in @file{comp.emacs}
sometime in 1989.

@menu
* Ss and Files commands::       Ss and Files commands
* Sets::                        Sets
* Show::                        Show
* Begin::                       Begin
* Help::                        Help
* Off::                         Off
* Shell Commands as Queries::   Shell Commands as Queries
* Shell Escape::                Shell Escape
@end menu

@node Ss and Files commands, Sets, Iid, Iid
@section Ss and Files commands

The primary query commands are @code{ss} (for select sets) and @code{files}
(for show file names). These commands both take a query expression as an
argument.

@deffn Subcommand ss query
The @code{ss} command runs a query and builds a set (or sets) of file names. The
result is printed as a summary of the sets constructed showing how many file
names are in each set.
@end deffn

@deffn Subcommand files query
The @code{files} command is like the @code{ss} command, but rather than printing
a summary, it displays the full list of matching file names.
@end deffn

@deffn Subcommand f query
The @code{f} command is merely a shorthand notation for @code{files}.
@end deffn

Database queries are simple expressions with operators like @code{and}
and @code{or}. Parentheses can be used to group operations. The complete
set of operators is summarized below:

@table @code
@item @var{pattern}
Any pattern not recognized as one of the keywords in this table is treated
as an identifier to be searched for in the database. It is passed as an
argument to the default search program (normally @code{lid}, but @code{aid}
is used if the @code{-a} option was given when @code{iid} was started).
The result of this operation is a set of file names, and it is assigned a
unique set number.
@item lid
@code{lid} is a keyword. It is used to invoke @code{lid} with the list of
identifiers following it as arguments. This forces the use of @code{lid}
regardless of the state of the @code{-a} option (@pxref{Lid}).
@item aid
The @code{aid} keyword is like the @code{lid} keyword, but it forces the
use of the @code{aid} program (@pxref{Aid}).
@item match
The @code{match} operator invokes the @code{pid} program to do pattern
matching on file names rather than identifiers. The set generated contains
the file names that match the specified patterns (@pxref{Pid}).
@item or
The @code{or} operator takes two sets of file names as arguments and generates
a new set containing all the files from both sets.
@item and
The @code{and} operator takes two sets of file names and generates a new
set containing only files from both sets.
@item not
The @code{not} operator inverts a set of file names, producing the set of
all files not in the input set.
@item set number
A set number consists of the letter @code{s} followed immediately by a number.
This refers to one of the sets created by a previous query operation. During
one @code{iid} session, each query generates a unique set number, so any
previously generated set may be used as part of any new query by referring
to the set number.
@end table

The @code{not} operator has the highest precedence with @code{and}
coming in the middle and @code{or} having the lowest precedence.  The
operator names are recognized using case insensitive matching, so
@code{AND}, @code{and}, and @code{aNd} are all the same as far as
@code{iid} is concerned. If you wish to use a keyword as an operand to
one of the query programs, you must enclose it in quotes.  Any patterns
containing shell special characters must also be properly quoted or
escaped, since the query commands are run by invoking them with the
shell.

Summary of query expression syntax:

@example
A <query> is:
   <set number>
   <identifier>
   lid <identifier list>
   aid <identifier list>
   match <wild card list>
   <query> or <query>
   <query> and <query>
   not <query>
   ( <query> )
@end example

@node Sets, Show, Ss and Files commands, Iid
@section Sets

@deffn Subcommand sets
@end deffn

The @code{sets} command displays all the sets created so far. Each one
is described by the query command that generated it.

@node Show, Begin, Sets, Iid
@section Show

@deffn Subcommand show set
@end deffn

@deffn Subcommand p set
@end deffn

The @code{show} and @code{p} commands are equivalent. They both accept
a set number as an argument and run the program given in the @code{PAGER}
environment variable with the file names in that set as arguments.

@node Begin, Help, Show, Iid
@section Begin

@deffn Subcommand begin directory
@end deffn

@deffn Subcommand b directory
@end deffn

The @code{begin} command (and its abbreviated version @code{b}) is used
to begin a new @code{iid} session in a different directory (which presumably
contains a different database). It flushes all the sets created so far
and switches to the specified directory. It is equivalent to exiting @code{iid},
changing directories in the shell, and running @code{iid} again.

@node Help, Off, Begin, Iid
@section Help

@deffn Subcommand help
@end deffn

@deffn Subcommand h
@end deffn

@deffn Subcommand ?
@end deffn

The @code{help}, @code{h}, and @code{?} command are three different ways to
ask for help. They all invoke the @code{PAGER} program to display a short
help file.

@node Off, Shell Commands as Queries, Help, Iid
@section Off

@deffn Subcommand off
@end deffn

@deffn Subcommand quit
@end deffn

@deffn Subcommand q
@end deffn

These three command (or just an end of file) all cause @code{iid} to exit.

@node Shell Commands as Queries, Shell Escape, Off, Iid
@section Shell Commands as Queries

When the first word on an @code{iid} command is not recognized as a
builtin @code{iid} command, @code{iid} assumes the command is a shell
command which will write a list of file names to @var{stdout}. This list
of file names is used to generate a new set of files.

Any set numbers that appear as arguments to this command are expanded
into lists of file names prior to running the command.

@node Shell Escape,  , Shell Commands as Queries, Iid
@section Shell Escape

If a command starts with a bang (@code{!}) character, the remainder of
the line is run as a shell command. Any set numbers that appear as
arguments to this command are expanded into lists of file names prior to
running the command.

@node Other Tools, Command Index, Iid, Top
@chapter Other Tools

This chapter describes some support tools that work with the other ID
programs.

@menu
* GNU Emacs Interface::         Using gid.el
* Fid::                         List identifiers in a file.
* Idx::                         Extract identifiers from source file.
@end menu

@node GNU Emacs Interface, Fid, Other Tools, Other Tools
@section GNU Emacs Interface

The source distribution comes with a file named @file{gid.el}.  This is
a GNU emacs interface to the @code{gid} tool.  If you put the file where
emacs can find it (somewhere in your @code{EMACSLOADPATH}) and put
@code{(autoload 'gid "gid" nil t)} in your @file{.emacs} file, you will
be able to invoke the @code{gid} function using @kbd{M-x gid}.

This function prompts you with the word the cursor is on. If you want
to search for a different pattern, simply delete the line and type the
pattern of interest.

It runs @code{gid} in a @code{*compilation*} buffer, so the normal
@code{next-error} function can be used to visit all the places the
identifier is found (@pxref{Compilation,,,emacs,The GNU Emacs Manual}).

@node Fid, Idx, GNU Emacs Interface, Other Tools
@section Fid

@deffn Command fid [@code{-f@var{<file>}}] file1 [file2]
@table @code
@item -f@var{<file>}
Look in the named database.
@item @var{file1}
List the identifiers contained in file1 according to the database.
@item @var{file2}
If a second file is given, list only the identifiers both files have
in common.
@end table
@end deffn

The @code{fid} program provides an inverse query. Instead of listing
files containing some identifier, it lists the identifiers found in
a file.

@node Idx,  , Fid, Other Tools
@section Idx

@deffn Command idx [@code{-s@var{<directory>}}] [@code{-r@var{<directory>}}] [@code{-S@var{<scanarg>}}] files@dots{}
The @code{-s}, @code{-r}, and @code{-S} arguments to @code{idx}
are identical to the same arguments on @code{mkid}
(@pxref{Mkid Command Line Options}).
@end deffn

The @code{idx} command is more of a test frame for scanners than a tool
designed to be independently useful. It takes the same scanner arguments
as @code{mkid}, but rather than building a database, it prints the
identifiers found to @var{stdout}, one per line. You can use it to try
out a scanner on a sample file to make sure it is extracting the
identifiers you believe it should extract.

@node Command Index,  , Other Tools, Top
@unnumbered Command Index

@printindex fn

@contents
@bye