summaryrefslogtreecommitdiffstats
path: root/doc/id-utils.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/id-utils.texi')
-rw-r--r--doc/id-utils.texi1378
1 files changed, 1378 insertions, 0 deletions
diff --git a/doc/id-utils.texi b/doc/id-utils.texi
new file mode 100644
index 0000000..9cc7dd4
--- /dev/null
+++ b/doc/id-utils.texi
@@ -0,0 +1,1378 @@
+\input texinfo
+@comment %**start of header
+@setfilename id-utils.info
+@settitle ID database utilities
+@comment %**end of header
+
+@include version.texi
+
+@c Define new indices for filenames, commands and options.
+@defcodeindex fl
+@defcodeindex cm
+@defcodeindex op
+
+@c Put everything in one index (arbitrarily chosen to be the concept index).
+@syncodeindex fl cp
+@syncodeindex fn cp
+@syncodeindex ky cp
+@syncodeindex op cp
+@syncodeindex pg cp
+@syncodeindex vr cp
+
+@ifinfo
+@format
+START-INFO-DIR-ENTRY
+* ID database: (id). Identifier database utilities.
+* aid: (id)aid invocation. Matching strings.
+* eid: (id)eid invocation. Invoking an editor on matches.
+* fid: (id)fid invocation. Listing a file's identifiers.
+* gid: (id)gid invocation. Listing all matching lines.
+* idx: (id)idx invocation. Testing mkid scanners.
+* lid: (id)lid invocation. Matching patterns.
+* mkid: (id)mkid invocation. Creating an ID database.
+* pid: (id)pid invocation. Looking up filenames.
+END-INFO-DIR-ENTRY
+@end format
+@end ifinfo
+
+@ifinfo
+This file documents the @code{mkid} identifier database utilities.
+
+Copyright (C) 1991, 1995 Tom Horsley.
+
+Permission is granted to make and distribute verbatim copies of
+this manual provided the copyright notice and this permission notice
+are preserved on all copies.
+
+@ignore
+Permission is granted to process this file through TeX and print the
+results, provided the printed document carries copying permission
+notice identical to this one except for the removal of this paragraph
+(this paragraph not being relevant to the printed manual).
+
+@end ignore
+Permission is granted to copy and distribute modified versions of this
+manual under the conditions for verbatim copying, provided that the entire
+resulting derived work is distributed under the terms of a permission
+notice identical to this one.
+
+Permission is granted to copy and distribute translations of this manual
+into another language, under the above conditions for modified versions,
+except that this permission notice may be stated in a translation.
+@end ifinfo
+
+@titlepage
+@title ID database utilities
+@subtitle Programs for simple, fast, high-capacity cross-referencing
+@subtitle for version @value{VERSION}
+@author Tom Horsley
+@author Greg McGary
+
+@page
+@vskip 0pt plus 1filll
+Copyright @copyright{} 1991, 1995 Tom Horsley.
+
+Permission is granted to make and distribute verbatim copies of
+this manual provided the copyright notice and this permission notice
+are preserved on all copies.
+
+Permission is granted to copy and distribute modified versions of this
+manual under the conditions for verbatim copying, provided that the entire
+resulting derived work is distributed under the terms of a permission
+notice identical to this one.
+
+Permission is granted to copy and distribute translations of this manual
+into another language, under the above conditions for modified versions,
+except that this permission notice may be stated in a translation.
+@end titlepage
+
+
+@ifinfo
+@node Top
+@top ID database utilities
+
+This manual documents version @value{VERSION} of the ID database
+utilities.
+
+@menu
+* Introduction:: Overview of the tools, and authors.
+* mkid invocation:: Creating an ID database.
+* Common query arguments:: Common lookup options and search patterns.
+* gid invocation:: Listing all matching lines.
+* Looking up identifiers:: lid, aid, eid, and fid.
+* pid invocation:: Looking up filenames.
+* Index:: General index.
+@end menu
+@end ifinfo
+
+
+@node Introduction
+@chapter Introduction
+
+@cindex overview
+@cindex introduction
+
+@cindex ID database, definition of
+An @dfn{ID database} is a binary file containing a list of filenames, a
+list of identifiers, and a matrix indicating which identifiers appear in
+which files. With this database and some tools to manipulate it
+(described in this manual), a host of tasks become simpler and faster.
+For example, you can list all files containing a particular
+@code{#include} throughout a huge source hierarchy, search for all the
+memos containing references to a project, or automatically invoke an
+editor on all files containing references to some function. Anyone with
+a large software project to maintain, or a large set of text files to
+organize, can benefit from an ID database.
+
+Although the ID utilities are most commonly used with identifiers,
+numeric constants are also stored in the database, and can be searched
+for in the same way (independent of radix, if desired).
+
+There are a number of programs in the ID family:
+
+@table @code
+
+@item mkid
+scans files for identifiers and numeric constants and builds the ID
+database file.
+
+@item gid
+lists all lines that match given patterns.
+
+@item lid
+lists the filenames containing identifiers that match given patterns.
+
+@item aid
+lists the filenames containing identifiers that contain given strings,
+independent of case.
+
+@item eid
+invokes an editor on each file containing identifiers that match given
+patterns.
+
+@item fid
+lists all identifiers recorded in the database for given files, or
+identifiers common to two files.
+
+@item pid
+matches the filenames in the database, rather than the identifiers.
+
+@item idx
+helps with testing of new @code{mkid} scanners.
+
+@end table
+
+@cindex bugs, reporting
+Please report bugs to @samp{gkm@@magilla.cichlid.com}. Remember to
+include the version number, machine architecture, input files, and any
+other information needed to reproduce the bug: your input, what you
+expected, what you got, and why it is wrong. Diffs are welcome, but
+please include a description of the problem as well, since this is
+sometimes difficult to infer. @xref{Bugs, , , gcc, GNU CC}.
+
+@menu
+* Past and future:: How the ID tools came about, and where they're going.
+@end menu
+
+
+@node Past and future
+@section Past and future
+
+@cindex history
+
+@pindex look @r{and @code{mkid} 1}
+@cindex McGary, Greg
+Greg McGary conceived of the ideas behind mkid when he began hacking the
+Unix kernel in 1984. He needed a navigation tool to help him find his
+way around the expansive, unfamiliar landscape. The first @code{mkid}-like
+tools were shell scripts, and produced an ASCII database that looks much
+like the output of @code{lid} with no arguments. It took over an hour
+on a VAX 11/750 to build a database for a 4.1BSD-ish kernel. Lookups
+were done with the system utility @code{look}, modified to handle very
+long lines.
+
+In 1986, Greg rewrote @code{mkid}, @code{lid}, @code{fid} and @code{idx}
+in C to improve performance. Database-build times were shortened by an
+order of magnitude. The @code{mkid} tools were first posted to
+@samp{comp.sources.unix} in September 1987.
+
+@cindex Horsley, Tom
+@cindex Scofield, Doug
+@cindex Leonard, Bill
+@cindex Berry, Karl
+Over the next few years, several versions diverged from the original
+source. Tom Horsley at Harris Computer Systems Division stepped forward
+to take over maintenance and integrated some of the fixes from divergent
+versions. A first release of
+@code{mkid} @w{version 2} was posted to @file{alt.sources} near the end
+of 1990. At that time, Tom wrote this Texinfo manual with the
+encouragement the net community. (Tom especially thanks Doug Scofield
+and Bill Leonard whom he dragooned into helping poorfraed and
+edit---they found several problems in the initial version.) Karl Berry
+revamped the manual for Texinfo style, indexing, and organization in
+1995.
+
+@pindex cscope
+@pindex grep
+@cindex future
+In January 1995, Greg McGary reemerged as the primary maintaner and
+launched development of @code{mkid} version 3, whose primary new feature
+is an efficient algorithm for building databases that is linear in both
+time and space over the size of the input text. (The old algorithm was
+quadratic in space and therefore choked on very large source trees.)
+The code is released under the GNU Public License, and might become a
+part of the GNU system. @code{mkid} 3 is an interim release, since
+several significant enhancements are still in the works: an optional
+coupling with GNU @code{grep}, so that @code{grep} can use an ID
+database for hints; a @code{cscope} work-alike query interface;
+incremental update of the ID database; and an automatic file-tree walker
+so you need not explicitly supply every filename argument to the
+@code{mkid} program.
+
+
+@node mkid invocation
+@chapter @code{mkid}: Creating ID databases
+
+@pindex mkid
+@cindex creating databases
+@cindex databases, creating
+
+@pindex cron
+The @code{mkid} program builds an ID database. To do this, it must scan
+each file you tell it to include in the database. This takes some time,
+but once the work is done the query programs run very rapidly. (You can
+run @code{mkid} as a @code{cron} job to regularly update your
+databases.)
+
+The @code{mkid} program knows how to extract identifiers from various
+types of files. For example, it can recognize and skip over comments
+and string constants in a C program.
+
+@cindex numbers, in databases
+Identifiers are not the only thing included in the database. Numbers
+are also recognized and included in the database indexed by their binary
+value. This feature allows you to find uses of constants without regard
+to the radix used to specify them, since the same number can frequently
+be written in many different ways (for instance, @samp{47}, @samp{0x2f},
+@samp{057} in C).
+
+All the places in this document which mention identifiers should really
+mention both identifiers and numbers, but that gets fairly clumsy after
+a while, so you just need to keep in mind that numbers are included in
+the database as well as identifiers.
+
+@cindex ID file format
+@cindex architecture-independence
+@cindex sharing ID files
+The ID files that @code{mkid} creates are architecture- and
+byte-order-independent; you can share them at will across systems.
+
+@menu
+* mkid options:: Command-line options to mkid.
+* Scanners:: Built-in and defining your own.
+* mkid examples:: Examples of mkid usage.
+@end menu
+
+
+@node mkid options
+@section @code{mkid} options
+
+@cindex options for @code{mkid}
+@pindex mkid @r{options}
+
+By default, @code{mkid} scans the files you specify and writes the
+database to a file named @file{ID} in the current directory.
+
+@example
+mkid [-v] [-S@var{scanarg}] [-a@var{argfile}] [-] [-f@var{idfile}] @c
+@var{files}@dots{}
+@end example
+
+The program accepts the following options.
+
+@table @samp
+
+@item -v
+@opindex -v
+@cindex statistics
+Verbose. @code{mkid} tells you as it scans each file and indicates
+which scanner it is using. It also summarizes some statistics about the
+database at the end.
+
+@item -S@var{scanarg}
+@opindex -S@var{scanarg}
+Specify options regarding @code{mkid}'s scanners. @xref{Scanner option
+formats}.
+
+@item -a@var{argfile}
+@opindex -a@var{argfile}
+Read additional command line arguments from @var{argfile}. This is
+typically used to specify lists of filenames longer than will fit on a
+command line; some systems have severe limitations on the total length
+of a command line.
+
+@item -
+@opindex -
+Read additional command line arguments from standard input.
+
+@item -f@var{idfile}
+Write the database to the file @var{idfile}, instead of @file{ID}. The
+database stores filenames relative to the directory containing the
+database, so if you move the database to a different directory after
+creating it, you may have trouble finding files.
+
+@c @item -u
+@c @opindex -u
+@c The @code{-u} option updates an existing database by rescanning any
+@c files that have changed since the database was written. Unfortunately
+@c you cannot incrementally add new files to a database.
+@c Greg is reimplementing this ...
+
+@end table
+
+The remaining arguments @var{files} are the files to be scanned and
+included in the database. If no files are given at all (either on
+command line or via @samp{-a} or @samp{-}), @code{mkid} does nothing.
+
+
+@node Scanners
+@section Scanners
+
+@cindex scanners
+
+To determine which identifiers to extract from a file and store in the
+database, @code{mkid} calls a @dfn{scanner}; we say a scanner
+@dfn{recognizes} a particular language. Scanners for several languages
+are built-in to @code{mkid}; you can add your own scanners as well, as
+explained in the sections below.
+
+@cindex suffixes of filenames
+@code{mkid} determines which scanner to use for a particular file by
+looking at the suffix of the filename. This @dfn{suffix} is everything
+after and including the last @samp{.} in a filename; for example, the
+suffix of @file{foo.c} is @file{.c}. @code{mkid} has a built-in list of
+bindings from some suffixes to corresponding scanners; for example,
+@file{.c} files are (not surprisingly) scanned by the predefined C
+language scanner.
+
+@findex .default @r{scanner}
+If @code{mkid} cannot determine what scanner to use for a particular
+file, either because the file has no suffix (e.g., @file{foo}) or
+because @code{mkid} has no binding for the file's suffix (e.g.,
+@file{foo.bar}), it uses the scanner bound to the @samp{.default}
+suffix. By default, this is the plain text scanner (@pxref{Plain text
+scanner}), but you can change this with the @samp{-S} option, as
+explained below.
+
+@menu
+* Scanner option formats:: Overview of the -S option.
+* Predefined scanners:: The C, plain text, and assembler scanners.
+* Defining new scanners:: Either in source code or at runtime with -S.
+* idx invocation:: Testing mkid scanners.
+@end menu
+
+
+@node Scanner option formats
+@subsection Scanner option formats
+
+@cindex scanner options
+@opindex -S @r{scanner option}
+
+With the @samp{-S} option, you can change which language scanner to use
+for which files, give language-specific options, and get some limited
+online help about scanner options.
+
+Here are the different forms of the @samp{-S} option:
+
+@table @samp
+
+@item -S.@var{suffix}=@var{scanner}
+@opindex -S.
+Use @var{scanner} for a file with the given @samp{.@var{suffix}}. For
+example, @samp{-S.yacc=c} tells @code{mkid} to use the @samp{c} language
+scanner for all files ending in @samp{.yacc}.
+
+@item -S.@var{suffix}=?
+Display which scanner is used for the given @samp{.@var{suffix}}.
+
+@item -S?=@var{scanner}
+@opindex -S?
+Display which suffixes @var{scanner} is used for.
+
+@item -S?=?
+Display the scanner binding for every known suffix.
+
+@item -S@var{scanner}+@var{arg}
+@itemx -S@var{scanner}-@var{arg}
+Each scanner accepts certain scanner-dependent arguments. These options
+all have one of these forms. @xref{Predefined scanners}.
+
+@item -S@var{scanner}?
+Display the scanner-specific options accepted by @var{scanner}.
+
+@item -S@var{new-scanner}/@var{old-scanner}/@var{filter-command}
+Define @var{new-scanner} in terms of @var{old-scanner} and
+@var{filter-command}. @xref{Defining scanners with options}.
+
+@end table
+
+
+@node Predefined scanners
+@subsection Predefined scanners
+
+@cindex predefined scanners
+@cindex scanners, predefined
+
+@code{mkid} has built-in scanners for several types of languages; you
+can get the list by running @code{mkid -S?=?}.
+The supported languages are documented
+below@footnote{This is not strictly true: @samp{vhil} is a supported
+language, but it is an obsolete and arcane dialect of C and should be
+ignored.}.
+
+@menu
+* C scanner:: For the C programming language.
+* Plain text scanner:: For documents or other non-source code.
+* Assembler scanner:: For assembly language.
+@end menu
+
+
+@node C scanner
+@subsubsection C scanner
+
+@cindex C scanner, predefined
+@flindex .[chly] @r{files, scanning}
+
+The C scanner is the most commonly used. Files with the usual @file{.c}
+and @file{.h} suffixes, and the @file{.y} (yacc) and @file{.l} (lex)
+suffixes, are processed with this scanner (by default).
+
+Scanner-specific options:
+
+@table @samp
+
+@item -Sc-s@var{character}
+@kindex $ @r{in identifiers}
+@opindex -Sc-s
+Allow the specified @var{character} in identifiers. For example, if you
+use @samp{$} in identifiers, you'll want to use @samp{-Sc-s$}.
+
+@item -Sc+u
+@opindex -Sc+u
+Strip leading underscores from identifiers. You might to do this in
+peculiar circumstances, such as trying to parse the output from
+@code{nm} or some other system utility.
+
+@item -Sc-u
+@opindex -Sc-u
+Don't strip leading underscores from identifiers; this is the default.
+
+@end table
+
+
+@node Plain text scanner
+@subsubsection Plain text scanner
+
+@cindex plain text scanner
+
+The plain text scanner is intended for scanning most non-source-code
+files. This is typically the scanner used when adding custom scanners
+via @samp{-S} (@pxref{Defining scanners with options}).
+
+@c @code{mkid} predefines a troff scanner in terms of the plain text
+@c scanner and
+@c the @code{deroff} utility.
+@c A compressed man page
+@c scanner runs @code{pcat} piped into @code{col -b}, and a @TeX{} scanner
+@c runs @code{detex}.
+
+Scanner-specific options:
+
+@table @samp
+
+@item -Stext+a@var{character}
+@opindex -Stext+a
+Include @var{character} in identifiers. By default, letters (a--z and
+A--Z) and underscore are included.
+
+@item -Stext-a@var{character}
+@opindex -Stext-a
+Exclude @var{character} from identifiers.
+
+@item -Stext+s@var{character}
+@opindex -Stext+s
+@cindex squeezing characters from identifiers
+Squeeze @var{character} from identifiers, i.e., do not terminate an
+identifier when @var{character} is seen. By default, the characters
+@samp{'}, @samp{-}, and @samp{.} are squeezed out of identifiers. For
+example, the input @samp{fred's} leads to the identifier @samp{freds}.
+
+@item -Stext-s@var{character}
+Do not squeeze @var{character}.
+
+@end table
+
+
+@node Assembler scanner
+@subsubsection Assembler scanner
+
+@cindex assembler scanner
+
+Since assembly languages come in several flavors, this scanner has a
+number of options:
+
+@table @samp
+
+@item -Sasm-c@var{character}
+@opindex -Sasm-c
+@cindex comments in assembler
+Define @var{character} as starting a comment that extends to the end of
+the input line; no default. In many assemblers this is @samp{;} or
+@samp{#}.
+
+@item -Sasm+u
+@itemx -Sasm-u
+@opindex -Sasm+u
+Strip (@samp{+u}) or do not strip (@samp{-u}) leading underscores from
+identifiers. The default is to strip them.
+
+@item -Sasm+a@var{character}
+@opindex -Sasm+a
+Allow @var{character} in identifiers.
+
+@item -Sasm-a@var{character}
+Allow @var{character} in identifiers, but if an identifier contains
+@var{character}, ignore it. This is useful to ignore temporary labels,
+which can be generated in great profusion; these often contain @samp{.}
+or @samp{@@}.
+
+@item -Sasm+p
+@itemx -Sasm-p
+@opindex -Sasm+p
+Recognize (@samp{+p}) or do not recognize (@samp{-p}) C preprocessor
+directives in assembler source. The default is to recognize them.
+
+@item -Sasm+C
+@itemx -Sasm-C
+@opindex -Sasm+C
+Skip over (@samp{+C}) or do not skip over (@samp{-C}) C style comments
+in assembler source. The default is to skip them.
+
+@end table
+
+
+@node Defining new scanners
+@subsection Defining new scanners
+
+@cindex scanners, adding new
+
+You can add new scanners to @code{mkid} in two ways: modify the source
+code and recompile, or at runtime via the @samp{-S} option. Each has
+their advantages and disadvantages, as explained below.
+
+If you create a new scanner that would be of use to others, please
+consider sending it back to the maintainer,
+@samp{gkm@@magilla.cichlid.com}, for inclusion in future releases of
+@code{mkid}.
+
+@menu
+* Defining scanners in source code::
+* Defining scanners with options::
+@end menu
+
+
+@node Defining scanners in source code
+@subsubsection Defining scanners in source code
+
+@flindex scanners.c
+@cindex scanners, defining in source code
+
+@vindex languages_0
+@vindex suffixes_0
+To add a new scanner in source code, you should add a new section to the
+file @file{scanners.c}. Copy one of the existing scanners (most likely
+either C or plain text), and modify as necessary. Also add the new
+scanner to the @code{languages_0} and @code{suffixes_0} tables near the
+beginning of the file.
+
+This is not a terribly difficult programming task, but it requires
+recompiling and installing the new version of @code{mkid}, which may be
+inconvenient.
+
+This method leads to scanners which operate much more quickly than ones
+that depend on external programmers. It is also likely the easiest way
+to define scanners for new programming languages.
+
+
+@node Defining scanners with options
+@subsubsection Defining scanners with options
+
+@cindex scanners, defining with options
+
+You can use the @samp{-S} option on the command line to define a new
+language scanner:
+
+@example
+-S@var{new-scanner}/@var{existing-scanner}/@var{filter}
+@end example
+
+@noindent
+Here, @var{new-scanner} is the name of the new scanner being defined,
+@var{existing-scanner} is the name of an existing scanner, and
+@var{filter} is a shell command or pipeline.
+
+The new scanner works by passing the input file to @var{filter}, and
+then arranging for the result to be passed through
+@var{existing-scanner}. Typically, @var{existing-scanner} is @samp{text}.
+
+Somewhere within @var{filter}, the string@samp{%s} should occur. This
+@samp{%s} is replaced by the name of the source file being scanned.
+
+@cindex Texinfo, scanning example of
+For example, @code{mkid} has no built-in scanner for Texinfo files (like
+this one). In indexing a Texinfo file, you most likely would want
+to ignore the Texinfo @@-commands. Here's one way to specify a new
+scanner to do this:
+
+@example
+-S/texinfo/text/sed s,@@[a-z]*,,g %s
+@end example
+
+This defines a new language scanner (@samp{texinfo}) defined in terms of
+a @code{sed} command to strip out Texinfo directives (an @samp{@@}
+character followed by letters). Once the directives are stripped, the
+remaining text is run through the plain text scanner.
+
+This is a minimal example; to do a complete job, you would need to
+completely delete some lines, such as those beginning with @code{@@end}
+or @@node.
+
+
+@node idx invocation
+@subsection @code{idx}: Testing @code{mkid} scanners
+
+@code{idx} prints the identifiers found in the files you specify to
+standard output. This is useful in debugging new @code{mkid} scanners
+(@pxref{Scanners}). Synopsis:
+
+@example
+idx [-S@var{scanarg}] @var{files}@dots{}
+@end example
+
+@code{idx} accepts the same @samp{-S} options as @code{mkid}.
+@xref{Scanner option formats}.
+
+The name ``idx'' stands for ``ID eXtract''. The name may change in
+future releases, since this is such an infrequently used program.
+
+
+@node mkid examples
+@section @code{mkid} examples
+
+@cindex examples of @code{mkid}
+
+The simplest example of @code{mkid} is something like:
+
+@example
+mkid *.[chy]
+@end example
+
+This will build an ID database indexing identifiers and numbers in the
+all the @file{.c}, @file{.h}, and @file{.y} files in the current
+directory. Because @code{mkid} already knows how to scan files with
+those suffixes, no additional options are needed.
+
+@cindex man pages, compressed
+@cindex compressed files, building ID from
+Here's a more complex example. Suppose you want to build a database
+indexing the contents of all the @code{man} pages, and furthur suppose
+that your system is using @code{gzip} (@pxref{Top, , , gzip, Gzip}) to
+store compressed @code{cat} versions of the @code{man} pages in the
+directory @file{/usr/catman}. The @code{gzip} program creates files
+with a @code{.gz} suffix, so you must tell @code{mkid} how to scan
+@file{.gz} files. Here are the commands to do the job:
+
+@example
+cd /usr/catman
+find . -name \*.gz -print | mkid '-Sman/text/gzip <%s' -S.gz=man -
+@end example
+
+@noindent Explanation:
+
+@enumerate
+
+@item
+We first @code{cd} to @file{/usr/catman} so the ID database
+will store the correct relative filenames.
+
+@item
+The @code{find} command prints the names of all @file{.gz} files under
+the current directory. @xref{find invocation, , , sh-utils, GNU shell
+utilities}.
+
+@item
+This list is piped to @code{mkid}; the @code{-} option (at the end of
+the line) tells @code{mkid} to read arguments (in this case, as is
+typical, the list of filenames) from standard input. @xref{mkid options}.
+
+@item
+The @samp{-Sman/text/gzip @dots{}} defines a new language @samp{man} in
+terms of the @code{gzip} program and @code{mkid}'s existing text
+scanner. @xref{Defining scanners with options}.
+
+@item
+The @samp{-S.gz=man} tells @code{mkid} to treat all @file{.gz} files as
+this new language @code{man}. @xref{Scanner option formats}.
+
+@end enumerate
+
+As a further complication, @code{cat} pages typically contain
+underlining and backspace sequences, which will confuse @code{mkid}. To
+handle this, the @code{gzip} command becomes a pipeline, like this:
+
+@example
+mkid '-Sman/text/gzip <%s | col -b' -S.gz=man -
+@end example
+
+
+@node Common query arguments
+@chapter Common query arguments
+
+@cindex common query arguments
+
+Certain options, and regular expression syntax, are shared by the ID
+query tools. So we describe those things in the sections below, instead
+of repeating the description for each tool.
+
+@menu
+* Query options:: -f -r -c -ew -kg -n -doxa -m -F -u.
+* Patterns:: Regular expression syntax for searches.
+* Examples: Query examples. Some common uses.
+@end menu
+
+
+@node Query options
+@section Query options
+
+@cindex query options, common
+@cindex common query options
+
+The ID query tools (@emph{not} @code{mkid}) share certain command line
+options. Not all of these options are recognized by all programs, but
+if an option is used by more than one program, it is described below.
+The description of each program gives the options that program uses.
+
+@table @samp
+
+@item -f@var{idfile}
+@opindex -f@var{idfile}
+@cindex database name, specifying
+@cindex parent directories, searched for ID
+Read the database from @var{idfile}, in the current directory or in any
+directory above the current directory. The default database name is
+@file{ID}. Searching parent directories lets you have a single ID
+database at the root of a large source tree and then use the query tools
+from anywhere within that tree.
+
+@item -r@var{directory}
+@opindex -r@var{directory}
+Find files relative to @var{directory}, instead of the directory in
+which the ID database was found. This is useful if the ID database was
+moved after its creation.
+
+@item -c
+@opindex -c
+Equivalent to @code{-r`pwd`}, i.e., find files relative to the current
+directory, instead of the directory in which the ID database was found.
+
+@item -e
+@itemx -w
+@opindex -e
+@opindex -w
+@cindex regular expressions, forcing evaluation as
+@cindex strings, forcing evaluation as
+@cindex constant strings, forcing evaluation as
+@samp{-e} forces pattern arguments to be treated as regular expressions,
+and @samp{-w} forces pattern arguments to be treated as constant
+strings. By default, the query tools guess whether a pattern is regular
+expressions or constant strings by looking for special characters.
+@xref{Patterns}.
+
+@item -k
+@itemx -g
+@opindex -k
+@opindex -g
+@cindex brace notation in filename lists
+@cindex shell brace notation in filename lists
+@samp{-k} suppresses use of shell brace notation in the output. By
+default, the query tools that generate lists of filenames attempt to
+compress the lists using the usual shell brace notation, e.g.,
+@file{@{foo,bar@}.c} to mean @file{foo.c} and @file{bar.c}. (This is
+useful if you use @code{ksh} or the original (not GNU) @code{sh} and
+want to feed the list of names to another command, since those shells do
+not support this brace notation; the name of the @code{-k} option comes
+from the @code{k} in @code{ksh}).
+
+@samp{-g} turns on use of brace notation; this is only needed if the
+query tools were compiled with @samp{-k} as the default behavior.
+
+@item -n
+@opindex -n
+@cindex suppressing matching identifier
+Suppress the matching identifier before each list of filenames that the
+query tools output by default. This is useful if you want a list of just
+the names to feed to another command.
+
+@item -d
+@itemx -o
+@itemx -x
+@itemx -a
+@opindex -d
+@opindex -o
+@opindex -x
+@opindex -a
+@cindex radix of numeric matches, specifying
+@cindex numeric matches, specifying radix of
+These options may be used in any combination to specify the radix of
+numeric matches. @samp{-d} allows matching on decimal numbers,
+@samp{-o} on octal numbers, and @samp{-x} on hexadecimal numbers. The
+@code{-a} option is equivalent to specifying all three; this is the
+default. Any combination of these options may be used.
+
+@item -m
+@opindex -m
+@cindex multiple lines, merging
+Merge multiple lines of output into a single line. If your query
+matches more than one identifier, the default is to generate a separate
+line of output for each matching identifier.
+
+@itemx -F-
+@itemx -F@var{n}
+@itemx -F-@var{m}
+@itemx -F@var{n}-@var{m}
+@opindex -F
+@cindex single matches, showing
+Show identifiers matching at least @var{n} and at most @var{m} times.
+@samp{-F-} is equivalent to @samp{-F1}, i.e., find identifiers that
+appear only once in the database. (This is useful to locate identifiers
+that are defined but never used, or used once and never defined.)
+
+@item -u@var{number}
+@opindex -u
+@cindex conflicting identifiers, finding
+List identifiers that conflict in the first @var{number} characters.
+This could be in useful porting programs to brain-dead computers that
+refuse to support long identifiers, but your best long term option is to
+set such computers on fire.
+
+@end table
+
+
+@node Patterns
+@section Patterns
+
+@cindex patterns
+@cindex regular expression syntax
+
+@dfn{Patterns}, also called @dfn{regular expressions}, allow you to
+match many different identifiers in a single query.
+
+The same regular expression syntax is recognized by all the query tools
+that handle regular expressions. The exact syntax depends on how the ID
+tools were compiled, but the following constructs should always be
+supported:
+
+@table @samp
+
+@item .
+Match any single character.
+
+@item [@var{chars}]
+Match any of the characters specified within the brackets. You can
+match any characters @emph{except} the ones in brackets by typing
+@samp{^} as the first character. A range of characters can be specified
+using @samp{-}. For example, @samp{[abc]} and @samp{[a-c]} both match
+@samp{a}, @samp{b}, or @samp{c}, and @samp{[^abc]} matches anything
+@emph{except} @samp{a}, @samp{b}, or @samp{c}.
+
+@item *
+Match the previous construct zero or more times.
+
+@item ^
+@itemx $
+@samp{^} (@samp{$}) at the beginning (end) of a pattern anchors the
+match to the first (last) character of the identifier.
+
+@end table
+
+The query programs use either the @code{regex}/@code{regcmp} or
+@code{re_comp}/@code{re_exec} functions, depending on which are
+available in the library on your system. These do not always support
+the exact same regular expression syntax, so consult your local
+@code{man} pages to find out.
+
+
+@node Query examples
+@section Query examples
+
+@cindex examples, queries
+@cindex query examples
+Here are some examples of the options described in the previous
+sections.
+
+To restrict searches to exact matches, use @samp{^@dots{}$}. For example:
+
+@example
+prompt$ gid '^FILE$'
+ansi2knr.c:144: @{ FILE *in, *out;
+ansi2knr.c:315: FILE *out;
+fid.c:38: FILE *id_FILE;
+filenames.c:576: FILE *
+@dots{}
+@end example
+
+To show identifiers not unique in the first 16 characters:
+
+@example
+prompt$ lid -u16
+RE_CONTEXT_INDEP_ANCHORS regex.c
+RE_CONTEXT_INDEP_OPS regex.c
+RE_SYNTAX_POSIX_BASIC regex.c
+RE_SYNTAX_POSIX_EXTENDED regex.c
+@dots{}
+@end example
+
+@cindex numeric searches
+Numbers are searched for numerically rather than textually. For example:
+
+@example
+prompt$ lid 0xff
+0377 @{lid,regex@}.c
+0xff @{bitops,fid,lid,mkid@}.c
+255 regex.c
+@end example
+
+On the other hand, you can restrict a numeric search to a particular
+radix if you want:
+
+@example
+laurie$ lid -x 0xff
+0xff @{bitops,fid,lid,mkid@}.c
+@end example
+
+Filenames in the output are always adjusted to be correct for the
+correct working directory. For example:
+
+@example
+prompt$ lid bdevsw
+bdevsw sys/conf.h cf/conf.c io/bio.c os/@{fio,main,prf,sys3@}.c
+prompt$ cd io
+prompt$ lid bdevsw
+bdevsw ../sys/conf.h ../cf/conf.c bio.c ../os/@{fio,main,prf,sys3@}.c
+@end example
+
+
+@node gid invocation
+@chapter @code{gid}: Listing matching lines
+
+Synopsis:
+
+@example
+gid [-f@var{file}] [-u@var{n}] [-r@var{dir}] [-doxasc] [@var{pattern}@dots{}]
+@end example
+
+@code{gid} finds the identifiers in the database that match the
+specified @var{pattern}s, then searches for all occurrences of those
+identifiers, in only the files containing matches. In a large source
+tree, this saves an enormous amount of time (compared to searching every
+source file).
+
+With no @var{pattern} arguments, @code{gid} prints every line of every
+source file.
+
+The name ``gid'' stands for ``grep for identifiers'', @code{grep} being
+the standard utility to search regular files.
+
+@xref{Common query arguments}, for a description of the command-line
+options and @var{pattern} arguments.
+
+@code{gid} uses the standard GNU output format for identifying source lines:
+
+@example
+@var{filename}:@var{linenum}: @var{text}
+@end example
+
+Here is an example:
+
+@example
+prompt$ gid FILE
+ansi2knr.c:144: @{ FILE *in, *out;
+ansi2knr.c:315: FILE *out;
+fid.c:38: FILE *id_FILE;
+@dots{}
+@end example
+
+@menu
+* GNU Emacs gid interface:: Using next-error with gid.
+@end menu
+
+
+@node GNU Emacs gid interface
+@section GNU Emacs @code{gid} interface
+
+@cindex Emacs interface to @code{gid}
+@flindex gid.el @r{interface to Emacs}
+
+@vindex load-path
+The @code{mkid} source distribution comes with a file @file{gid.el},
+which defines a GNU Emacs interface to @code{gid}. To install it, put
+@file{gid.el} somewhere that Emacs will find it (i.e., in your
+@code{load-path}) and put
+
+@example
+(autoload 'gid "gid" nil t)
+@end example
+
+@noindent in one of Emacs' initialization files, e.g., @file{~/.emacs}.
+You will then be able to use @kbd{M-x gid} to run the command.
+
+@findex gid @r{Emacs function}
+The @code{gid} function prompts you with the word around point. If you
+want to search for something else, simply delete the line and type the
+pattern of interest.
+
+@flindex *scratch* @r{Emacs buffer}
+The function then runs the @code{gid} program in a @samp{*compilation*}
+buffer, so the normal @code{next-error} function can be used to visit
+all the places the identifier is found (@pxref{Compilation,,, emacs, The
+GNU Emacs Manual}).
+
+
+@node Looking up identifiers
+@chapter Looking up identifiers
+
+These commands look up identifiers in the ID database and operate on the
+files containing matches.
+
+@menu
+* lid invocation:: Matching patterns.
+* aid invocation:: Matching strings.
+* eid invocation:: Invoking an editor on matches.
+* fid invocation:: Listing a file's identifiers.
+@end menu
+
+
+@node lid invocation
+@section @code{lid}: Matching patterns
+
+@pindex lid
+
+Synopsis:
+
+@example
+lid [-f@var{file}] [-u@var{n}] [-r@var{dir}] [-mewdoxaskgnc] @c
+@var{pattern}@dots{}
+@end example
+
+@code{lid} searches the database for identifiers matching the given
+@var{pattern} arguments and prints the names of the files that match
+each @var{pattern}. With no @var{pattern}s, @code{lid} lists every
+entry in the database.
+
+The name ``lid'' stands for ``lookup identifier''.
+
+@xref{Common query arguments}, for a description of the command-line
+options and @var{pattern} arguments.
+
+By default, each line of output consists of an identifier and all the
+files containing that identifier.
+
+Here is an example showing a search for a single identifier (omitting
+some output to keep lines short):
+
+@example
+prompt$ lid FILE
+FILE extern.h @{fid,gets0,getsFF,idx,init,lid,mkid,@dots{}@}.c
+@end example
+
+This example shows a regular expression search:
+
+@example
+prompt$ lid 'FILE$'
+AF_FILE mkid.c
+AF_IDFILE mkid.c
+FILE extern.h @{fid,gets0,getsFF,idx,init,lid,mkid,@dots{}@}.c
+IDFILE id.h @{fid,lid,mkid@}.c
+IdFILE @{fid,lid@}.c
+@dots{}
+@end example
+
+@noindent As you can see, when a regular expression is used, it is
+possible to get more than one line of output. To merge multiple lines
+into one, use @samp{-m}:
+
+@example
+prompt$ lid -m ^get
+^get extern.h @{bitsvec,fid,gets0,getsFF,getscan,idx,lid,@dots{}@}.c
+@end example
+
+
+@node aid invocation
+@section @code{aid}: Matching strings
+
+@pindex aid
+
+Synopsis:
+
+@example
+aid [-f@var{file}] [-u@var{n}] [-r@var{dir}] [-mewdoxaskgnc] @c
+@var{string}@dots{}
+@end example
+
+@cindex case-insensitive searching
+@cindex string searching
+@code{aid} searches the database for identifiers containing the given
+@var{string} arguments. The search is case-insensitive.
+
+@flindex whatis
+The name ``aid'' stands for ``apropos identifier'', @code{apropros}
+being a command that does a similar search of the @code{whatis} database
+of @code{man} descriptions.
+
+For example, @samp{aid get} matches the identifiers @code{fgets},
+@code{GETLINE}, and @code{getchar}.
+
+The default output format is the same as @code{lid}; see the previous
+section.
+
+@xref{Common query arguments}, for a description of the command-line
+options and @var{pattern} arguments.
+
+
+@node eid invocation
+@section @code{eid}: Invoking an editor on matches
+
+@pindex eid
+
+Synopsis:
+
+@example
+eid [-f@var{file}] [-u@var{n}] [-r@var{dir}] [-doxasc] [@var{pattern}]@dots{}
+@end example
+
+@code{eid} runs the usual search (@pxref{lid invocation}) on the given
+arguments, shows you the output, and then asks:
+
+@example
+Edit? [y1-9^S/nq]
+@end example
+
+@noindent
+You can respond with:
+
+@table @samp
+@item y
+Edit all files listed.
+
+@item 1@dots{}9
+Edit all files starting at the @math{@var{n} + 1}'st file.
+
+@item /@var{string} @r{or} @kbd{CTRL-S}@var{string}
+Edit all files whose name contains @var{string}.
+
+@item n
+Go on to the next @var{pattern}, i.e., edit no files for this one.
+
+@item q
+Quit @code{eid}.
+
+@end table
+
+@code{eid} invokes an editor once per @var{pattern}; all the specified
+files are given to the editor for you to edit simultaneously.
+
+@code{eid} invokes the editor defined by the @samp{EDITOR} environment
+variable. If the editor can accept an initial search argument on the
+command line, @code{eid} moves automatically to the location of the
+match, via the environment variables below.
+
+@xref{Common query arguments}, for a description of the command-line
+options and @var{pattern} arguments.
+
+Here are the environment variables relevant to @code{eid}:
+
+@table @samp
+
+@item EDITOR
+@vindex EDITOR
+The name of the editor program to invoke.
+
+@item EIDARG
+@vindex EIDARG
+@cindex search for identifier, initial
+The argument to pass to the editor to search for the matching
+identifier. For @code{vi}, this should be @samp{+/%s/'}.
+
+@item EIDLDEL
+@vindex EIDLDEL
+@cindex left delimiter editor argument
+@cindex beginning-of-word editor argument
+A regular expression to force a match at the beginning of a word (``left
+delimiter). @code{eid} inserts this in front of the matching identifier
+when composing the search argument. For @code{vi}, this should be
+@samp{\<}.
+
+@item EIDRDEL
+@vindex EIDRDEL
+@cindex right delimiter editor argument
+@cindex end-of-word editor argument
+The end-of-word regular expression. For @code{vi}, this should be
+@samp{\>}.
+
+@end table
+
+For Emacs users, the interface in @code{gid.el} is probably preferable
+to @code{eid}. @xref{GNU Emacs gid interface}.
+
+
+Here is an example:
+
+@example
+prompt$ eid FILE \^print
+FILE @{ansi2knr,fid,filenames,idfile,idx,lid,misc,@dots{}@}.c
+Edit? [y1-9^S/nq] n
+^print @{ansi2knr,fid,getopt,getopt1,lid,mkid,regex,scanners@}.c
+Edit? [y1-9^S/nq] 2
+@end example
+
+@noindent This will start editing at @file{getopt}.c.
+
+
+@node fid invocation
+@section @code{fid}: Listing a file's identifiers
+
+@pindex fid
+@cindex identifiers in a file
+
+@code{fid} lists the identifiers found in a given file. Synopsis:
+
+@example
+fid [-f@var{dbfile}] @var{file1} [@var{file2}]
+@end example
+
+@table @samp
+
+@item -f@var{dbfile}
+Read the database from @var{dbfile} instead of @file{ID}.
+
+@item @var{file1}
+List all the identifiers contained in @var{file1}.
+
+@item @var{file2}
+With a second file argument, list only the identifiers both files have
+in common.
+
+@end table
+
+The output is simply one identifier (or number) per line.
+
+
+@node pid invocation
+@chapter @code{pid}: Looking up filenames
+
+@pindex pid
+@cindex filenames, matching
+@cindex matching filenames
+
+@code{pid} matches the filenames stored in the ID database, rather than
+the identifiers. Synopsis:
+
+@example
+pid [-f@var{dbfile}] [-r@var{dir}] [-ebkgnc] @var{wildcard}@dots{}
+@end example
+
+By default, the @var{wildcard} patterns are treated as shell globbing
+patterns, rather than the regular expressions the other utilities
+accept. See the section below for details.
+
+Besides the standard options given in the synopsis (@pxref{Query
+options}), @code{pid} accepts the following:
+
+@table @samp
+
+@item -e
+@opindex -e
+Do the usual regular expression matching (@pxref{Patterns}), instead
+of shell wildcard matching.
+
+@item -b
+@opindex -b
+@cindex basename match
+Match the basenames of the files in the database. For example,
+@samp{pid -b foo} will match the stored filename @file{dir/foo}, but not
+@file{foo/file}.
+
+@end table
+
+For example, the command:
+
+@example
+pid \*.c
+@end example
+
+@noindent lists all the @file{.c} files in the database. (The @samp{\}
+here protects the @samp{*} from being expanded by the shell.)
+
+@menu
+* Wildcard patterns:: Shell-style globbing patterns.
+@end menu
+
+
+@node Wildcard patterns
+@section Wildcard patterns
+
+@cindex globbing patterns
+@cindex shell wildcard patterns
+@cindex wildcard wildcard patterns
+
+@code{pid} does simplified shell wildcard matching (unless the @samp{-e}
+option is specified), rather than the regular expression matching done
+by the other utilities. Here is a description of wildcard matching,
+also called @dfn{globbing}:
+
+@itemize @bullet
+
+@item
+@kindex * @r{in globbing}
+@samp{*} matches zero or more characters.
+
+@item
+@kindex ? @r{in globbing}
+@samp{?} matches any single character.
+
+@item
+@kindex \ @r{in globbing}
+@samp{\} forces the next character to be taken literally.
+
+@item
+@kindex [@dots{}] @r{in globbing}
+@samp{[@var{chars}]} matches any single character listed in @var{chars}.
+
+@item
+@kindex [!@dots{}] @r{in globbing}
+@samp{[!@var{chars}]} matches any character @emph{not} listed in @var{chars}.
+
+@end itemize
+
+Most shells treat @samp{/} and leading @samp{.} characters
+specially. @code{pid} does not do this. It simply matches the filename
+in the database against the wildcard pattern.
+
+
+@node Index
+@unnumbered Index
+
+@printindex cp
+
+@contents
+@bye