1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
|
\input texinfo
@comment %**start of header (This is for running Texinfo on a region.)
@setfilename mkid.info
@settitle The ID Database
@setchapternewpage odd
@comment %**end of header (This is for running Texinfo on a region.)
@include version.texi
@ifinfo
@format
START-INFO-DIR-ENTRY
* mkid: (mkid). Identifier database utilities
END-INFO-DIR-ENTRY
@end format
@end ifinfo
@ifinfo
This file documents the @code{mkid} identifier database utilities.
Copyright (C) 1991 Tom Horsley
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.
@ignore
Permission is granted to process this file through TeX and print the
results, provided the printed document carries copying permission
notice identical to this one except for the removal of this paragraph
(this paragraph not being relevant to the printed manual).
@end ignore
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation.
@end ifinfo
@titlepage
@title The MKID Identifier Database, version @value{VERSION}
@subtitle A Simple, Fast, High-Capacity Cross-Referencer
@subtitle lid, gid, aid, eid, pid, iid
@author by Tom Horsley
@page
@vskip 0pt plus 1filll
Copyright @copyright{} 1991 Tom Horsley
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation.
@end titlepage
@ifinfo
@node Top, Overview, (dir), (dir)
@top GNU @code{mkid}
@menu
* Overview:: What is an ID database and what tools manipulate it?
* Mkid:: Mkid
* Database Query Tools:: Database Query Tools
* Iid:: Iid
* Other Tools:: Other Tools
* Command Index:: Command Index
@end menu
@end ifinfo
@node Overview, Mkid, Top, Top
@chapter Overview
@cindex Reference to First Chapter
An ID database is simply a file containing a list of file names, a list of
identifiers, and a binary relation (stored as a bit matrix) indicating which
of the identifiers appear in each file. With this database and some tools
to manipulate the data, a host of tasks become simpler and faster. You can
@code{grep} through hundreds of files for a name, skipping the files that
don't contain the name. You can search for all the memos containing
references to a project. You can edit every file that calls some function,
adding a new required argument. Anyone with a large software project to
maintain, or a large set of text files to organize can benefit from the ID
database and the tools that manipulate it.
There are several programs in the ID family. The @code{mkid} program
scans the files, finds the identifiers and builds the ID database. The
@code{lid} and @code{aid} tools are used to generate lists of file names
containing an identifier (perhaps to recompile every file that
references a macro which just changed). The @code{eid} program will
invoke an editor on each of the files containing an identifier and the
@code{gid} program will @code{grep} for an identifier in the subset of
files known to contain it. The @code{pid} tool is used to query the
path names of the files in the database (rather than the contents).
Finally, the @code{iid} tool is an interactive program supporting
complex queries to intersect and join sets of file names.
@menu
* History:: History
@end menu
@node History, , Overview, Overview
@section History
Greg McGary conceived of the ideas behind mkid when he began hacking
the UNIX kernel in 1984. He needed a navigation tool to help him find
his way the expansive, unfamiliar landscape. The first mkid-like tools
were built with shell scripts, and produced an ascii database that looks
much like the output of `lid' with no arguments. It took over an hour
on a VAX 11/750 to build a database for a 4.1BSDish kernel. Lookups were
done with the UNIX command @code{look}, modified to handle very long lines.
In 1986, Greg rewrote mkid, lid, fid and idx in C to improve
performance. Database-build times were shortened by an order of
magnitude. The mkid tools were first posted to @file{comp.sources.unix}
September of 1987.
Over the next few years, several versions diverged from the original
source. Tom Horsley at Harris Computer Systems Division stepped forward
to take over maintenance and integrated some of the fixes from divergent
versions. He also wrote the @code{iid} program. A pre-release of
@code{mkid2} was posted to @file{alt.sources} near the end of 1990. At
that time Tom wrote this texinfo manual with the encouragement the net
community. (Tom thanks Doug Scofield and Bill Leonard whom I dragooned
into helping me poorf raed and edit --- they found several problems in
the initial version.)
In January, 1995, Greg McGary reemerged as the primary maintaner and is
hereby launching @code{mkid-3} whose primary new feature is an efficient
algorithm for building databases that is linear over the size of the
input text for both time and space. (The old algorithm was quadratic
for space and choked on very large source trees.) The code is now under
GPL and might become a part of the GNU system. @code{Mkid-3} is an
interim release, since several significant enhacements are in the works.
These include an optional coupling with GNU grep, so that grep can use
an ID database for hints; a cscope work-alike query interface;
incremental update of the ID database; and an automatic file-tree walker
so you need not explicitly supply every file name argument to
the @code{mkid} program.
@node Mkid, Database Query Tools, Overview, Top
@chapter Mkid
The @code{mkid} program builds the ID database. To do this it must scan
each of the files included in the database. This takes some time, but
once the work is done the query programs run very rapidly.
The @code{mkid} program knows how to scan a variety of of files. For
example, it knows how to skip over comments and strings in a C program,
only picking out the identifiers used in the code.
Identifiers are not the only thing included in the database.
Numbers are also scanned and included in the database indexed by
their binary value. Since the same number can be written many
different ways (47, 0x2f, 057 in a C program for instance), this
feature allows you to find hard coded uses of constants without
regard to the radix used to specify them.
All the places in this document where identifiers are written about
should really mention identifiers and numbers, but that gets fairly
clumsy after a while, so you should always keep in mind that numbers are
included in the database as well as identifiers.
@menu
* Mkid Command Line Options:: Mkid Command Line Options
* Builtin Scanners:: Builtin Scanners
* Adding Your Own Scanner:: Adding Your Own Scanner
* Mkid Examples:: Mkid Examples
@end menu
@node Mkid Command Line Options, Builtin Scanners, Mkid, Mkid
@section Mkid Command Line Options
@deffn Command mkid [@code{-v}] [@code{-S@var{scanarg}}] [@code{-a@var{arg-file}}] [@code{-}] [@code{-f@var{out-file}}] [@code{-u}] [@code{files}@dots{}]
@table @code
@item -v
Verbose. Mkid tells you as it scans each file and indicates which scanner
it is using. It also summarizes some statistics about the database at
the end.
@item -S@var{scanarg}
The @code{-S} option is used to specify arguments to the various language
scanners. @xref{Scanner Arguments}, for details.
@item -a@var{arg-file}
Name a file containing additional command line arguments (one per line). This
may be used to specify lists of file names longer than will fit on a command
line.
@item -
A simple @code{-} by itself means read arguments from stdin.
@item -f@var{out-file}
Specify the name of the database file to create. The default name is @code{ID}
(in the current directory), but you may specify any name. The file names
stored in the database will be stored relative to the directory containing
the database, so if you move the database after creating it, you may have
trouble finding files unless they remain in the same relative position.
@item -u
The @code{-u} option updates an existing database by rescanning any files
that have changed since the database was written. Unfortunately you cannot
incrementally add new files to a database.
@item files
Remaining arguments are names of files to be scanned and included in the
database.
@end table
@end deffn
@menu
* Scanner Arguments:: Scanner Arguments
@end menu
@node Scanner Arguments, , Mkid Command Line Options, Mkid Command Line Options
@subsection Scanner Arguments
Scanner arguments all start with @code{-S}. Scanner arguments are used to tell
@code{mkid} which language scanner to use for which files, to pass language
specific options to the individual scanners, and to get some limited
online help about scanner options.
@code{Mkid} usually determines which language scanner to use on a file
by looking at the suffix of the file name. The suffix starts at the last
@samp{.} in a file name and includes the @samp{.} and all remaining
characters (for example the suffix of @file{fred.c} is @file{.c}). Not
all files have a suffix, and not all suffixes are bound to a specific
language by mkid. If @code{mkid} cannot determine what language a file
is, it will use the language bound to the @file{.default} suffix. The
plain text scanner is normally bound to @file{.default}, but the
@code{-S} option can be used to change any language bindings.
There are several different forms for scanner options:
@table @code
@item -S.@var{<suffix>}=@var{<language>}
@code{Mkid} determines which language scanner to use on a file by examining the
file name suffix. The @samp{.} is part of the suffix and must be specified
in this form of the @code{-S} option. For example @samp{-S.y=c} tells
@code{mkid} to use the @samp{c} language scanner for all files ending in
the @samp{.y} suffix.
@item -S.@var{<suffix>}=?
@code{Mkid} has several built in suffixes it already recognizes. Passing
a @samp{?} will cause it to print the language it will use to scan files
with that suffix.
@item -S?=@var{<language>}
This form will print which suffixes are scanned with the given language.
@item -S?=?
This prints all the suffix@expansion{}language bindings recognized by
@code{mkid}.
@item -S@var{<language>}-@var{<arg>}
Each language scanner accepts scanner dependent arguments. This form of the
@code{-S} option is used to pass arbitrary arguments to the language scanners.
@item -S@var{<language>}?
Passing a @samp{?} instead of a language option will print a brief summary
of the options recognized by the specified language scanner.
@item -S@var{<new language>}/@var{<builtin language>}/@var{<filter command>}
This form specifies a new language defined in terms of a builtin language
and a shell command that will be used to filter the file prior to passing
on to the builtin language scanner.
@end table
@node Builtin Scanners, Adding Your Own Scanner, Mkid Command Line Options, Mkid
@section Builtin Scanners
If you run @code{mkid -S?=?} you will find bindings for a number of
languages; unfortunately pascal, though mentioned in the list, is not
actually supported. The supported languages are documented below
@footnote{This is not strictly true --- vhil is a supported language, but
it is an obsolete and arcane dialect of C and should be ignored}.
@menu
* C:: C
* Plain Text:: Plain Text
* Assembler:: Assembler
@end menu
@node C, Plain Text, Builtin Scanners, Builtin Scanners
@subsection C
The C scanner is probably the most popular. It scans identifiers out of
C programs, skipping over comments and strings in the process. The
normal @file{.c} and @file{.h} suffixes are automatically recognized as
C language, as well as the more obscure @file{.y} (yacc) and @file{.l}
(lex) suffixes.
The @code{-S} options recognized by the C scanner are:
@table @code
@item -Sc-s@var{<character>}
Allow the specified @var{<character>} in identifiers (some dialects of
C allow @code{$} in identifiers, so you could say @code{-Sc-s$} to
accept that dialect).
@item -Sc-u
Don't strip leading underscores from identifier names (this is the default
mode of operation).
@item -Sc+u
Do strip leading underscores from identifier names (I don't know why you
would want to do this in C programs, but the option is available).
@end table
@node Plain Text, Assembler, C, Builtin Scanners
@subsection Plain Text
The plain text scanner is designed for scanning documents. This is
typically the scanner used when adding custom scanners, and several
custom scanners are built in to @code{mkid} and defined in terms of filters
and the text scanner. A troff scanner runs @code{deroff} over the file
then feeds the result to the text scanner. A compressed man page scanner
runs @code{pcat} piped into @code{col -b}, and a @TeX{} scanner runs
@code{detex}.
Options:
@table @code
@item -Stext+a@var{<character>}
Include the specified character in identifiers. By default, standard
C identifiers are recognized.
@item -Stext-a@var{<character>}
Exclude the specified character from identifiers.
@item -Stext+s@var{<character>}
Squeeze the specified character out of identifiers. By default, the
characters @samp{'}, @samp{-}, and @samp{.} are squeezed out of identifiers.
This generates transformations like @var{fred's}@expansion{}@var{freds} or
@var{a.s.p.c.a.}@expansion{}@var{aspca}.
@item -Stext-s@var{<character>}
Do not squeeze out the specified character.
@end table
@node Assembler, , Plain Text, Builtin Scanners
@subsection Assembler
Assemblers come in several flavors, so there are several options to
control scanning of assembly code:
@table @code
@item -Sasm-c@var{<character>}
The specified character starts a comment that extends to end of line
(in many assemblers this is a semicolon or number sign --- there is
no default value for this).
@item -Sasm+u
Strip the leading underscores off identifiers (the default behavior).
@item -Sasm-u
Do not strip the leading underscores.
@item -Sasm+a@var{<character>}
The specified character is allowed in identifiers.
@item -Sasm-a@var{<character>}
The specified character is allowed in identifiers, but any identifier
containing that character is ignored (often a @samp{.} or @samp{@@}
will be used to indicate an internal temp label, you may want to
ignore these).
@item -Sasm+p
Recognize C preprocessor directives in assembler source (default).
@item -Sasm-p
Do not recognize C preprocessor directives in assembler source.
@item -Sasm+C
Skip over C style comments in assembler source (default).
@item -Sasm-C
Do not skip over C style comments in assembler source.
@end table
@node Adding Your Own Scanner, Mkid Examples, Builtin Scanners, Mkid
@section Adding Your Own Scanner
There are two ways to add new scanners to @code{mkid}. The first is to
modify the code in @file{getscan.c} and add a new @file{scan-*.c} file
with the code for your scanner. This is not too hard, but it requires
relinking and installing a new version of @code{mkid}, which might be
inconvenient, and would lead to the proliferation of @code{mkid}
versions.
The second technique uses the @code{-S<lang>/<lang>/<filter>} form
of the @code{-S} option to specify a new language scanner. In this form
the first language is the name of the new language to be defined,
the second language is the name of an existing language scanner to
be invoked on the output of the filter command specified as the
third component of the @code{-S} option.
The filter is an arbitrary shell command. Somewhere in the filter string,
a @code{%s} should occur. This @code{%s} is replaced by the name of the
source file being scanned, the shell command is invoked, and whatever
comes out on @var{stdout} is scanned using the builtin scanner.
For example, no scanner is provided for texinfo files (like this one).
If I wished to index the contents of this file, but avoid indexing the
texinfo directives, I would need a filter that stripped out the texinfo
directives, but left the remainder of the file intact. I could then use
the plain text scanner on the remainder. A quick way to specify this
might be:
@example
'-S/texinfo/text/sed s,@@[a-z]*,,g < %s'
@end example
This defines a new language scanner (@var{texinfo}) defined in terms of
a @code{sed} command to strip out texinfo directives (at signs followed
by letters). Once the directives are stripped, the remaining text is run
through the plain text scanner.
This is just an example, to do a better job I would actually need to
delete some lines (such as those beginning with @code{@@end}) as well
as deleting the @code{@@} directives embedded in the text.
@node Mkid Examples, , Adding Your Own Scanner, Mkid
@section Mkid Examples
The simplest example of @code{mkid} is something like:
@example
mkid *.[chy]
@end example
This will build an ID database indexing all the
identifiers and numbers in the @file{.c}, @file{.h}, and @file{.y} files
in the current directory. Because those suffixes are already known to
@code{mkid} as C language files, no other special arguments are required.
From a simple example, lets go to a more complex one. Suppose you want
to build a database indexing the contents of all the @var{man} pages.
Since @code{mkid} already knows how to deal with @file{.z} files, let's
assume your system is using the @code{compress} program to store
compressed cattable versions of the @var{man} pages. The
@code{compress} program creates files with a @code{.Z} suffix, so
@code{mkid} will have to be told how to scan @file{.Z} files. The
following code shows how to combine the @code{find} command with the
special scanner arguments to @code{mkid} to generate the required ID
database:
@example
cd /usr/catman
find . -name '*.Z' -print | mkid '-Sman/text/uncompress -c < %s' -S.Z=man -
@end example
This example first switches to the @file{/usr/catman} directory where
the compressed @var{man} pages are stored. The @code{find} command then
finds all the @file{.Z} files under that directory and prints their
names. This list is piped into the @code{mkid} program. The @code{-}
argument by itself (at the end of the line) tells @code{mkid} to read
arguments (in this case the list of file names) from @var{stdin}. The
first @code{-S} argument defines a new language (@var{man}) in terms of
the @code{uncompress} utility and the existing text scanner. The second
@code{-S} argument tells @code{mkid} to treat all @file{.Z} files as
language @var{man}. In practice, you might find the @code{mkid}
arguments need to be even more complex, something like:
@example
mkid '-Sman/text/uncompress -c < %s | col -b' -S.Z=man -
@end example
This will take the additional step of getting rid of any underlining and
backspacing which might be present in the compressed @var{man} pages.
@node Database Query Tools, Iid, Mkid, Top
@chapter Database Query Tools
The ID database is useless without database query tools. The remainder
of this document describes those tools.
The @code{lid}, @code{gid},
@code{aid}, @code{eid}, and @code{pid} programs are all the same program
installed with links to different names. The name used to invoke the
program determines how it will act.
The @code{iid} program is an interactive query shell that sits on top
of the other query tools.
@menu
* Common Options:: Common command line options
* Patterns:: Identifier pattern matching
* Lid:: Look up identifiers
* Aid:: Case insensitive lid
* Gid:: Grep for identifiers
* Eid:: Edit files with matching identifiers
* Pid:: Look up path names in database
@end menu
@node Common Options, Patterns, Database Query Tools, Database Query Tools
@section Common Options
Since many of the programs are really links to one common program, it
is only reasonable to expect that most of the query tools would share
common command line options. Not all options make sense for all programs,
but they are all described here. The description of each program
gives the options that program uses.
@table @code
@item -f@var{<file>}
Read the database specified by @var{<file>}. Normally the tools look
for a file named @file{ID} in either the current directory or in any
of the directories above the current directory. This means you can keep
a global @file{ID} database in the root of a large source tree and use
the query tools from anywhere within that tree.
@item -r@var{<directory>}
The query tools usually assume the file names in the database are relative
to the directory holding the database. The @code{-r} option tells the
tools to look for the files relative to @var{<directory>} regardless
of the location of the database.
@item -c
This is shorthand for @code{-r`pwd`}. It tells the query tools to assume
the file names are stored relative to the current working directory.
@item -e
Force the pattern arguments to be treated as regular expressions.
Normally the query tools attempt to guess if the patterns are regular
expressions or simple identifiers by looking for special characters
in the pattern.
@item -w
Force the pattern arguments to be treated as simple words even if
they contain special regular expression characters.
@item -k
Normally the query tools that generate lists of file names attempt to
compress the lists using the @code{csh} brace notation. This option
suppresses the file name compression and outputs each name in full.
(This is particularly useful if you are a @code{ksh} user and want to
feed the list of names to another command --- the @code{-k} option
comes from the @code{k} in @code{ksh}).
@item -g
It is possible to build the query tools so the @code{-k} option is the
default behavior. If this is the case for your system, the @code{-g}
option turns on the globbing of file names using the @code{csh} brace
notation.
@item -n
Normally the query tools that generate lists of file names also list
the matching identifier at the head of the list of names. This is
irritating if you want just a list of names to feed to another command,
so the @code{-n} option suppresses the identifier and lists only
file names.
@item -b
This option is only used by the @code{pid} tool. It restricts @code{pid}
to pattern match only the basename part of a file name. Normally the
absolute file name is matched against the pattern.
@item -d -o -x -a
These options may be used in any combination to limit the radix of
numeric matches. The @code{-d} option will allow matches on decimal
numbers, @code{-o} on octal, and @code{-x} on hexadecimal numbers.
The @code{-a} option is shorthand for specifying all three. Any
combination of these options may be used.
@item -m
Merge multiple lines of output into a single line. (If your query
matches more than one identifier the default action is to generate
a separate line of output for each matching identifier).
@item -s
Search for identifiers that appear only once in the database. This
helps to locate identifiers that are defined but never used.
@item -u@var{<number>}
List identifiers that conflict in the first @var{<number>} characters.
This could be useful porting programs to brain-dead computers that
refuse to support long identifiers, but your best long term option
is to set such computers on fire.
@end table
@node Patterns, Lid, Common Options, Database Query Tools
@section Patterns
You can attempt to match either simple identifiers or numbers in a
query, or you can specify a regular expression pattern which may
match many different identifiers in the database. The query
programs use either @var{regex} and @var{regcmp} or @var{re_comp}
and @var{re_exec}, depending on which one is available in the library
on your system. These might not always support the exact same
regular expression syntax, so consult your local @var{man} pages
to find out. Any regular expression routines should support the following
syntax:
@table @code
@item .
A dot matches any character.
@item [ ]
Brackets match any of the characters specified within the brackets. You
can match any characters @emph{except} the ones in brackets by typing
@code{^} as the first character. A range of characters can be specified
using @code{-}.
@item *
An asterisk means repeat the previous pattern zero or more times.
@item ^
An @code{^} at the beginning of a pattern means the pattern must match
starting at the first character of the identifier.
@item $
A @code{$} at the end of the pattern means the pattern must match ending
at the last character in the identifier.
@end table
@node Lid, Aid, Patterns, Database Query Tools
@section Lid
@deffn Command lid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-ewdoxamskgnc}] patterns@dots{}
@end deffn
The @code{lid} program stands for @var{lookup identifier}.
It searches the database for any identifiers matching the patterns
and prints the names of the files that match each pattern. The exact
format of the output depends on the options.
@node Aid, Gid, Lid, Database Query Tools
@section Aid
@deffn Command aid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-doxamskgnc}] patterns@dots{}
@end deffn
The @code{aid} command is an abbreviation for @var{apropos identifier}.
The patterns cannot be regular expressions, but it looks for them using
a case insensitive match, and any pattern that is a substring of an
identifier in the database will match that identifier.
For example @samp{aid get} might match the identifiers @code{fgets},
@code{GETLINE}, and @code{getchar}.
@node Gid, Eid, Aid, Database Query Tools
@section Gid
@deffn Command gid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-doxasc}] patterns@dots{}
@end deffn
The @code{gid} command stands for @var{grep for identifiers}. It finds
identifiers in the database that match the specified patterns, then
@code{greps} for those identifiers in just the set of files containing
matches. In a large source tree, this saves a fantastic amount of time.
There is an @var{emacs} interface to this program (@pxref{GNU Emacs Interface}).
If you are an @var{emacs} user, you will probably prefer the @var{emacs}
interface over the @code{eid} tool.
@node Eid, Pid, Gid, Database Query Tools
@section Eid
@deffn Command eid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-doxasc}] patterns@dots{}
@end deffn
The @code{eid} command allows you to invoke an editor on each file containing
a matching pattern. The @code{EDITOR} environment variable is the name of the
program to be invoked. If the specified editor can accept an initial search
argument on the command line, you can use the @code{EIDARG}, @code{EIDLDEL},
and @code{EIDRDEL} environment variables to specify the form of that argument.
@table @code
@item EDITOR
The name of the editor program to invoke.
@item EIDARG
A printf string giving the form of the argument to pass containing the
initial search string (the matching identifier). For @code{vi}
it should be set to @samp{+/%s/'}.
@item EIDLDEL
A string giving the regular expression pattern that forces a match at
the beginning (left end) of a word. This string is inserted in front
of the matching identifier when composing the search argument. For @code{vi},
this should be @samp{\<}.
@item EIDRDEL
The matching right end word delimiter. For @code{vi}, use @samp{\>}.
@end table
@node Pid, , Eid, Database Query Tools
@section Pid
@deffn Command pid [@code{-f@var{<file>}}] [@code{-u@var{<n>}}] [@code{-r@var{<dir>}}] [@code{-ebkgnc}] patterns@dots{}
@end deffn
The @code{pid} tool is unlike all the other tools. It matches the
patterns against the file names in the database rather than the
identifiers in the database. Patterns are treated as shell wild card
patterns unless the @code{-e} option is given, in which case full
regular expression matching is done.
The wild card pattern is matched against the absolute path name of the
file. Most shells treat slashes @samp{/} and file names that start with
dot @samp{.} specially, @code{pid} does not do this. It simply attempts
to match the absolute path name string against the wild card pattern.
The @code{-b} option restricts the pattern matching to the base name of
the file (all the leading directory names are stripped prior to pattern
matching).
@node Iid, Other Tools, Database Query Tools, Top
@chapter Iid
@deffn Command iid [@code{-a}] [@code{-c@var{<command>}}] [@code{-H}]
@table @code
@item -a
Normally @code{iid} uses the @code{lid} command to search for names.
If you give the @code{-a} option on the command line, then it will
use @code{aid} as the default search engine.
@item -c@var{<command>}
In normal operation, @code{iid} starts up and prompts you for commands
used to build sets of files. The @code{-c} option is used to pass a
single query command to @code{iid} which it then executes and exits.
@item -H
The @code{-H} option prints a short help message and exits. To get more
help use the @code{help} command from inside @code{iid}.
@end table
@end deffn
The @code{iid} program is an interactive ID query tool. It operates by
running the other query programs (such as @code{lid} and @code{aid})
and creating sets of file names returned by these queries. It also
provides operators for @code{anding} and @code{oring} these sets to
create new sets.
The @code{PAGER} environment variable names the program @code{iid} uses
to display files. If you use @code{emacs}, you might want to set
@code{PAGER} so it invokes the @code{emacsclient} program. Check the
file @file{lisp/server.el} in the emacs source tree for documentation on
this. It is useful not only with X windows, but also when running
@code{iid} from an emacs shell buffer. There is also a somewhat spiffier
version called gnuserv by Andy Norman
(@code{ange%anorman@@hplabs.hp.com}) which appeared in @file{comp.emacs}
sometime in 1989.
@menu
* Ss and Files commands:: Ss and Files commands
* Sets:: Sets
* Show:: Show
* Begin:: Begin
* Help:: Help
* Off:: Off
* Shell Commands as Queries:: Shell Commands as Queries
* Shell Escape:: Shell Escape
@end menu
@node Ss and Files commands, Sets, Iid, Iid
@section Ss and Files commands
The primary query commands are @code{ss} (for select sets) and @code{files}
(for show file names). These commands both take a query expression as an
argument.
@deffn Subcommand ss query
The @code{ss} command runs a query and builds a set (or sets) of file names. The
result is printed as a summary of the sets constructed showing how many file
names are in each set.
@end deffn
@deffn Subcommand files query
The @code{files} command is like the @code{ss} command, but rather than printing
a summary, it displays the full list of matching file names.
@end deffn
@deffn Subcommand f query
The @code{f} command is merely a shorthand notation for @code{files}.
@end deffn
Database queries are simple expressions with operators like @code{and}
and @code{or}. Parentheses can be used to group operations. The complete
set of operators is summarized below:
@table @code
@item @var{pattern}
Any pattern not recognized as one of the keywords in this table is treated
as an identifier to be searched for in the database. It is passed as an
argument to the default search program (normally @code{lid}, but @code{aid}
is used if the @code{-a} option was given when @code{iid} was started).
The result of this operation is a set of file names, and it is assigned a
unique set number.
@item lid
@code{lid} is a keyword. It is used to invoke @code{lid} with the list of
identifiers following it as arguments. This forces the use of @code{lid}
regardless of the state of the @code{-a} option (@pxref{Lid}).
@item aid
The @code{aid} keyword is like the @code{lid} keyword, but it forces the
use of the @code{aid} program (@pxref{Aid}).
@item match
The @code{match} operator invokes the @code{pid} program to do pattern
matching on file names rather than identifiers. The set generated contains
the file names that match the specified patterns (@pxref{Pid}).
@item or
The @code{or} operator takes two sets of file names as arguments and generates
a new set containing all the files from both sets.
@item and
The @code{and} operator takes two sets of file names and generates a new
set containing only files from both sets.
@item not
The @code{not} operator inverts a set of file names, producing the set of
all files not in the input set.
@item set number
A set number consists of the letter @code{s} followed immediately by a number.
This refers to one of the sets created by a previous query operation. During
one @code{iid} session, each query generates a unique set number, so any
previously generated set may be used as part of any new query by referring
to the set number.
@end table
The @code{not} operator has the highest precedence with @code{and}
coming in the middle and @code{or} having the lowest precedence. The
operator names are recognized using case insensitive matching, so
@code{AND}, @code{and}, and @code{aNd} are all the same as far as
@code{iid} is concerned. If you wish to use a keyword as an operand to
one of the query programs, you must enclose it in quotes. Any patterns
containing shell special characters must also be properly quoted or
escaped, since the query commands are run by invoking them with the
shell.
Summary of query expression syntax:
@example
A <query> is:
<set number>
<identifier>
lid <identifier list>
aid <identifier list>
match <wild card list>
<query> or <query>
<query> and <query>
not <query>
( <query> )
@end example
@node Sets, Show, Ss and Files commands, Iid
@section Sets
@deffn Subcommand sets
@end deffn
The @code{sets} command displays all the sets created so far. Each one
is described by the query command that generated it.
@node Show, Begin, Sets, Iid
@section Show
@deffn Subcommand show set
@end deffn
@deffn Subcommand p set
@end deffn
The @code{show} and @code{p} commands are equivalent. They both accept
a set number as an argument and run the program given in the @code{PAGER}
environment variable with the file names in that set as arguments.
@node Begin, Help, Show, Iid
@section Begin
@deffn Subcommand begin directory
@end deffn
@deffn Subcommand b directory
@end deffn
The @code{begin} command (and its abbreviated version @code{b}) is used
to begin a new @code{iid} session in a different directory (which presumably
contains a different database). It flushes all the sets created so far
and switches to the specified directory. It is equivalent to exiting @code{iid},
changing directories in the shell, and running @code{iid} again.
@node Help, Off, Begin, Iid
@section Help
@deffn Subcommand help
@end deffn
@deffn Subcommand h
@end deffn
@deffn Subcommand ?
@end deffn
The @code{help}, @code{h}, and @code{?} command are three different ways to
ask for help. They all invoke the @code{PAGER} program to display a short
help file.
@node Off, Shell Commands as Queries, Help, Iid
@section Off
@deffn Subcommand off
@end deffn
@deffn Subcommand quit
@end deffn
@deffn Subcommand q
@end deffn
These three command (or just an end of file) all cause @code{iid} to exit.
@node Shell Commands as Queries, Shell Escape, Off, Iid
@section Shell Commands as Queries
When the first word on an @code{iid} command is not recognized as a
builtin @code{iid} command, @code{iid} assumes the command is a shell
command which will write a list of file names to @var{stdout}. This list
of file names is used to generate a new set of files.
Any set numbers that appear as arguments to this command are expanded
into lists of file names prior to running the command.
@node Shell Escape, , Shell Commands as Queries, Iid
@section Shell Escape
If a command starts with a bang (@code{!}) character, the remainder of
the line is run as a shell command. Any set numbers that appear as
arguments to this command are expanded into lists of file names prior to
running the command.
@node Other Tools, Command Index, Iid, Top
@chapter Other Tools
This chapter describes some support tools that work with the other ID
programs.
@menu
* GNU Emacs Interface:: Using gid.el
* Fid:: List identifiers in a file.
* Idx:: Extract identifiers from source file.
@end menu
@node GNU Emacs Interface, Fid, Other Tools, Other Tools
@section GNU Emacs Interface
The source distribution comes with a file named @file{gid.el}. This is
a GNU emacs interface to the @code{gid} tool. If you put the file where
emacs can find it (somewhere in your @code{EMACSLOADPATH}) and put
@code{(autoload 'gid "gid" nil t)} in your @file{.emacs} file, you will
be able to invoke the @code{gid} function using @kbd{M-x gid}.
This function prompts you with the word the cursor is on. If you want
to search for a different pattern, simply delete the line and type the
pattern of interest.
It runs @code{gid} in a @code{*compilation*} buffer, so the normal
@code{next-error} function can be used to visit all the places the
identifier is found (@pxref{Compilation,,,emacs,The GNU Emacs Manual}).
@node Fid, Idx, GNU Emacs Interface, Other Tools
@section Fid
@deffn Command fid [@code{-f@var{<file>}}] file1 [file2]
@table @code
@item -f@var{<file>}
Look in the named database.
@item @var{file1}
List the identifiers contained in file1 according to the database.
@item @var{file2}
If a second file is given, list only the identifiers both files have
in common.
@end table
@end deffn
The @code{fid} program provides an inverse query. Instead of listing
files containing some identifier, it lists the identifiers found in
a file.
@node Idx, , Fid, Other Tools
@section Idx
@deffn Command idx [@code{-s@var{<directory>}}] [@code{-r@var{<directory>}}] [@code{-S@var{<scanarg>}}] files@dots{}
The @code{-s}, @code{-r}, and @code{-S} arguments to @code{idx}
are identical to the same arguments on @code{mkid}
(@pxref{Mkid Command Line Options}).
@end deffn
The @code{idx} command is more of a test frame for scanners than a tool
designed to be independently useful. It takes the same scanner arguments
as @code{mkid}, but rather than building a database, it prints the
identifiers found to @var{stdout}, one per line. You can use it to try
out a scanner on a sample file to make sure it is extracting the
identifiers you believe it should extract.
@node Command Index, , Other Tools, Top
@unnumbered Command Index
@printindex fn
@contents
@bye
|