summaryrefslogtreecommitdiffstats
path: root/mkid.info
blob: 029013bf92b2df898896d672392a4b3a7c0e1945 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
This is Info file mkid.info, produced by Makeinfo-1.55 from the input
file mkid.texinfo.

START-INFO-DIR-ENTRY
* mkid: (mkid).			Identifier database utilities
END-INFO-DIR-ENTRY

   This file documents the `mkid' identifier database utilities.

   Copyright (C) 1991 Tom Horsley

   Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.

   Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that
the entire resulting derived work is distributed under the terms of a
permission notice identical to this one.

   Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be stated in a
translation.


File: mkid.info,  Node: Top,  Next: Overview,  Prev: (dir),  Up: (dir)

* Menu:

* Overview::                    What is an ID database and what tools manipulate it?
* Mkid::                        Mkid
* Database Query Tools::        Database Query Tools
* Iid::                         Iid
* Other Tools::                 Other Tools
* Command Index::               Command Index


File: mkid.info,  Node: Overview,  Next: Mkid,  Prev: Top,  Up: Top

Overview
********

   An ID database is simply a file containing a list of file names, a
list of identifiers, and a binary relation (stored as a bit matrix)
indicating which of the identifiers appear in each file.  With this
database and some tools to manipulate the data, a host of tasks become
simpler and faster. You can `grep' through hundreds of files for a
name, skipping the files that don't contain the name.  You can search
for all the memos containing references to a project.  You can edit
every file that calls some function, adding a new required argument.
Anyone with a large software project to maintain, or a large set of
text files to organize can benefit from the ID database and the tools
that manipulate it.

   There are several programs in the ID family.  The `mkid' program
scans the files, finds the identifiers and builds the ID database.  The
`lid' and `aid' tools are used to generate lists of file names
containing an identifier (perhaps to recompile every file that
references a macro which just changed). The `eid' program will invoke
an editor on each of the files containing an identifier and the `gid'
program will `grep' for an identifier in the subset of files known to
contain it.  The `pid' tool is used to query the path names of the
files in the database (rather than the contents).  Finally, the `iid'
tool is an interactive program supporting complex queries to intersect
and join sets of file names.

* Menu:

* History::                     History


File: mkid.info,  Node: History,  Prev: Overview,  Up: Overview

History
=======

   Greg McGary conceived of the ideas behind mkid when he began hacking
the UNIX kernel in 1984.  He needed a navigation tool to help him find
his way the expansive, unfamiliar landscape.  The first mkid-like tools
were built with shell scripts, and produced an ascii database that looks
much like the output of `lid' with no arguments.  It took over an hour
on a VAX 11/750 to build a database for a 4.1BSDish kernel.  Lookups
were done with the UNIX command `look', modified to handle very long
lines.

   In 1986, Greg rewrote mkid, lid, fid and idx in C to improve
performance.  Database-build times were shortened by an order of
magnitude.  The mkid tools were first posted to `comp.sources.unix'
September of 1987.

   Over the next few years, several versions diverged from the original
source.  Tom Horsley at Harris Computer Systems Division stepped forward
to take over maintenance and integrated some of the fixes from divergent
versions.  He also wrote the `iid' program.  A pre-release of `mkid2'
was posted to `alt.sources' near the end of 1990.  At that time Tom
wrote this texinfo manual with the encouragement the net community.
(Tom thanks Doug Scofield and Bill Leonard whom I dragooned into
helping me poorf raed and edit -- they found several problems in the
initial version.)

   In January, 1995, Greg McGary reemerged as the primary maintaner and
is hereby launching `mkid-3' whose primary new feature is an efficient
algorithm for building databases that is linear over the size of the
input text for both time and space.  (The old algorithm was quadratic
for space and choked on very large source trees.)  The code is now under
GPL and might become a part of the GNU system.  `Mkid-3' is an interim
release, since several significant enhacements are in the works.  These
include an optional coupling with GNU grep, so that grep can use an ID
database for hints; a cscope work-alike query interface; incremental
update of the ID database; and an automatic file-tree walker so you
need not explicitly supply every file name argument to the `mkid'
program.


File: mkid.info,  Node: Mkid,  Next: Database Query Tools,  Prev: Overview,  Up: Top

Mkid
****

   The `mkid' program builds the ID database.  To do this it must scan
each of the files included in the database.  This takes some time, but
once the work is done the query programs run very rapidly.

   The `mkid' program knows how to scan a variety of of files. For
example, it knows how to skip over comments and strings in a C program,
only picking out the identifiers used in the code.

   Identifiers are not the only thing included in the database.
Numbers are also scanned and included in the database indexed by their
binary value. Since the same number can be written many different ways
(47, 0x2f, 057 in a C program for instance), this feature allows you to
find hard coded uses of constants without regard to the radix used to
specify them.

   All the places in this document where identifiers are written about
should really mention identifiers and numbers, but that gets fairly
clumsy after a while, so you should always keep in mind that numbers are
included in the database as well as identifiers.

* Menu:

* Mkid Command Line Options::   Mkid Command Line Options
* Builtin Scanners::            Builtin Scanners
* Adding Your Own Scanner::     Adding Your Own Scanner
* Mkid Examples::               Mkid Examples


File: mkid.info,  Node: Mkid Command Line Options,  Next: Builtin Scanners,  Prev: Mkid,  Up: Mkid

Mkid Command Line Options
=========================

 - Command: mkid [`-v'] [`-SSCANARG'] [`-aARG-FILE'] [`-']
          [`-fOUT-FILE'] [`-u'] [`files'...]
    `-v'
          Verbose. Mkid tells you as it scans each file and indicates
          which scanner it is using. It also summarizes some statistics
          about the database at the end.

    `-SSCANARG'
          The `-S' option is used to specify arguments to the various
          language scanners. *Note Scanner Arguments::, for details.

    `-aARG-FILE'
          Name a file containing additional command line arguments (one
          per line). This may be used to specify lists of file names
          longer than will fit on a command line.

    `-'
          A simple `-' by itself means read arguments from stdin.

    `-fOUT-FILE'
          Specify the name of the database file to create. The default
          name is `ID' (in the current directory), but you may specify
          any name. The file names stored in the database will be
          stored relative to the directory containing the database, so
          if you move the database after creating it, you may have
          trouble finding files unless they remain in the same relative
          position.

    `-u'
          The `-u' option updates an existing database by rescanning
          any files that have changed since the database was written.
          Unfortunately you cannot incrementally add new files to a
          database.

    `files'
          Remaining arguments are names of files to be scanned and
          included in the database.

* Menu:

* Scanner Arguments::           Scanner Arguments


File: mkid.info,  Node: Scanner Arguments,  Prev: Mkid Command Line Options,  Up: Mkid Command Line Options

Scanner Arguments
-----------------

   Scanner arguments all start with `-S'. Scanner arguments are used to
tell `mkid' which language scanner to use for which files, to pass
language specific options to the individual scanners, and to get some
limited online help about scanner options.

   `Mkid' usually determines which language scanner to use on a file by
looking at the suffix of the file name. The suffix starts at the last
`.' in a file name and includes the `.' and all remaining characters
(for example the suffix of `fred.c' is `.c'). Not all files have a
suffix, and not all suffixes are bound to a specific language by mkid.
If `mkid' cannot determine what language a file is, it will use the
language bound to the `.default' suffix. The plain text scanner is
normally bound to `.default', but the `-S' option can be used to change
any language bindings.

   There are several different forms for scanner options:
`-S.<SUFFIX>=<LANGUAGE>'
     `Mkid' determines which language scanner to use on a file by
     examining the file name suffix. The `.' is part of the suffix and
     must be specified in this form of the `-S' option. For example
     `-S.y=c' tells `mkid' to use the `c' language scanner for all
     files ending in the `.y' suffix.

`-S.<SUFFIX>=?'
     `Mkid' has several built in suffixes it already recognizes. Passing
     a `?' will cause it to print the language it will use to scan files
     with that suffix.

`-S?=<LANGUAGE>'
     This form will print which suffixes are scanned with the given
     language.

`-S?=?'
     This prints all the suffix==>language bindings recognized by
     `mkid'.

`-S<LANGUAGE>-<ARG>'
     Each language scanner accepts scanner dependent arguments. This
     form of the `-S' option is used to pass arbitrary arguments to the
     language scanners.

`-S<LANGUAGE>?'
     Passing a `?' instead of a language option will print a brief
     summary of the options recognized by the specified language
     scanner.

`-S<NEW LANGUAGE>/<BUILTIN LANGUAGE>/<FILTER COMMAND>'
     This form specifies a new language defined in terms of a builtin
     language and a shell command that will be used to filter the file
     prior to passing on to the builtin language scanner.


File: mkid.info,  Node: Builtin Scanners,  Next: Adding Your Own Scanner,  Prev: Mkid Command Line Options,  Up: Mkid

Builtin Scanners
================

   If you run `mkid -S?=?' you will find bindings for a number of
languages; unfortunately pascal, though mentioned in the list, is not
actually supported.  The supported languages are documented below (1).

* Menu:

* C::                           C
* Plain Text::                  Plain Text
* Assembler::                   Assembler

   ---------- Footnotes ----------

   (1)  This is not strictly true -- vhil is a supported language, but
it is an obsolete and arcane dialect of C and should be ignored


File: mkid.info,  Node: C,  Next: Plain Text,  Prev: Builtin Scanners,  Up: Builtin Scanners

C
-

   The C scanner is probably the most popular. It scans identifiers out
of C programs, skipping over comments and strings in the process.  The
normal `.c' and `.h' suffixes are automatically recognized as C
language, as well as the more obscure `.y' (yacc) and `.l' (lex)
suffixes.

   The `-S' options recognized by the C scanner are:

`-Sc-s<CHARACTER>'
     Allow the specified <CHARACTER> in identifiers (some dialects of C
     allow `$' in identifiers, so you could say `-Sc-s$' to accept that
     dialect).

`-Sc-u'
     Don't strip leading underscores from identifier names (this is the
     default mode of operation).

`-Sc+u'
     Do strip leading underscores from identifier names (I don't know
     why you would want to do this in C programs, but the option is
     available).


File: mkid.info,  Node: Plain Text,  Next: Assembler,  Prev: C,  Up: Builtin Scanners

Plain Text
----------

   The plain text scanner is designed for scanning documents. This is
typically the scanner used when adding custom scanners, and several
custom scanners are built in to `mkid' and defined in terms of filters
and the text scanner. A troff scanner runs `deroff' over the file then
feeds the result to the text scanner. A compressed man page scanner
runs `pcat' piped into `col -b', and a TeX scanner runs `detex'.

   Options:

`-Stext+a<CHARACTER>'
     Include the specified character in identifiers. By default,
     standard C identifiers are recognized.

`-Stext-a<CHARACTER>'
     Exclude the specified character from identifiers.

`-Stext+s<CHARACTER>'
     Squeeze the specified character out of identifiers. By default, the
     characters `'', `-', and `.' are squeezed out of identifiers.
     This generates transformations like FRED'S==>FREDS or
     A.S.P.C.A.==>ASPCA.

`-Stext-s<CHARACTER>'
     Do not squeeze out the specified character.


File: mkid.info,  Node: Assembler,  Prev: Plain Text,  Up: Builtin Scanners

Assembler
---------

   Assemblers come in several flavors, so there are several options to
control scanning of assembly code:

`-Sasm-c<CHARACTER>'
     The specified character starts a comment that extends to end of
     line (in many assemblers this is a semicolon or number sign --
     there is no default value for this).

`-Sasm+u'
     Strip the leading underscores off identifiers (the default
     behavior).

`-Sasm-u'
     Do not strip the leading underscores.

`-Sasm+a<CHARACTER>'
     The specified character is allowed in identifiers.

`-Sasm-a<CHARACTER>'
     The specified character is allowed in identifiers, but any
     identifier containing that character is ignored (often a `.' or `@'
     will be used to indicate an internal temp label, you may want to
     ignore these).

`-Sasm+p'
     Recognize C preprocessor directives in assembler source (default).

`-Sasm-p'
     Do not recognize C preprocessor directives in assembler source.

`-Sasm+C'
     Skip over C style comments in assembler source (default).

`-Sasm-C'
     Do not skip over C style comments in assembler source.


File: mkid.info,  Node: Adding Your Own Scanner,  Next: Mkid Examples,  Prev: Builtin Scanners,  Up: Mkid

Adding Your Own Scanner
=======================

   There are two ways to add new scanners to `mkid'. The first is to
modify the code in `getscan.c' and add a new `scan-*.c' file with the
code for your scanner. This is not too hard, but it requires relinking
and installing a new version of `mkid', which might be inconvenient,
and would lead to the proliferation of `mkid' versions.

   The second technique uses the  `-S<lang>/<lang>/<filter>' form of
the `-S' option to specify a new language scanner. In this form the
first language is the name of the new language to be defined, the
second language is the name of an existing language scanner to be
invoked on the output of the filter command specified as the third
component of the `-S' option.

   The filter is an arbitrary shell command. Somewhere in the filter
string, a `%s' should occur. This `%s' is replaced by the name of the
source file being scanned, the shell command is invoked, and whatever
comes out on STDOUT is scanned using the builtin scanner.

   For example, no scanner is provided for texinfo files (like this
one).  If I wished to index the contents of this file, but avoid
indexing the texinfo directives, I would need a filter that stripped
out the texinfo directives, but left the remainder of the file intact.
I could then use the plain text scanner on the remainder. A quick way
to specify this might be:

     '-S/texinfo/text/sed s,@[a-z]*,,g < %s'

   This defines a new language scanner (TEXINFO) defined in terms of a
`sed' command to strip out texinfo directives (at signs followed by
letters). Once the directives are stripped, the remaining text is run
through the plain text scanner.

   This is just an example, to do a better job I would actually need to
delete some lines (such as those beginning with `@end') as well as
deleting the `@' directives embedded in the text.


File: mkid.info,  Node: Mkid Examples,  Prev: Adding Your Own Scanner,  Up: Mkid

Mkid Examples
=============

   The simplest example of `mkid' is something like:

     mkid *.[chy]

   This will build an ID database indexing all the identifiers and
numbers in the `.c', `.h', and `.y' files in the current directory.
Because those suffixes are already known to `mkid' as C language files,
no other special arguments are required.

   From a simple example, lets go to a more complex one. Suppose you
want to build a database indexing the contents of all the MAN pages.
Since `mkid' already knows how to deal with `.z' files, let's assume
your system is using the `compress' program to store compressed
cattable versions of the MAN pages.  The `compress' program creates
files with a `.Z' suffix, so `mkid' will have to be told how to scan
`.Z' files. The following code shows how to combine the `find' command
with the special scanner arguments to `mkid' to generate the required ID
database:

     cd /usr/catman
     find . -name '*.Z' -print | mkid '-Sman/text/uncompress -c < %s' -S.Z=man -

   This example first switches to the `/usr/catman' directory where the
compressed MAN pages are stored. The `find' command then finds all the
`.Z' files under that directory and prints their names.  This list is
piped into the `mkid' program. The `-' argument by itself (at the end
of the line) tells `mkid' to read arguments (in this case the list of
file names) from STDIN. The first `-S' argument defines a new language
(MAN) in terms of the `uncompress' utility and the existing text
scanner. The second `-S' argument tells `mkid' to treat all `.Z' files
as language MAN. In practice, you might find the `mkid' arguments need
to be even more complex, something like:

     mkid '-Sman/text/uncompress -c < %s | col -b' -S.Z=man -

   This will take the additional step of getting rid of any underlining
and backspacing which might be present in the compressed MAN pages.


File: mkid.info,  Node: Database Query Tools,  Next: Iid,  Prev: Mkid,  Up: Top

Database Query Tools
********************

   The ID database is useless without database query tools. The
remainder of this document describes those tools.

   The `lid', `gid', `aid', `eid', and `pid' programs are all the same
program installed with links to different names. The name used to
invoke the program determines how it will act.

   The `iid' program is an interactive query shell that sits on top of
the other query tools.

* Menu:

* Common Options::              Common command line options
* Patterns::                    Identifier pattern matching
* Lid::                         Look up identifiers
* Aid::                         Case insensitive lid
* Gid::                         Grep for identifiers
* Eid::                         Edit files with matching identifiers
* Pid::                         Look up path names in database


File: mkid.info,  Node: Common Options,  Next: Patterns,  Prev: Database Query Tools,  Up: Database Query Tools

Common Options
==============

   Since many of the programs are really links to one common program, it
is only reasonable to expect that most of the query tools would share
common command line options. Not all options make sense for all
programs, but they are all described here. The description of each
program gives the options that program uses.

`-f<FILE>'
     Read the database specified by <FILE>. Normally the tools look for
     a file named `ID' in either the current directory or in any of the
     directories above the current directory. This means you can keep a
     global `ID' database in the root of a large source tree and use
     the query tools from anywhere within that tree.

`-r<DIRECTORY>'
     The query tools usually assume the file names in the database are
     relative to the directory holding the database. The `-r' option
     tells the tools to look for the files relative to <DIRECTORY>
     regardless of the location of the database.

`-c'
     This is shorthand for `-r`pwd`'. It tells the query tools to assume
     the file names are stored relative to the current working
     directory.

`-e'
     Force the pattern arguments to be treated as regular expressions.
     Normally the query tools attempt to guess if the patterns are
     regular expressions or simple identifiers by looking for special
     characters in the pattern.

`-w'
     Force the pattern arguments to be treated as simple words even if
     they contain special regular expression characters.

`-k'
     Normally the query tools that generate lists of file names attempt
     to compress the lists using the `csh' brace notation. This option
     suppresses the file name compression and outputs each name in full.
     (This is particularly useful if you are a `ksh' user and want to
     feed the list of names to another command -- the `-k' option comes
     from the `k' in `ksh').

`-g'
     It is possible to build the query tools so the `-k' option is the
     default behavior. If this is the case for your system, the `-g'
     option turns on the globbing of file names using the `csh' brace
     notation.

`-n'
     Normally the query tools that generate lists of file names also
     list the matching identifier at the head of the list of names.
     This is irritating if you want just a list of names to feed to
     another command, so the `-n' option suppresses the identifier and
     lists only file names.

`-b'
     This option is only used by the `pid' tool. It restricts `pid' to
     pattern match only the basename part of a file name. Normally the
     absolute file name is matched against the pattern.

`-d -o -x -a'
     These options may be used in any combination to limit the radix of
     numeric matches. The `-d' option will allow matches on decimal
     numbers, `-o' on octal, and `-x' on hexadecimal numbers.  The `-a'
     option is shorthand for specifying all three. Any combination of
     these options may be used.

`-m'
     Merge multiple lines of output into a single line. (If your query
     matches more than one identifier the default action is to generate
     a separate line of output for each matching identifier).

`-s'
     Search for identifiers that appear only once in the database. This
     helps to locate identifiers that are defined but never used.

`-u<NUMBER>'
     List identifiers that conflict in the first <NUMBER> characters.
     This could be useful porting programs to brain-dead computers that
     refuse to support long identifiers, but your best long term option
     is to set such computers on fire.


File: mkid.info,  Node: Patterns,  Next: Lid,  Prev: Common Options,  Up: Database Query Tools

Patterns
========

   You can attempt to match either simple identifiers or numbers in a
query, or you can specify a regular expression pattern which may match
many different identifiers in the database. The query programs use
either REGEX and REGCMP or RE_COMP and RE_EXEC, depending on which one
is available in the library on your system. These might not always
support the exact same regular expression syntax, so consult your local
MAN pages to find out. Any regular expression routines should support
the following syntax:

`.'
     A dot matches any character.

`[ ]'
     Brackets match any of the characters specified within the
     brackets.  You can match any characters *except* the ones in
     brackets by typing `^' as the first character. A range of
     characters can be specified using `-'.

`*'
     An asterisk means repeat the previous pattern zero or more times.

`^'
     An `^' at the beginning of a pattern means the pattern must match
     starting at the first character of the identifier.

`$'
     A `$' at the end of the pattern means the pattern must match ending
     at the last character in the identifier.


File: mkid.info,  Node: Lid,  Next: Aid,  Prev: Patterns,  Up: Database Query Tools

Lid
===

 - Command: lid [`-f<FILE>'] [`-u<N>'] [`-r<DIR>'] [`-ewdoxamskgnc']
          PATTERNS...

   The `lid' program stands for LOOKUP IDENTIFIER.  It searches the
database for any identifiers matching the patterns and prints the names
of the files that match each pattern. The exact format of the output
depends on the options.


File: mkid.info,  Node: Aid,  Next: Gid,  Prev: Lid,  Up: Database Query Tools

Aid
===

 - Command: aid [`-f<FILE>'] [`-u<N>'] [`-r<DIR>'] [`-doxamskgnc']
          PATTERNS...

   The `aid' command is an abbreviation for APROPOS IDENTIFIER.  The
patterns cannot be regular expressions, but it looks for them using a
case insensitive match, and any pattern that is a substring of an
identifier in the database will match that identifier.

   For example `aid get' might match the identifiers `fgets',
`GETLINE', and `getchar'.


File: mkid.info,  Node: Gid,  Next: Eid,  Prev: Aid,  Up: Database Query Tools

Gid
===

 - Command: gid [`-f<FILE>'] [`-u<N>'] [`-r<DIR>'] [`-doxasc']
          PATTERNS...

   The `gid' command stands for GREP FOR IDENTIFIERS. It finds
identifiers in the database that match the specified patterns, then
`greps' for those identifiers in just the set of files containing
matches. In a large source tree, this saves a fantastic amount of time.

   There is an EMACS interface to this program (*note GNU Emacs
Interface::.).  If you are an EMACS user, you will probably prefer the
EMACS interface over the `eid' tool.


File: mkid.info,  Node: Eid,  Next: Pid,  Prev: Gid,  Up: Database Query Tools

Eid
===

 - Command: eid [`-f<FILE>'] [`-u<N>'] [`-r<DIR>'] [`-doxasc']
          PATTERNS...

   The `eid' command allows you to invoke an editor on each file
containing a matching pattern. The `EDITOR' environment variable is the
name of the program to be invoked. If the specified editor can accept
an initial search argument on the command line, you can use the
`EIDARG', `EIDLDEL', and `EIDRDEL' environment variables to specify the
form of that argument.

`EDITOR'
     The name of the editor program to invoke.

`EIDARG'
     A printf string giving the form of the argument to pass containing
     the initial search string (the matching identifier). For `vi' it
     should be set to `+/%s/''.

`EIDLDEL'
     A string giving the regular expression pattern that forces a match
     at the beginning (left end) of a word. This string is inserted in
     front of the matching identifier when composing the search
     argument. For `vi', this should be `\<'.

`EIDRDEL'
     The matching right end word delimiter. For `vi', use `\>'.


File: mkid.info,  Node: Pid,  Prev: Eid,  Up: Database Query Tools

Pid
===

 - Command: pid [`-f<FILE>'] [`-u<N>'] [`-r<DIR>'] [`-ebkgnc']
          PATTERNS...

   The `pid' tool is unlike all the other tools. It matches the
patterns against the file names in the database rather than the
identifiers in the database.  Patterns are treated as shell wild card
patterns unless the `-e' option is given, in which case full regular
expression matching is done.

   The wild card pattern is matched against the absolute path name of
the file. Most shells treat slashes `/' and file names that start with
dot `.' specially, `pid' does not do this. It simply attempts to match
the absolute path name string against the wild card pattern.

   The `-b' option restricts the pattern matching to the base name of
the file (all the leading directory names are stripped prior to pattern
matching).


File: mkid.info,  Node: Iid,  Next: Other Tools,  Prev: Database Query Tools,  Up: Top

Iid
***

 - Command: iid [`-a'] [`-c<COMMAND>'] [`-H']
    `-a'
          Normally `iid' uses the `lid' command to search for names.
          If you give the `-a' option on the command line, then it will
          use `aid' as the default search engine.

    `-c<COMMAND>'
          In normal operation, `iid' starts up and prompts you for
          commands used to build sets of files. The `-c' option is used
          to pass a single query command to `iid' which it then
          executes and exits.

    `-H'
          The `-H' option prints a short help message and exits. To get
          more help use the `help' command from inside `iid'.

   The `iid' program is an interactive ID query tool. It operates by
running the other query programs (such as `lid' and `aid') and creating
sets of file names returned by these queries. It also provides
operators for `anding' and `oring' these sets to create new sets.

   The `PAGER' environment variable names the program `iid' uses to
display files. If you use `emacs', you might want to set `PAGER' so it
invokes the `emacsclient' program. Check the file `lisp/server.el' in
the emacs source tree for documentation on this. It is useful not only
with X windows, but also when running `iid' from an emacs shell buffer.
There is also a somewhat spiffier version called gnuserv by Andy Norman
(`ange%anorman@hplabs.hp.com') which appeared in `comp.emacs' sometime
in 1989.

* Menu:

* Ss and Files commands::       Ss and Files commands
* Sets::                        Sets
* Show::                        Show
* Begin::                       Begin
* Help::                        Help
* Off::                         Off
* Shell Commands as Queries::   Shell Commands as Queries
* Shell Escape::                Shell Escape


File: mkid.info,  Node: Ss and Files commands,  Next: Sets,  Prev: Iid,  Up: Iid

Ss and Files commands
=====================

   The primary query commands are `ss' (for select sets) and `files'
(for show file names). These commands both take a query expression as an
argument.

 - Subcommand: ss QUERY
     The `ss' command runs a query and builds a set (or sets) of file
     names. The result is printed as a summary of the sets constructed
     showing how many file names are in each set.

 - Subcommand: files QUERY
     The `files' command is like the `ss' command, but rather than
     printing a summary, it displays the full list of matching file
     names.

 - Subcommand: f QUERY
     The `f' command is merely a shorthand notation for `files'.

   Database queries are simple expressions with operators like `and'
and `or'. Parentheses can be used to group operations. The complete set
of operators is summarized below:

`PATTERN'
     Any pattern not recognized as one of the keywords in this table is
     treated as an identifier to be searched for in the database. It is
     passed as an argument to the default search program (normally
     `lid', but `aid' is used if the `-a' option was given when `iid'
     was started).  The result of this operation is a set of file
     names, and it is assigned a unique set number.

`lid'
     `lid' is a keyword. It is used to invoke `lid' with the list of
     identifiers following it as arguments. This forces the use of `lid'
     regardless of the state of the `-a' option (*note Lid::.).

`aid'
     The `aid' keyword is like the `lid' keyword, but it forces the use
     of the `aid' program (*note Aid::.).

`match'
     The `match' operator invokes the `pid' program to do pattern
     matching on file names rather than identifiers. The set generated
     contains the file names that match the specified patterns (*note
     Pid::.).

`or'
     The `or' operator takes two sets of file names as arguments and
     generates a new set containing all the files from both sets.

`and'
     The `and' operator takes two sets of file names and generates a new
     set containing only files from both sets.

`not'
     The `not' operator inverts a set of file names, producing the set
     of all files not in the input set.

`set number'
     A set number consists of the letter `s' followed immediately by a
     number.  This refers to one of the sets created by a previous
     query operation. During one `iid' session, each query generates a
     unique set number, so any previously generated set may be used as
     part of any new query by referring to the set number.

   The `not' operator has the highest precedence with `and' coming in
the middle and `or' having the lowest precedence.  The operator names
are recognized using case insensitive matching, so `AND', `and', and
`aNd' are all the same as far as `iid' is concerned. If you wish to use
a keyword as an operand to one of the query programs, you must enclose
it in quotes.  Any patterns containing shell special characters must
also be properly quoted or escaped, since the query commands are run by
invoking them with the shell.

   Summary of query expression syntax:

     A <query> is:
        <set number>
        <identifier>
        lid <identifier list>
        aid <identifier list>
        match <wild card list>
        <query> or <query>
        <query> and <query>
        not <query>
        ( <query> )


File: mkid.info,  Node: Sets,  Next: Show,  Prev: Ss and Files commands,  Up: Iid

Sets
====

 - Subcommand: sets

   The `sets' command displays all the sets created so far. Each one is
described by the query command that generated it.


File: mkid.info,  Node: Show,  Next: Begin,  Prev: Sets,  Up: Iid

Show
====

 - Subcommand: show SET

 - Subcommand: p SET

   The `show' and `p' commands are equivalent. They both accept a set
number as an argument and run the program given in the `PAGER'
environment variable with the file names in that set as arguments.


File: mkid.info,  Node: Begin,  Next: Help,  Prev: Show,  Up: Iid

Begin
=====

 - Subcommand: begin DIRECTORY

 - Subcommand: b DIRECTORY

   The `begin' command (and its abbreviated version `b') is used to
begin a new `iid' session in a different directory (which presumably
contains a different database). It flushes all the sets created so far
and switches to the specified directory. It is equivalent to exiting
`iid', changing directories in the shell, and running `iid' again.


File: mkid.info,  Node: Help,  Next: Off,  Prev: Begin,  Up: Iid

Help
====

 - Subcommand: help

 - Subcommand: h

 - Subcommand: ?

   The `help', `h', and `?' command are three different ways to ask for
help. They all invoke the `PAGER' program to display a short help file.


File: mkid.info,  Node: Off,  Next: Shell Commands as Queries,  Prev: Help,  Up: Iid

Off
===

 - Subcommand: off

 - Subcommand: quit

 - Subcommand: q

   These three command (or just an end of file) all cause `iid' to exit.


File: mkid.info,  Node: Shell Commands as Queries,  Next: Shell Escape,  Prev: Off,  Up: Iid

Shell Commands as Queries
=========================

   When the first word on an `iid' command is not recognized as a
builtin `iid' command, `iid' assumes the command is a shell command
which will write a list of file names to STDOUT. This list of file
names is used to generate a new set of files.

   Any set numbers that appear as arguments to this command are expanded
into lists of file names prior to running the command.


File: mkid.info,  Node: Shell Escape,  Prev: Shell Commands as Queries,  Up: Iid

Shell Escape
============

   If a command starts with a bang (`!') character, the remainder of
the line is run as a shell command. Any set numbers that appear as
arguments to this command are expanded into lists of file names prior to
running the command.


File: mkid.info,  Node: Other Tools,  Next: Command Index,  Prev: Iid,  Up: Top

Other Tools
***********

   This chapter describes some support tools that work with the other ID
programs.

* Menu:

* GNU Emacs Interface::         Using gid.el
* Fid::                         List identifiers in a file.
* Idx::                         Extract identifiers from source file.


File: mkid.info,  Node: GNU Emacs Interface,  Next: Fid,  Prev: Other Tools,  Up: Other Tools

GNU Emacs Interface
===================

   The source distribution comes with a file named `gid.el'.  This is a
GNU emacs interface to the `gid' tool.  If you put the file where emacs
can find it (somewhere in your `EMACSLOADPATH') and put `(autoload 'gid
"gid" nil t)' in your `.emacs' file, you will be able to invoke the
`gid' function using `M-x gid'.

   This function prompts you with the word the cursor is on. If you want
to search for a different pattern, simply delete the line and type the
pattern of interest.

   It runs `gid' in a `*compilation*' buffer, so the normal
`next-error' function can be used to visit all the places the
identifier is found (*note Compilation: (emacs)Compilation.).


File: mkid.info,  Node: Fid,  Next: Idx,  Prev: GNU Emacs Interface,  Up: Other Tools

Fid
===

 - Command: fid [`-f<FILE>'] FILE1 [FILE2]
    `-f<FILE>'
          Look in the named database.

    `FILE1'
          List the identifiers contained in file1 according to the
          database.

    `FILE2'
          If a second file is given, list only the identifiers both
          files have in common.

   The `fid' program provides an inverse query. Instead of listing
files containing some identifier, it lists the identifiers found in a
file.


File: mkid.info,  Node: Idx,  Prev: Fid,  Up: Other Tools

Idx
===

 - Command: idx [`-s<DIRECTORY>'] [`-r<DIRECTORY>'] [`-S<SCANARG>']
          FILES...
     The `-s', `-r', and `-S' arguments to `idx' are identical to the
     same arguments on `mkid' (*note Mkid Command Line Options::.).

   The `idx' command is more of a test frame for scanners than a tool
designed to be independently useful. It takes the same scanner arguments
as `mkid', but rather than building a database, it prints the
identifiers found to STDOUT, one per line. You can use it to try out a
scanner on a sample file to make sure it is extracting the identifiers
you believe it should extract.


File: mkid.info,  Node: Command Index,  Prev: Other Tools,  Up: Top

Command Index
*************

* Menu:

* ?:                                    Help.
* aid:                                  Aid.
* b:                                    Begin.
* begin:                                Begin.
* eid:                                  Eid.
* f:                                    Ss and Files commands.
* fid:                                  Fid.
* files:                                Ss and Files commands.
* gid:                                  Gid.
* h:                                    Help.
* help:                                 Help.
* idx:                                  Idx.
* iid:                                  Iid.
* lid:                                  Lid.
* mkid:                                 Mkid Command Line Options.
* off:                                  Off.
* p:                                    Show.
* pid:                                  Pid.
* q:                                    Off.
* quit:                                 Off.
* sets:                                 Sets.
* show:                                 Show.
* ss:                                   Ss and Files commands.



Tag Table:
Node: Top913
Node: Overview1298
Node: History2862
Node: Mkid5027
Node: Mkid Command Line Options6363
Node: Scanner Arguments8124
Node: Builtin Scanners10479
Node: C11144
Node: Plain Text12039
Node: Assembler13107
Node: Adding Your Own Scanner14295
Node: Mkid Examples16272
Node: Database Query Tools18249
Node: Common Options19190
Node: Patterns22906
Node: Lid24148
Node: Aid24570
Node: Gid25101
Node: Eid25721
Node: Pid26845
Node: Iid27735
Node: Ss and Files commands29605
Node: Sets33068
Node: Show33308
Node: Begin33636
Node: Help34123
Node: Off34404
Node: Shell Commands as Queries34634
Node: Shell Escape35160
Node: Other Tools35502
Node: GNU Emacs Interface35879
Node: Fid36685
Node: Idx37237
Node: Command Index37912

End Tag Table