1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
|
CPPAWK(1) Awk With C Preprocessing CPPAWK(1)
NAME
cppawk - wrapper for awk, with C preprocessing
SYNOPSIS
cppawk [cpp, awk and cppawk options] [awk arguments]
cppawk --prepro-only [cpp, awk and cppawk options]
DESCRIPTION
cppawk is a shell script which passes awk code through the standalone C preprocessor, and
then invokes awk on the preprocessed code. This allows Awk code to be written which uses C
preprocessor #define macros, #include C comments, trigraphs (though perish the thought)
and backslash continuation.
cppawk deliberately has an invocation syntax similar to Awk, and understands certain Awk
options such as -f and also understands cpp options, such as -Dfoo=bar for pre-defining a
macro.
Just like with awk, code is specified either directly as the first non-option argument, or
via the -f option which indicates a file. In either situation, cppawk preprocesses the
code and places the result in a temporary file which is then executed as awk code.
OPTIONS
Any option not described here is assumed to be an Awk option which takes no argument, and
is consequently passed through to the awk program.
-- End of options: any subsequent argument is the first non-option argument, even if
it looks like an option.
--prepro-only
Do not run the preprocessed Awk program; dump the preprocessed code to standard
output.
--awk=path
Specify alternative Awk implementation. If it contains no slashes, then PATH is
searched to find the program. If the base name of the program is gawk or mawk,
then, respectively, one of the preprocessor symbols __gawk__ or __mawk__ is prede-
fined, with a value of 1. This happens immediately when this option is processed,
so can be counter-acted by a subsequent -U option.
--prepro=path
Specify alternative preprocessor. If it contains no slashes, then PATH is searched
to find the program.
-f filename
Read the awk program from filename rather than processing awk code from the first
non-option command-line argument. The program is preprocessed to a temporary file,
and awk is then invoked on this file. The file is deleted when awk terminates.
--nobash
Pretend that the shell which executes cppawk isn't GNU Bash, even if it is. This
has the effect of disabling the use of process substitution in favor of the use of
a temporary file.
-ump-macros
Instruct the preprocessor to dump all of the #define directives instead of the pre-
processed output. Since this is only useful with --prepro-only that option is im-
plied.
-M, --bignum
These two equivalent GNU Awk options are passed through to awk , which will under-
stand them if it is GNU Awk. Using either of them causes the preprocessor symbol
__bignum__ to be defined with the value 1.
-P, --posix
These two equivalent GNU Awk options are passed through to awk , which will under-
stand them if it is GNU Awk. Using either of them causes the preprocessor symbol
__posix__ to be defined with the value 1.
-M... Any optional argument beginning with -M and followed by one or more characters re-
sults in a diagnostic message and failed termination. The intent is that the -M
family of options that are supported by GNU cpp are not supported by cppawk.
-F, -v, -E, -i, -l, -L
These standard and GNU Awk options are recognized by cppawk as requiring an argu-
ment. They are validated for the presence of the required argument, and passed to
awk.
-U..., -D..., -I..., -iquote...
Options which match these patterns are passed to the cpp program instead of awk.
PREDEFINED SYMBOLS
__gawk__
When cppawk installation is configured to use GNU Awk, which is the default, the
preprocessor symbol __gawk__ is predefined with a value of 1. See the --awk option.
__cppawk_ver
This preprocessor symbol gives the version of cppawk. Its value is a is an eight
digit decimal integer the form YYYYMMDD, such as 20220321.
CONFIGURATION SYMBOLS
__gawk_ver
Certain cppawk header files may have functionality that depends on GNU Awk.
The __gawk_ver variable may be set by the application to indicate which version of
GNU Awk should be assumed by those library headers. The headers will avoid generat-
ing code that doesn't work with later versions than this.
This variable should be set before including any header files, or using the -D op-
tion on the command line.
The variable should be a decimal integer, whose last four digits encode the minor
and build numbers. For instance 4.1.3 is encoded as 40103:
#define __gawk_ver 40103 // Inform library GNU Awk 4.1.3 is used
#include <...> // inclusion of headers follows
If the variable is not set, then the library headers which make use of it will de-
fine it themselves to a default value of 40000, to assume GNU Awk 4.0 or later.
Lower values than 40000 are not supported; code that requires GNU Awk assumes at
least version 4.0.
STANDARD HEADERS
cppawk points the preprocessor to look for #include <...> files in its own directory,
which contains a library of header files that accompany cppawk.
<narg.h>
This header provides macros which make it easy to write variable-argument macros
with complex expansions. This is documented in the cppawk-narg manual page.
<case.h>
This header provides macros for writing a case statement. The case statement syntax
is designed so that a GNU Awk switch statement is easily converted to it. The pre-
processor translates it back to a clean GNU Awk switch statement, or to portable
Awk code that runs on other Awks. The contents of this header are documented by the
cppawk-case manual page.
EXAMPLES
Print the larger of field 1 or 2:
cppawk '// C comment
#define max(a, b) ((a) > (b) ? (a) : (b))
{ print max($1, $2) /* C comment */ } #awk comment'
Implement awk-like processing loop within function, to process /proc/mounts:
#include "awkloop.h"
function main()
{
awkloop ("/proc/mounts") {
rule ($3 != "ext4") { nextrec }
rule ($2 == "/") { print $1 }
}
}
BEGIN {
main()
}
Where awkloop.h contains:
#define awkloop(file) for (; getline < file || (close(file) && 0); )
#define nextrec continue
#define rule(cond) if (cond)
SEE ALSO
awk(1), cpp(6), cppawk-narg(1), cppawk-case(1), cppawk-cons(1)
BUGS
The -f option can be given only once, whereas awk accepts multiple -f options, and exe-
cutes each of the indicated files.
Awk error messages are reported against the preprocessed text.
Awk # comments cannot be used at the start of a line because # begins a preprocessing di-
rective. They also cannot be used inside a preprocessing directive, such as a macro defi-
nition, because # is an operator in the preprocessor language. It may be a good idea to
avoid # comments entirely in cppawk source, and use only C comments.
The cpp program tokenizes text using C preprocessor rules. Because Awk is "C-like", there
is a lot of compatibility between that and Awk syntax, which is why cppawk works at all;
however, there may be corner cases where some issue arises because of this. One example is
that double quote characters may be used in Awk regular expressions such as
/abc"/
but the preprocessor rejects this as a literal with a missing closing quote. The work-
around for that situation is to use an escape sequence to encode the quote:
/abc\042/
Another area of an incompatibility is that newlines are significant in the Awk grammar,
and some Awk programs use backslash-newline escape sequences in order to turn significant
newlines into insignificant newlines. Though the C preprocessor recognizes and consumes
backslash-newline sequences it may, unfortunately, replace them with an unescaped new-
lines. So the backslash line continuation technique is not reliably available to cppawk
programs. A clumsy workaround which works with GNU cpp is this:
#define BS \\
/pattern/ BS
{ action }
Some Awk code uses backslash continuations in order to turn These are significant Awk has
significant newlines in numerous places in the grammar which can change the meaning of the
code or introduce a syntax error, unless they are escaped with a backslash. This backlash
escaping is
Awk implementations reports errors against lines an anonymous filename associated with the
preprocessed stream, rather than the original lines in the original file. Although the
preprocessed output indicates source file and line number information, Awks do not under-
stand this.
The default choices of gawk and cpp are fixed in the source code; users must edit cppawk
to select alternative implementations or locations of these tools, if they don't wish to
use the --awk and --prepro command line options.
The C preprocessor doesn't permit macro recursion, which introduces limitations to the
ability to compose invocations of cppawk macros, thus curtailing their power. If in the
expansion of some macro M a call of macro M appears, that call is not expanded. This is
relied upon by C programs which use macros to inline same-named functions, for instance,
if it were acceptable for the argument of strlen to be evaluated twice, then this macro
version would be permissible:
#define strlen(x) (*(x) == 0 ? 0 : strlen(x)) Here, the strlen call in the macro expan-
sion is relied upon not to be expanded as a macro, in which case runaway expansion would
occur.
AUTHOR
Kaz Kylheku <kaz@kylheku.com>
COPYRIGHT
Copyright 2022, BSD2 License.
Utility Commands 19 April 2022 CPPAWK(1)
|