aboutsummaryrefslogtreecommitdiffstats
path: root/cppawk.1
blob: 77f53572853d311200fa765120ef87e1bd92604c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
CPPAWK(1)                            Awk With C Preprocessing                           CPPAWK(1)

NAME
       cppawk - wrapper for awk, with C preprocessing

SYNOPSIS
       cppawk [cpp, awk and cppawk options] [awk arguments]

       cppawk --prepro-only [cpp, awk and cppawk options]

DESCRIPTION
       cppawk  is a shell script which passes awk code through the standalone C preprocessor, and
       then invokes awk on the preprocessed code. This allows Awk code to be written which uses C
       preprocessor  #define  macros,  #include C comments, trigraphs (though perish the thought)
       and backslash continuation.

       cppawk deliberately has an invocation syntax similar to Awk, and understands  certain  Awk
       options  such as -f and also understands cpp options, such as -Dfoo=bar for pre-defining a
       macro.

       Just like with awk, code is specified either directly as the first non-option argument, or
       via  the  -f  option  which indicates a file. In either situation, cppawk preprocesses the
       code and places the result in a temporary file which is then executed as awk code.

OPTIONS
       Any option not described here is assumed to be an Awk option which takes no argument,  and
       is consequently passed through to the awk program.

       --     End  of  options: any subsequent argument is the first non-option argument, even if
              it looks like an option.

       --prepro-only
              Do not run the preprocessed Awk program; dump the  preprocessed  code  to  standard
              output.

       --awk=path
              Specify  alternative  Awk  implementation.  If it contains no slashes, then PATH is
              searched to find the program. If the base name of the  program  is  gawk  or  mawk,
              then,  respectively, one of the preprocessor symbols __gawk__ or __mawk__ is prede-
              fined, with a value of 1. This happens immediately when this option  is  processed,
              so can be counter-acted by a subsequent -U option.

       --prepro=path
              Specify  alternative preprocessor. If it contains no slashes, then PATH is searched
              to find the program.

       -f filename
              Read the awk program from filename rather than processing awk code from  the  first
              non-option  command-line argument. The program is preprocessed to a temporary file,
              and awk is then invoked on this file. The file is deleted when awk terminates.

       -E filename
              The -E option is inspired by that of GNU Awk; cppawk implements a form of this  op-
              tion  itself, for all Awk back-ends, and does not pass it through to GNU Awk.  This
              option combines the semantics of the -f and -- options. Arrangements are  made  for
              the  awk  program  to be read from a file exactly as described above for the -f op-
              tion. Then, no more options are processed.  Any remaining option-like arguments are
              ordinary arguments.

              Note that unlike GNU Awk's -E options, cppawk's -E option doesn't suppress the pro-
              cessing of arguments which look like variable assignments.

              Instead, the program may specify the following preprocessing directive, outside  of
              any Awk block or function:

                #include <safearg.h>

              this directive produces a BEGIN clause which prepares an associate array named argv
              that contains the same key/value pairs as the standard ARGV.   The  ARGV  array  is
              then deleted. Consequently, Awk will not process and perform the command line vari-
              able assignments, which normally occurs after the BEGIN clauses are processed.  The
              effects  of  <safearg.h>  are not visible to BEGIN clauses which are placed earlier
              than the inclusion of <safearg.h>.  Those earlier clauses have access to the origi-
              nal ARGV array.

              However,  the  combination  of -E option and <safearg.h> is still not equivalent to
              GNU Awk's -E option, because no filename arguments are available for  implicit  use
              in the Awk pattern processing loop.

       --nobash
              Pretend  that  the  shell which executes cppawk isn't GNU Bash, even if it is. This
              has the effect of disabling the use of process substitution in favor of the use  of
              a temporary file.

       --dump-macros
              Instruct the preprocessor to dump all of the #define directives instead of the pre-
              processed output. Since this is only useful with --prepro-only that option  is  im-
              plied.

       -M, --bignum
              These  two equivalent GNU Awk options are passed through to awk , which will under-
              stand them if it is GNU Awk. Using  either of them causes the  preprocessor  symbol
              __bignum__ to be defined with the value 1.

       -P, --posix
              These  two equivalent GNU Awk options are passed through to awk , which will under-
              stand them if it is GNU Awk. Using  either of them causes the  preprocessor  symbol
              __posix__ to be defined with the value 1.

       -M...  Any  optional argument beginning with -M and followed by one or more characters re-
              sults in a diagnostic message and failed termination. The intent  is  that  the  -M
              family of options that are supported by GNU cpp are not supported by cppawk.

       -F, -v, -i, -l, -L
              These  standard  and GNU Awk options are recognized by cppawk as requiring an argu-
              ment. They are validated for the presence of the required argument, and  passed  to
              awk.

       -U..., -D..., -I..., -iquote...
              Options which match these patterns are passed to the cpp program instead of awk.

PREDEFINED SYMBOLS
       __gawk__
              When  cppawk  installation  is configured to use GNU Awk, which is the default, the
              preprocessor symbol __gawk__ is predefined with a value of 1. See the --awk option.

       __cppawk_ver
              This preprocessor symbol gives the version of cppawk.  Its value is a is  an  eight
              digit decimal integer the form YYYYMMDD, such as 20220321.

CONFIGURATION SYMBOLS
       __gawk_ver
              Certain cppawk header files may have functionality that depends on GNU Awk.

              The  __gawk_ver variable may be set by the application to indicate which version of
              GNU Awk should be assumed by those library headers. The headers will avoid generat-
              ing code that doesn't work with later versions than this.

              This  variable should be set before including any header files, or using the -D op-
              tion on the command line.

              The variable should be a decimal integer, whose last four digits encode  the  minor
              and build numbers. For instance 4.1.3 is encoded as 40103:

                #define __gawk_ver 40103 // Inform library GNU Awk 4.1.3 is used
                #include <...>           // inclusion of headers follows

              If  the variable is not set, then the library headers which make use of it will de-
              fine it themselves to a default value of 40000, to assume GNU Awk 4.0 or later.

              Lower values than 40000 are not supported; code that requires GNU  Awk  assumes  at
              least version 4.0.

STANDARD HEADERS
       cppawk  points  the  preprocessor  to  look for #include <...> files in its own directory,
       which contains a library of header files that accompany cppawk.

       <narg.h>
              This header provides macros which make it easy to  write  variable-argument  macros
              with complex expansions. This is documented in the cppawk-narg manual page.

       <case.h>
              This header provides macros for writing a case statement. The case statement syntax
              is designed so that a GNU Awk switch statement is easily converted to it.  The pre-
              processor  translates  it  back to a clean GNU Awk switch statement, or to portable
              Awk code that runs on other Awks. The contents of this header are documented by the
              cppawk-case manual page.

EXAMPLES
       Print the larger of field 1 or 2:

         cppawk '// C comment
                #define max(a, b) ((a) > (b) ? (a) : (b))
                { print max($1, $2) /* C comment */ } #awk comment'

       Implement awk-like processing loop within function, to process /proc/mounts:

         #include "awkloop.h"

         function main()
         {
           awkloop ("/proc/mounts") {
             rule ($3 != "ext4") { nextrec }
             rule ($2 == "/") { print $1 }
           }
         }

         BEGIN {
           main()
         }

       Where awkloop.h contains:

         #define awkloop(file)  for (; getline < file || (close(file) && 0); )
         #define nextrec        continue
         #define rule(cond)     if (cond)

       Produce  an informative banner in generated output, as an Awk comment block.  This is very
       useful when output is being generated and retained instead of being immediately  executed,
       for instance for installation on a target system which has no preprocessor:

         #define HASH #
         HASH###################################################
         HASH DO NOT EDIT!
         HASH This file was generated from __FILE__ on __DATE__
         HASH###################################################

       Note: this was tested to work with the GNU preprocessor. A spurious blank line may appear.
       The material in the Awk comments isn't a comment to the C preprocessor; it must consist of
       valid C preprocessor tokens, so the text must be chosen accordingly.

SEE ALSO
       awk(1), cpp(6), cppawk-narg(1), cppawk-case(1), cppawk-cons(1)

BUGS
       The  -f  option  can be given only once, whereas awk accepts multiple -f options, and exe-
       cutes each of the indicated files.

       Awk error messages are reported against the preprocessed text.

       Awk # comments cannot be used at the start of a line because # begins a preprocessing  di-
       rective.  They also cannot be used inside a preprocessing directive, such as a macro defi-
       nition, because # is an operator in the preprocessor language. It may be a  good  idea  to
       avoid # comments entirely in cppawk source, and use only C comments.

       The  cpp program tokenizes text using C preprocessor rules. Because Awk is "C-like", there
       is a lot of compatibility between that and Awk syntax, which is why cppawk works  at  all;
       however, there may be corner cases where some issue arises because of this. One example is
       that double quote characters may be used in Awk regular expressions such as

         /abc"/

       but the preprocessor rejects this as a literal with a missing  closing  quote.  The  work-
       around for that situation is to use an escape sequence to encode the quote:

         /abc\042/

       Another  area  of  an incompatibility is that newlines are significant in the Awk grammar,
       and some Awk programs use backslash-newline escape sequences in order to turn  significant
       newlines  into  insignificant  newlines. Though the C preprocessor recognizes and consumes
       backslash-newline sequences it may, unfortunately, replace them  with  an  unescaped  new-
       lines.  So  the  backslash line continuation technique is not reliably available to cppawk
       programs. A clumsy workaround which works with GNU cpp is this:

         #define BS \\
         /pattern/ BS
         { action }

       Awk implementations reports errors against lines an anonymous filename associated with the
       preprocessed  stream,  rather  than the original lines in the original file.  Although the
       preprocessed output indicates source file and line number information, Awks do not  under-
       stand this.

       The  default  choices of gawk and cpp are fixed in the source code; users must edit cppawk
       to select alternative implementations or locations of these tools, if they don't  wish  to
       use the --awk and --prepro command line options.

       The  C  preprocessor  doesn't  permit macro recursion, which introduces limitations to the
       ability to compose invocations of cppawk macros, thus curtailing their power.  If  in  the
       expansion  of  some  macro M a call of macro M appears, that call is not expanded. This is
       relied upon by C programs which use macros to inline same-named functions,  for  instance,
       if  it  were  acceptable for the argument of strlen to be evaluated twice, then this macro
       version would be permissible:
         #define strlen(x) (*(x) == 0 ? 0 : strlen(x)) Here, the strlen call in the macro  expan-
       sion  is  relied upon not to be expanded as a macro, in which case runaway expansion would
       occur.

AUTHOR
       Kaz Kylheku <kaz@kylheku.com>

COPYRIGHT
       Copyright 2022, BSD2 License.

Utility Commands                          19 April 2022                                 CPPAWK(1)