.TH CPPAWK 1 "25 March 2022" "Utility Commands" "Awk With C Preprocessing" .SH NAME cppawk \- wrapper for awk, with C preprocessing .SH SYNOPSIS cppawk [cpp, awk and cppawk options] [awk arguments] cppawk --prepro-only [cpp, awk and cppawk options] .SH DESCRIPTION .B cppawk is a shell script which passes awk code through the standalone C preprocessor, and then invokes awk on the preprocessed code. This allows Awk code to be written which uses C preprocessor .BI #define macros, .BI #include C comments, trigraphs (though perish the thought) and backslash continuation. .B cppawk deliberately has an invocation syntax similar to Awk, and understands certain Awk options such as .BI -f and also understands .B cpp options, such as .BI -Dfoo=bar for pre-defining a macro. Just like with .BR awk , code is specified either directly as the first non-option argument, or via the .BI -f option which indicates a file. In either situation, .B cppawk preprocesses the code and places the result in a temporary file which is then executed as .B awk code. .SH OPTIONS Any option not described here is assumed to be an Awk option which takes no argument, and is consequently passed through to the .B awk program. .IP "\fB\-\-\fR" End of options: any subsequent argument is the first non-option argument, even if it looks like an option. .IP "\fB\-\-prepro\-only\fR" Do not run the preprocessed Awk program; dump the preprocessed code to standard output. .IP "\fB\-\-awk=\fR\fIpath\fR" Specify alternative Awk implementation. If it contains no slashes, then .BI PATH is searched to find the program. If the base name of the program is .BI gawk or .BI mawk, then, respectively, one of the preprocessor symbols .BI __gawk__ or .BI __mawk__ is predefined, with a value of 1. This happens immediately when this option is processed, so can be counter-acted by a subsequent .BI -U option. .IP "\fB\-\-prepro=\fR\fIpath\fR" Specify alternative preprocessor. If it contains no slashes, then .BI PATH is searched to find the program. .IP "\fB\-f\fR \fIfilename\fR" Read the awk program from .I filename rather than processing awk code from the first non-option command-line argument. The program is preprocessed to a temporary file, and .B awk is then invoked on this file. The file is deleted when .B awk terminates. .IP "\fB\-\-nobash\fR" Pretend that the shell which executes .B cppawk isn't GNU Bash, even if it is. This has the effect of disabling the use of process substitution in favor of the use of a temporary file. .IP "\fB\-\-dump-macros\fR" Instruct the preprocessor to dump all of the .BI #define directives instead of the preprocessed output. Since this is only useful with .BI --prepro-only that option is implied. .IP "\fB\-M\fR, \fB\--bignum\fR" These two equivalent GNU Awk options are passed through to .BR awk , which will understand them if it is GNU Awk. Using either of them causes the preprocessor symbol .BI __bignum__ to be defined with the value 1. .IP "\fB\-P\fR, \fB\--posix\fR" These two equivalent GNU Awk options are passed through to .BR awk , which will understand them if it is GNU Awk. Using either of them causes the preprocessor symbol .BI __posix__ to be defined with the value 1. .IP "\fB\-M...\fR Any optional argument beginning with .BI -M and followed by one or more characters results in a diagnostic message and failed termination. The intent is that the .BI -M family of options that are supported by GNU cpp are not supported by .BR cppawk . .IP "\fB-F\fR, \fB-v\fR, \fB-E\fR, \fB-i\fR, \fB-l\fR, \fB-L\fR" These standard and GNU Awk options are recognizes by .B cppawk as requiring an argument. They are validated for the presence of the required argument, and passed to .BR awk . .IP "\fB-U...\fR, \fB-D...\fR, \fB-I...\fR, \fB-iquote...\fR" Options which match these patterns are passed to the .B cpp program instead of .BR awk . .SH PREDEFINED SYMBOLS .IP \fB__gawk__\fR When .B cppawk installation is configured to use GNU Awk, which is the default, the preprocessor symbol .BI __gawk__ is predefined with a value of 1. See the .BI --awk option. .IP \fB__cppawk_ver\fR This preprocessor symbol gives the version of .BR cppawk . Its value is a is an eight digit decimal integer the form .IR YYYYMMDD , such as 20220321. .SH NOT PREDEFINED, SIGNIFICANT SYMBOLS .IP \fB__gawk_ver\fR Certain .B cppawk header files may have functionality that depends on GNU Awk. By default, those .B cppawk headers which require GNU Awk may assume the latest version released by the GNU Awk project, with all of its features and bugfixes. Consequently, the generate code may not work on older versions of GNU Awk. The user application, prior to including any .B cppawk header, may define the macro .BI __gawk_ver to indicate which version of GNU Awk the generated code is required to work with. In reaction to this variable, those header files may be able to generate alternative code suitable for the indicated version of GNU Awk. The variable should be a decimal integer, whose last four digits encode the minor and build numbers. For instance 4.1.3 is encoded as 40103: #define __gawk_ver 40103 // Please support GNU Awk 4.1.3 #include <...> // inclusion of headers follows Naturally, .BI __gawk_ver may be specified on the command line with the .BI -D option. .SH STANDARD HEADERS .B cppawk points the preprocessor to look for .BI "#include <...>" files in its own directory. There are currently no files in this directory. .SH EXAMPLES Print the larger of field 1 or 2: cppawk '// C comment #define max(a, b) ((a) > (b) ? (a) : (b)) { print max($1, $2) /* C comment */ } #awk comment' Implement awk-like processing loop within function, to process /proc/mounts: #include "awkloop.h" function main() { awkloop ("/proc/mounts") { rule ($3 != "ext4") { nextrec } rule ($2 == "/") { print $1 } } } BEGIN { main() } Where .BI awkloop.h contains: #define awkloop(file) for (; getline < file || (close(file) && 0); ) #define nextrec continue #define rule(cond) if (cond) .SH "SEE ALSO" awk(1), cpp(6) .SH BUGS The .BI -f option can be given only once, whereas .B awk accepts multiple .BI -f options, and executes each of the indicated files. Awk error messages are reported against the preprocessed text. Awk .BI # comments cannot be used at the start of a line because .BI # begins a preprocessing directive. They also cannot be used inside a preprocessing directive, such as a macro definition, because .BI # is an operator in the preprocessor language. It may be a good idea to avoid .BI # comments entirely in .B cppawk source, and use only C comments. The .B cpp program tokenizes text using C preprocessor rules. Because Awk is "C-like", there is a lot of compatibility between that and Awk syntax, which is why .B cppawk works at all; however, there may be corner cases where some issue arises because of this. The default choices of .B gawk and .B cpp are fixed in the source code; users must edit .B cppawk to select alternative implementations or locations of these tools, if they don't wish to use the .BI --awk and .BI --prepro command line options. The C preprocessor's .BR "#include \(dq...\(dq" directive is expected to search in the same directory as the file in which it is located, which is critically important feature. However, .B cppawk feeds the Awk code to .B cpp via a pipe, even in the case when source is specified via the .BI -f option. The reason is that the Awk source is filtered to remove the .BI #! ("hash bang") line, which .B cpp doesn't like. To make sure .BR #include works as expected, .B cppawk inserts a preprocessor option to add the original directory into the include file search path: the current working directory in the case of Awk code specified in the command line, or else the directory of the file specified via the .BI -f option. In the default configuration, which assumes the GNU C Preprocessor, the .BI -iquote option is used for this; but in a configuration using some preprocessor which does not have that option, it may have to be done via the more heavy-handed .BI -I option. These options are inserted before any preprocessor options that come from the .B cppawk command line. .SH AUTHOR Kaz Kylheku .SH COPYRIGHT Copyright 2022, BSD2 License.