## What is `cppawk`? `cppawk` is a tiny shell script that is used like `awk`. It invokes the C preprocessor (GNU `cpp`) on the Awk code and calls Awk on the result. `cppawk` understands the basic Awk options like `-F` and `-v`, and also understands common `cpp` options like `-I` and `-Dmacro=value`. The [`cppawk` man page](../tree/cppawk.1) describes all the invocation and usage details. For instance, if we define a file called `awkloop.h` which has these contents: ::c #define awkloop(file) for (; getline < file > 0 || (close(file) && 0); ) #define nextrec continue #define rule(cond) if (cond) Then this sort of code is possible: ::c #include "awkloop.h" function main() { awkloop ("/proc/mounts") { rule ($3 != "ext4") { nextrec } rule ($2 == "/") { print $1 } } } BEGIN { main() } We have implemented a facsimile of an Awk input-scanning loop inside a function with a bit of syntactic sugar. However, these few preprocessing directives are just a toy example, compared to what is provided in the `cppawk` standard headers. `cppawk` has few dependencies. It's written in shell, and makes use of the `sed` and `printf` utilities. Preprocessed programs can be captured and transferred for execution to systems that have Awk but do not have a preprocessor. ## The `cppawk` Library `cppawk` is sprouting a small library of useful macros and functions. One of them is a powerful `loop` facility that allows iteration (both parallel and nested) to be expressed by combining abstract clauses. Here is a program designed to demonstrate the `cppawk` `loop` macro, with its multiple clauses. It solves the following problem: a projectile is fired vertically with an initial speed of 5. Every step of the simulation, the speed drops by 1 due to gravity, eventually becoming negative. What is the maximum height achieved? ::c #include BEGIN { loop (from_step (vel, 5, -1), from_step (pos, 0, vel), while (pos >= 0), maximizing (maxpos, pos)) { print pos } print "maxpos =", maxpos } The output is ::txt 0 4 7 9 10 10 9 7 4 0 maxpos = 10 This example is taken from the [`testcases-iter`](../tree/testcases-iter) file. By the way, how is it possible that we can implement an iteration language with expressive, not to mention **user-definable** classes using a preprocessor that is famous for lacking power? It turns out that a significant part of the problem with C preprocessing is the backend languages being targeted. The C and C++ languages rob their preprocessing frontend of its full power. C macros in the context of the "home language" have to contend with syntactic roadblocks, such as identifiers having to be declared before use. Because Awk is a flexibly typed language in which variables don't have to be declared, it creates opportunities for significantly more freedom in how the C preprocessor can be applied, and the resulting macros can be much more clutter-free and ergonomic compared to if similar techniques were attempted on the "home turf". It's as if the C preprocessor were tailor-made for Awk. ## Roadmap `cppawk` has been carefully developed, and has a regression test suite. Nearly every feature and fix was developed by first writing one or more failing tests and getting them to pass. The script is stable and nearly feature-complete, since it is out of the project scope to modify Awk or the C preprocessor. The remaining work is likely solving portability issues, like using different implementations of the C preprocessor. Among future directions for `cppawk` is the development of a small library of useful standard headers. The foundation has been laid for this because when `#include <...>` is used (angle bracket include), it looks in a subdirectory called `cppawk-include` which is in the same directory as itself. For instance if `cppawk` is `/usr/bin/cppawk`, it looks in `/usr/bin/cppawk-include`. There are currently * [``](../tree/cppawk-case.1): provides a portable `case` statement macro which efficiently translates to a GNU Awk `switch` statement or else to less efficient but portable code. Additionally, the `case` statement requires clauses to be explicit about whether they fall through or break, which makes it safer to use. * [``](../tree/cppawk-narg.1): provides useful primitives for easily writing variadic macros. * [``](../tree/cppawk-iter.1): provides powerful iteration constructs, including a `loop` macro that features the ability for the application to define new iteration clauses, in addition to the numerous useful ones that come with `loop`. * [``](../tree/cppawk-cons.1): provides Lisp-like functional, heterogeneous list manipulation, higher order functions, some useful control operators, and functions combining Lisp lists and Awk arrays such as `group_by`. * [``](../tree/cppawk-fun.1): three macros for indirect functions, with a simple partial application mechanism for binding the leftmost argument. This requires GNU Awk 4.0 or higher, which features indirect function calls. Note: there are bugs in GNU Awk's indirect function calls feature that are present right through 5.1.1. * [``](../tree/cppawk-varg.1): utilities for working with variadic functions in Awk, as well as with optional arguments. * [``](../tree/cppawk-field.1): utilities for manipulating the Awk positional parameters ("fields"). * [``](../tree/cppawk-quote.1): provides the `q` function for quoting text for safe insertion into shells cripts. Several unreleased headers are in the development queue: * ``: some associative array utilities. * ``: Lisp-like assoc lists: addendum to ``. * Certain utilities in the private header `` should be made public. ## License `cppawk` is offered under the two-clause BSD license. See the copyright header in the source files and the LICENSE file in the source tree. ## Why? * Why not? * You know Awk. You know C preprocessing inside out. Now use two things that you know, together, in obvious ways. * You can organize an Awk program into a tree of files that the preprocessor "compiles" into a single "executable". * You can use macros for C-style metaprogramming, and for conditional selection of code. * Powerful library: list manipulation, iteration, variadic functions. * Other minor benefits: Awk has no comments other than from a `#` character to the end of the line. You get `/* ... */` comments with `cppawk`, and also `#if 0` ... `#endif` for temporarily disabling code. * Some techniques from the `cppawk` header files would be useful in C and C++. Everything is BSD-licensed; you are welcome to use it as you please, whole or just bits and pieces. ## But GNU Awk has `@include`? * GNU Awk's `@include` isn't a full preprocessor. There are no conditional expressions, and no macros. * It is only implemented in GNU Awk. * It provides no way to capture all the included output. * The way `@include` searches for files is inferior to `cpp`; it doesn't look in the same directory as the parent file which contains the `@include` syntax. It reacts to an `AWKPATH` environment variable which has no provision for referencing relative to the location of the parent file. * `@include` requires, syntactically, a string-literal–like specification of the path name to be included. An expression is not allowed. For instance, a GNU Awk program cannot do this: ::awk self = calculate_own_path_somehow(); @include self "lib/util" # error By contrast, a `cppawk` program just does this: ::c #include "lib/util" // no problem The C preprocessor allows macro-replacement to take place in `#include`: ::c #include FOO_LIB // conditionally-defined macro to select lib