aboutsummaryrefslogtreecommitdiffstats
path: root/README.md
blob: 5e16a7f5cc2749b13fc63bfe6d35c8ab91d29d44 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
## What is `cppawk`?

`cppawk` is a tiny shell script that is used like `awk`. It invokes
the C preprocessor (GNU `cpp`) on the Awk code and calls `gawk`.

`cppawk` understands the basic Awk options like `-F` and `-v`, and also
understand common `cpp` options like `-I` and `-Dmacro=value`.

There is a `man` page with all the details.

For instance, if we define a file called `awkloop.h` which has these contents

    :::c
    #define awkloop(file)  for (; getline < file || (close(file) && 0); )
    #define nextrec        continue
    #define rule(cond)     if (cond)

Then this sort of code is possible:

    ::c
    #include "awkloop.h"

    function main()
    {
      awkloop ("/proc/mounts") {
        rule ($3 != "ext4") { nextrec }
        rule ($2 == "/") { print $1 }
      }
    }

    BEGIN {
      main()
    }

We have implemented a facsimile of an Awk input scanning loop inside a function
with a bit of syntactic sugar.

If you know C and the C preprocessor, and if you know Awk, the utility
and applications for this should be obvious; you may skip the next section
and go get it!

## Why or who might use `cppawk`?

0.  Why not? It's just a tiny shell script.

1.  You know how to use the C preprocessor, and are doing work on a system or
    situation where you can count on it being installed, such as build
    scripts or continuous integration. Make use of what you know.

2.  It makes some things easy, like making a program out of multiple
    files, which easily find each other in the same directory or relative
    path. Or macros, for some syntactic sugars.

3.  `cppawk` calls `gawk`, but can easily be tweaked to target any Awk; any Awk
    can have file inclusion.  There is the possibility of using `#ifdef` to
    make code work with different Awks, using their nonportable constructs:

        #if __gawk__
        GNU Awk specific code
        #else
        ...
        #endif

    The default cppawk installation uses gawk, and also defines `__gawk__`
    with a value of `1`. (If an installation of `cppawk` is prepared which uses
    a different awk, it is strongly recommended to define a different symbol to
    `1`, based on the name: `__mawk__`, and so forth.)

4.  Comments. Awk has no comments that don't end at the end of
    the line; `cppawk` gives you `/*...*/`.

5.  Temporarily disabling code with `#if 0` ... `#endif` rather than
    fiddling with hash marks.

6.  Exploration: Awk is syntactically C like, but not C: what
    implications does that have for writing macros? You can discover some
    new-ish techniques, though it won't be earth-shattering.

7.  Weird access to some host attributes intended for C:

        $ cppawk '#include <limits.h>
        BEGIN { print PATH_MAX, ULONG_MAX }'
        4096 214748364701

    this sort of thing can be useful in devops land.

8. `cppawk` has a `--prepro-only` option to generate the preprocessed
   Awk program on standard output rather than execute it. This is useful
   in several ways:
   *   debugging the inclusion process;
   *   preprocessing code on a machine with `cpp` to execute it on a system
       which has no `cpp`.
   *   conditionally generating multiple programs from a single source
       based on preprocessor symbols passed in via `-D` arguments.

## But GNU Awk has `@include`?

* GNU Awk's `@include` isn't a full preprocessor. There are no conditional
  expressions, and no macros.

* It is only implemented in GNU Awk.

* It provides no way to capture all the included output.

* The way `@include` searches for files is inferior to `cpp`. GNU Awk's include
  search is driven by the `AWKPATH` variable which brings in all the
  disadvantages shared by shared by `PATH`-like variables.
  In contrast `cpp` implements the familiar behavior that an `#include "..."`
  directive is resolved relative to the directory of the file which contains
  that `#include` directive. No configuration is required for a program
  to find all of its included pieces.

## What about systems that have `awk` but no `cpp`?

`cppawk` is used directly on systems that have `cpp`, as if
it were an Awk implementation.

`cppawk --prepro-only` will generate the preprocessed Awk code,
which can be captured and transferred to a system that has no
preprocessor installed, such as an embedded board that has
BusyBox Awk.