aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--COPYING128
-rw-r--r--Makefile36
-rw-r--r--README25
-rw-r--r--awk.h255
-rw-r--r--awk.tab.c1696
-rw-r--r--awk.y898
-rw-r--r--awk1.c723
-rw-r--r--awk2.c1129
-rw-r--r--awk3.c900
-rw-r--r--debug.c485
-rw-r--r--obstack.c157
-rw-r--r--obstack.h204
-rw-r--r--regex.c1689
-rw-r--r--regex.h213
14 files changed, 8538 insertions, 0 deletions
diff --git a/COPYING b/COPYING
new file mode 100644
index 00000000..d0f89166
--- /dev/null
+++ b/COPYING
@@ -0,0 +1,128 @@
+
+ GAWK GENERAL PUBLIC LICENSE
+ (Clarified 20 March 1987)
+
+ Copyright (C) 1986 Richard M. Stallman
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license, but changing it is not allowed.
+
+ The license agreements of most software companies keep you at the
+mercy of those companies. By contrast, our general public license is
+intended to give everyone the right to share GAWK. To make sure that
+you get the rights we want you to have, we need to make restrictions
+that forbid anyone to deny you these rights or to ask you to surrender
+the rights. Hence this license agreement.
+
+ Specifically, we want to make sure that you have the right to give
+away copies of GAWK, that you receive source code or else can get it
+if you want it, that you can change GAWK or use pieces of it in new
+free programs, and that you know you can do these things.
+
+ To make sure that everyone has such rights, we have to forbid you to
+deprive anyone else of these rights. For example, if you distribute
+copies of GAWK, you must give the recipients all the rights that you
+have. You must make sure that they, too, receive or can get the
+source code. And you must tell them their rights.
+
+ Also, for our own protection, we must make certain that everyone
+finds out that there is no warranty for GAWK. If GAWK is modified by
+someone else and passed on, we want its recipients to know that what
+they have is not what we distributed, so that any problems introduced
+by others will not reflect on our reputation.
+
+ Therefore we (Richard Stallman and the Free Software Foundation,
+Inc.) make the following terms which say what you must do to be
+allowed to distribute or change GAWK.
+
+
+ COPYING POLICIES
+
+ 1. You may copy and distribute verbatim copies of GAWK source code as
+you receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy a valid copyright notice "Copyright
+(C) 1986 Free Software Foundation, Inc." (or with the year updated if
+that is appropriate); keep intact the notices on all files that refer
+to this License Agreement and to the absence of any warranty; and give
+any other recipients of the GAWK program a copy of this License
+Agreement along with the program. You may charge a distribution fee
+for the physical act of transferring a copy.
+
+ 2. You may modify your copy or copies of GAWK or any portion of it,
+and copy and distribute such modifications under the terms of
+Paragraph 1 above, provided that you also do the following:
+
+ a) cause the modified files to carry prominent notices stating
+ that you changed the files and the date of any change; and
+
+ b) cause the whole of any work that you distribute or publish,
+ that in whole or in part contains or is a derivative of GAWK or any
+ part thereof, to be licensed at no charge to all third parties on
+ terms identical to those contained in this License Agreement
+ (except that you may choose to grant more extensive warranty
+ protection to third parties, at your option).
+
+ c) You may charge a distribution fee for the physical act of
+ transferring a copy, and you may at your option offer warranty
+ protection in exchange for a fee.
+
+ 3. You may copy and distribute GAWK or any portion of it in
+compiled, executable or object code form under the terms of Paragraphs
+1 and 2 above provided that you do the following:
+
+ a) cause each such copy to be accompanied by the
+ corresponding machine-readable source code, which must
+ be distributed under the terms of Paragraphs 1 and 2 above; or,
+
+ b) cause each such copy to be accompanied by a
+ written offer, with no time limit, to give any third party
+ free (except for a nominal shipping charge) a machine readable
+ copy of the corresponding source code, to be distributed
+ under the terms of Paragraphs 1 and 2 above; or,
+
+ c) in the case of a recipient of GAWK in compiled, executable
+ or object code form (without the corresponding source code) you
+ shall cause copies you distribute to be accompanied by a copy
+ of the written offer of source code which you received along
+ with the copy you received.
+
+ 4. You may not copy, sublicense, distribute or transfer GAWK
+except as expressly provided under this License Agreement. Any attempt
+otherwise to copy, sublicense, distribute or transfer GAWK is void and
+your rights to use the program under this License agreement shall be
+automatically terminated. However, parties who have received computer
+software programs from you with this License Agreement will not have
+their licenses terminated so long as such parties remain in full compliance.
+
+ 5. If you wish to incorporate parts of GAWK into other free
+programs whose distribution conditions are different, write to the Free
+Software Foundation at 1000 Mass Ave, Cambridge, MA 02138. We have not yet
+worked out a simple rule that can be stated here, but we will often permit
+this. We will be guided by the two goals of preserving the free status of
+all derivatives of our free software and of promoting the sharing and reuse
+of software.
+
+In other words, go ahead and share GAWK, but don't try to stop
+anyone else from sharing it farther. Help stamp out software hoarding!
+
+ NO WARRANTY
+
+ BECAUSE GAWK IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY NO
+WARRANTY, TO THE EXTENT PERMITTED BY APPLICABLE STATE LAW. EXCEPT
+WHEN OTHERWISE STATED IN WRITING, FREE SOFTWARE FOUNDATION, INC,
+RICHARD M. STALLMAN AND/OR OTHER PARTIES PROVIDE GAWK "AS IS" WITHOUT
+WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND
+PERFORMANCE OF GAWK IS WITH YOU. SHOULD GAWK PROVE DEFECTIVE, YOU
+ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW WILL RICHARD M.
+STALLMAN, THE FREE SOFTWARE FOUNDATION, INC., AND/OR ANY OTHER PARTY
+WHO MAY MODIFY AND REDISTRIBUTE GAWK AS PERMITTED ABOVE, BE LIABLE TO
+YOU FOR DAMAGES, INCLUDING ANY LOST PROFITS, LOST MONIES, OR OTHER
+SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR
+INABILITY TO USE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA
+BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY THIRD PARTIES OR A
+FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS) GAWK, EVEN
+IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES, OR FOR
+ANY CLAIM BY ANY OTHER PARTY.
diff --git a/Makefile b/Makefile
new file mode 100644
index 00000000..e0f050cb
--- /dev/null
+++ b/Makefile
@@ -0,0 +1,36 @@
+LIB=.
+#CFLAGS=-O -pg -I$(LIB) -DFAST
+CFLAGS=-g -I$(LIB)
+LIBOBJS= $(LIB)/obstack.o $(LIB)/regex.o
+OBJS = awk1.o awk2.o awk3.o awk.tab.o debug.o
+gawk: $(OBJS) $(LIBOBJS)
+ $(CC) -o gawk $(CFLAGS) $(OBJS) $(LIBOBJS) -lm
+$(OBJS): awk.h
+awk.tab.c: awk.y
+ bison -v awk.y
+# yacc -v awk.y
+# mv y.tab.c awk.tab.c
+lint:
+ lint -I$(LIB) awk.tab.c awk1.c awk2.c awk3.c
+clean:
+ rm -f gawk *.o core awk.output awk.tab.c
+
+# We don't distribute shar files, but they're useful for mailing.
+SHARS = COPYING Makefile README awk.h awk.tab.c awk.y awk1.c awk2.c awk3.c\
+ debug.c obstack.h obstack.c regex.h regex.c
+
+awk.shar: $(SHARS)
+ shar -f awk -c $(SHARS)
+
+awk.tar: $(SHARS)
+ tar cvf awk.tar $(SHARS)
+
+awk.tar.Z: awk.tar
+ -@mv awk.tar.Z awk.tar.old.Z
+ compress < awk.tar > awk.tar.Z
+
+# This command probably won't be useful to the rest of the world, but makes
+# life much easier for me.
+dist: awk.tar awk.tar.Z
+ rcp awk.tar prep:/u2/emacs/awk.tar
+ rcp awk.tar.Z prep:/u2/emacs/awk.tar.Z
diff --git a/README b/README
new file mode 100644
index 00000000..e61441bb
--- /dev/null
+++ b/README
@@ -0,0 +1,25 @@
+This is the Beta-test distribution of gawk. (Probably around version 1.01
+or so.)
+
+Please send all
+bug-reports, comments, cries for help, etc, to hack@prep.ai.mit.edu
+AKA mit-eddie!prep!hack During odd hours I can sometimes be reached at
+(617) 253-8975, which is an MIT phone in the middle of the corridor, so don't
+be suprised if someone wierd answers, or if the person on the other end has
+never heard of me. (Direct them to the microvax about 10feet to their left.)
+
+Gawk requires some berkeleyisms, like alloca(), bcopy(), index(), etc. I
+believe we have a portable version of alloca() (part of GNUemacs), and
+probably the other stuff as well. Send me mail if you need anything.
+
+For real speed, you should change the Makefile to compile -O -DFAST and
+disable the debugger. (-DFAST replaces some function calls with macros, and
+disables a lot of debugging stuff.)
+
+If you don't have bison, modify the makefile to call yacc instead
+(The proper commands should be already in the makefile; just un-comment them.)
+If you have neither bison nor yacc, use the awk.tab.c file here. It was
+generated with bison, and should have no AT+T code in it. (Note that
+modifying awk.y without bison or yacc will be difficult, at best. You might
+want to get a copy of bison from us too.)
+
diff --git a/awk.h b/awk.h
new file mode 100644
index 00000000..be792e07
--- /dev/null
+++ b/awk.h
@@ -0,0 +1,255 @@
+/*
+ * awk.h -- Definitions for gawk.
+ *
+ * Copyright (C) 1986 Free Software Foundation
+ * Written by Paul Rubin, August 1986
+ *
+ */
+
+/*
+GAWK is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY. No author or distributor accepts responsibility to anyone
+for the consequences of using it or for whether it serves any
+particular purpose or works at all, unless he says so in writing.
+Refer to the GAWK General Public License for full details.
+
+Everyone is granted permission to copy, modify and redistribute GAWK,
+but only under the conditions described in the GAWK General Public
+License. A copy of this license is supposed to have been given to you
+along with GAWK so you can know your rights and responsibilities. It
+should be in a file named COPYING. Among other things, the copyright
+notice and this notice must be preserved on all copies.
+
+In other words, go ahead and share GAWK, but don't try to stop
+anyone else from sharing it farther. Help stamp out software hoarding!
+*/
+
+#define AWKNUM float
+
+#include <ctype.h>
+#define is_identchar(c) (isalnum(c) || (c) == '_')
+
+#include "obstack.h"
+#define obstack_chunk_alloc malloc
+#define obstack_chunk_free free
+char *malloc(),*realloc();
+void free();
+
+typedef enum {
+ /* illegal entry == 0 */
+ Node_illegal, /* 0 */
+
+ /* binary operators lnode and rnode are the expressions to work on */
+ Node_times, /* 1 */
+ Node_quotient, /* 2 */
+ Node_mod, /* 3 */
+ Node_plus, /* 4 */
+ Node_minus, /* 5 */
+ Node_cond_pair, /* 6: conditional pair (see Node_line_range) jfw */
+ Node_subscript, /* 7 */
+ Node_concat, /* 8 */
+
+ /* unary operators subnode is the expression to work on */
+ Node_preincrement, /* 9 */
+ Node_predecrement, /* 10 */
+ Node_postincrement, /* 11 */
+ Node_postdecrement, /* 12 */
+ Node_unary_minus, /* 13 */
+ Node_field_spec, /* 14 */
+
+ /* assignments lnode is the var to assign to, rnode is the exp */
+ Node_assign, /* 15 */
+ Node_assign_times, /* 16 */
+ Node_assign_quotient, /* 17 */
+ Node_assign_mod, /* 18 */
+ Node_assign_plus, /* 19 */
+ Node_assign_minus, /* 20 */
+
+ /* boolean binaries lnode and rnode are expressions */
+ Node_and, /* 21 */
+ Node_or, /* 22 */
+
+ /* binary relationals compares lnode and rnode */
+ Node_equal, /* 23 */
+ Node_notequal, /* 24 */
+ Node_less, /* 25 */
+ Node_greater, /* 26 */
+ Node_leq, /* 27 */
+ Node_geq, /* 28 */
+
+ /* unary relationals works on subnode */
+ Node_not, /* 29 */
+
+ /* match ops (binary) work on lnode and rnode ??? */
+ Node_match, /* 30 */
+ Node_nomatch, /* 31 */
+
+ /* data items */
+ Node_string, /* 32 has stlen, stptr, and stref */
+ Node_temp_string, /* 33 has stlen, stptr, and stref */
+ Node_number, /* 34 has numbr */
+
+ /* program structures */
+ Node_rule_list, /* 35 lnode is a rule, rnode is rest of list */
+ Node_rule_node, /* 36 lnode is an conditional, rnode is statement */
+ Node_statement_list, /* 37 lnode is a statement, rnode is more list */
+ Node_if_branches, /* 38 lnode is to run on true, rnode on false */
+ Node_expression_list, /* 39 lnode is an exp, rnode is more list */
+
+ /* keywords */
+ Node_K_BEGIN, /* 40 no stuff */
+ Node_K_END, /* 41 ditto */
+ Node_K_if, /* 42 lnode is conditonal, rnode is if_branches */
+ Node_K_while, /* 43 lnode is condtional, rnode is stuff to run */
+ Node_K_for, /* 44 lnode is for_struct, rnode is stuff to run */
+ Node_K_arrayfor, /* 45 lnode is for_struct, rnode is stuff to run */
+ Node_K_break, /* 46 no subs */
+ Node_K_continue, /* 47 no stuff */
+ Node_K_print, /* 48 lnode is exp_list, rnode is redirect */
+ Node_K_printf, /* 49 lnode is exp_list, rnode is redirect */
+ Node_K_next, /* 59 no subs */
+ Node_K_exit, /* 51 subnode is return value, or NULL */
+
+ /* I/O redirection for print statements */
+ Node_redirect_output, /* 52 subnode is where to redirect */
+ Node_redirect_append, /* 53 subnode is where to redirect */
+ Node_redirect_pipe, /* 54 subnode is where to redirect */
+
+ /* Variables */
+ Node_var, /* 55 rnode is value, lnode is array stuff */
+ Node_var_array, /* 56 array is ptr to elements, asize num of eles */
+
+ /* Builtins subnode is explist to work on, proc is func to call */
+ Node_builtin, /* 57 */
+
+ /* pattern: conditional ',' conditional ; lnode of Node_line_range is
+ * the two conditionals (Node_cond_pair), other word (rnode place) is
+ * a flag indicating whether or not this range has been entered.
+ * (jfw@eddie.mit.edu)
+ */
+ Node_line_range, /* 58 */
+} NODETYPE;
+
+typedef struct exp_node {
+ NODETYPE type;
+ union {
+ struct {
+ struct exp_node *lptr;
+ union {
+ struct exp_node *rptr;
+ struct exp_node *(* pptr)();
+ struct re_pattern_buffer *preg;
+ struct for_loop_header *hd;
+ struct ahash **av;
+ int r_ent; /* range entered (jfw) */
+ } r;
+ } nodep;
+ struct {
+ struct exp_node **ap;
+ int as;
+ } ar;
+ struct {
+ char *sp;
+ short slen,sref;
+ } str;
+ AWKNUM fltnum;
+ } sub;
+} NODE;
+
+#define lnode sub.nodep.lptr
+#define rnode sub.nodep.r.rptr
+
+#define subnode lnode
+#define proc sub.nodep.r.pptr
+
+#define reexp lnode
+#define rereg sub.nodep.r.preg
+
+#define forsub lnode
+#define forloop sub.nodep.r.hd
+
+#define array sub.ar.ap
+#define arrsiz sub.ar.as
+
+#define stptr sub.str.sp
+#define stlen sub.str.slen
+#define stref sub.str.sref
+
+#define numbr sub.fltnum
+
+#define var_value lnode
+#define var_array sub.nodep.r.av
+
+#define condpair lnode
+#define triggered sub.nodep.r.r_ent
+
+NODE *newnode(), *dupnode();
+NODE *node(), *snode(), *make_number(), *make_string();
+NODE *mkrangenode(); /* to remove the temptation to use sub.nodep.r.rptr
+ * as a boolean flag, or to call node() with a 0 and
+ * hope that it will store correctly as an int. (jfw)
+ */
+NODE *tmp_string(),*tmp_number();
+NODE *variable(), *append_right();
+
+NODE *tree_eval();
+
+struct re_pattern_buffer *make_regexp();
+
+extern NODE *Nnull_string;
+
+#ifdef FAST
+double atof();
+NODE *strforce();
+#define force_number(x) ((x)->type==Node_number ? (x)->numbr : atof((x)->stptr))
+#define force_string(x) ((x)->type==Node_number ? (strforce(x)) : (x))
+#define tmp_node(ty) (global_tmp=(NODE *)obstack_alloc(&temp_strings,sizeof(NODE)),global_tmp->type=ty)
+#define tmp_number(n) (tmp_node(Node_number),global_tmp->numbr=(n),global_tmp)
+/* #define tmp_string(s,len) (tmp_node(Node_temp_string),global_tmp->stref=1,global_tmp->stlen=len,global_tmp->stptr=(char *)obstack_alloc(&temp_strings,len+1),bcopy(s,global_tmp->stptr,len),global_tmp->stptr[len]='\0',global_tmp) */
+NODE *global_tmp;
+#else
+AWKNUM force_number();
+NODE *force_string();
+#endif
+
+NODE *expression_value;
+
+#define HASHSIZE 101
+
+typedef struct hashnode HASHNODE;
+struct hashnode {
+ HASHNODE *next;
+ char *name;
+ int length;
+ NODE *value;
+} *variables[HASHSIZE];
+
+
+typedef struct ahash AHASH;
+struct ahash {
+ AHASH *next;
+ NODE *name,
+ *symbol,
+ *value;
+};
+
+
+typedef struct for_loop_header {
+ NODE *init;
+ NODE *cond;
+ NODE *incr;
+} FOR_LOOP_HEADER;
+
+FOR_LOOP_HEADER *make_for_loop();
+
+#define ADD_ONE_REFERENCE(s) ++(s)->stref
+#define FREE_ONE_REFERENCE(s) {\
+ if(s==Nnull_string) {\
+ fprintf(stderr,"Free_Nnull_string %d",(s)->stref);\
+ }\
+ if (--(s)->stref == 0) {\
+ free((char *)((s)->stptr));\
+ free((char *)s);\
+ }\
+}
+/* #define FREE_ONE_REFERENCE(s) {if (--(s)->stref == 0) {printf("FREE %x\n",s);free((s)->stptr);free(s);}} */
diff --git a/awk.tab.c b/awk.tab.c
new file mode 100644
index 00000000..f0f43df4
--- /dev/null
+++ b/awk.tab.c
@@ -0,0 +1,1696 @@
+
+/* A Bison parser, made from awk.y */
+
+#define NAME 258
+#define REGEXP 259
+#define YSTRING 260
+#define ERROR 261
+#define INCDEC 262
+#define NUMBER 263
+#define ASSIGNOP 264
+#define RELOP 265
+#define MATCHOP 266
+#define NEWLINE 267
+#define REDIRECT_OP 268
+#define CONCAT_OP 269
+#define LEX_BEGIN 270
+#define LEX_END 271
+#define LEX_IF 272
+#define LEX_ELSE 273
+#define LEX_WHILE 274
+#define LEX_FOR 275
+#define LEX_BREAK 276
+#define LEX_CONTINUE 277
+#define LEX_PRINT 278
+#define LEX_PRINTF 279
+#define LEX_NEXT 280
+#define LEX_EXIT 281
+#define LEX_IN 282
+#define LEX_AND 283
+#define LEX_OR 284
+#define INCREMENT 285
+#define DECREMENT 286
+#define LEX_BUILTIN 287
+#define UNARY 288
+
+#line 27 "awk.y"
+
+#define YYDEBUG 12
+
+#include <stdio.h>
+#include "awk.h"
+
+ static int yylex ();
+
+
+ /*
+ * The following variable is used for a very sickening thing.
+ * The awk language uses white space as the string concatenation
+ * operator, but having a white space token that would have to appear
+ * everywhere in all the grammar rules would be unbearable.
+ * It turns out we can return CONCAT_OP exactly when there really
+ * is one, just from knowing what kinds of other tokens it can appear
+ * between (namely, constants, variables, or close parentheses).
+ * This is because concatenation has the lowest priority of all
+ * operators. want_concat_token is used to remember that something
+ * that could be the left side of a concat has just been returned.
+ *
+ * If anyone knows a cleaner way to do this (don't look at the Un*x
+ * code to find one, though), please suggest it.
+ */
+ static int want_concat_token;
+
+ /* Two more horrible kludges. The same comment applies to these two too */
+ static int want_regexp; /* lexical scanning kludge */
+ static int want_redirect; /* similarly */
+ int lineno = 1; /* JF for error msgs */
+
+/* During parsing of a gawk program, the pointer to the next character
+ is in this variable. */
+ char *lexptr; /* JF moved it up here */
+ char *lexptr_begin; /* JF for error msgs */
+
+#line 64 "awk.y"
+typedef union {
+ long lval;
+ AWKNUM fval;
+ NODE *nodeval;
+ NODETYPE nodetypeval;
+ char *sval;
+ NODE *(*ptrval)();
+} YYSTYPE;
+
+#ifndef YYLTYPE
+typedef
+ struct yyltype
+ {
+ int timestamp;
+ int first_line;
+ int first_column;
+ int last_line;
+ int last_column;
+ char *text;
+ }
+ yyltype;
+
+#define YYLTYPE yyltype
+#endif
+
+#define YYACCEPT return(0)
+#define YYABORT return(1)
+#define YYERROR return(1)
+#include <stdio.h>
+
+#ifndef __STDC__
+#define const
+#endif
+
+
+
+#define YYFINAL 200
+#define YYFLAG -32768
+#define YYNTBASE 49
+
+#define YYTRANSLATE(x) (yytranslate[x])
+
+static const char yytranslate[] = { 0,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 40, 2, 2, 48, 37, 2, 2, 41,
+ 42, 35, 33, 39, 34, 2, 36, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 45, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 46, 2, 47, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 43, 2, 44, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 1, 2, 3, 4, 5,
+ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+ 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
+ 26, 27, 28, 29, 30, 31, 32, 38
+};
+
+static const short yyrline[] = { 0,
+ 105, 110, 112, 117, 122, 124, 126, 131, 134, 136,
+ 138, 140, 142, 151, 153, 158, 160, 164, 166, 171,
+ 173, 178, 180, 182, 186, 189, 193, 196, 197, 198,
+ 199, 201, 204, 206, 208, 210, 212, 214, 217, 220,
+ 222, 228, 230, 236, 239, 244, 246, 248, 250, 255,
+ 259, 265, 267, 271, 276, 282, 284, 288, 291, 293,
+ 299, 301, 303, 305, 307, 309, 311, 313, 315, 317,
+ 319, 323, 325, 327, 329, 331, 334, 336, 340, 342,
+ 344, 346, 348, 350, 352, 354, 356, 358, 360, 364,
+ 366, 368, 370, 372, 375, 379, 382, 384
+};
+
+static const char * yytname[] = { 0,
+"error","$illegal.","NAME","REGEXP","YSTRING","ERROR","INCDEC","NUMBER","ASSIGNOP","RELOP",
+"MATCHOP","NEWLINE","REDIRECT_OP","CONCAT_OP","LEX_BEGIN","LEX_END","LEX_IF","LEX_ELSE","LEX_WHILE","LEX_FOR",
+"LEX_BREAK","LEX_CONTINUE","LEX_PRINT","LEX_PRINTF","LEX_NEXT","LEX_EXIT","LEX_IN","LEX_AND","LEX_OR","INCREMENT",
+"DECREMENT","LEX_BUILTIN","'+'","'-'","'*'","'/'","'%'","UNARY","','","'!'",
+"'('","')'","'{'","'}'","';'","'['","']'","'$'","start"
+};
+
+static const short yyr1[] = { 0,
+ 49, 50, 50, 51, 52, 52, 52, 53, 53, 53,
+ 53, 53, 53, 54, 53, 55, 53, 53, 53, 56,
+ 56, 57, 57, 57, 58, 58, 59, 59, 59, 59,
+ 59, 60, 60, 60, 60, 60, 60, 60, 60, 61,
+ 60, 62, 60, 63, 60, 60, 60, 60, 60, 64,
+ 64, 65, 65, 66, 66, 67, 67, 68, 68, 68,
+ 69, 69, 69, 69, 69, 69, 69, 69, 69, 69,
+ 69, 69, 69, 69, 69, 69, 69, 69, 70, 70,
+ 70, 70, 70, 70, 70, 70, 70, 70, 70, 70,
+ 70, 70, 70, 70, 70, 71, 71, 71
+};
+
+static const short yyr2[] = { 0,
+ 2, 1, 2, 4, 0, 1, 3, 1, 1, 2,
+ 3, 3, 3, 0, 4, 0, 6, 3, 1, 0,
+ 4, 0, 1, 2, 2, 2, 0, 1, 1, 2,
+ 2, 5, 1, 6, 10, 9, 9, 2, 2, 0,
+ 5, 0, 5, 0, 7, 2, 2, 5, 2, 6,
+ 9, 0, 2, 0, 2, 0, 1, 0, 1, 3,
+ 4, 1, 3, 2, 2, 2, 2, 2, 1, 1,
+ 1, 3, 3, 3, 3, 3, 3, 3, 4, 1,
+ 3, 2, 2, 2, 2, 2, 1, 1, 1, 3,
+ 3, 3, 3, 3, 3, 1, 4, 2
+};
+
+static const short yydefact[] = { 52,
+ 5, 96, 71, 70, 53, 8, 9, 0, 0, 62,
+ 0, 14, 0, 0, 0, 5, 2, 20, 6, 19,
+ 69, 0, 65, 66, 58, 0, 64, 0, 10, 0,
+ 19, 89, 88, 0, 0, 80, 0, 0, 98, 87,
+ 3, 27, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 67, 68, 0, 0, 59,
+ 0, 0, 13, 63, 83, 84, 58, 82, 0, 0,
+ 0, 0, 0, 0, 0, 85, 86, 29, 28, 22,
+ 52, 11, 12, 7, 18, 16, 77, 75, 76, 72,
+ 73, 74, 78, 97, 0, 61, 15, 0, 81, 95,
+ 93, 94, 90, 91, 92, 31, 30, 0, 0, 0,
+ 0, 0, 40, 42, 0, 0, 27, 0, 23, 33,
+ 0, 4, 0, 60, 79, 0, 0, 56, 52, 52,
+ 38, 39, 58, 58, 58, 46, 0, 47, 22, 21,
+ 24, 49, 0, 0, 0, 96, 0, 57, 25, 26,
+ 54, 0, 54, 0, 0, 17, 27, 27, 0, 0,
+ 0, 0, 44, 0, 0, 27, 0, 0, 0, 56,
+ 0, 55, 41, 54, 43, 48, 32, 50, 34, 0,
+ 0, 56, 0, 27, 27, 27, 0, 45, 0, 0,
+ 0, 27, 51, 37, 36, 0, 35, 0, 0, 0
+};
+
+static const short yydefgoto[] = { 198,
+ 16, 17, 18, 19, 28, 123, 43, 118, 131, 80,
+ 119, 133, 135, 174, 120, 1, 162, 147, 59, 121,
+ 39, 21
+};
+
+static const short yypact[] = {-32768,
+ 262, -29,-32768,-32768,-32768,-32768,-32768, 11, 11, 6,
+ 313,-32768, 326, 326, 345, 142,-32768, 17, 243, 399,
+ 31, 313,-32768,-32768, 313, 313,-32768, 61,-32768, 35,
+ 129,-32768,-32768, 11, 11, 39, 313, 313,-32768, 157,
+-32768, 99, 81, 326, 326, 326, 313, 64, 313, 313,
+ 313, 313, 313, 313, 313,-32768,-32768, 72, -20, 254,
+ 87, 105,-32768,-32768,-32768,-32768, 313,-32768, 369, 313,
+ 313, 313, 313, 313, 313,-32768,-32768,-32768,-32768, 172,
+-32768,-32768, 107, 89, 254,-32768, 336, 164, 164,-32768,
+-32768,-32768, 254,-32768, 313,-32768,-32768, 46,-32768, 336,
+ 164, 164,-32768,-32768,-32768,-32768,-32768, 111, 114, 115,
+ -6, -6,-32768, 120, -6, 71, 99, 202,-32768,-32768,
+ -7, 156, 166, 254,-32768, 326, 326, 360,-32768,-32768,
+-32768,-32768, 313, 313, 313,-32768, 313,-32768, 172,-32768,
+-32768,-32768, 143, 109, 125, -5, 140, 254, 156, 156,
+ 3, 53, 3, 381, 232,-32768, 99, 99, 181, 292,
+ 313, -6,-32768, -6, -6, 99, 172, 172, 206, 313,
+ -25, 254,-32768, 198,-32768,-32768, 132, 194,-32768, 174,
+ 175, 313, -6, 99, 99, 99, 176,-32768, 172, 172,
+ 172, 99,-32768,-32768,-32768, 172,-32768, 214, 229,-32768
+};
+
+static const short yypgoto[] = {-32768,
+-32768, 215,-32768, -12,-32768,-32768,-32768, 91, -34, -82,
+ -100,-32768,-32768,-32768,-32768, -73, -95, -159, -36, -1,
+-32768, 304
+};
+
+
+#define YYLAST 436
+
+
+static const short yytable[] = { 20,
+ 29, 30, 44, 45, 129, 129, 49, 122, 159, 27,
+ 181, 20, 31, 2, 20, 161, 22, 141, 95, 182,
+ 58, 96, 187, 60, 61, 50, 51, 52, 53, 54,
+ 98, 82, 83, 84, 139, 68, 69, 130, 130, 55,
+ 22, 95, 20, 20, 20, 85, 25, 87, 88, 89,
+ 90, 91, 92, 93, 141, 149, 150, 164, 15, 42,
+ 56, 57, 44, 45, 62, 60, 178, 179, 100, 101,
+ 102, 103, 104, 105, 167, 168, 63, 132, 183, 67,
+ 136, 138, 129, 177, 95, 49, 142, 125, 193, 194,
+ 195, 95, 81, 124, 163, 197, 151, 152, 153, 86,
+ 49, 189, 190, 191, 50, 51, 52, 53, 54, 196,
+ 78, 137, 79, 144, 145, 130, 44, 45, 94, 50,
+ 51, 52, 53, 54, 20, 20, 148, 173, 64, 175,
+ 176, 60, 60, 60, 44, 154, 44, 45, 47, 48,
+ 97, -1, 49, 106, 2, 107, 3, 171, 188, 4,
+ 157, 126, 44, 45, 127, 128, 6, 7, 20, 172,
+ 134, 50, 51, 52, 53, 54, 158, 5, 148, 143,
+ 64, 8, 9, 10, 2, 11, 3, 12, 156, 4,
+ 148, 13, 14, 106, 160, 107, 76, 77, 108, 15,
+ 109, 110, 111, 112, 113, 114, 115, 116, 52, 53,
+ 54, 8, 9, 10, 2, 11, 3, 169, 180, 4,
+ 161, 184, 26, 199, 117, 185, 186, 192, 108, 15,
+ 109, 110, 111, 112, 113, 114, 115, 116, 200, 155,
+ 41, 8, 9, 10, 2, 11, 3, 0, 0, 4,
+ 0, 0, 26, 0, 117, 140, 0, 0, 108, 15,
+ 109, 110, 111, 112, 113, 114, 115, 116, 0, 0,
+ 0, 8, 9, 10, 2, 11, 3, 49, 0, 4,
+ 44, 45, 26, 5, 117, 166, 6, 7, 0, 15,
+ 0, 46, 0, 0, 0, 0, 50, 51, 52, 53,
+ 54, 8, 9, 10, 2, 11, 3, 12, 0, 4,
+ 0, 13, 14, 0, 0, 0, 6, 7, 0, 15,
+ 0, 23, 24, 0, 0, 2, 0, 3, 40, 0,
+ 4, 8, 9, 10, 0, 11, 0, 12, 2, 0,
+ 3, 13, 14, 4, 0, 0, 170, 65, 66, 15,
+ 6, 7, 8, 9, 10, 0, 11, 2, 0, 32,
+ 0, 0, 33, 26, 0, 8, 9, 10, 0, 11,
+ 15, 12, 146, 0, 3, 13, 14, 4, 50, 51,
+ 52, 53, 54, 15, 34, 35, 36, 0, 37, 0,
+ 0, 0, 49, 0, 0, 38, 0, 0, 0, 8,
+ 9, 10, 15, 11, 49, 0, 0, 0, 0, 0,
+ 26, 50, 51, 52, 53, 54, 0, 15, 47, 48,
+ 99, 0, 49, 50, 51, 52, 53, 54, 0, 0,
+ 0, 0, 165, 0, 0, 0, 0, 0, 0, 0,
+ 0, 50, 51, 52, 53, 54
+};
+
+static const short yycheck[] = { 1,
+ 13, 14, 28, 29, 12, 12, 14, 81, 14, 11,
+ 170, 13, 14, 3, 16, 13, 46, 118, 39, 45,
+ 22, 42, 182, 25, 26, 33, 34, 35, 36, 37,
+ 67, 44, 45, 46, 117, 37, 38, 45, 45, 9,
+ 46, 39, 44, 45, 46, 47, 41, 49, 50, 51,
+ 52, 53, 54, 55, 155, 129, 130, 153, 48, 43,
+ 30, 31, 28, 29, 4, 67, 167, 168, 70, 71,
+ 72, 73, 74, 75, 157, 158, 42, 112, 174, 41,
+ 115, 116, 12, 166, 39, 14, 121, 42, 189, 190,
+ 191, 39, 12, 95, 42, 196, 133, 134, 135, 36,
+ 14, 184, 185, 186, 33, 34, 35, 36, 37, 192,
+ 12, 41, 14, 126, 127, 45, 28, 29, 47, 33,
+ 34, 35, 36, 37, 126, 127, 128, 162, 42, 164,
+ 165, 133, 134, 135, 28, 137, 28, 29, 10, 11,
+ 36, 0, 14, 12, 3, 14, 5, 160, 183, 8,
+ 42, 41, 28, 29, 41, 41, 15, 16, 160, 161,
+ 41, 33, 34, 35, 36, 37, 42, 12, 170, 4,
+ 42, 30, 31, 32, 3, 34, 5, 36, 36, 8,
+ 182, 40, 41, 12, 45, 14, 30, 31, 17, 48,
+ 19, 20, 21, 22, 23, 24, 25, 26, 35, 36,
+ 37, 30, 31, 32, 3, 34, 5, 27, 3, 8,
+ 13, 18, 41, 0, 43, 42, 42, 42, 17, 48,
+ 19, 20, 21, 22, 23, 24, 25, 26, 0, 139,
+ 16, 30, 31, 32, 3, 34, 5, -1, -1, 8,
+ -1, -1, 41, -1, 43, 44, -1, -1, 17, 48,
+ 19, 20, 21, 22, 23, 24, 25, 26, -1, -1,
+ -1, 30, 31, 32, 3, 34, 5, 14, -1, 8,
+ 28, 29, 41, 12, 43, 44, 15, 16, -1, 48,
+ -1, 39, -1, -1, -1, -1, 33, 34, 35, 36,
+ 37, 30, 31, 32, 3, 34, 5, 36, -1, 8,
+ -1, 40, 41, -1, -1, -1, 15, 16, -1, 48,
+ -1, 8, 9, -1, -1, 3, -1, 5, 15, -1,
+ 8, 30, 31, 32, -1, 34, -1, 36, 3, -1,
+ 5, 40, 41, 8, -1, -1, 45, 34, 35, 48,
+ 15, 16, 30, 31, 32, -1, 34, 3, -1, 5,
+ -1, -1, 8, 41, -1, 30, 31, 32, -1, 34,
+ 48, 36, 3, -1, 5, 40, 41, 8, 33, 34,
+ 35, 36, 37, 48, 30, 31, 32, -1, 34, -1,
+ -1, -1, 14, -1, -1, 41, -1, -1, -1, 30,
+ 31, 32, 48, 34, 14, -1, -1, -1, -1, -1,
+ 41, 33, 34, 35, 36, 37, -1, 48, 10, 11,
+ 42, -1, 14, 33, 34, 35, 36, 37, -1, -1,
+ -1, -1, 42, -1, -1, -1, -1, -1, -1, -1,
+ -1, 33, 34, 35, 36, 37
+};
+#define YYPURE 1
+
+#line 2 "bison.simple"
+
+/* Skeleton output parser for bison,
+ copyright (C) 1984 Bob Corbett and Richard Stallman
+
+ Permission is granted to anyone to make or distribute verbatim copies of this program
+ provided that the copyright notice and this permission notice are preserved;
+ and provided that the recipient is not asked to waive or limit his right to
+ redistribute copies as permitted by this permission notice;
+ and provided that anyone possessing an executable copy
+ is granted access to copy the source code, in machine-readable form,
+ in some reasonable manner.
+
+ Permission is granted to distribute derived works or enhanced versions of
+ this program under the above conditions with the additional condition
+ that the entire derivative or enhanced work
+ must be covered by a permission notice identical to this one.
+
+ Anything distributed as part of a package containing portions derived
+ from this program, which cannot in current practice perform its function usefully
+ in the absense of what was derived directly from this program,
+ is to be considered as forming, together with the latter,
+ a single work derived from this program,
+ which must be entirely covered by a permission notice identical to this one
+ in order for distribution of the package to be permitted.
+
+ In other words, you are welcome to use, share and improve this program.
+ You are forbidden to forbid anyone else to use, share and improve
+ what you give them. Help stamp out software-hoarding! */
+
+/* This is the parser code that is written into each bison parser
+ when the %semantic_parser declaration is not specified in the grammar.
+ It was written by Richard Stallman by simplifying the hairy parser
+ used when %semantic_parser is specified. */
+
+/* Note: there must be only one dollar sign in this file.
+ It is replaced by the list of actions, each action
+ as one case of the switch. */
+
+#define yyerrok (yyerrstatus = 0)
+#define yyclearin (yychar = YYEMPTY)
+#define YYEMPTY -2
+#define YYEOF 0
+#define YYFAIL goto yyerrlab;
+
+#define YYTERROR 1
+
+#ifndef YYIMPURE
+#define YYLEX yylex()
+#endif
+
+#ifndef YYPURE
+#define YYLEX yylex(&yylval, &yylloc)
+#endif
+
+/* If nonreentrant, generate the variables here */
+
+#ifndef YYIMPURE
+
+int yychar; /* the lookahead symbol */
+YYSTYPE yylval; /* the semantic value of the */
+ /* lookahead symbol */
+
+YYLTYPE yylloc; /* location data for the lookahead */
+ /* symbol */
+
+int yydebug = 0; /* nonzero means print parse trace */
+
+#endif /* YYIMPURE */
+
+
+/* YYMAXDEPTH indicates the initial size of the parser's stacks */
+
+#ifndef YYMAXDEPTH
+#define YYMAXDEPTH 200
+#endif
+
+/* YYMAXLIMIT is the maximum size the stacks can grow to
+ (effective only if the built-in stack extension method is used). */
+
+#ifndef YYMAXLIMIT
+#define YYMAXLIMIT 10000
+#endif
+
+
+#line 87 "bison.simple"
+int
+yyparse()
+{
+ register int yystate;
+ register int yyn;
+ register short *yyssp;
+ register YYSTYPE *yyvsp;
+ YYLTYPE *yylsp;
+ int yyerrstatus; /* number of tokens to shift before error messages enabled */
+ int yychar1; /* lookahead token as an internal (translated) token number */
+
+ short yyssa[YYMAXDEPTH]; /* the state stack */
+ YYSTYPE yyvsa[YYMAXDEPTH]; /* the semantic value stack */
+ YYLTYPE yylsa[YYMAXDEPTH]; /* the location stack */
+
+ short *yyss = yyssa; /* refer to the stacks thru separate pointers */
+ YYSTYPE *yyvs = yyvsa; /* to allow yyoverflow to reallocate them elsewhere */
+ YYLTYPE *yyls = yylsa;
+
+ int yymaxdepth = YYMAXDEPTH;
+
+#ifndef YYPURE
+
+ int yychar;
+ YYSTYPE yylval;
+ YYLTYPE yylloc;
+
+ extern int yydebug;
+
+#endif
+
+
+ YYSTYPE yyval; /* the variable used to return */
+ /* semantic values from the action */
+ /* routines */
+
+ int yylen;
+
+ if (yydebug)
+ fprintf(stderr, "Starting parse\n");
+
+ yystate = 0;
+ yyerrstatus = 0;
+ yychar = YYEMPTY; /* Cause a token to be read. */
+
+ /* Initialize stack pointers.
+ Waste one element of value and location stack
+ so that they stay on the same level as the state stack. */
+
+ yyssp = yyss - 1;
+ yyvsp = yyvs;
+ yylsp = yyls;
+
+/* Push a new state, which is found in yystate . */
+/* In all cases, when you get here, the value and location stacks
+ have just been pushed. so pushing a state here evens the stacks. */
+yynewstate:
+
+ *++yyssp = yystate;
+
+ if (yyssp >= yyss + yymaxdepth - 1)
+ {
+ /* Give user a chance to reallocate the stack */
+ /* Use copies of these so that the &'s don't force the real ones into memory. */
+ YYSTYPE *yyvs1 = yyvs;
+ YYLTYPE *yyls1 = yyls;
+ short *yyss1 = yyss;
+
+ /* Get the current used size of the three stacks, in elements. */
+ int size = yyssp - yyss + 1;
+
+#ifdef yyoverflow
+ /* Each stack pointer address is followed by the size of
+ the data in use in that stack, in bytes. */
+ yyoverflow("parser stack overflow",
+ &yyss1, size * sizeof (*yyssp),
+ &yyvs1, size * sizeof (*yyvsp),
+ &yyls1, size * sizeof (*yylsp),
+ &yymaxdepth);
+
+ yyss = yyss1; yyvs = yyvs1; yyls = yyls1;
+#else /* no yyoverflow */
+ /* Extend the stack our own way. */
+ if (yymaxdepth >= YYMAXLIMIT)
+ yyerror("parser stack overflow");
+ yymaxdepth *= 2;
+ if (yymaxdepth > YYMAXLIMIT)
+ yymaxdepth = YYMAXLIMIT;
+ yyss = (short *) alloca (yymaxdepth * sizeof (*yyssp));
+ bcopy ((char *)yyss1, (char *)yyss, size * sizeof (*yyssp));
+ yyls = (YYLTYPE *) alloca (yymaxdepth * sizeof (*yylsp));
+ bcopy ((char *)yyls1, (char *)yyls, size * sizeof (*yylsp));
+ yyvs = (YYSTYPE *) alloca (yymaxdepth * sizeof (*yyvsp));
+ bcopy ((char *)yyvs1, (char *)yyvs, size * sizeof (*yyvsp));
+#endif /* no yyoverflow */
+
+ yyssp = yyss + size - 1;
+ yylsp = yyls + size - 1;
+ yyvsp = yyvs + size - 1;
+
+ if (yydebug)
+ fprintf(stderr, "Stack size increased to %d\n", yymaxdepth);
+
+ if (yyssp >= yyss + yymaxdepth - 1)
+ YYERROR;
+ }
+
+ if (yydebug)
+ fprintf(stderr, "Entering state %d\n", yystate);
+
+/* Do appropriate processing given the current state. */
+/* Read a lookahead token if we need one and don't already have one. */
+yyresume:
+
+ /* First try to decide what to do without reference to lookahead token. */
+
+ yyn = yypact[yystate];
+ if (yyn == YYFLAG)
+ goto yydefault;
+
+ /* Not known => get a lookahead token if don't already have one. */
+
+ /* yychar is either YYEMPTY or YYEOF
+ or a valid token in external form. */
+
+ if (yychar == YYEMPTY)
+ {
+ yychar = YYLEX;
+ }
+
+ /* Convert token to internal form (in yychar1) for indexing tables with */
+
+ if (yychar <= 0) /* This means end of input. */
+ {
+ yychar1 = 0;
+ yychar = YYEOF; /* Don't call YYLEX any more */
+
+ if (yydebug)
+ fprintf(stderr, "Now at end of input.\n");
+ }
+ else
+ {
+ yychar1 = YYTRANSLATE(yychar);
+
+ if (yydebug)
+ fprintf(stderr, "Parsing next token; it is %d (%s)\n", yychar, yytname[yychar1]);
+ }
+
+ yyn += yychar1;
+ if (yyn < 0 || yyn > YYLAST || yycheck[yyn] != yychar1)
+ goto yydefault;
+
+ yyn = yytable[yyn];
+
+ /* yyn is what to do for this token type in this state.
+ Negative => reduce, -yyn is rule number.
+ Positive => shift, yyn is new state.
+ New state is final state => don't bother to shift,
+ just return success.
+ 0, or most negative number => error. */
+
+ if (yyn < 0)
+ {
+ if (yyn == YYFLAG)
+ goto yyerrlab;
+ yyn = -yyn;
+ goto yyreduce;
+ }
+ else if (yyn == 0)
+ goto yyerrlab;
+
+ if (yyn == YYFINAL)
+ YYACCEPT;
+
+ /* Shift the lookahead token. */
+
+ if (yydebug)
+ fprintf(stderr, "Shifting token %d (%s), ", yychar, yytname[yychar1]);
+
+ /* Discard the token being shifted unless it is eof. */
+ if (yychar != YYEOF)
+ yychar = YYEMPTY;
+
+ *++yyvsp = yylval;
+ *++yylsp = yylloc;
+
+ /* count tokens shifted since error; after three, turn off error status. */
+ if (yyerrstatus) yyerrstatus--;
+
+ yystate = yyn;
+ goto yynewstate;
+
+/* Do the default action for the current state. */
+yydefault:
+
+ yyn = yydefact[yystate];
+ if (yyn == 0)
+ goto yyerrlab;
+
+/* Do a reduction. yyn is the number of a rule to reduce with. */
+yyreduce:
+ yylen = yyr2[yyn];
+ yyval = yyvsp[1-yylen]; /* implement default value of the action */
+
+ if (yydebug)
+ {
+ if (yylen == 1)
+ fprintf (stderr, "Reducing 1 value via line %d, ",
+ yyrline[yyn]);
+ else
+ fprintf (stderr, "Reducing %d values via line %d, ",
+ yylen, yyrline[yyn]);
+ }
+
+
+ switch (yyn) {
+
+case 1:
+#line 106 "awk.y"
+{ expression_value = yyvsp[0].nodeval; ;
+ break;}
+case 2:
+#line 111 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_rule_list,(NODE *) NULL); ;
+ break;}
+case 3:
+#line 114 "awk.y"
+{ yyval.nodeval = append_right (yyvsp[-1].nodeval, node(yyvsp[0].nodeval, Node_rule_list,(NODE *) NULL)); ;
+ break;}
+case 4:
+#line 118 "awk.y"
+{ yyval.nodeval = node (yyvsp[-3].nodeval, Node_rule_node, yyvsp[-2].nodeval); ;
+ break;}
+case 5:
+#line 123 "awk.y"
+{ yyval.nodeval = NULL; ;
+ break;}
+case 6:
+#line 125 "awk.y"
+{ yyval.nodeval = yyvsp[0].nodeval; ;
+ break;}
+case 7:
+#line 127 "awk.y"
+{ yyval.nodeval = mkrangenode ( node(yyvsp[-2].nodeval, Node_cond_pair, yyvsp[0].nodeval) ); ;
+ break;}
+case 8:
+#line 133 "awk.y"
+{ yyval.nodeval = node ((NODE *)NULL, Node_K_BEGIN,(NODE *) NULL); ;
+ break;}
+case 9:
+#line 135 "awk.y"
+{ yyval.nodeval = node ((NODE *)NULL, Node_K_END,(NODE *) NULL); ;
+ break;}
+case 10:
+#line 137 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_not,(NODE *) NULL); ;
+ break;}
+case 11:
+#line 139 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_and, yyvsp[0].nodeval); ;
+ break;}
+case 12:
+#line 141 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_or, yyvsp[0].nodeval); ;
+ break;}
+case 13:
+#line 143 "awk.y"
+{
+ yyval.nodeval = yyvsp[-1].nodeval;
+ want_concat_token = 0;
+ ;
+ break;}
+case 14:
+#line 152 "awk.y"
+{ ++want_regexp; ;
+ break;}
+case 15:
+#line 154 "awk.y"
+{ want_regexp = 0;
+ yyval.nodeval = node (node (make_number ((AWKNUM)0), Node_field_spec, (NODE *)NULL),
+ Node_match, (NODE *)make_regexp (yyvsp[-1].sval));
+ ;
+ break;}
+case 16:
+#line 159 "awk.y"
+{ ++want_regexp; ;
+ break;}
+case 17:
+#line 161 "awk.y"
+{ want_regexp = 0;
+ yyval.nodeval = node (yyvsp[-5].nodeval, yyvsp[-4].nodetypeval, (NODE *)make_regexp(yyvsp[-1].sval));
+ ;
+ break;}
+case 18:
+#line 165 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, yyvsp[-1].nodetypeval, yyvsp[0].nodeval); ;
+ break;}
+case 19:
+#line 167 "awk.y"
+{ yyval.nodeval = yyvsp[0].nodeval; ;
+ break;}
+case 20:
+#line 172 "awk.y"
+{ yyval.nodeval = NULL; ;
+ break;}
+case 21:
+#line 174 "awk.y"
+{ yyval.nodeval = yyvsp[-1].nodeval; ;
+ break;}
+case 22:
+#line 179 "awk.y"
+{ yyval.nodeval = NULL; ;
+ break;}
+case 23:
+#line 181 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_statement_list, (NODE *)NULL); ;
+ break;}
+case 24:
+#line 183 "awk.y"
+{ yyval.nodeval = append_right(yyvsp[-1].nodeval, node( yyvsp[0].nodeval, Node_statement_list, (NODE *)NULL)); ;
+ break;}
+case 25:
+#line 188 "awk.y"
+{ yyval.nodetypeval = Node_illegal; ;
+ break;}
+case 26:
+#line 190 "awk.y"
+{ yyval.nodetypeval = Node_illegal; ;
+ break;}
+case 27:
+#line 195 "awk.y"
+{ yyval.nodetypeval = Node_illegal; ;
+ break;}
+case 32:
+#line 203 "awk.y"
+{ yyval.nodeval = yyvsp[-2].nodeval; ;
+ break;}
+case 33:
+#line 205 "awk.y"
+{ yyval.nodeval = yyvsp[0].nodeval; ;
+ break;}
+case 34:
+#line 207 "awk.y"
+{ yyval.nodeval = node (yyvsp[-3].nodeval, Node_K_while, yyvsp[0].nodeval); ;
+ break;}
+case 35:
+#line 209 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_K_for, (NODE *)make_for_loop (yyvsp[-7].nodeval, yyvsp[-5].nodeval, yyvsp[-3].nodeval)); ;
+ break;}
+case 36:
+#line 211 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_K_for, (NODE *)make_for_loop (yyvsp[-6].nodeval, (NODE *)NULL, yyvsp[-3].nodeval)); ;
+ break;}
+case 37:
+#line 213 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_K_arrayfor, (NODE *)make_for_loop(variable(yyvsp[-6].sval), (NODE *)NULL, variable(yyvsp[-3].sval))); ;
+ break;}
+case 38:
+#line 216 "awk.y"
+{ yyval.nodeval = node ((NODE *)NULL, Node_K_break, (NODE *)NULL); ;
+ break;}
+case 39:
+#line 219 "awk.y"
+{ yyval.nodeval = node ((NODE *)NULL, Node_K_continue, (NODE *)NULL); ;
+ break;}
+case 40:
+#line 221 "awk.y"
+{ ++want_redirect; ;
+ break;}
+case 41:
+#line 223 "awk.y"
+{
+ want_redirect = 0;
+ /* $4->lnode = NULL; */
+ yyval.nodeval = node (yyvsp[-2].nodeval, Node_K_print, yyvsp[-1].nodeval);
+ ;
+ break;}
+case 42:
+#line 229 "awk.y"
+{ ++want_redirect; ;
+ break;}
+case 43:
+#line 231 "awk.y"
+{
+ want_redirect = 0;
+ /* $4->lnode = NULL; */
+ yyval.nodeval = node (yyvsp[-2].nodeval, Node_K_printf, yyvsp[-1].nodeval);
+ ;
+ break;}
+case 44:
+#line 237 "awk.y"
+{ ++want_redirect;
+ want_concat_token = 0; ;
+ break;}
+case 45:
+#line 240 "awk.y"
+{
+ want_redirect = 0;
+ yyval.nodeval = node (yyvsp[-4].nodeval, Node_K_printf, yyvsp[-1].nodeval);
+ ;
+ break;}
+case 46:
+#line 245 "awk.y"
+{ yyval.nodeval = node ((NODE *)NULL, Node_K_next, (NODE *)NULL); ;
+ break;}
+case 47:
+#line 247 "awk.y"
+{ yyval.nodeval = node ((NODE *)NULL, Node_K_exit, (NODE *)NULL); ;
+ break;}
+case 48:
+#line 249 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_K_exit, (NODE *)NULL); ;
+ break;}
+case 49:
+#line 251 "awk.y"
+{ yyval.nodeval = yyvsp[-1].nodeval; ;
+ break;}
+case 50:
+#line 257 "awk.y"
+{ yyval.nodeval = node (yyvsp[-3].nodeval, Node_K_if,
+ node (yyvsp[0].nodeval, Node_if_branches, (NODE *)NULL)); ;
+ break;}
+case 51:
+#line 261 "awk.y"
+{ yyval.nodeval = node (yyvsp[-6].nodeval, Node_K_if,
+ node (yyvsp[-3].nodeval, Node_if_branches, yyvsp[0].nodeval)); ;
+ break;}
+case 53:
+#line 268 "awk.y"
+{ yyval.nodetypeval = Node_illegal; ;
+ break;}
+case 54:
+#line 273 "awk.y"
+{ yyval.nodeval = NULL; /* node (NULL, Node_redirect_nil, NULL); */ ;
+ break;}
+case 55:
+#line 277 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, yyvsp[-1].nodetypeval, (NODE *)NULL); ;
+ break;}
+case 56:
+#line 283 "awk.y"
+{ yyval.nodeval = NULL; /* node(NULL, Node_builtin, NULL); */ ;
+ break;}
+case 57:
+#line 285 "awk.y"
+{ yyval.nodeval = yyvsp[0].nodeval; ;
+ break;}
+case 58:
+#line 290 "awk.y"
+{ yyval.nodeval = NULL; ;
+ break;}
+case 59:
+#line 292 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_expression_list, (NODE *)NULL); ;
+ break;}
+case 60:
+#line 294 "awk.y"
+{ yyval.nodeval = append_right(yyvsp[-2].nodeval, node( yyvsp[0].nodeval, Node_expression_list, (NODE *)NULL)); ;
+ break;}
+case 61:
+#line 300 "awk.y"
+{ yyval.nodeval = snode (yyvsp[-1].nodeval, Node_builtin, yyvsp[-3].ptrval); ;
+ break;}
+case 62:
+#line 302 "awk.y"
+{ yyval.nodeval = snode ((NODE *)NULL, Node_builtin, yyvsp[0].ptrval); ;
+ break;}
+case 63:
+#line 304 "awk.y"
+{ yyval.nodeval = yyvsp[-1].nodeval; ;
+ break;}
+case 64:
+#line 306 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_unary_minus, (NODE *)NULL); ;
+ break;}
+case 65:
+#line 308 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_preincrement, (NODE *)NULL); ;
+ break;}
+case 66:
+#line 310 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_predecrement, (NODE *)NULL); ;
+ break;}
+case 67:
+#line 312 "awk.y"
+{ yyval.nodeval = node (yyvsp[-1].nodeval, Node_postincrement, (NODE *)NULL); ;
+ break;}
+case 68:
+#line 314 "awk.y"
+{ yyval.nodeval = node (yyvsp[-1].nodeval, Node_postdecrement, (NODE *)NULL); ;
+ break;}
+case 69:
+#line 316 "awk.y"
+{ yyval.nodeval = yyvsp[0].nodeval; ;
+ break;}
+case 70:
+#line 318 "awk.y"
+{ yyval.nodeval = make_number (yyvsp[0].fval); ;
+ break;}
+case 71:
+#line 320 "awk.y"
+{ yyval.nodeval = make_string (yyvsp[0].sval, -1); ;
+ break;}
+case 72:
+#line 324 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_times, yyvsp[0].nodeval); ;
+ break;}
+case 73:
+#line 326 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_quotient, yyvsp[0].nodeval); ;
+ break;}
+case 74:
+#line 328 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_mod, yyvsp[0].nodeval); ;
+ break;}
+case 75:
+#line 330 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_plus, yyvsp[0].nodeval); ;
+ break;}
+case 76:
+#line 332 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_minus, yyvsp[0].nodeval); ;
+ break;}
+case 77:
+#line 335 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_concat, yyvsp[0].nodeval); ;
+ break;}
+case 78:
+#line 337 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, yyvsp[-1].nodetypeval, yyvsp[0].nodeval); ;
+ break;}
+case 79:
+#line 341 "awk.y"
+{ yyval.nodeval = snode (yyvsp[-1].nodeval, Node_builtin, yyvsp[-3].ptrval); ;
+ break;}
+case 80:
+#line 343 "awk.y"
+{ yyval.nodeval = snode ((NODE *)NULL, Node_builtin, yyvsp[0].ptrval); ;
+ break;}
+case 81:
+#line 345 "awk.y"
+{ yyval.nodeval = yyvsp[-1].nodeval; ;
+ break;}
+case 82:
+#line 347 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_unary_minus, (NODE *)NULL); ;
+ break;}
+case 83:
+#line 349 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_preincrement, (NODE *)NULL); ;
+ break;}
+case 84:
+#line 351 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_predecrement, (NODE *)NULL); ;
+ break;}
+case 85:
+#line 353 "awk.y"
+{ yyval.nodeval = node (yyvsp[-1].nodeval, Node_postincrement, (NODE *)NULL); ;
+ break;}
+case 86:
+#line 355 "awk.y"
+{ yyval.nodeval = node (yyvsp[-1].nodeval, Node_postdecrement, (NODE *)NULL); ;
+ break;}
+case 87:
+#line 357 "awk.y"
+{ yyval.nodeval = yyvsp[0].nodeval; ;
+ break;}
+case 88:
+#line 359 "awk.y"
+{ yyval.nodeval = make_number (yyvsp[0].fval); ;
+ break;}
+case 89:
+#line 361 "awk.y"
+{ yyval.nodeval = make_string (yyvsp[0].sval, -1); ;
+ break;}
+case 90:
+#line 365 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_times, yyvsp[0].nodeval); ;
+ break;}
+case 91:
+#line 367 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_quotient, yyvsp[0].nodeval); ;
+ break;}
+case 92:
+#line 369 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_mod, yyvsp[0].nodeval); ;
+ break;}
+case 93:
+#line 371 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_plus, yyvsp[0].nodeval); ;
+ break;}
+case 94:
+#line 373 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_minus, yyvsp[0].nodeval); ;
+ break;}
+case 95:
+#line 376 "awk.y"
+{ yyval.nodeval = node (yyvsp[-2].nodeval, Node_concat, yyvsp[0].nodeval); ;
+ break;}
+case 96:
+#line 381 "awk.y"
+{ yyval.nodeval = variable (yyvsp[0].sval); ;
+ break;}
+case 97:
+#line 383 "awk.y"
+{ yyval.nodeval = node (variable(yyvsp[-3].sval), Node_subscript, yyvsp[-1].nodeval); ;
+ break;}
+case 98:
+#line 385 "awk.y"
+{ yyval.nodeval = node (yyvsp[0].nodeval, Node_field_spec, (NODE *)NULL); ;
+ break;}
+}
+ /* the action file gets copied in in place of this dollarsign */
+#line 303 "bison.simple"
+
+ yyvsp -= yylen;
+ yylsp -= yylen;
+ yyssp -= yylen;
+
+ if (yydebug)
+ {
+ short *ssp1 = yyss - 1;
+ fprintf (stderr, "state stack now", yyssp-yyss);
+ while (ssp1 != yyssp)
+ fprintf (stderr, " %d", *++ssp1);
+ fprintf (stderr, "\n");
+ }
+
+ *++yyvsp = yyval;
+
+ yylsp++;
+ if (yylen == 0)
+ {
+ yylsp->first_line = yylloc.first_line;
+ yylsp->first_column = yylloc.first_column;
+ yylsp->last_line = (yylsp-1)->last_line;
+ yylsp->last_column = (yylsp-1)->last_column;
+ yylsp->text = 0;
+ }
+ else
+ {
+ yylsp->last_line = (yylsp+yylen-1)->last_line;
+ yylsp->last_column = (yylsp+yylen-1)->last_column;
+ }
+
+ /* Now "shift" the result of the reduction.
+ Determine what state that goes to,
+ based on the state we popped back to
+ and the rule number reduced by. */
+
+ yyn = yyr1[yyn];
+
+ yystate = yypgoto[yyn - YYNTBASE] + *yyssp;
+ if (yystate >= 0 && yystate <= YYLAST && yycheck[yystate] == *yyssp)
+ yystate = yytable[yystate];
+ else
+ yystate = yydefgoto[yyn - YYNTBASE];
+
+ goto yynewstate;
+
+yyerrlab: /* here on detecting error */
+
+ if (! yyerrstatus)
+ /* If not already recovering from an error, report this error. */
+ {
+ yyerror("parse error");
+ }
+
+ if (yyerrstatus == 3)
+ {
+ /* if just tried and failed to reuse lookahead token after an error, discard it. */
+
+ /* return failure if at end of input */
+ if (yychar == YYEOF)
+ YYERROR;
+
+ if (yydebug)
+ fprintf(stderr, "Discarding token %d (%s).\n", yychar, yytname[yychar1]);
+
+ yychar = YYEMPTY;
+ }
+
+ /* Else will try to reuse lookahead token
+ after shifting the error token. */
+
+ yyerrstatus = 3; /* Each real token shifted decrements this */
+
+ goto yyerrhandle;
+
+yyerrdefault: /* current state does not do anything special for the error token. */
+
+#if 0
+ /* This is wrong; only states that explicitly want error tokens
+ should shift them. */
+ yyn = yydefact[yystate]; /* If its default is to accept any token, ok. Otherwise pop it.*/
+ if (yyn) goto yydefault;
+#endif
+
+yyerrpop: /* pop the current state because it cannot handle the error token */
+
+ if (yyssp == yyss) YYERROR;
+ yyvsp--;
+ yylsp--;
+ yystate = *--yyssp;
+
+ if (yydebug)
+ {
+ short *ssp1 = yyss - 1;
+ fprintf (stderr, "Error: state stack now", yyssp-yyss);
+ while (ssp1 != yyssp)
+ fprintf (stderr, " %d", *++ssp1);
+ fprintf (stderr, "\n");
+ }
+
+yyerrhandle:
+
+ yyn = yypact[yystate];
+ if (yyn == YYFLAG)
+ goto yyerrdefault;
+
+ yyn += YYTERROR;
+ if (yyn < 0 || yyn > YYLAST || yycheck[yyn] != YYTERROR)
+ goto yyerrdefault;
+
+ yyn = yytable[yyn];
+ if (yyn < 0)
+ {
+ if (yyn == YYFLAG)
+ goto yyerrpop;
+ yyn = -yyn;
+ goto yyreduce;
+ }
+ else if (yyn == 0)
+ goto yyerrpop;
+
+ if (yyn == YYFINAL)
+ YYACCEPT;
+
+ if (yydebug)
+ fprintf(stderr, "Shifting error token, ");
+
+ *++yyvsp = yylval;
+ *++yylsp = yylloc;
+
+ yystate = yyn;
+ goto yynewstate;
+}
+#line 388 "awk.y"
+
+
+
+struct token {
+ char *operator;
+ NODETYPE value;
+ int class;
+ NODE *(*ptr)();
+};
+
+#define NULL 0
+
+NODE *do_exp(), *do_getline(), *do_index(), *do_length(),
+ *do_sqrt(), *do_log(), *do_sprintf(), *do_substr(),
+ *do_split(), *do_int();
+
+ /* Special functions for debugging */
+#ifndef FAST
+NODE *do_prvars(), *do_bp();
+#endif
+
+/* Tokentab is sorted ascii ascending order, so it can be binary searched. */
+/* (later. Right now its just sort of linear search (SLOW!!) */
+
+static struct token tokentab[] = {
+ {"BEGIN", Node_illegal, LEX_BEGIN, 0},
+ {"END", Node_illegal, LEX_END, 0},
+#ifndef FAST
+ {"bp", Node_builtin, LEX_BUILTIN, do_bp},
+#endif
+ {"break", Node_K_break, LEX_BREAK, 0},
+ {"continue", Node_K_continue, LEX_CONTINUE, 0},
+ {"else", Node_illegal, LEX_ELSE, 0},
+ {"exit", Node_K_exit, LEX_EXIT, 0},
+ {"exp", Node_builtin, LEX_BUILTIN, do_exp},
+ {"for", Node_K_for, LEX_FOR, 0},
+ {"getline", Node_builtin, LEX_BUILTIN, do_getline},
+ {"if", Node_K_if, LEX_IF, 0},
+ {"in", Node_illegal, LEX_IN, 0},
+ {"index", Node_builtin, LEX_BUILTIN, do_index},
+ {"int", Node_builtin, LEX_BUILTIN, do_int},
+ {"length", Node_builtin, LEX_BUILTIN, do_length},
+ {"log", Node_builtin, LEX_BUILTIN, do_log},
+ {"next", Node_K_next, LEX_NEXT, 0},
+ {"print", Node_K_print, LEX_PRINT, 0},
+ {"printf", Node_K_printf, LEX_PRINTF, 0},
+#ifndef FAST
+ {"prvars", Node_builtin, LEX_BUILTIN, do_prvars},
+#endif
+ {"split", Node_builtin, LEX_BUILTIN, do_split},
+ {"sprintf", Node_builtin, LEX_BUILTIN, do_sprintf},
+ {"sqrt", Node_builtin, LEX_BUILTIN, do_sqrt},
+ {"substr", Node_builtin, LEX_BUILTIN, do_substr},
+ {"while", Node_K_while, LEX_WHILE, 0},
+ {NULL, Node_illegal, ERROR, 0}
+};
+
+/* Read one token, getting characters through lexptr. */
+
+static int
+yylex ()
+{
+ register int c;
+ register int namelen;
+ register char *tokstart;
+ register struct token *toktab;
+ double atof(); /* JF know what happens if you forget this? */
+
+
+ static did_newline = 0; /* JF the grammar insists that actions end
+ with newlines. This was easier than hacking
+ the grammar. */
+ int do_concat;
+
+ int seen_e = 0; /* These are for numbers */
+ int seen_point = 0;
+
+ retry:
+
+ if(!lexptr)
+ return 0;
+
+ if (want_regexp) {
+ want_regexp = 0;
+ /* there is a potential bug if a regexp is followed by an equal sign:
+ "/foo/=bar" would result in assign_quotient being returned as the
+ next token. Nothing is done about it since it is not valid awk,
+ but maybe something should be done anyway. */
+
+ tokstart = lexptr;
+ while (c = *lexptr++) {
+ switch (c) {
+ case '\\':
+ if (*lexptr++ == '\0') {
+ yyerror ("unterminated regexp ends with \\");
+ return ERROR;
+ }
+ break;
+ case '/': /* end of the regexp */
+ lexptr--;
+ yylval.sval = tokstart;
+ return REGEXP;
+ case '\n':
+ case '\0':
+ yyerror ("unterminated regexp");
+ return ERROR;
+ }
+ }
+ }
+ do_concat=want_concat_token;
+ want_concat_token=0;
+
+ if(*lexptr=='\0') {
+ lexptr=0;
+ return NEWLINE;
+ }
+
+ /* if lexptr is at white space between two terminal tokens or parens,
+ it is a concatenation operator. */
+ if(do_concat && (*lexptr==' ' || *lexptr=='\t')) {
+ while (*lexptr == ' ' || *lexptr == '\t')
+ lexptr++;
+ if (isalnum(*lexptr) || *lexptr == '\"' || *lexptr == '('
+ || *lexptr == '.' || *lexptr == '$') /* the '.' is for decimal pt */
+ return CONCAT_OP;
+ }
+
+ while (*lexptr == ' ' || *lexptr == '\t')
+ lexptr++;
+
+ tokstart = lexptr; /* JF */
+
+ switch (c = *lexptr++) {
+ case 0:
+ return 0;
+
+ case '\n':
+ lineno++;
+ return NEWLINE;
+
+ case '#': /* it's a comment */
+ while (*lexptr != '\n' && *lexptr != '\0')
+ lexptr++;
+ goto retry;
+
+ case '\\':
+ if(*lexptr=='\n') {
+ lexptr++;
+ goto retry;
+ } else break;
+ case ')':
+ case ']':
+ ++want_concat_token;
+ /* fall through */
+ case '(': /* JF these were above, but I don't see why they should turn on concat. . . &*/
+ case '[':
+
+ case '{':
+ case ',': /* JF */
+ case '$':
+ case ';':
+ /* set node type to ILLEGAL because the action should set it to
+ the right thing */
+ yylval.nodetypeval = Node_illegal;
+ return c;
+
+ case '*':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_times;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '/':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_quotient;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '%':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_mod;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '+':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_plus;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ if(*lexptr=='+') {
+ yylval.nodetypeval=Node_illegal;
+ lexptr++;
+ return INCREMENT;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '!':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_notequal;
+ lexptr++;
+ return RELOP;
+ }
+ if(*lexptr=='~') {
+ yylval.nodetypeval=Node_nomatch;
+ lexptr++;
+ return MATCHOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '<':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_leq;
+ lexptr++;
+ return RELOP;
+ }
+ yylval.nodetypeval=Node_less;
+ return RELOP;
+
+ case '=':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_equal;
+ lexptr++;
+ return RELOP;
+ }
+ yylval.nodetypeval=Node_assign;
+ return ASSIGNOP;
+
+ case '>':
+ if(want_redirect) {
+ if (*lexptr == '>') {
+ yylval.nodetypeval = Node_redirect_append;
+ lexptr++;
+ } else
+ yylval.nodetypeval = Node_redirect_output;
+ return REDIRECT_OP;
+ }
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_geq;
+ lexptr++;
+ return RELOP;
+ }
+ yylval.nodetypeval=Node_greater;
+ return RELOP;
+
+ case '~':
+ yylval.nodetypeval=Node_match;
+ return MATCHOP;
+
+ case '}': /* JF added did newline stuff. Easier than hacking the grammar */
+ if(did_newline) {
+ did_newline=0;
+ return c;
+ }
+ did_newline++;
+ --lexptr;
+ return NEWLINE;
+
+ case '"':
+ while (*lexptr != '\0') {
+ switch (*lexptr++) {
+ case '\\':
+ if (*lexptr++ != '\0')
+ break;
+ /* fall through */
+ case '\n':
+ yyerror ("unterminated string");
+ return ERROR;
+ case '\"':
+ yylval.sval = tokstart + 1; /* JF Skip the doublequote */
+ ++want_concat_token;
+ return YSTRING;
+ }
+ }
+ return ERROR; /* JF this was one level up, wrong? */
+
+ case '-':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_minus;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ if(*lexptr=='-') {
+ yylval.nodetypeval=Node_illegal;
+ lexptr++;
+ return DECREMENT;
+ }
+ /* JF I think space tab comma and newline are the legal places for
+ a UMINUS. Have I missed any? */
+ if((!isdigit(*lexptr) && *lexptr!='.') || (lexptr>lexptr_begin+1 &&
+ !index(" \t,\n",lexptr[-2]))) {
+ /* set node type to ILLEGAL because the action should set it to
+ the right thing */
+ yylval.nodetypeval = Node_illegal;
+ return c;
+ }
+ /* FALL through into number code */
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ case '.':
+ /* It's a number */
+ if(c=='-') namelen=1;
+ else namelen=0;
+ for (; (c = tokstart[namelen]) != '\0'; namelen++) {
+ switch (c) {
+ case '.':
+ if (seen_point)
+ goto got_number;
+ ++seen_point;
+ break;
+ case 'e':
+ case 'E':
+ if (seen_e)
+ goto got_number;
+ ++seen_e;
+ if (tokstart[namelen+1] == '-' || tokstart[namelen+1] == '+')
+ namelen++;
+ break;
+ case '0': case '1': case '2': case '3': case '4':
+ case '5': case '6': case '7': case '8': case '9':
+ break;
+ default:
+ goto got_number;
+ }
+ }
+
+got_number:
+ lexptr = tokstart + namelen;
+ yylval.fval = atof(tokstart);
+ ++want_concat_token;
+ return NUMBER;
+
+ case '&':
+ if(*lexptr=='&') {
+ yylval.nodetypeval=Node_and;
+ lexptr++;
+ return LEX_AND;
+ }
+ return ERROR;
+
+ case '|':
+ if(want_redirect) {
+ lexptr++;
+ yylval.nodetypeval = Node_redirect_pipe;
+ return REDIRECT_OP;
+ }
+ if(*lexptr=='|') {
+ yylval.nodetypeval=Node_or;
+ lexptr++;
+ return LEX_OR;
+ }
+ return ERROR;
+ }
+
+ if (!isalpha(c)) {
+ yyerror ("Invalid char '%c' in expression\n", c);
+ return ERROR;
+ }
+
+ /* its some type of name-type-thing. Find its length */
+ for (namelen = 0; is_identchar(tokstart[namelen]); namelen++)
+ ;
+
+
+ /* See if it is a special token. */
+ for (toktab = tokentab; toktab->operator != NULL; toktab++) {
+ if(*tokstart==toktab->operator[0] &&
+ !strncmp(tokstart,toktab->operator,namelen) &&
+ toktab->operator[namelen]=='\0') {
+ lexptr=tokstart+namelen;
+ if(toktab->class == LEX_BUILTIN)
+ yylval.ptrval = toktab->ptr;
+ else
+ yylval.nodetypeval = toktab->value;
+ return toktab->class;
+ }
+ }
+
+ /* It's a name. See how long it is. */
+ yylval.sval = tokstart;
+ lexptr = tokstart+namelen;
+ ++want_concat_token;
+ return NAME;
+}
+
+/*VARARGS1*/
+yyerror (mesg,a1,a2,a3,a4,a5,a6,a7,a8)
+ char *mesg;
+{
+ register char *ptr,*beg;
+
+ /* Find the current line in the input file */
+ if(!lexptr) {
+ beg="(END OF FILE)";
+ ptr=beg+13;
+ } else {
+ if (*lexptr == '\n' && lexptr!=lexptr_begin)
+ --lexptr;
+ for (beg = lexptr;beg!=lexptr_begin && *beg != '\n';--beg)
+ ;
+ for (ptr = lexptr;*ptr && *ptr != '\n';ptr++) /*jfw: NL isn't guaranteed*/
+ ;
+ if(beg!=lexptr_begin)
+ beg++;
+ }
+ fprintf (stderr, "Error near line %d, '%.*s'\n",lineno, ptr-beg, beg);
+ /* figure out line number, etc. later */
+ fprintf (stderr, mesg, a1, a2, a3, a4, a5, a6, a7, a8);
+ fprintf (stderr,"\n");
+ exit (1);
+}
+
+/* Parse a C escape sequence. STRING_PTR points to a variable
+ containing a pointer to the string to parse. That pointer
+ is updated past the characters we use. The value of the
+ escape sequence is returned.
+
+ A negative value means the sequence \ newline was seen,
+ which is supposed to be equivalent to nothing at all.
+
+ If \ is followed by a null character, we return a negative
+ value and leave the string pointer pointing at the null character.
+
+ If \ is followed by 000, we return 0 and leave the string pointer
+ after the zeros. A value of 0 does not mean end of string. */
+
+static int
+parse_escape (string_ptr)
+ char **string_ptr;
+{
+ register int c = *(*string_ptr)++;
+ switch (c)
+ {
+ case 'a':
+ return '\a';
+ case 'b':
+ return '\b';
+ case 'e':
+ return 033;
+ case 'f':
+ return '\f';
+ case 'n':
+ return '\n';
+ case 'r':
+ return '\r';
+ case 't':
+ return '\t';
+ case 'v':
+ return '\v';
+ case '\n':
+ return -2;
+ case 0:
+ (*string_ptr)--;
+ return 0;
+ case '^':
+ c = *(*string_ptr)++;
+ if (c == '\\')
+ c = parse_escape (string_ptr);
+ if (c == '?')
+ return 0177;
+ return (c & 0200) | (c & 037);
+
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ {
+ register int i = c - '0';
+ register int count = 0;
+ while (++count < 3)
+ {
+ if ((c = *(*string_ptr)++) >= '0' && c <= '7')
+ {
+ i *= 8;
+ i += c - '0';
+ }
+ else
+ {
+ (*string_ptr)--;
+ break;
+ }
+ }
+ return i;
+ }
+ default:
+ return c;
+ }
+}
diff --git a/awk.y b/awk.y
new file mode 100644
index 00000000..180505f6
--- /dev/null
+++ b/awk.y
@@ -0,0 +1,898 @@
+
+/*
+ * gawk -- GNU version of awk
+ * Copyright (C) 1986 Free Software Foundation
+ * Written by Paul Rubin, August 1986
+ *
+ */
+
+/*
+GAWK is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY. No author or distributor accepts responsibility to anyone
+for the consequences of using it or for whether it serves any
+particular purpose or works at all, unless he says so in writing.
+Refer to the GAWK General Public License for full details.
+
+Everyone is granted permission to copy, modify and redistribute GAWK,
+but only under the conditions described in the GAWK General Public
+License. A copy of this license is supposed to have been given to you
+along with GAWK so you can know your rights and responsibilities. It
+should be in a file named COPYING. Among other things, the copyright
+notice and this notice must be preserved on all copies.
+
+In other words, go ahead and share GAWK, but don't try to stop
+anyone else from sharing it farther. Help stamp out software hoarding!
+*/
+
+%{
+#define YYDEBUG 12
+
+#include <stdio.h>
+#include "awk.h"
+
+ static int yylex ();
+
+
+ /*
+ * The following variable is used for a very sickening thing.
+ * The awk language uses white space as the string concatenation
+ * operator, but having a white space token that would have to appear
+ * everywhere in all the grammar rules would be unbearable.
+ * It turns out we can return CONCAT_OP exactly when there really
+ * is one, just from knowing what kinds of other tokens it can appear
+ * between (namely, constants, variables, or close parentheses).
+ * This is because concatenation has the lowest priority of all
+ * operators. want_concat_token is used to remember that something
+ * that could be the left side of a concat has just been returned.
+ *
+ * If anyone knows a cleaner way to do this (don't look at the Un*x
+ * code to find one, though), please suggest it.
+ */
+ static int want_concat_token;
+
+ /* Two more horrible kludges. The same comment applies to these two too */
+ static int want_regexp; /* lexical scanning kludge */
+ static int want_redirect; /* similarly */
+ int lineno = 1; /* JF for error msgs */
+
+/* During parsing of a gawk program, the pointer to the next character
+ is in this variable. */
+ char *lexptr; /* JF moved it up here */
+ char *lexptr_begin; /* JF for error msgs */
+%}
+
+%union {
+ long lval;
+ AWKNUM fval;
+ NODE *nodeval;
+ NODETYPE nodetypeval;
+ char *sval;
+ NODE *(*ptrval)();
+}
+
+%type <nodeval> exp start program rule pattern conditional
+%type <nodeval> action variable redirection expression_list
+%type <nodeval> statements statement if_statement
+%type <nodeval> opt_exp v_exp
+%type <nodetypeval> whitespace
+
+%token <sval> NAME REGEXP YSTRING
+%token <lval> ERROR INCDEC
+%token <fval> NUMBER
+%token <nodetypeval> ASSIGNOP RELOP MATCHOP NEWLINE REDIRECT_OP CONCAT_OP
+%token <nodetypeval> LEX_BEGIN LEX_END LEX_IF LEX_ELSE
+%token <nodetypeval> LEX_WHILE LEX_FOR LEX_BREAK LEX_CONTINUE
+%token <nodetypeval> LEX_PRINT LEX_PRINTF LEX_NEXT LEX_EXIT
+%token LEX_IN
+%token <lval> LEX_AND LEX_OR INCREMENT DECREMENT
+%token <ptrval> LEX_BUILTIN
+
+/* these are just yylval numbers */
+/* %token <lval> CHAR JF this isn't used anymore */
+
+/* Lowest to highest */
+%left LEX_OR
+%left LEX_AND
+%right ASSIGNOP
+%left CONCAT_OP
+%left '+' '-'
+%left '*' '/' '%'
+%right UNARY
+%nonassoc MATCHOP RELOP
+
+%%
+
+start : optional_newlines program
+ { expression_value = $2; }
+ ;
+
+
+program : rule
+ { $$ = node ($1, Node_rule_list,(NODE *) NULL); }
+ | program rule
+ /* cons the rule onto the tail of list */
+ { $$ = append_right ($1, node($2, Node_rule_list,(NODE *) NULL)); }
+ ;
+
+rule : pattern action NEWLINE optional_newlines
+ { $$ = node ($1, Node_rule_node, $2); }
+ ;
+
+
+pattern : /* empty */
+ { $$ = NULL; }
+ | conditional
+ { $$ = $1; }
+ | conditional ',' conditional
+ { $$ = mkrangenode ( node($1, Node_cond_pair, $3) ); } /*jfw*/
+ ;
+
+
+conditional :
+ LEX_BEGIN
+ { $$ = node ((NODE *)NULL, Node_K_BEGIN,(NODE *) NULL); }
+ | LEX_END
+ { $$ = node ((NODE *)NULL, Node_K_END,(NODE *) NULL); }
+ | '!' conditional %prec UNARY
+ { $$ = node ($2, Node_not,(NODE *) NULL); }
+ | conditional LEX_AND conditional
+ { $$ = node ($1, Node_and, $3); }
+ | conditional LEX_OR conditional
+ { $$ = node ($1, Node_or, $3); }
+ | '(' conditional ')'
+ {
+ $$ = $2;
+ want_concat_token = 0;
+ }
+
+ /* In these rules, want_regexp tells yylex that the next thing
+ is a regexp so it should read up to the closing slash. */
+
+ | '/'
+ { ++want_regexp; }
+ REGEXP '/'
+ { want_regexp = 0;
+ $$ = node (node (make_number ((AWKNUM)0), Node_field_spec, (NODE *)NULL),
+ Node_match, (NODE *)make_regexp ($3));
+ }
+ | exp MATCHOP '/'
+ { ++want_regexp; }
+ REGEXP '/'
+ { want_regexp = 0;
+ $$ = node ($1, $2, (NODE *)make_regexp($5));
+ }
+ | exp RELOP exp
+ { $$ = node ($1, $2, $3); }
+ | exp /* JF */
+ { $$ = $1; }
+ ;
+
+
+action : /* empty */
+ { $$ = NULL; }
+ | '{' whitespace statements '}'
+ { $$ = $3; }
+ ;
+
+
+statements : /* EMPTY */
+ { $$ = NULL; }
+ | statement
+ { $$ = node ($1, Node_statement_list, (NODE *)NULL); }
+ | statements statement
+ { $$ = append_right($1, node( $2, Node_statement_list, (NODE *)NULL)); }
+ ;
+
+statement_term :
+ NEWLINE optional_newlines
+ { $<nodetypeval>$ = Node_illegal; }
+ | ';' optional_newlines
+ { $<nodetypeval>$ = Node_illegal; }
+ ;
+
+whitespace :
+ /* blank */
+ { $$ = Node_illegal; }
+ | CONCAT_OP
+ | NEWLINE
+ | whitespace CONCAT_OP
+ | whitespace NEWLINE
+ ;
+statement :
+ '{' whitespace statements '}' whitespace
+ { $$ = $3; }
+ | if_statement
+ { $$ = $1; }
+ | LEX_WHILE '(' conditional ')' whitespace statement
+ { $$ = node ($3, Node_K_while, $6); }
+ | LEX_FOR '(' opt_exp ';' conditional ';' opt_exp ')' whitespace statement
+ { $$ = node ($10, Node_K_for, (NODE *)make_for_loop ($3, $5, $7)); }
+ | LEX_FOR '(' opt_exp ';' ';' opt_exp ')' whitespace statement
+ { $$ = node ($9, Node_K_for, (NODE *)make_for_loop ($3, (NODE *)NULL, $6)); }
+ | LEX_FOR '(' NAME CONCAT_OP LEX_IN NAME ')' whitespace statement
+ { $$ = node ($9, Node_K_arrayfor, (NODE *)make_for_loop(variable($3), (NODE *)NULL, variable($6))); }
+ | LEX_BREAK statement_term
+ /* for break, maybe we'll have to remember where to break to */
+ { $$ = node ((NODE *)NULL, Node_K_break, (NODE *)NULL); }
+ | LEX_CONTINUE statement_term
+ /* similarly */
+ { $$ = node ((NODE *)NULL, Node_K_continue, (NODE *)NULL); }
+ | LEX_PRINT
+ { ++want_redirect; }
+ expression_list redirection statement_term
+ {
+ want_redirect = 0;
+ /* $4->lnode = NULL; */
+ $$ = node ($3, Node_K_print, $4);
+ }
+ | LEX_PRINTF
+ { ++want_redirect; }
+ expression_list redirection statement_term
+ {
+ want_redirect = 0;
+ /* $4->lnode = NULL; */
+ $$ = node ($3, Node_K_printf, $4);
+ }
+ | LEX_PRINTF '(' expression_list ')'
+ { ++want_redirect;
+ want_concat_token = 0; }
+ redirection statement_term
+ {
+ want_redirect = 0;
+ $$ = node ($3, Node_K_printf, $6);
+ }
+ | LEX_NEXT statement_term
+ { $$ = node ((NODE *)NULL, Node_K_next, (NODE *)NULL); }
+ | LEX_EXIT statement_term
+ { $$ = node ((NODE *)NULL, Node_K_exit, (NODE *)NULL); }
+ | LEX_EXIT '(' exp ')' statement_term
+ { $$ = node ($3, Node_K_exit, (NODE *)NULL); }
+ | exp statement_term
+ { $$ = $1; }
+ ;
+
+
+if_statement:
+ LEX_IF '(' conditional ')' whitespace statement
+ { $$ = node ($3, Node_K_if,
+ node ($6, Node_if_branches, (NODE *)NULL)); }
+ | LEX_IF '(' conditional ')' whitespace statement
+ LEX_ELSE whitespace statement
+ { $$ = node ($3, Node_K_if,
+ node ($6, Node_if_branches, $9)); }
+ ;
+
+optional_newlines :
+ /* empty */
+ | optional_newlines NEWLINE
+ { $<nodetypeval>$ = Node_illegal; }
+ ;
+
+redirection :
+ /* empty */
+ { $$ = NULL; /* node (NULL, Node_redirect_nil, NULL); */ }
+ /* | REDIRECT_OP NAME
+ { $$ = node ($2, $1, NULL); } */
+ | REDIRECT_OP exp
+ { $$ = node ($2, $1, (NODE *)NULL); }
+ ;
+
+
+/* optional expression, as in for loop */
+opt_exp :
+ { $$ = NULL; /* node(NULL, Node_builtin, NULL); */ }
+ | exp
+ { $$ = $1; }
+ ;
+
+expression_list :
+ /* empty */
+ { $$ = NULL; }
+ | exp
+ { $$ = node ($1, Node_expression_list, (NODE *)NULL); }
+ | expression_list ',' exp
+ { $$ = append_right($1, node( $3, Node_expression_list, (NODE *)NULL)); }
+ ;
+
+
+/* Expressions, not including the comma operator. */
+exp : LEX_BUILTIN '(' expression_list ')'
+ { $$ = snode ($3, Node_builtin, $1); }
+ | LEX_BUILTIN
+ { $$ = snode ((NODE *)NULL, Node_builtin, $1); }
+ | '(' exp ')'
+ { $$ = $2; }
+ | '-' exp %prec UNARY
+ { $$ = node ($2, Node_unary_minus, (NODE *)NULL); }
+ | INCREMENT variable %prec UNARY
+ { $$ = node ($2, Node_preincrement, (NODE *)NULL); }
+ | DECREMENT variable %prec UNARY
+ { $$ = node ($2, Node_predecrement, (NODE *)NULL); }
+ | variable INCREMENT %prec UNARY
+ { $$ = node ($1, Node_postincrement, (NODE *)NULL); }
+ | variable DECREMENT %prec UNARY
+ { $$ = node ($1, Node_postdecrement, (NODE *)NULL); }
+ | variable
+ { $$ = $1; } /* JF was variable($1) */
+ | NUMBER
+ { $$ = make_number ($1); }
+ | YSTRING
+ { $$ = make_string ($1, -1); }
+
+/* Binary operators in order of decreasing precedence. */
+ | exp '*' exp
+ { $$ = node ($1, Node_times, $3); }
+ | exp '/' exp
+ { $$ = node ($1, Node_quotient, $3); }
+ | exp '%' exp
+ { $$ = node ($1, Node_mod, $3); }
+ | exp '+' exp
+ { $$ = node ($1, Node_plus, $3); }
+ | exp '-' exp
+ { $$ = node ($1, Node_minus, $3); }
+ /* Empty operator. See yylex for disgusting details. */
+ | exp CONCAT_OP exp
+ { $$ = node ($1, Node_concat, $3); }
+ | variable ASSIGNOP exp
+ { $$ = node ($1, $2, $3); }
+ ;
+
+v_exp : LEX_BUILTIN '(' expression_list ')'
+ { $$ = snode ($3, Node_builtin, $1); }
+ | LEX_BUILTIN
+ { $$ = snode ((NODE *)NULL, Node_builtin, $1); }
+ | '(' exp ')'
+ { $$ = $2; }
+ | '-' exp %prec UNARY
+ { $$ = node ($2, Node_unary_minus, (NODE *)NULL); }
+ | INCREMENT variable %prec UNARY
+ { $$ = node ($2, Node_preincrement, (NODE *)NULL); }
+ | DECREMENT variable %prec UNARY
+ { $$ = node ($2, Node_predecrement, (NODE *)NULL); }
+ | variable INCREMENT %prec UNARY
+ { $$ = node ($1, Node_postincrement, (NODE *)NULL); }
+ | variable DECREMENT %prec UNARY
+ { $$ = node ($1, Node_postdecrement, (NODE *)NULL); }
+ | variable
+ { $$ = $1; } /* JF was variable($1) */
+ | NUMBER
+ { $$ = make_number ($1); }
+ | YSTRING
+ { $$ = make_string ($1, -1); }
+
+/* Binary operators in order of decreasing precedence. */
+ | v_exp '*' exp
+ { $$ = node ($1, Node_times, $3); }
+ | v_exp '/' exp
+ { $$ = node ($1, Node_quotient, $3); }
+ | v_exp '%' exp
+ { $$ = node ($1, Node_mod, $3); }
+ | v_exp '+' exp
+ { $$ = node ($1, Node_plus, $3); }
+ | v_exp '-' exp
+ { $$ = node ($1, Node_minus, $3); }
+ /* Empty operator. See yylex for disgusting details. */
+ | v_exp CONCAT_OP exp
+ { $$ = node ($1, Node_concat, $3); }
+ ;
+
+variable :
+ NAME
+ { $$ = variable ($1); }
+ | NAME '[' exp ']'
+ { $$ = node (variable($1), Node_subscript, $3); }
+ | '$' v_exp %prec UNARY
+ { $$ = node ($2, Node_field_spec, (NODE *)NULL); }
+ ;
+
+%%
+
+
+struct token {
+ char *operator;
+ NODETYPE value;
+ int class;
+ NODE *(*ptr)();
+};
+
+#define NULL 0
+
+NODE *do_exp(), *do_getline(), *do_index(), *do_length(),
+ *do_sqrt(), *do_log(), *do_sprintf(), *do_substr(),
+ *do_split(), *do_int();
+
+ /* Special functions for debugging */
+#ifndef FAST
+NODE *do_prvars(), *do_bp();
+#endif
+
+/* Tokentab is sorted ascii ascending order, so it can be binary searched. */
+/* (later. Right now its just sort of linear search (SLOW!!) */
+
+static struct token tokentab[] = {
+ {"BEGIN", Node_illegal, LEX_BEGIN, 0},
+ {"END", Node_illegal, LEX_END, 0},
+#ifndef FAST
+ {"bp", Node_builtin, LEX_BUILTIN, do_bp},
+#endif
+ {"break", Node_K_break, LEX_BREAK, 0},
+ {"continue", Node_K_continue, LEX_CONTINUE, 0},
+ {"else", Node_illegal, LEX_ELSE, 0},
+ {"exit", Node_K_exit, LEX_EXIT, 0},
+ {"exp", Node_builtin, LEX_BUILTIN, do_exp},
+ {"for", Node_K_for, LEX_FOR, 0},
+ {"getline", Node_builtin, LEX_BUILTIN, do_getline},
+ {"if", Node_K_if, LEX_IF, 0},
+ {"in", Node_illegal, LEX_IN, 0},
+ {"index", Node_builtin, LEX_BUILTIN, do_index},
+ {"int", Node_builtin, LEX_BUILTIN, do_int},
+ {"length", Node_builtin, LEX_BUILTIN, do_length},
+ {"log", Node_builtin, LEX_BUILTIN, do_log},
+ {"next", Node_K_next, LEX_NEXT, 0},
+ {"print", Node_K_print, LEX_PRINT, 0},
+ {"printf", Node_K_printf, LEX_PRINTF, 0},
+#ifndef FAST
+ {"prvars", Node_builtin, LEX_BUILTIN, do_prvars},
+#endif
+ {"split", Node_builtin, LEX_BUILTIN, do_split},
+ {"sprintf", Node_builtin, LEX_BUILTIN, do_sprintf},
+ {"sqrt", Node_builtin, LEX_BUILTIN, do_sqrt},
+ {"substr", Node_builtin, LEX_BUILTIN, do_substr},
+ {"while", Node_K_while, LEX_WHILE, 0},
+ {NULL, Node_illegal, ERROR, 0}
+};
+
+/* Read one token, getting characters through lexptr. */
+
+static int
+yylex ()
+{
+ register int c;
+ register int namelen;
+ register char *tokstart;
+ register struct token *toktab;
+ double atof(); /* JF know what happens if you forget this? */
+
+
+ static did_newline = 0; /* JF the grammar insists that actions end
+ with newlines. This was easier than hacking
+ the grammar. */
+ int do_concat;
+
+ int seen_e = 0; /* These are for numbers */
+ int seen_point = 0;
+
+ retry:
+
+ if(!lexptr)
+ return 0;
+
+ if (want_regexp) {
+ want_regexp = 0;
+ /* there is a potential bug if a regexp is followed by an equal sign:
+ "/foo/=bar" would result in assign_quotient being returned as the
+ next token. Nothing is done about it since it is not valid awk,
+ but maybe something should be done anyway. */
+
+ tokstart = lexptr;
+ while (c = *lexptr++) {
+ switch (c) {
+ case '\\':
+ if (*lexptr++ == '\0') {
+ yyerror ("unterminated regexp ends with \\");
+ return ERROR;
+ }
+ break;
+ case '/': /* end of the regexp */
+ lexptr--;
+ yylval.sval = tokstart;
+ return REGEXP;
+ case '\n':
+ case '\0':
+ yyerror ("unterminated regexp");
+ return ERROR;
+ }
+ }
+ }
+ do_concat=want_concat_token;
+ want_concat_token=0;
+
+ if(*lexptr=='\0') {
+ lexptr=0;
+ return NEWLINE;
+ }
+
+ /* if lexptr is at white space between two terminal tokens or parens,
+ it is a concatenation operator. */
+ if(do_concat && (*lexptr==' ' || *lexptr=='\t')) {
+ while (*lexptr == ' ' || *lexptr == '\t')
+ lexptr++;
+ if (isalnum(*lexptr) || *lexptr == '\"' || *lexptr == '('
+ || *lexptr == '.' || *lexptr == '$') /* the '.' is for decimal pt */
+ return CONCAT_OP;
+ }
+
+ while (*lexptr == ' ' || *lexptr == '\t')
+ lexptr++;
+
+ tokstart = lexptr; /* JF */
+
+ switch (c = *lexptr++) {
+ case 0:
+ return 0;
+
+ case '\n':
+ lineno++;
+ return NEWLINE;
+
+ case '#': /* it's a comment */
+ while (*lexptr != '\n' && *lexptr != '\0')
+ lexptr++;
+ goto retry;
+
+ case '\\':
+ if(*lexptr=='\n') {
+ lexptr++;
+ goto retry;
+ } else break;
+ case ')':
+ case ']':
+ ++want_concat_token;
+ /* fall through */
+ case '(': /* JF these were above, but I don't see why they should turn on concat. . . &*/
+ case '[':
+
+ case '{':
+ case ',': /* JF */
+ case '$':
+ case ';':
+ /* set node type to ILLEGAL because the action should set it to
+ the right thing */
+ yylval.nodetypeval = Node_illegal;
+ return c;
+
+ case '*':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_times;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '/':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_quotient;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '%':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_mod;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '+':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_plus;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ if(*lexptr=='+') {
+ yylval.nodetypeval=Node_illegal;
+ lexptr++;
+ return INCREMENT;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '!':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_notequal;
+ lexptr++;
+ return RELOP;
+ }
+ if(*lexptr=='~') {
+ yylval.nodetypeval=Node_nomatch;
+ lexptr++;
+ return MATCHOP;
+ }
+ yylval.nodetypeval=Node_illegal;
+ return c;
+
+ case '<':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_leq;
+ lexptr++;
+ return RELOP;
+ }
+ yylval.nodetypeval=Node_less;
+ return RELOP;
+
+ case '=':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_equal;
+ lexptr++;
+ return RELOP;
+ }
+ yylval.nodetypeval=Node_assign;
+ return ASSIGNOP;
+
+ case '>':
+ if(want_redirect) {
+ if (*lexptr == '>') {
+ yylval.nodetypeval = Node_redirect_append;
+ lexptr++;
+ } else
+ yylval.nodetypeval = Node_redirect_output;
+ return REDIRECT_OP;
+ }
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_geq;
+ lexptr++;
+ return RELOP;
+ }
+ yylval.nodetypeval=Node_greater;
+ return RELOP;
+
+ case '~':
+ yylval.nodetypeval=Node_match;
+ return MATCHOP;
+
+ case '}': /* JF added did newline stuff. Easier than hacking the grammar */
+ if(did_newline) {
+ did_newline=0;
+ return c;
+ }
+ did_newline++;
+ --lexptr;
+ return NEWLINE;
+
+ case '"':
+ while (*lexptr != '\0') {
+ switch (*lexptr++) {
+ case '\\':
+ if (*lexptr++ != '\0')
+ break;
+ /* fall through */
+ case '\n':
+ yyerror ("unterminated string");
+ return ERROR;
+ case '\"':
+ yylval.sval = tokstart + 1; /* JF Skip the doublequote */
+ ++want_concat_token;
+ return YSTRING;
+ }
+ }
+ return ERROR; /* JF this was one level up, wrong? */
+
+ case '-':
+ if(*lexptr=='=') {
+ yylval.nodetypeval=Node_assign_minus;
+ lexptr++;
+ return ASSIGNOP;
+ }
+ if(*lexptr=='-') {
+ yylval.nodetypeval=Node_illegal;
+ lexptr++;
+ return DECREMENT;
+ }
+ /* JF I think space tab comma and newline are the legal places for
+ a UMINUS. Have I missed any? */
+ if((!isdigit(*lexptr) && *lexptr!='.') || (lexptr>lexptr_begin+1 &&
+ !index(" \t,\n",lexptr[-2]))) {
+ /* set node type to ILLEGAL because the action should set it to
+ the right thing */
+ yylval.nodetypeval = Node_illegal;
+ return c;
+ }
+ /* FALL through into number code */
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ case '.':
+ /* It's a number */
+ if(c=='-') namelen=1;
+ else namelen=0;
+ for (; (c = tokstart[namelen]) != '\0'; namelen++) {
+ switch (c) {
+ case '.':
+ if (seen_point)
+ goto got_number;
+ ++seen_point;
+ break;
+ case 'e':
+ case 'E':
+ if (seen_e)
+ goto got_number;
+ ++seen_e;
+ if (tokstart[namelen+1] == '-' || tokstart[namelen+1] == '+')
+ namelen++;
+ break;
+ case '0': case '1': case '2': case '3': case '4':
+ case '5': case '6': case '7': case '8': case '9':
+ break;
+ default:
+ goto got_number;
+ }
+ }
+
+got_number:
+ lexptr = tokstart + namelen;
+ yylval.fval = atof(tokstart);
+ ++want_concat_token;
+ return NUMBER;
+
+ case '&':
+ if(*lexptr=='&') {
+ yylval.nodetypeval=Node_and;
+ lexptr++;
+ return LEX_AND;
+ }
+ return ERROR;
+
+ case '|':
+ if(want_redirect) {
+ lexptr++;
+ yylval.nodetypeval = Node_redirect_pipe;
+ return REDIRECT_OP;
+ }
+ if(*lexptr=='|') {
+ yylval.nodetypeval=Node_or;
+ lexptr++;
+ return LEX_OR;
+ }
+ return ERROR;
+ }
+
+ if (!isalpha(c)) {
+ yyerror ("Invalid char '%c' in expression\n", c);
+ return ERROR;
+ }
+
+ /* its some type of name-type-thing. Find its length */
+ for (namelen = 0; is_identchar(tokstart[namelen]); namelen++)
+ ;
+
+
+ /* See if it is a special token. */
+ for (toktab = tokentab; toktab->operator != NULL; toktab++) {
+ if(*tokstart==toktab->operator[0] &&
+ !strncmp(tokstart,toktab->operator,namelen) &&
+ toktab->operator[namelen]=='\0') {
+ lexptr=tokstart+namelen;
+ if(toktab->class == LEX_BUILTIN)
+ yylval.ptrval = toktab->ptr;
+ else
+ yylval.nodetypeval = toktab->value;
+ return toktab->class;
+ }
+ }
+
+ /* It's a name. See how long it is. */
+ yylval.sval = tokstart;
+ lexptr = tokstart+namelen;
+ ++want_concat_token;
+ return NAME;
+}
+
+/*VARARGS1*/
+yyerror (mesg,a1,a2,a3,a4,a5,a6,a7,a8)
+ char *mesg;
+{
+ register char *ptr,*beg;
+
+ /* Find the current line in the input file */
+ if(!lexptr) {
+ beg="(END OF FILE)";
+ ptr=beg+13;
+ } else {
+ if (*lexptr == '\n' && lexptr!=lexptr_begin)
+ --lexptr;
+ for (beg = lexptr;beg!=lexptr_begin && *beg != '\n';--beg)
+ ;
+ for (ptr = lexptr;*ptr && *ptr != '\n';ptr++) /*jfw: NL isn't guaranteed*/
+ ;
+ if(beg!=lexptr_begin)
+ beg++;
+ }
+ fprintf (stderr, "Error near line %d, '%.*s'\n",lineno, ptr-beg, beg);
+ /* figure out line number, etc. later */
+ fprintf (stderr, mesg, a1, a2, a3, a4, a5, a6, a7, a8);
+ fprintf (stderr,"\n");
+ exit (1);
+}
+
+/* Parse a C escape sequence. STRING_PTR points to a variable
+ containing a pointer to the string to parse. That pointer
+ is updated past the characters we use. The value of the
+ escape sequence is returned.
+
+ A negative value means the sequence \ newline was seen,
+ which is supposed to be equivalent to nothing at all.
+
+ If \ is followed by a null character, we return a negative
+ value and leave the string pointer pointing at the null character.
+
+ If \ is followed by 000, we return 0 and leave the string pointer
+ after the zeros. A value of 0 does not mean end of string. */
+
+static int
+parse_escape (string_ptr)
+ char **string_ptr;
+{
+ register int c = *(*string_ptr)++;
+ switch (c)
+ {
+ case 'a':
+ return '\a';
+ case 'b':
+ return '\b';
+ case 'e':
+ return 033;
+ case 'f':
+ return '\f';
+ case 'n':
+ return '\n';
+ case 'r':
+ return '\r';
+ case 't':
+ return '\t';
+ case 'v':
+ return '\v';
+ case '\n':
+ return -2;
+ case 0:
+ (*string_ptr)--;
+ return 0;
+ case '^':
+ c = *(*string_ptr)++;
+ if (c == '\\')
+ c = parse_escape (string_ptr);
+ if (c == '?')
+ return 0177;
+ return (c & 0200) | (c & 037);
+
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ {
+ register int i = c - '0';
+ register int count = 0;
+ while (++count < 3)
+ {
+ if ((c = *(*string_ptr)++) >= '0' && c <= '7')
+ {
+ i *= 8;
+ i += c - '0';
+ }
+ else
+ {
+ (*string_ptr)--;
+ break;
+ }
+ }
+ return i;
+ }
+ default:
+ return c;
+ }
+}
diff --git a/awk1.c b/awk1.c
new file mode 100644
index 00000000..79963bac
--- /dev/null
+++ b/awk1.c
@@ -0,0 +1,723 @@
+/*
+ * awk1 -- Expression tree constructors and main program for gawk.
+ *
+ * Copyright (C) 1986 Free Software Foundation
+ * Written by Paul Rubin, August 1986
+ *
+ */
+
+/*
+GAWK is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY. No author or distributor accepts responsibility to anyone
+for the consequences of using it or for whether it serves any
+particular purpose or works at all, unless he says so in writing.
+Refer to the GAWK General Public License for full details.
+
+Everyone is granted permission to copy, modify and redistribute GAWK,
+but only under the conditions described in the GAWK General Public
+License. A copy of this license is supposed to have been given to you
+along with GAWK so you can know your rights and responsibilities. It
+should be in a file named COPYING. Among other things, the copyright
+notice and this notice must be preserved on all copies.
+
+In other words, go ahead and share GAWK, but don't try to stop
+anyone else from sharing it farther. Help stamp out software hoarding!
+*/
+
+#include <stdio.h>
+#include "regex.h"
+#include "awk.h"
+
+/* Temporary nodes are stored here. ob_dummy is a dummy object used to
+ keep the obstack library from free()ing up the entire stack. */
+struct obstack temp_strings;
+char *ob_dummy;
+
+/* The parse tree and field nodes are stored here. Parse_end is a dummy
+ item used to free up unneeded fields without freeing the program being run
+ */
+struct obstack other_stack;
+char *parse_end;
+
+/* The global null string */
+NODE *Nnull_string;
+
+/* The special variable that contains the name of the current input file */
+extern NODE *FILENAME_node;
+
+/* The name the program was invoked under, for error messages */
+char *myname;
+
+/* A block of gAWK code to be run before running the program */
+NODE *begin_block = 0;
+
+/* A block of gAWK code to be run after the last input file */
+NODE *end_block = 0;
+
+FILE *input_file; /* Where to read from */
+
+#ifndef FAST
+/* non-zero means in debugging is enabled. Probably not very useful */
+int debugging;
+#endif
+
+char *index();
+
+
+main(argc, argv)
+ int argc;
+ char **argv;
+{
+ register int i;
+ register NODE *tmp;
+ char **do_vars;
+#ifndef FAST
+ /* Print out the parse tree. For debugging */
+ register int dotree = 0;
+ extern int yydebug;
+#endif
+ extern char *lexptr;
+ extern char *lexptr_begin;
+ FILE *fp,*fopen();
+
+ --argc;
+ myname= *argv++;
+ if(!argc)
+ usage();
+
+ /* Tell the regex routines how they should work. . . */
+ re_set_syntax(RE_NO_BK_PARENS|RE_NO_BK_VBAR);
+
+ /* Set up the stack for temporary strings */
+ obstack_init (&temp_strings);
+ ob_dummy=obstack_alloc(&temp_strings,0);
+
+ /* Set up the other stack for other things */
+ obstack_init(&other_stack);
+ /* initialize the null string */
+ Nnull_string = make_string("",0);
+ /* This was to keep Nnull_string from ever being free()d It didn't work */
+ /* Nnull_string->stref=32000; */
+ /* Set up the special variables */
+ /* Note that this must be done BEFORE arg parsing else -R and -F
+ break horribly */
+ init_vars();
+
+
+ for(;*argv && **argv=='-';argc--,argv++) {
+ switch(argv[0][1]) {
+#ifndef FAST
+ case 'd':
+ debugging++;
+ dotree++;
+ break;
+
+ case 'D':
+ debugging++;
+ yydebug=2;
+ break;
+#endif
+ /* This feature isn't in un*x awk, but might be useful */
+ case 'R':
+ set_rs(&argv[0][2]);
+ break;
+
+ case 'F':
+ set_fs(&argv[0][2]);
+ break;
+
+
+ /* It would be better to read the input file in as we parse
+ it. Its done this way for hysterical reasons. Feel
+ free to fix it. */
+ case 'f':
+ if(lexptr)
+ panic("Can only use one -f option");
+ if((fp=fopen(argv[1],"r"))==NULL)
+ er_panic(argv[1]);
+ else {
+ char *curptr;
+ int siz,nread;
+
+ curptr=lexptr=malloc(2000);
+ if(curptr==NULL)
+ panic("Memory exhausted"); /* jfw: instead of abort() */
+ siz=2000;
+ i=siz-1;
+ while((nread=fread(curptr,sizeof(char),i,fp)) > 0) {
+ curptr+=nread;
+ i-=nread;
+ if(i==0) {
+ lexptr=realloc(lexptr,siz*2);
+ if(lexptr==NULL)
+ panic("Memory exhausted"); /* jfw: instead of abort() */
+ curptr=lexptr+siz-1;
+ i=siz;
+ siz*=2;
+ }
+ }
+ *curptr='\0';
+ fclose(fp);
+ }
+ argc--;
+ argv++;
+ break;
+
+ case '\0': /* A file */
+ break;
+
+ default:
+ panic("Unknown option %s",argv[0]);
+ }
+ }
+ if (debugging) setbuf(stdout, 0); /* jfw: make debugging easier */
+ /* No -f option, use next arg */
+ if(!lexptr) {
+ if(!argc) usage();
+ lexptr= *argv++;
+ --argc;
+ }
+
+ /* Read in the program */
+ lexptr_begin=lexptr;
+ (void)yyparse ();
+
+ /* Anything allocated on the other_stack after here will be freed
+ when the next input line is read.
+ */
+ parse_end=obstack_alloc(&other_stack,0);
+
+#ifndef FAST
+ if(dotree)
+ print_parse_tree(expression_value);
+#endif
+ /* Set up the field variables */
+ init_fields();
+
+ /* Look for BEGIN and END blocks. Only one of each allowed */
+ for(tmp=expression_value;tmp;tmp=tmp->rnode) {
+ if(!tmp->lnode || !tmp->lnode->lnode)
+ continue;
+ if(tmp->lnode->lnode->type==Node_K_BEGIN)
+ begin_block=tmp->lnode->rnode;
+ else if(tmp->lnode->lnode->type==Node_K_END)
+ end_block=tmp->lnode->rnode;
+ }
+ if(begin_block && interpret(begin_block) == 0) exit(0); /* jfw */
+ do_vars=argv;
+ while(argc>0 && index(*argv,'=')) {
+ argv++;
+ --argc;
+ }
+ if(do_vars==argv) do_vars=0;
+ if(argc==0) {
+ static char *dumb[2]= { "-", 0};
+
+ argc=1;
+ argv= &dumb[0];
+ }
+ while(argc--) {
+ if(!strcmp(*argv,"-")) {
+ input_file=stdin;
+ FILENAME_node->var_value=Nnull_string;
+ ADD_ONE_REFERENCE(Nnull_string);
+ } else {
+ extern NODE *deref;
+
+ input_file=fopen(*argv,"r");
+ /* This should print the error message from errno */
+ if(!input_file)
+ er_panic(*argv);
+ /* This is a kludge. */
+ deref=FILENAME_node->var_value;
+ do_deref();
+ FILENAME_node->var_value=make_string(*argv,strlen(*argv));
+ }
+ /* This is where it spends all its time. The infamous MAIN LOOP */
+ if(inrec()==0) {
+ if(do_vars) {
+ while(do_vars!=argv && *do_vars) {
+ char *cp;
+
+ cp=index(*do_vars,'=');
+ *cp++='\0';
+ variable(*do_vars)->var_value=make_string(cp,strlen(cp));
+ do_vars++;
+ }
+ do_vars=0;
+ }
+ do
+ obstack_free(&temp_strings, ob_dummy);
+ while (interpret(expression_value) && inrec() == 0);
+ }
+ if(input_file!=stdin) fclose(input_file);
+ argv++;
+ }
+ if(end_block) (void)interpret(end_block);
+ exit(0);
+}
+
+/* These exit values are arbitrary */
+/*VARARGS1*/
+panic(str,arg)
+char *str;
+{
+ fprintf(stderr,"%s: ",myname);
+ fprintf(stderr,str,arg);
+ fprintf(stderr,"\n");
+ exit(12);
+}
+
+er_panic(str)
+char *str;
+{
+ fprintf(stderr,"%s: ",myname);
+ perror(str);
+ exit(15);
+}
+
+usage()
+{
+ fprintf(stderr,"%s: usage: %s {-f progfile | program } [-F{c} -R{c}] file . . .\n",myname,myname);
+ exit(11);
+}
+
+
+/* This allocates a new node of type ty. Note that this node will not go
+ away unless freed, so don't use it for tmp storage */
+NODE *
+newnode(ty)
+NODETYPE ty;
+{
+ register NODE *r;
+
+ r=(NODE *)malloc(sizeof(NODE));
+ if(r==NULL)
+ abort();
+ r->type=ty;
+ return r;
+}
+
+
+/* Duplicate a node. (For global strings, "duplicate" means crank up
+ the reference count.) This creates global nodes. . .*/
+NODE *
+dupnode(n)
+NODE *n;
+{
+ register NODE *r;
+
+ if(n->type==Node_string) {
+ n->stref++;
+ return n;
+ } else if(n->type==Node_temp_string) {
+ r=newnode(Node_string);
+ r->stlen=n->stlen;
+ r->stref=1;
+ r->stptr=malloc(n->stlen+1);
+ if(r->stptr==NULL)
+ abort();
+ bcopy (n->stptr, r->stptr, n->stlen);
+ r->stptr[r->stlen]='\0'; /* JF for hackval */
+ return r;
+ } else {
+ r=newnode(Node_illegal);
+ *r= *n;
+ return r;
+ }
+}
+
+/* This allocates a node with defined lnode and rnode. */
+/* This should only be used by yyparse+co while
+ reading in the program */
+NODE *
+node (left, op, right)
+ NODE *left, *right;
+ NODETYPE op;
+{
+ register NODE *r;
+
+ r = (NODE *)obstack_alloc(&other_stack,sizeof(NODE));
+ r->type=op;
+ r->lnode = left;
+ r->rnode = right;
+ return r;
+}
+
+/* This allocates a node with defined subnode and proc */
+/* Otherwise like node() */
+NODE *
+snode(subn, op, procp)
+NODETYPE op;
+NODE *(*procp)();
+NODE *subn;
+{
+ register NODE *r;
+
+ r=(NODE *)obstack_alloc(&other_stack,sizeof(NODE));
+ r->type=op;
+ r->subnode=subn;
+ r->proc=procp;
+ return r;
+}
+
+/* (jfw) This allocates a Node_line_range node
+ * with defined condpair and zeroes the trigger word
+ * to avoid the temptation of assuming that calling
+ * 'node( foo, Node_line_range, 0)' will properly initialize 'triggered'.
+ */
+/* Otherwise like node() */
+NODE *
+mkrangenode(cpair)
+NODE *cpair;
+{
+ register NODE *r;
+
+ r=(NODE *)obstack_alloc(&other_stack,sizeof(NODE));
+ r->type=Node_line_range;
+ r->condpair=cpair;
+ r->triggered = 0;
+ return r;
+}
+
+/* this allocates a node with defined numbr */
+/* This creates global nodes! */
+NODE *
+make_number (x)
+ AWKNUM x;
+{
+ register NODE *r;
+
+ r=newnode(Node_number);
+ r->numbr = x;
+ return r;
+}
+
+/* This creates temporary nodes. They go away quite quicly, so
+ don't use them for anything important */
+#ifndef FAST
+NODE *
+tmp_number(x)
+AWKNUM x;
+{
+#ifdef DONTDEF
+ return make_number(x);
+#endif
+ NODE *r;
+
+ r=(NODE *)obstack_alloc(&temp_strings,sizeof(NODE));
+ r->type=Node_number;
+ r->numbr=x;
+ return r;
+}
+#endif
+
+/* Make a string node. If len==0, the string passed in S is supposed to end
+ with a double quote, but have had the beginning double quote
+ already stripped off by yylex.
+ If LEN!=0, we don't care what s ends with. This creates a global node */
+
+NODE *
+make_string (s,len)
+ char *s;
+{
+ register NODE *r;
+ register char *pf,*pt;
+ register int c;
+
+ /* the aborts are impossible because yylex is supposed to have
+ already checked for unterminated strings */
+ if(len==-1) { /* Called from yyparse, find our own len */
+#ifndef FAST
+ if (s[-1] != '\"') /* Didn't start with " */
+ abort ();
+#endif
+
+ for(pf = pt = s; *pf != '\0' && *pf!='\"';) {
+ c= *pf++;
+ switch(c) {
+#ifndef FAST
+ case '\0':
+ abort();
+#endif
+
+ case '\\':
+#ifndef FAST
+ if(*pf=='\0')
+ abort();
+#endif
+
+ c= *pf++;
+ switch(c) {
+ case '\\': /* no massagary needed */
+ case '\'':
+ case '\"':
+ break;
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ c-='0';
+ while(*pf && *pf>='0' && *pf<='7') {
+ c=c*8+ *pf++ - '0';
+ }
+ break;
+ case 'b':
+ c='\b';
+ break;
+ case 'f':
+ c='\f';
+ break;
+ case 'n':
+ c='\n';
+ break;
+ case 'r':
+ c='\r';
+ break;
+ case 't':
+ c='\t';
+ break;
+ case 'v':
+ c='\v';
+ break;
+ default:
+ *pt++='\\';
+ break;
+ }
+ /* FALL THROUGH */
+ default:
+ *pt++=c;
+ break;
+ }
+ }
+#ifndef FAST
+ if(*pf=='\0')
+ abort(); /* JF hit the end of the buf */
+#endif
+ len = pt - s; /* JF was p - s - 1 */
+ }
+
+ r=newnode(Node_string);
+ r->stptr=(char *)malloc(len+1);
+ if(r->stptr==0)
+ abort();
+ r->type=Node_string;
+ r->stlen=len;
+ r->stref=1;
+ bcopy (s, r->stptr, len);
+ r->stptr[len]='\0'; /* JF a hack */
+
+ return r;
+}
+
+/* #ifndef FAST */
+/* This should be a macro for speed, but the C compiler chokes. */
+/* Read the warning under tmp_number */
+NODE *
+tmp_string(s,len)
+char *s;
+{
+ register NODE *r;
+
+#ifdef DONTDEF
+ return make_string(s,len);
+#endif
+ r=(NODE *)obstack_alloc(&temp_strings,sizeof(NODE));
+ r->stptr=(char *)obstack_alloc(&temp_strings,len+1);
+ r->type=Node_temp_string;
+ r->stlen=len;
+ r->stref=1;
+ bcopy (s, r->stptr, len);
+ r->stptr[len]='\0'; /* JF a hack */
+
+ return r;
+}
+/* #endif */
+
+/* Generate compiled regular expressions */
+struct re_pattern_buffer *
+make_regexp (s)
+ char *s;
+{
+ typedef struct re_pattern_buffer RPAT;
+ RPAT *rp;
+ char *p, *err;
+
+ rp = (RPAT *) obstack_alloc(&other_stack, sizeof (RPAT));
+ bzero((char *)rp,sizeof(RPAT));
+ rp->buffer = (char *)malloc(8); /* JF I'd obstack allocate it,
+ except the regex routines
+ try to realloc() it, which fails. */
+ /* Note that this means it may never be freed. Someone fix, please? */
+
+ rp->allocated = 8;
+ rp->fastmap = (char *)obstack_alloc(&other_stack, 256);
+
+ for (p = s; *p != '\0'; p++) {
+ if (*p == '\\')
+ p++;
+ else if (*p == '/')
+ break;
+ }
+#ifndef FAST
+ if (*p != '/')
+ abort (); /* impossible */
+#endif
+
+ /* JF was re_compile_pattern, but that mishandles ( ) and |,
+ so I had to write my own front end. Sigh. */
+
+ if ((err = re_compile_pattern (s, p - s, rp)) != NULL) {
+ fprintf (stderr, "illegal regexp: ");
+ yyerror (err); /* fatal */
+ }
+
+ return rp;
+}
+
+/* Build a for loop */
+FOR_LOOP_HEADER *
+make_for_loop (init, cond, incr)
+ NODE *init, *cond, *incr;
+{
+ register FOR_LOOP_HEADER *r;
+
+ r = (FOR_LOOP_HEADER *)obstack_alloc(&other_stack,sizeof (FOR_LOOP_HEADER));
+ r->init = init;
+ r->cond = cond;
+ r->incr = incr;
+ return r;
+}
+
+/* Name points to a variable name. Make sure its in the symbol table */
+NODE *
+variable (name)
+ char *name;
+{
+ register NODE *r;
+ NODE *lookup(), *install();
+
+ if ((r = lookup (variables, name)) == NULL) {
+ r = install (variables, name, node(Nnull_string, Node_var, (NODE *)NULL));
+ /* JF make_number (0.0) is WRONG */
+ }
+ return r;
+}
+
+/* Create a special variable */
+NODE *
+spc_var (name,value)
+char *name;
+NODE *value;
+{
+ register NODE *r;
+ NODE *lookup(), *install();
+
+ if ((r = lookup(variables, name)) == NULL)
+ r = install (variables, name, node(value, Node_var, (NODE *)NULL));
+ return r;
+}
+
+
+/*
+ * Install a name in the hash table specified, even if it is already there.
+ * Name stops with first non alphanumeric.
+ * Caller must check against redefinition if that is desired.
+ */
+NODE *
+install (table, name, value)
+ HASHNODE **table;
+ char *name;
+ NODE *value;
+{
+ register HASHNODE *hp;
+ register int i, len, bucket;
+ register char *p;
+
+ len = 0;
+ p = name;
+ while (is_identchar(*p))
+ p++;
+ len = p - name;
+
+ i = sizeof (HASHNODE) + len + 1;
+ hp = (HASHNODE *)obstack_alloc(&other_stack,i);
+ bucket = hashf(name, len, HASHSIZE);
+ hp->next = table[bucket];
+ table[bucket] = hp;
+ hp->length = len;
+ hp->value = value;
+ hp->name = ((char *) hp) + sizeof (HASHNODE);
+ hp->length = len;
+ bcopy (name, hp->name, len);
+ return hp->value;
+}
+
+/*
+ * find the most recent hash node for name name (ending with first
+ * non-identifier char) installed by install
+ */
+NODE *
+lookup (table, name)
+ HASHNODE **table;
+ char *name;
+{
+ register char *bp;
+ register HASHNODE *bucket;
+ register int len;
+
+ for (bp = name; is_identchar(*bp); bp++)
+ ;
+ len = bp - name;
+ bucket = table[hashf(name, len, HASHSIZE)];
+ while (bucket) {
+ if (bucket->length == len && strncmp(bucket->name, name, len) == 0)
+ return bucket->value;
+ bucket = bucket->next;
+ }
+ return NULL;
+}
+
+#define HASHSTEP(old, c) ((old << 1) + c)
+#define MAKE_POS(v) (v & ~0x80000000) /* make number positive */
+
+/*
+ * return hash function on name. must be compatible with the one
+ * computed a step at a time, elsewhere (JF: Where? I can't find it!)
+ */
+int
+hashf(name, len, hashsize)
+ register char *name;
+ register int len;
+ int hashsize;
+{
+ register int r = 0;
+
+ while (len--)
+ r = HASHSTEP(r, *name++);
+
+ return MAKE_POS(r) % hashsize;
+}
+
+/* Add new to the rightmost branch of LIST. This uses n^2 time, but
+ doesn't get used enough to make optimizing worth it. . . */
+/* You don't believe me? Profile it yourself! */
+
+NODE *
+append_right(list,new)
+NODE *list,*new;
+{
+ register NODE *oldlist;
+
+ oldlist = list;
+ while(list->rnode!=NULL)
+ list=list->rnode;
+ list->rnode = new;
+ return oldlist;
+}
diff --git a/awk2.c b/awk2.c
new file mode 100644
index 00000000..8f29e312
--- /dev/null
+++ b/awk2.c
@@ -0,0 +1,1129 @@
+/*
+ * awk2 --- gawk parse tree interpreter
+ *
+ * Copyright (C) 1986 Free Software Foundation
+ * Written by Paul Rubin, August 1986
+ *
+ */
+
+/*
+GAWK is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY. No author or distributor accepts responsibility to anyone
+for the consequences of using it or for whether it serves any
+particular purpose or works at all, unless he says so in writing.
+Refer to the GAWK General Public License for full details.
+
+Everyone is granted permission to copy, modify and redistribute GAWK,
+but only under the conditions described in the GAWK General Public
+License. A copy of this license is supposed to have been given to you
+along with GAWK so you can know your rights and responsibilities. It
+should be in a file named COPYING. Among other things, the copyright
+notice and this notice must be preserved on all copies.
+
+In other words, go ahead and share GAWK, but don't try to stop
+anyone else from sharing it farther. Help stamp out software hoarding!
+*/
+
+#include <setjmp.h>
+#include <stdio.h>
+
+#ifdef SYSV
+/* nasty nasty berkelixm */
+#define _setjmp setjmp
+#define _longjmp longjmp
+#endif
+
+#include "awk.h"
+
+NODE **get_lhs();
+
+extern NODE dumb[],*OFMT_node;
+/* BEGIN and END blocks need special handling, because we are handed them
+ * as raw Node_statement_lists, not as Node_rule_lists (jfw)
+ */
+extern NODE *begin_block, *end_block;
+NODE *do_sprintf();
+extern struct obstack other_stack;
+
+
+#define min(a,b) ((a) < (b) ? (a) : (b))
+
+/* More of that debugging stuff */
+#ifdef FAST
+#define DEBUG(X)
+#else
+#define DEBUG(X) print_debug X
+#endif
+
+/* longjmp return codes, must be nonzero */
+/* Continue means either for loop/while continue, or next input record */
+#define TAG_CONTINUE 1
+/* Break means either for/while break, or stop reading input */
+#define TAG_BREAK 2
+
+/* the loop_tag_valid variable allows continue/break-out-of-context
+ * to be caught and diagnosed (jfw) */
+#define PUSH_BINDING(stack, x) (bcopy ((char *)(x), (char *)(stack), sizeof (jmp_buf)), loop_tag_valid++)
+#define RESTORE_BINDING(stack, x) (bcopy ((char *)(stack), (char *)(x), sizeof (jmp_buf)), loop_tag_valid--)
+
+/* for "for(iggy in foo) {" */
+struct search {
+ int numleft;
+ AHASH **arr_ptr;
+ AHASH *bucket;
+ NODE *symbol;
+ NODE *retval;
+};
+
+struct search *assoc_scan(),*assoc_next();
+/* Tree is a bunch of rules to run.
+ Returns zero if it hit an exit() statement */
+interpret (tree)
+ NODE *tree;
+{
+ register NODE *t; /* temporary */
+
+ auto jmp_buf loop_tag_stack; /* shallow binding stack for loop_tag */
+ static jmp_buf loop_tag; /* always the current binding */
+ static int loop_tag_valid = 0;/* nonzero when loop_tag valid (jfw) */
+
+ static jmp_buf rule_tag; /* tag the rule currently being run,
+ for NEXT and EXIT statements. It is
+ static because there are no nested rules */
+
+ register NODE **lhs; /* lhs == Left Hand Side for assigns, etc */
+ register struct search *l; /* For array_for */
+
+
+ extern struct obstack temp_strings;
+ extern char *ob_dummy;
+ NODE *do_printf();
+
+ /* clean up temporary strings created by evaluating expressions in
+ previous recursive calls */
+ obstack_free (&temp_strings, ob_dummy);
+
+ if(tree == NULL)
+ return 1;
+ switch (tree->type) {
+#ifndef FAST
+ /* Can't run these! */
+ case Node_illegal:
+ case Node_rule_node:
+ case Node_if_branches:
+ case Node_expression_list:
+ case Node_K_BEGIN:
+ case Node_K_END:
+ case Node_redirect_output:
+ case Node_redirect_append:
+ case Node_redirect_pipe:
+ case Node_var_array:
+ abort();
+#endif
+
+ case Node_rule_list:
+ for (t = tree; t != NULL; t = t->rnode) {
+ switch (_setjmp(rule_tag)) {
+ case 0: /* normal non-jump */
+ if (eval_condition (t->lnode->lnode)) {
+ DEBUG(("Found a rule",t->lnode->rnode));
+ if (t->lnode->rnode == NULL) {
+ /* special case: pattern with no action is equivalent to
+ * an action of {print} (jfw) */
+ NODE printnode;
+ printnode.type = Node_K_print;
+ printnode.lnode = NULL;
+ printnode.rnode = NULL;
+ hack_print_node(&printnode);
+ } else
+ (void)interpret (t->lnode->rnode);
+ }
+ break;
+ case TAG_CONTINUE: /* NEXT statement */
+ return 1;
+ case TAG_BREAK:
+ return 0;
+ }
+ }
+ break;
+
+ case Node_statement_list:
+ /* print_a_node(tree); */
+ /* because BEGIN and END do not have Node_rule_list nature, yet can
+ * have exits and nexts, we special-case a setjmp of rule_tag here.
+ * (jfw)
+ */
+ if (tree == begin_block || tree == end_block) {
+ switch (_setjmp(rule_tag)) {
+ case TAG_CONTINUE: /* next */
+ panic("unexpected next");
+ return 1;
+ case TAG_BREAK: return 0;
+ }
+ }
+ for (t = tree; t != NULL; t = t->rnode) {
+ DEBUG(("Statements",t->lnode));
+ (void)interpret (t->lnode);
+ }
+ break;
+
+ case Node_K_if:
+ DEBUG(("IF",tree->lnode));
+ if (eval_condition(tree->lnode)) {
+ DEBUG(("True",tree->rnode->lnode));
+ (void)interpret (tree->rnode->lnode);
+ } else {
+ DEBUG(("False",tree->rnode->rnode));
+ (void)interpret (tree->rnode->rnode);
+ }
+ break;
+
+ case Node_K_while:
+ PUSH_BINDING (loop_tag_stack, loop_tag);
+
+ DEBUG(("WHILE",tree->lnode));
+ while (eval_condition (tree->lnode)) {
+ switch (_setjmp (loop_tag)) {
+ case 0: /* normal non-jump */
+ DEBUG(("DO",tree->rnode));
+ (void)interpret (tree->rnode);
+ break;
+ case TAG_CONTINUE: /* continue statement */
+ break;
+ case TAG_BREAK: /* break statement */
+ RESTORE_BINDING (loop_tag_stack, loop_tag);
+ return 1;
+#ifndef FAST
+ default:
+ abort (); /* never happens */
+#endif
+ }
+ }
+ RESTORE_BINDING (loop_tag_stack, loop_tag);
+ break;
+
+ case Node_K_for:
+ PUSH_BINDING (loop_tag_stack, loop_tag);
+
+ DEBUG(("FOR",tree->forloop->init));
+ (void)interpret (tree->forloop->init);
+
+ DEBUG(("FOR.WHILE",tree->forloop->cond));
+ while (eval_condition (tree->forloop->cond)) {
+ switch (_setjmp (loop_tag)) {
+ case 0: /* normal non-jump */
+ DEBUG(("FOR.DO",tree->lnode));
+ (void)interpret (tree->lnode);
+ /* fall through */
+ case TAG_CONTINUE: /* continue statement */
+ DEBUG(("FOR.INCR",tree->forloop->incr));
+ (void)interpret (tree->forloop->incr);
+ break;
+ case TAG_BREAK: /* break statement */
+ RESTORE_BINDING (loop_tag_stack, loop_tag);
+ return 1;
+#ifndef FAST
+ default:
+ abort (); /* never happens */
+#endif
+ }
+ }
+ RESTORE_BINDING (loop_tag_stack, loop_tag);
+ break;
+
+ case Node_K_arrayfor:
+#define hakvar forloop->init
+#define arrvar forloop->incr
+ PUSH_BINDING(loop_tag_stack, loop_tag);
+ DEBUG(("AFOR.VAR",tree->hakvar));
+ lhs=get_lhs(tree->hakvar);
+ do_deref();
+ for(l=assoc_scan(tree->arrvar);l;l=assoc_next(l)) {
+ *lhs=dupnode(l->retval);
+ DEBUG(("AFOR.NEXTIS",*lhs));
+ switch(_setjmp(loop_tag)) {
+ case 0:
+ DEBUG(("AFOR.DO",tree->lnode));
+ (void)interpret(tree->lnode);
+ case TAG_CONTINUE:
+ break;
+
+ case TAG_BREAK:
+ RESTORE_BINDING(loop_tag_stack, loop_tag);
+ return 1;
+#ifndef FAST
+ default:
+ abort();
+#endif
+ }
+ }
+ RESTORE_BINDING(loop_tag_stack, loop_tag);
+ break;
+
+ case Node_K_break:
+ DEBUG(("BREAK",NULL));
+ if (loop_tag_valid == 0) /* jfw */
+ panic("unexpected break or continue");
+ _longjmp (loop_tag, TAG_BREAK);
+ break;
+
+ case Node_K_continue:
+ DEBUG(("CONTINUE",NULL));
+ if (loop_tag_valid == 0) /* jfw */
+ panic("unexpected break or continue");
+ _longjmp (loop_tag, TAG_CONTINUE);
+ break;
+
+ case Node_K_print:
+ DEBUG(("PRINT",tree));
+ (void)hack_print_node (tree);
+ break;
+
+ case Node_K_printf:
+ DEBUG(("PRINTF",tree));
+ (void)do_printf(tree);
+ break;
+
+ case Node_K_next:
+ DEBUG(("NEXT",NULL));
+ _longjmp (rule_tag, TAG_CONTINUE);
+ break;
+
+ case Node_K_exit:
+ /* The unix awk doc says to skip the rest of the input. Does that
+ mean after performing all the rules on the current line?
+ Unix awk quits immediately, so this does too. */
+ /* The UN*X exit can also take an optional arg return code. We don't */
+ /* Well, we parse it, but never *DO* it */
+ DEBUG(("EXIT",NULL));
+ _longjmp (rule_tag, TAG_BREAK);
+ break;
+
+ default:
+ /* Appears to be an expression statement. Throw away the value. */
+ DEBUG(("E",NULL));
+ (void)tree_eval (tree);
+ break;
+ }
+ return 1;
+}
+
+/* evaluate a subtree, allocating strings on a temporary stack. */
+/* This used to return a whole NODE, instead of a ptr to one, but that
+ led to lots of obnoxious copying. I got rid of it (JF) */
+NODE *
+tree_eval (tree)
+ NODE *tree;
+{
+ register NODE *r, *t1, *t2; /* return value and temporary subtrees */
+ register NODE **lhs;
+ static AWKNUM x; /* Why are these static? */
+ extern struct obstack temp_strings;
+
+ if(tree == NULL) {
+ DEBUG(("NULL",NULL));
+ return Nnull_string;
+ }
+ switch (tree->type) {
+ /* trivial data */
+ case Node_string:
+ case Node_number:
+ DEBUG(("DATA",tree));
+ return tree;
+
+ /* Builtins */
+ case Node_builtin:
+ DEBUG(("builtin",tree));
+ return ((*tree->proc)(tree->subnode));
+
+ /* unary operations */
+
+ case Node_var:
+ case Node_subscript:
+ case Node_field_spec:
+ DEBUG(("var_type ref",tree));
+ lhs=get_lhs(tree);
+ return *lhs;
+
+ case Node_preincrement:
+ case Node_predecrement:
+ DEBUG(("+-X",tree));
+ lhs=get_lhs(tree->subnode);
+ assign_number(lhs,force_number(*lhs) + (tree->type==Node_preincrement ? 1.0 : -1.0));
+ return *lhs;
+
+ case Node_postincrement:
+ case Node_postdecrement:
+ DEBUG(("X+-",tree));
+ lhs=get_lhs(tree->subnode);
+ x = force_number(*lhs);
+ assign_number (lhs, x + (tree->type==Node_postincrement ? 1.0 : -1.0));
+ return tmp_number(x);
+
+ case Node_unary_minus:
+ DEBUG(("UMINUS",tree));
+ return tmp_number(-force_number(tree_eval(tree->subnode)));
+
+ /* assignments */
+ case Node_assign:
+ DEBUG(("ASSIGN",tree));
+ r = tree_eval (tree->rnode);
+ lhs=get_lhs(tree->lnode);
+ *lhs= dupnode(r);
+ do_deref();
+ /* FOO we have to regenerate $0 here! */
+ if(tree->lnode->type==Node_field_spec)
+ fix_fields();
+ return r;
+ /* other assignment types are easier because they are numeric */
+ case Node_assign_times:
+ r = tree_eval (tree->rnode);
+ lhs=get_lhs(tree->lnode);
+ assign_number(lhs, force_number(*lhs) * force_number(r));
+ do_deref();
+ return r;
+
+ case Node_assign_quotient:
+ r = tree_eval (tree->rnode);
+ lhs=get_lhs(tree->lnode);
+ assign_number(lhs, force_number(*lhs) / force_number(r));
+ do_deref();
+ return r;
+
+ case Node_assign_mod:
+ r = tree_eval (tree->rnode);
+ lhs=get_lhs(tree->lnode);
+ assign_number(lhs, (AWKNUM)(((int) force_number(*lhs)) % ((int) force_number(r))));
+ do_deref();
+ return r;
+
+ case Node_assign_plus:
+ r = tree_eval (tree->rnode);
+ lhs=get_lhs(tree->lnode);
+ assign_number(lhs, force_number(*lhs) + force_number(r));
+ do_deref();
+ return r;
+
+ case Node_assign_minus:
+ r = tree_eval (tree->rnode);
+ lhs=get_lhs(tree->lnode);
+ assign_number(lhs, force_number(*lhs) - force_number(r));
+ do_deref();
+ return r;
+ }
+ /* Note that if TREE is invalid, gAWK will probably bomb in one of these
+ tree_evals here. */
+ /* evaluate subtrees in order to do binary operation, then keep going */
+ t1 = tree_eval (tree->lnode);
+ t2 = tree_eval (tree->rnode);
+
+ switch (tree->type) {
+
+ case Node_concat:
+ t1=force_string(t1);
+ t2=force_string(t2);
+
+ r=(NODE *)obstack_alloc(&temp_strings,sizeof(NODE));
+ r->type=Node_temp_string;
+ r->stlen=t1->stlen+t2->stlen;
+ r->stref=1;
+ r->stptr=(char *)obstack_alloc(&temp_strings,r->stlen+1);
+ bcopy(t1->stptr,r->stptr,t1->stlen);
+ bcopy(t2->stptr,r->stptr+t1->stlen,t2->stlen);
+ r->stptr[r->stlen]='\0';
+ return r;
+
+ case Node_times:
+ return tmp_number(force_number(t1) * force_number(t2));
+
+ case Node_quotient:
+ x=force_number(t2);
+ if(x==(AWKNUM)0) return tmp_number((AWKNUM)0);
+ else return tmp_number(force_number(t1) / x);
+
+ case Node_mod:
+ x=force_number(t2);
+ if(x==(AWKNUM)0) return tmp_number((AWKNUM)0);
+ return tmp_number((AWKNUM) /* uggh... */
+ (((int) force_number(t1)) % ((int) x)));
+
+ case Node_plus:
+ return tmp_number(force_number(t1) + force_number(t2));
+
+ case Node_minus:
+ return tmp_number(force_number(t1) - force_number(t2));
+
+#ifndef FAST
+ default:
+ fprintf (stderr, "internal error: illegal numeric operation\n");
+ abort ();
+#endif
+ }
+ return 0;
+}
+
+/* We can't dereference a variable until after we've given it its new value.
+ This variable points to the value we have to free up */
+NODE *deref;
+
+/* This returns a POINTER to a node pointer.
+ *get_lhs(ptr) is the current value of the var, or where to store the
+ var's new value */
+
+NODE **
+get_lhs(ptr)
+NODE *ptr;
+{
+ register NODE *subexp;
+ register NODE **aptr;
+ register int num;
+ extern NODE **fields_arr;
+ extern f_arr_siz;
+ NODE **assoc_lookup();
+ extern char f_empty[]; /* jfw */
+
+#ifndef FAST
+ if(ptr == NULL)
+ abort();
+#endif
+ deref = NULL;
+ switch(ptr->type) {
+ case Node_var:
+ deref=ptr->var_value;
+ return &(ptr->var_value);
+
+ case Node_field_spec:
+ num=(int)force_number(tree_eval(ptr->lnode));
+ if(num<0) num=0; /* JF what should I do? */
+ if(num>f_arr_siz)
+ set_field(num,f_empty,0); /* jfw: so blank_strings can be simpler */
+ deref = NULL;
+ return &fields_arr[num];
+
+ case Node_subscript:
+ subexp = tree_eval(ptr->rnode);
+ aptr=assoc_lookup(ptr->lnode,subexp);
+ deref= *aptr;
+ return aptr;
+ }
+#ifndef FAST
+ abort();
+ return 0;
+#endif
+}
+
+do_deref()
+{
+ if(deref) {
+ switch(deref->type) {
+ case Node_string:
+ if(deref!=Nnull_string)
+ FREE_ONE_REFERENCE(deref);
+ break;
+ case Node_number:
+ free((char *)deref);
+ break;
+#ifndef FAST
+ default:
+ abort();
+#endif
+ }
+ deref = 0;
+ }
+}
+
+/* This makes numeric operations slightly more efficient.
+ Just change the value of a numeric node, if possible */
+assign_number (ptr, value)
+NODE **ptr;
+AWKNUM value;
+{
+ switch ((*ptr)->type) {
+ case Node_string:
+ if(*ptr!=Nnull_string)
+ FREE_ONE_REFERENCE (*ptr);
+ case Node_temp_string: /* jfw: dont crash if we say $2 += 4 */
+ *ptr=make_number(value);
+ return;
+ case Node_number:
+ (*ptr)->numbr = value;
+ deref=0;
+ break;
+#ifndef FAST
+ default:
+ printf("assign_number nodetype %d\n", (*ptr)->type); /* jfw: add mesg. */
+ abort ();
+#endif
+ }
+}
+
+
+/* Routines to deal with fields */
+#define ORIG_F 30
+
+NODE **fields_arr;
+NODE *fields_nodes;
+int f_arr_siz;
+char f_empty [] = "";
+
+init_fields()
+{
+ register NODE **tmp;
+ register NODE *xtmp;
+
+ f_arr_siz=ORIG_F;
+ fields_arr=(NODE **)malloc(ORIG_F * sizeof(NODE *));
+ fields_nodes=(NODE *)malloc(ORIG_F * sizeof(NODE));
+ tmp= &fields_arr[f_arr_siz];
+ xtmp= &fields_nodes[f_arr_siz];
+ while(--tmp>= &fields_arr[0]) {
+ --xtmp;
+ *tmp=xtmp;
+ xtmp->type=Node_temp_string;
+ xtmp->stlen=0;
+ xtmp->stref=1;
+ xtmp->stptr=f_empty;
+ }
+}
+
+blank_fields()
+{
+ register NODE **tmp;
+ extern char *parse_end;
+
+ tmp= &fields_arr[f_arr_siz];
+ while(--tmp>= &fields_arr[0]) {
+ switch(tmp[0]->type) {
+ case Node_number:
+ free((char *)*tmp);
+ *tmp= &fields_nodes[tmp-fields_arr];
+ break;
+ case Node_string:
+ if(*tmp!=Nnull_string)
+ FREE_ONE_REFERENCE(*tmp);
+ *tmp= &fields_nodes[tmp-fields_arr];
+ break;
+ case Node_temp_string:
+ break;
+#ifndef FAST
+ default:
+ abort();
+#endif
+ }
+ if ((*tmp)->stptr != f_empty) { /* jfw */
+ /*Then it was assigned a string with set_field */
+ /*out of a private buffer to inrec, so don't free it*/
+ (*tmp)->stptr = f_empty;
+ (*tmp)->stlen = 0;
+ (*tmp)->stref = 1;
+ }
+ /* *tmp=Nnull_string; */
+ }
+ /* Free the strings */
+ obstack_free(&other_stack,parse_end);
+}
+
+/* Danger! Must only be called for fields we know have just been blanked,
+ or fields we know don't exist yet. */
+set_field(n,str,len)
+char *str;
+{
+ NODE *field_string();
+
+ if(n>f_arr_siz) {
+ int t;
+
+ fields_arr=(NODE **)realloc((char *)fields_arr,(n+1)*sizeof(NODE *));
+ fields_nodes=(NODE *)realloc((char *)fields_nodes,(n+1)*sizeof(NODE));
+ for(t=f_arr_siz;t<=n;t++) {
+ fields_arr[t]= &fields_nodes[t];
+ fields_nodes[t].type=Node_temp_string;
+ fields_nodes[t].stlen=0;
+ fields_nodes[t].stref=1;
+ fields_nodes[t].stptr=f_empty;
+ }
+ f_arr_siz=n+1;
+ }
+ fields_nodes[n].stlen=len;
+ if(n==0) {
+ fields_nodes[n].stptr=(char*)obstack_alloc(&other_stack,len+1);
+ bcopy(str,fields_nodes[n].stptr,len);
+ fields_nodes[n].stptr[len]='\0';
+ } else {
+ fields_nodes[n].stptr=str;
+ str[len]='\0';
+ }
+}
+
+#ifdef DONTDEF
+/* Nodes created with this will go away when the next input line is read */
+NODE *
+field_string(s,len)
+char *s;
+{
+ register NODE *r;
+
+ r=(NODE *)obstack_alloc(&other_stack,sizeof(NODE));
+ r->type=Node_temp_string;
+ r->stref=1;
+ r->stlen=len;
+ r->stptr=(char*)obstack_alloc(&other_stack,len+1);
+ bcopy(s,r->stptr,len);
+ /* r->stptr=s;
+ r->stptr[len]='\0'; */
+
+ return r;
+}
+#endif
+
+/* Someone assigned a value to $(something). Fix up $0 to be right */
+fix_fields()
+{
+ register int tlen;
+ register NODE *tmp;
+ NODE *ofs;
+ char *ops;
+ register char *cops;
+ register NODE **ptr,**maxp;
+ extern NODE *OFS_node;
+
+ maxp=0;
+ tlen=0;
+ ofs=force_string(*get_lhs(OFS_node));
+ ptr= &fields_arr[f_arr_siz];
+ while(--ptr> &fields_arr[0]) {
+ tmp=force_string(*ptr);
+ tlen+=tmp->stlen;
+ if(tmp->stlen && !maxp)
+ maxp=ptr;
+ }
+ if(!maxp) {
+ if (fields_arr[0] != fields_nodes)
+ FREE_ONE_REFERENCE(fields_arr[0]);
+ fields_arr[0]=Nnull_string;
+ return;
+ }
+
+ tlen+=((maxp-fields_arr)-1)*ofs->stlen;
+ ops=(char *)malloc(tlen+1);
+ cops=ops;
+ for(ptr= &fields_arr[1];ptr<=maxp;ptr++) {
+ tmp=force_string(*ptr);
+ bcopy(tmp->stptr,cops,tmp->stlen);
+ cops+=tmp->stlen;
+ if(ptr!=maxp) {
+ bcopy(ofs->stptr,cops,ofs->stlen);
+ cops+=ofs->stlen;
+ }
+ }
+ tmp=newnode(Node_string);
+ tmp->stptr=ops;
+ tmp->stlen=tlen;
+ tmp->stref=1;
+ tmp->stptr[tlen]='\0';
+ /* don't free unless it's new */
+ if (fields_arr[0] != fields_nodes)
+ FREE_ONE_REFERENCE(fields_arr[0]);
+ fields_arr[0]=tmp;
+}
+
+
+/* Is TREE true or false? Returns 0==false, non-zero==true */
+int
+eval_condition (tree)
+NODE *tree;
+{
+ register int di;
+ register NODE *t1,*t2;
+
+ if(tree==NULL) /* Null trees are the easiest kinds */
+ return 1;
+ switch (tree->type) {
+ /* Maybe it's easy; check and see. */
+ /* BEGIN and END are always false */
+ case Node_K_BEGIN:
+ return 0;
+ break;
+
+ case Node_K_END:
+ return 0;
+ break;
+
+ case Node_and:
+ return eval_condition (tree->lnode)
+ && eval_condition (tree->rnode);
+
+ case Node_or:
+ return eval_condition (tree->lnode)
+ || eval_condition (tree->rnode);
+
+ case Node_not:
+ return !eval_condition (tree->lnode);
+
+ /* Node_line_range is kind of like Node_match, EXCEPT:
+ * the lnode field (more properly, the condpair field) is a node of
+ * a Node_cond_pair; whether we evaluate the lnode of that node or the
+ * rnode depends on the triggered word. More precisely: if we are not
+ * yet triggered, we tree_eval the lnode; if that returns true, we set
+ * the triggered word. If we are triggered (not ELSE IF, note), we
+ * tree_eval the rnode, clear triggered if it succeeds, and perform our
+ * action (regardless of success or failure). We want to be able to
+ * begin and end on a single input record, so this isn't an ELSE IF, as
+ * noted above.
+ * This feature was implemented by John Woods, jfw@eddie.mit.edu, during
+ * a rainy weekend.
+ */
+ case Node_line_range:
+ if (!tree->triggered)
+ if (!eval_condition(tree->condpair->lnode))
+ return 0;
+ else
+ tree->triggered = 1;
+ /* Else we are triggered */
+ if (eval_condition(tree->condpair->rnode))
+ tree->triggered = 0;
+ return 1;
+ }
+
+ /* Could just be J.random expression.
+ in which case, null and 0 are false,
+ anything else is true */
+
+ switch(tree->type) {
+ case Node_match:
+ case Node_nomatch:
+ case Node_equal:
+ case Node_notequal:
+ case Node_less:
+ case Node_greater:
+ case Node_leq:
+ case Node_geq:
+ break;
+
+ default: /* This is so 'if(iggy)', etc, will work */
+ /* Non-zero and non-empty are true */
+ t1=tree_eval(tree);
+ switch(t1->type) {
+ case Node_number:
+ return t1->numbr!=0.0;
+ case Node_string:
+ case Node_temp_string:
+ return t1->stlen!=0;
+#ifndef FAST
+ default:
+ abort();
+#endif
+ }
+ }
+ /* couldn't fob it off recursively, eval left subtree and
+ see if it's a pattern match operation */
+
+ t1 = tree_eval (tree->lnode);
+
+ if (tree->type == Node_match || tree->type == Node_nomatch) {
+ t1=force_string(t1);
+ return (re_search (tree->rereg, t1->stptr,
+ t1->stlen, 0, t1->stlen,
+ NULL) == -1)
+ ^ (tree->type == Node_match);
+ }
+
+ /* still no luck--- eval the right subtree and try binary ops */
+
+ t2 = tree_eval (tree->rnode);
+
+ di=cmp_nodes(t1,t2);
+
+ switch (tree->type) {
+ case Node_equal:
+ return di == 0;
+ case Node_notequal:
+ return di != 0;
+ case Node_less:
+ return di < 0;
+ case Node_greater:
+ return di > 0;
+ case Node_leq:
+ return di <= 0;
+ case Node_geq:
+ return di >= 0;
+#ifndef FAST
+ default:
+ fprintf(stderr,"Panic: unknown conditonal\n");
+ abort ();
+#endif
+ }
+ return 0;
+}
+
+/* FOO this doesn't properly compare "12.0" and 12.0 etc */
+/* or "1E1" and 10 etc */
+/* Perhaps someone should fix it. */
+/* Consider it fixed (jfw) */
+
+/* strtod() would have been better, except (1) real awk is needlessly
+ * restrictive in what strings it will consider to be numbers, and
+ * (2) I couldn't find the public domain version anywhere handy.
+ */
+is_a_number(str) /* does the string str have pure-numeric syntax? */
+char *str; /* don't convert it, assume that atof is better */
+{
+ if (*str == 0) return 1; /* null string has numeric value of0 */
+ /* This is still a bug: in real awk, an explicit "" string
+ * is not treated as a number. Perhaps it is only variables
+ * that, when empty, are also 0s. This bug-lette here at
+ * least lets uninitialized variables to compare equal to
+ * zero like they should.
+ */
+ if (*str == '-') str++;
+ if (*str == 0) return 0;
+ /* must be either . or digits (.4 is legal) */
+ if (*str != '.' && !isdigit(*str)) return 0;
+ while (isdigit(*str)) str++;
+ if (*str == '.') {
+ str++;
+ while (isdigit(*str)) str++;
+ }
+ /* curiously, real awk DOESN'T consider "1E1" to be equal to 10!
+ * Or even equal to 1E1 for that matter! For a laugh, try:
+ * awk 'BEGIN {if ("1E1" == 1E1) print "eq"; else print "neq";exit}'
+ * Since this behavior is QUITE curious, I include the code for the
+ * adventurous. One might also feel like skipping leading whitespace
+ * (awk doesn't) and allowing a leading + (awk doesn't).
+#ifdef Allow_Exponents
+ if (*str == 'e' || *str == 'E') {
+ str++;
+ if (*str == '+' || *str == '-') str++;
+ if (!isdigit(*str)) return 0;
+ while (isdigit(*str)) str++;
+ }
+#endif
+ /* if we have digested the whole string, we are successful */
+ return (*str == 0);
+}
+
+cmp_nodes(t1,t2)
+NODE *t1,*t2;
+{
+ register int di;
+ register AWKNUM d;
+
+
+ if(t1==t2) {
+ return 0;
+ }
+#ifndef FAST
+ if(!t1 || !t2) {
+ abort();
+ return t1 ? 1 : -1;
+ }
+
+#endif
+ if (t1->type == Node_number && t2->type == Node_number) {
+ d = t1->numbr - t2->numbr;
+ if (d < 0.0)
+ return -1;
+ if (d > 0.0)
+ return 1;
+ return 0;
+ }
+ t1=force_string(t1);
+ t2=force_string(t2);
+ /* "real" awk treats things as numbers if they both "look" like numbers. */
+ if (*t1->stptr && *t2->stptr /* don't allow both to be empty strings(jfw)*/
+ && is_a_number(t1->stptr) && is_a_number(t2->stptr)) {
+ double atof();
+ d = atof(t1->stptr) - atof(t2->stptr);
+ if (d < 0.0) return -1;
+ if (d > 0.0) return 1;
+ return 0;
+ }
+ di = strncmp (t1->stptr, t2->stptr, min (t1->stlen, t2->stlen));
+ if (di == 0)
+ di = t1->stlen - t2->stlen;
+ if(di>0) return 1;
+ if(di<0) return -1;
+ return 0;
+}
+
+
+#ifdef DONTDEF
+int primes[] = {31,61,127,257,509,1021,2053,4099,8191,16381};
+#endif
+
+/* routines for associative arrays. SYMBOL is the address of the node
+ (or other pointer) being dereferenced. SUBS is a number or string
+ used as the subscript. */
+
+/* #define ASSOC_HASHSIZE 1009 /* prime */
+#define ASSOC_HASHSIZE 29
+#define STIR_BITS(n) ((n) << 5 | (((n) >> 27) & 0x1f))
+#define HASHSTEP(old, c) ((old << 1) + c)
+#define MAKE_POS(v) (v & ~0x80000000) /* make number positive */
+
+/* static AHASH *assoc_table[ASSOC_HASHSIZE]; */
+
+
+/* Flush all the values in symbol[] before doing a split() */
+assoc_clear(symbol)
+NODE *symbol;
+{
+ int i;
+ AHASH *bucket,*next;
+
+ if(symbol->var_array==0)
+ return;
+ for(i=0;i<ASSOC_HASHSIZE;i++) {
+ for(bucket=symbol->var_array[i];bucket;bucket=next) {
+ next=bucket->next;
+ deref=bucket->name;
+ do_deref();
+ deref=bucket->value;
+ do_deref();
+ free((void *)bucket);
+ }
+ symbol->var_array[i]=0;
+ }
+}
+
+/* Find SYMBOL[SUBS] in the assoc array. Install it with value "" if it
+ isn't there. */
+/* Returns a pointer ala get_lhs to where its value is stored */
+NODE **
+assoc_lookup (symbol, subs)
+NODE *symbol,
+ *subs;
+{
+ int hash1 = 0, hashf(), i;
+ AHASH *bucket;
+ NODETYPE ty;
+
+ if(subs->type==Node_number) {
+ hash1=(int)subs->numbr;
+ ty=Node_number;
+ } else {
+ ty=Node_string;
+ subs=force_string(subs);
+ for(i=0;i<subs->stlen;i++)
+ hash1=HASHSTEP(hash1,subs->stptr[i]);
+
+ /* hash1 ^= (int) STIR_BITS((int)symbol); */
+ }
+ hash1 = MAKE_POS(STIR_BITS((int)hash1)) % ASSOC_HASHSIZE;
+
+ /* this table really should grow dynamically */
+ if(symbol->var_array==0) {
+ symbol->var_array=(AHASH **)malloc(sizeof(AHASH *)*ASSOC_HASHSIZE);
+ for(i=0;i<ASSOC_HASHSIZE;i++) {
+ symbol->var_array[i]=0;
+ }
+ } else {
+ for (bucket = symbol->var_array[hash1]; bucket; bucket = bucket->next) {
+ if (bucket->name->type!= ty || cmp_nodes(bucket->name,subs))
+ continue;
+ return &(bucket->value);
+ }
+ /* Didn't find it on first pass. Try again. */
+ for (bucket = symbol->var_array[hash1]; bucket; bucket = bucket->next) {
+ if (cmp_nodes(bucket->name,subs))
+ continue;
+ return &(bucket->value);
+ }
+ }
+ bucket = (AHASH *) malloc(sizeof (AHASH));
+ bucket->symbol = symbol;
+ bucket->name = dupnode(subs);
+ bucket->value = Nnull_string;
+ bucket->next = symbol->var_array[hash1];
+ symbol->var_array[hash1]=bucket;
+ return &(bucket->value);
+}
+
+struct search *
+assoc_scan(symbol)
+NODE *symbol;
+{
+ struct search *lookat;
+
+ if(!symbol->var_array)
+ return 0;
+ lookat=(struct search *)obstack_alloc(&other_stack,sizeof(struct search));
+ /* lookat->symbol=symbol; */
+ lookat->numleft=ASSOC_HASHSIZE;
+ lookat->arr_ptr=symbol->var_array;
+ lookat->bucket=symbol->var_array[0];
+ return assoc_next(lookat);
+}
+
+struct search *
+assoc_next(lookat)
+struct search *lookat;
+{
+ for(;lookat->numleft;lookat->numleft--) {
+ while(lookat->bucket!=0) {
+ lookat->retval=lookat->bucket->name;
+ lookat->bucket=lookat->bucket->next;
+ return lookat;
+ }
+ lookat->bucket= *++(lookat->arr_ptr);
+ }
+ return 0;
+}
+
+
+#ifdef FAST
+NODE *
+strforce(n)
+NODE *n;
+{
+ extern NODE dumb[],*OFMT_node;
+ NODE *do_sprintf();
+
+ dumb[1].lnode=n;
+ if(OFMT_node->var_value->type!=Node_string)
+ panic("Insane value for OFMT detected.");
+ return do_sprintf(&dumb[0]);
+}
+
+#else
+AWKNUM
+force_number (n)
+NODE *n;
+{
+ double atof(); /* Forgetting this is bad */
+
+ if(n==NULL)
+ abort();
+ switch (n->type) {
+ case Node_number:
+ return n->numbr;
+ case Node_string:
+ case Node_temp_string:
+ return atof(n->stptr);
+ default:
+ abort ();
+ }
+ return 0.0;
+}
+
+NODE *
+force_string(s)
+NODE *s;
+{
+ if(s==NULL)
+ abort();
+ switch(s->type) {
+ case Node_string:
+ case Node_temp_string:
+ return s;
+ case Node_number:
+ if((*get_lhs(OFMT_node))->type!=Node_string)
+ panic("Insane value for OFMT!",0);
+ dumb[1].lnode=s;
+ return do_sprintf(&dumb[0]);
+ default:
+ abort();
+ }
+ return NULL;
+}
+#endif
diff --git a/awk3.c b/awk3.c
new file mode 100644
index 00000000..1f58dfae
--- /dev/null
+++ b/awk3.c
@@ -0,0 +1,900 @@
+/* awk3.c -- Builtin functions and various utility procedures
+ Copyright (C) 1986,1987 Free Software Foundation
+ Written by Jay Fenlason, December 1986
+
+ */
+
+/*
+GAWK is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY. No author or distributor accepts responsibility to anyone
+for the consequences of using it or for whether it serves any
+particular purpose or works at all, unless he says so in writing.
+Refer to the GAWK General Public License for full details.
+
+Everyone is granted permission to copy, modify and redistribute GAWK,
+but only under the conditions described in the GAWK General Public
+License. A copy of this license is supposed to have been given to you
+along with GAWK so you can know your rights and responsibilities. It
+should be in a file named COPYING. Among other things, the copyright
+notice and this notice must be preserved on all copies.
+
+In other words, go ahead and share GAWK, but don't try to stop
+anyone else from sharing it farther. Help stamp out software hoarding!
+*/
+#include <stdio.h>
+#include "awk.h"
+
+#include <obstack.h>
+
+extern struct obstack temp_strings;
+
+/* This node is the cannonical null string, used everywhere */
+extern NODE *Nnull_string;
+
+/* These nodes store all the special variables gAWK uses */
+NODE *FS_node, *NF_node, *RS_node, *NR_node;
+NODE *FILENAME_node, *OFS_node, *ORS_node, *OFMT_node;
+
+/* This dumb kludge is used by force_string to turn a floating point
+ number into a string */
+NODE dumb[2];
+
+NODE **get_lhs();
+FILE *deal_redirect();
+
+struct redirect {
+ int flag; /* JF was NODETYPE */
+ NODE *value;
+ FILE *fp;
+};
+struct redirect reds[20]; /* An arbitrary limit, surely, but there's an
+ arbitrary limit on open files, too. So it
+ doesn't make much difference, does it? */
+
+
+long NR;
+int NF;
+
+/* The next #define tells how you find $0. Its a hack */
+extern NODE **fields_arr;
+#define WHOLELINE fields_arr[0]
+
+/* Set all the special variables to their initial values. Also sets up
+ the dumb[] array for force_string */
+init_vars()
+{
+ NODE *spc_var();
+ NODE *do_sprintf();
+
+ FS_node=spc_var("FS",make_string(" ",1));
+ NF_node=spc_var("NF",make_number(0.0));
+ RS_node=spc_var("RS",make_string("\n",1));
+ NR_node=spc_var("NR",make_number(0.0));
+ FILENAME_node=spc_var("FILENAME",Nnull_string);
+ OFS_node=spc_var("OFS",make_string(" ",1));
+ ORS_node=spc_var("ORS",make_string("\n",1));
+ OFMT_node=spc_var("OFMT",make_string("%.6g",4));
+
+ /* This ugly hack is used by force_string
+ to fake a call to sprintf */
+ dumb[0].type=Node_expression_list;
+ dumb[0].lnode=OFMT_node;
+ dumb[0].rnode= &dumb[1];
+ dumb[1].type=Node_expression_list;
+ dumb[1].lnode=(NODE *)0; /* fill in the var here */
+ dumb[1].rnode=(NODE *)0;
+ reds[0].flag=0; /* Don't depend on uninit data being zero, although it should be */
+}
+
+/* OFMT is special because we don't dare use force_string on it for fear of
+ infinite loops. Thus, if it isn't a string, we return the default "%.6g"
+ This may or may not be the right thing to do, but its the easiest */
+/* This routine isn't used! It should be. */
+char *get_ofmt()
+{
+ register NODE *tmp;
+
+ tmp= *get_lhs(OFMT_node);
+ if(tmp->type!=Node_string || tmp->stlen==0)
+ return "%.6g";
+ return tmp->stptr;
+}
+
+int
+get_fs()
+{
+ register NODE *tmp;
+
+ tmp=force_string(FS_node->var_value);
+ if(tmp->stlen==0) return 0;
+ return *(tmp->stptr);
+}
+
+set_fs(str)
+char *str;
+{
+ register NODE **tmp;
+
+ tmp= get_lhs(FS_node);
+ do_deref();
+ /* stupid special case so -F\t works as documented in awk */
+ /* even though the shell hands us -Ft. Bleah! (jfw) */
+ if (*str == 't') *str == '\t';
+ *tmp=make_string(str,1);
+}
+
+set_rs(str)
+char *str;
+{
+ register NODE **tmp;
+
+ tmp= get_lhs(RS_node);
+ do_deref();
+ /* stupid special case to be consistent with -F (jfw) */
+ if (*str == 't') *str == '\t';
+ *tmp=make_string(str,1);
+}
+
+
+int
+get_rs()
+{
+ register NODE *tmp;
+
+ tmp=force_string(RS_node->var_value);
+ if(tmp->stlen==0) return 0;
+ return *(tmp->stptr);
+}
+
+
+/* Builtin functions */
+NODE *
+do_exp(tree)
+NODE *tree;
+{
+ NODE *tmp;
+ double exp();
+
+ get_one(tree,&tmp);
+ return tmp_number(exp(force_number(tmp)));
+}
+
+/* JF: I don't know what this should return. */
+/* jfw: 1 if successful or by land, 0 if end of file or by sea */
+NODE *
+do_getline(tree)
+NODE *tree;
+{
+ if(inrec() == 0)
+ return tmp_number(1.0);
+ else
+ return tmp_number(0.0);
+}
+
+NODE *
+do_index(tree)
+NODE *tree;
+{
+ NODE *s1,*s2;
+ register char *p1,*p2;
+ register int l1,l2;
+
+ get_two(tree,&s1,&s2);
+ p1=s1->stptr;
+ p2=s2->stptr;
+ l1=s1->stlen;
+ l2=s2->stlen;
+ while(l1) {
+ if(!strncmp(p1,p2,l2))
+ return tmp_number((AWKNUM)(1+s1->stlen-l1));
+ l1--;
+ p1++;
+ }
+ return tmp_number(0.0);
+}
+
+NODE *
+do_int(tree)
+NODE *tree;
+{
+ NODE *tmp;
+ double floor();
+
+ get_one(tree,&tmp);
+ return tmp_number(floor(force_number(tmp)));
+}
+
+NODE *
+do_length(tree)
+NODE *tree;
+{
+ NODE *tmp;
+
+ get_one(tree,&tmp);
+ return tmp_number((AWKNUM)(force_string(tmp)->stlen));
+}
+
+NODE *
+do_log(tree)
+NODE *tree;
+{
+ NODE *tmp;
+ double log();
+
+ get_one(tree,&tmp);
+ return tmp_number(log(force_number(tmp)));
+}
+
+
+NODE *
+do_printf(tree)
+NODE *tree;
+{
+ register FILE *fp;
+ NODE *do_sprintf();
+
+ fp=deal_redirect(tree->rnode);
+ print_simple(do_sprintf(tree->lnode),fp);
+ return Nnull_string;
+}
+
+
+NODE *
+do_split(tree)
+NODE *tree;
+{
+ NODE *t1,*t2,*t3;
+ register int splitc;
+ register int num,snum,olds;
+ register char *ptr,*oldp;
+ NODE **assoc_lookup();
+
+ if(a_get_three(tree,&t1,&t2,&t3)<3)
+ splitc= get_fs();
+ else
+ splitc= *(force_string(t3)->stptr);
+ num=0;
+ tree=force_string(t1);
+ olds=snum=tree->stlen;
+ oldp=ptr=tree->stptr;
+ assoc_clear(t2);
+ while(snum--) {
+ if(*ptr++==splitc) {
+ *assoc_lookup(t2,make_number((AWKNUM)(++num)))=make_string(oldp,(olds-snum)-1);
+ oldp=ptr;
+ olds=snum;
+ }
+ }
+ *assoc_lookup(t2,make_number((AWKNUM)(++num)))=make_string(oldp,(olds-snum)-1);
+ return tmp_number((AWKNUM)num);
+}
+
+/* Note that the output buffer cannot be static because sprintf may get called
+ recursively by force_string. Hence the wasteful alloca calls */
+
+/* %e and %f formats are not properly implemented. Someone should fix them */
+NODE *
+do_sprintf(tree)
+NODE *tree;
+{
+#define bchunk(s,l) if(l) {\
+ if((l)>ofre) {\
+ char *tmp;\
+ tmp=(char *)alloca(osiz*2);\
+ bcopy(obuf,tmp,olen);\
+ obuf=tmp;\
+ ofre+=osiz;\
+ osiz*=2;\
+ }\
+ bcopy(s,obuf+olen,(l));\
+ olen+=(l);\
+ ofre-=(l);\
+ }
+
+/* Is there space for something L big in the buffer? */
+#define chksize(l) if((l)>ofre) {\
+ char *tmp;\
+ tmp=(char *)alloca(osiz*2);\
+ bcopy(obuf,tmp,olen);\
+ obuf=tmp;\
+ ofre+=osiz;\
+ osiz*=2;\
+ }
+/* Get the next arg to be formatted. If we've run out of args, return
+ "" (Null string) */
+#define parse_next_arg() {\
+ if(!carg) arg= Nnull_string;\
+ else {\
+ get_one(carg,&arg);\
+ carg=carg->rnode;\
+ }\
+ }
+
+ char *obuf;
+ int osiz,ofre,olen;
+ static char chbuf[] = "0123456789abcdef";
+ static char sp[] =" ";
+ char *s0,*s1;
+ int n0;
+ NODE *sfmt,*arg;
+ register NODE *carg;
+ long fw,prec,lj,alt,big;
+ long *cur;
+ long val;
+ unsigned long uval;
+ int sgn;
+ int base;
+ char cpbuf[30]; /* if we have numbers bigger than 30 */
+ char *cend= &cpbuf[30]; /* chars, we lose, but seems unlikely */
+ char *cp;
+ char *fill;
+ double tmpval;
+ char *pr_str;
+
+
+ obuf=(char *)alloca(120);
+ osiz=120;
+ ofre=osiz;
+ olen=0;
+ get_one(tree,&sfmt);
+ sfmt=force_string(sfmt);
+ carg=tree->rnode;
+ for(s0=s1=sfmt->stptr,n0=sfmt->stlen;n0-->0;) {
+ if(*s1!='%') {
+ s1++;
+ continue;
+ }
+
+ bchunk(s0,s1-s0);
+ s0=s1;
+ cur= &fw;
+ fw=0;
+ prec=0;
+ lj=alt=big=0;
+ fill= sp;
+ cp=cend;
+ s1++;
+
+ retry:
+ --n0;
+ switch(*s1++) {
+ case '%':
+ bchunk("%",1);
+ s0=s1;
+ break;
+
+ case '0':
+ if(fill!=sp || lj) goto lose;
+ fill="0"; /* FALL through */
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ if(cur==0)
+ goto lose;
+ *cur= s1[-1]-'0';
+ while(n0>0 && *s1>='0' && *s1<='9') {
+ --n0;
+ *cur= *cur * 10 + *s1++ - '0';
+ }
+ goto retry;
+ case '-':
+ if(lj || fill!=sp) goto lose;
+ lj++;
+ goto retry;
+ case '.':
+ if(cur!=&fw) goto lose;
+ cur= &prec;
+ goto retry;
+ case '#':
+ if(alt) goto lose;
+ alt++;
+ goto retry;
+ case 'l':
+ if(big) goto lose;
+ big++;
+ goto retry;
+ case '*':
+ if(cur==0) goto lose;
+ parse_next_arg();
+ *cur=(int)arg;
+ goto retry;
+ case 'c':
+ parse_next_arg();
+ if(arg->type==Node_number) {
+ uval=(unsigned long)arg->numbr;
+ cpbuf[0]=uval;
+ prec=1;
+ pr_str=cpbuf;
+ goto dopr_string;
+ }
+ if(!prec || prec>arg->stlen)
+ prec=arg->stlen;
+ pr_str=cpbuf;
+ goto dopr_string;
+ case 's':
+ parse_next_arg();
+ arg=force_string(arg);
+ if(!prec || prec>arg->stlen)
+ prec=arg->stlen;
+ pr_str=arg->stptr;
+
+ dopr_string:
+ if(fw>prec && !lj) {
+ while(fw>prec) {
+ bchunk(sp,1);
+ fw--;
+ }
+ }
+ bchunk(pr_str,(int)prec);
+ if(fw>prec) {
+ while(fw>prec) {
+ bchunk(sp,1);
+ fw--;
+ }
+ }
+ s0=s1;
+ break;
+ case 'd':
+ parse_next_arg();
+ val=(long)force_number(arg);
+ if(val<0) {
+ sgn=1;
+ val= -val;
+ } else sgn=0;
+ do {
+ *--cp='0'+val%10;
+ val/=10;
+ } while (val);
+ if(sgn) *--cp='-';
+ prec=cend-cp;
+ if(fw>prec && !lj) {
+ if(fill!=sp && *cp=='-') {
+ bchunk(cp,1);
+ cp++;
+ prec--;
+ fw--;
+ }
+ while(fw>prec) {
+ bchunk(fill,1);
+ fw--;
+ }
+ }
+ bchunk(cp,(int)prec);
+ if(fw>prec) {
+ while(fw>prec) {
+ bchunk(fill,1);
+ fw--;
+ }
+ }
+ s0=s1;
+ break;
+ case 'u':
+ base=10;
+ goto pr_unsigned;
+ case 'o':
+ base=8;
+ goto pr_unsigned;
+ case 'x':
+ base=16;
+ goto pr_unsigned;
+ pr_unsigned:
+ parse_next_arg();
+ uval=(unsigned long)force_number(arg);
+ do {
+ *--cp=chbuf[uval%base];
+ uval/=base;
+ } while(uval);
+ prec=cend-cp;
+ if(fw>prec && !lj) {
+ while(fw>prec) {
+ bchunk(fill,1);
+ fw--;
+ }
+ }
+ bchunk(cp,(int)prec);
+ if(fw>prec) {
+ while(fw>prec) {
+ bchunk(fill,1);
+ fw--;
+ }
+ }
+ s0=s1;
+ break;
+ case 'g':
+ parse_next_arg();
+ tmpval=force_number(arg);
+ if(prec==0) prec=13;
+ gcvt(tmpval,prec,cpbuf);
+ prec=strlen(cpbuf);
+ cp=cpbuf;
+ if(fw>prec && !lj) {
+ if(fill!=sp && *cp=='-') {
+ bchunk(cp,1);
+ cp++;
+ prec--;
+ } /* Deal with .5 as 0.5 */
+ if(fill==sp && *cp=='.') {
+ --fw;
+ while(--fw>=prec) {
+ bchunk(fill,1);
+ }
+ bchunk("0",1);
+ } else
+ while(fw-->prec) bchunk(fill,1);
+ } else { /* Turn .5 into 0.5 */
+ /* FOO */
+ if(*cp=='.' && fill==sp) {
+ bchunk("0",1);
+ --fw;
+ }
+ }
+ bchunk(cp,(int)prec);
+ if(fw>prec) while(fw-->prec) bchunk(fill,1);
+ s0=s1;
+ break;
+ /* JF how to handle these!? */
+ case 'f':
+ parse_next_arg();
+ tmpval=force_number(arg);
+ chksize(fw+prec+5); /* 5==slop */
+/* cp=fcvt(tmpval,prec,&dec,&sgn);
+ prec=strlen(cp);
+ if(sgn) prec++; */
+ cp=cpbuf;
+ *cp++='%';
+ if(lj) *cp++='-';
+ if(fill!=sp) *cp++='0';
+ if(prec!=0) {
+ strcpy(cp,"*.*f");
+ sprintf(obuf+olen,cpbuf,fw,prec,(double)tmpval);
+ } else {
+ strcpy(cp,"*f");
+ sprintf(obuf+olen,cpbuf,fw,(double)tmpval);
+ }
+ cp=obuf+olen;
+ ofre-=strlen(obuf+olen);
+ olen+=strlen(obuf+olen);/* There may be nulls */
+ s0=s1;
+ break;
+ case 'e':
+ parse_next_arg();
+ tmpval=force_number(arg);
+ chksize(fw+prec+5); /* 5==slop */
+ cp=cpbuf;
+ *cp++='%';
+ if(lj) *cp++='-';
+ if(fill!=sp) *cp++='0';
+ if(prec!=0) {
+ strcpy(cp,"*.*e");
+ sprintf(obuf+olen,cpbuf,fw,prec,(double)tmpval);
+ } else {
+ strcpy(cp,"*e");
+ sprintf(obuf+olen,cpbuf,fw,(double)tmpval);
+ }
+ cp=obuf+olen;
+ ofre-=strlen(obuf+olen);
+ olen+=strlen(obuf+olen);/* There may be nulls */
+ s0=s1;
+ break;
+ break;
+ /* case 'g':
+ parse_next_arg();
+ tmpval=force_number(arg);
+ if(prec!=0) sprintf(obuf+osiz-ofre,"%*.*g",fw,prec,(double)tmpval);
+ else sprintf(obuf+osiz-ofre,"%*g",fw,(double)tmpval);
+ ofre-=strlen(obuf+osiz-ofre);
+ s0=s1;
+ break; */
+ default:
+ lose:
+ break;
+ }
+ }
+ bchunk(s0,s1-s0);
+ return tmp_string(obuf,olen);
+}
+
+NODE *
+do_sqrt(tree)
+NODE *tree;
+{
+ NODE *tmp;
+ double sqrt();
+
+ get_one(tree,&tmp);
+ return tmp_number(sqrt(force_number(tmp)));
+}
+
+NODE *
+do_substr(tree)
+NODE *tree;
+{
+ NODE *t1,*t2,*t3;
+ register int n1,n2;
+
+ if(get_three(tree,&t1,&t2,&t3)<3)
+ n2=32000;
+ else
+ n2=(int)force_number(t3);
+ n1=(int)force_number(t2)-1;
+ tree=force_string(t1);
+ if(n1<0 || n1>=tree->stlen || n2<=0)
+ return Nnull_string;
+ if(n1+n2>tree->stlen)
+ n2=tree->stlen-n1;
+ return tmp_string(tree->stptr+n1,n2);
+}
+
+/* The print command. Its name is historical */
+hack_print_node(tree)
+NODE *tree;
+{
+ register FILE *fp;
+
+#ifndef FAST
+ if(!tree || tree->type != Node_K_print)
+ abort();
+#endif
+ fp=deal_redirect(tree->rnode);
+ tree=tree->lnode;
+ if(!tree) tree=WHOLELINE;
+ if(tree->type!=Node_expression_list) {
+ print_simple(tree,fp);
+ } else {
+ while(tree) {
+ print_simple(tree_eval(tree->lnode),fp);
+ tree=tree->rnode;
+ if(tree) print_simple(OFS_node->var_value,fp);
+ }
+ }
+ print_simple(ORS_node->var_value,fp);
+}
+
+
+/* Get the arguments to functions. No function cares if you give it
+ too many args (they're ignored). Only a few fuctions complain
+ about being given too few args. The rest have defaults */
+
+get_one(tree,res)
+NODE *tree,**res;
+{
+ if(!tree) {
+ *res= WHOLELINE;
+ return;
+ }
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res=tree_eval(tree->lnode);
+}
+
+get_two(tree,res1,res2)
+NODE *tree,**res1,**res2;
+{
+ if(!tree) {
+ *res1= WHOLELINE;
+ return;
+ }
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res1=tree_eval(tree->lnode);
+ if(!tree->rnode)
+ return;
+ tree=tree->rnode;
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res2=tree_eval(tree->lnode);
+}
+
+get_three(tree,res1,res2,res3)
+NODE *tree,**res1,**res2,**res3;
+{
+ if(!tree) {
+ *res1= WHOLELINE;
+ return 0;
+ }
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res1=tree_eval(tree->lnode);
+ if(!tree->rnode)
+ return 1;
+ tree=tree->rnode;
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res2=tree_eval(tree->lnode);
+ if(!tree->rnode)
+ return 2;
+ tree=tree->rnode;
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res3=tree_eval(tree->lnode);
+ return 3;
+}
+
+a_get_three(tree,res1,res2,res3)
+NODE *tree,**res1,**res2,**res3;
+{
+ if(!tree) {
+ *res1= WHOLELINE;
+ return 0;
+ }
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res1=tree_eval(tree->lnode);
+ if(!tree->rnode)
+ return 1;
+ tree=tree->rnode;
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res2=tree->lnode;
+ if(!tree->rnode)
+ return 2;
+ tree=tree->rnode;
+#ifndef FAST
+ if(tree->type!=Node_expression_list)
+ abort();
+#endif
+ *res3=tree_eval(tree->lnode);
+ return 3;
+}
+
+/* FOO this should re-allocate the buffer if it isn't big enough.
+ Also, it should do RMS style only-parse-enough stuff. */
+/* This reads in a line from the input file */
+inrec()
+{
+ static char *buf,*buf_end;
+ static bsz;
+ register char *cur;
+ register char *tmp;
+ register char *ttmp;
+ int cnt;
+ int tcnt;
+ register int c;
+ int rs;
+ int fs;
+ extern FILE *input_file;
+ NODE **get_lhs();
+
+ rs = get_rs();
+ fs = get_fs();
+ blank_fields();
+ NR++;
+ NF=0;
+ if(!buf) {
+ buf=malloc(128);
+ bsz=128;
+ buf_end=buf+bsz;
+ }
+ cur=buf;
+ cnt=0;
+ while ((c=getc(input_file))!=EOF) {
+ if((!rs && c=='\n' && cur[-1]=='\n' && cur!=buf) || (c == rs))
+ break;
+ *cur++=c;
+ cnt++;
+ if(cur==buf_end) {
+ buf=realloc(buf,bsz*2);
+ cur=buf+bsz;
+ bsz*=2;
+ buf_end=buf+bsz;
+ }
+ }
+ *cur='\0';
+ set_field(0,buf,cnt);
+ assign_number(&(NF_node->var_value),0.0);
+ if(c==EOF && cnt==0)
+ return 1;
+ assign_number(&(NR_node->var_value),1.0+force_number(NR_node->var_value));
+ for(tmp=buf;tmp<cur;tmp++) {
+ if(fs==' ') {
+ while((*tmp==' ' || *tmp=='\t') && tmp<cur)
+ tmp++;
+ if(tmp>=cur)
+ break;
+ }
+ tcnt=0;
+ ttmp=tmp;
+ if(fs==' ') {
+ while(*tmp!=' ' && *tmp!='\t' && tmp<cur) {
+ tmp++;
+ tcnt++;
+ }
+ } else {
+ while(*tmp!=fs && tmp<cur) {
+ tmp++;
+ tcnt++;
+ }
+ }
+ set_field(++NF,ttmp,tcnt);
+ }
+ assign_number(&(NF_node->var_value),(AWKNUM)NF);
+ return 0;
+}
+
+/* Redirection for printf and print commands */
+FILE *
+deal_redirect(tree)
+NODE *tree;
+{
+ register NODE *tmp;
+ register struct redirect *rp;
+ register char *str;
+ register FILE *fp;
+ FILE *popen();
+ int tflag;
+
+
+ if(!tree) return stdout;
+ tflag= (tree->type==Node_redirect_pipe) ? 1 : 2;
+ tmp=tree_eval(tree->subnode);
+ for(rp=reds;rp->flag!=0 && rp<&reds[20];rp++) { /* That limit again */
+ if(rp->flag==tflag && cmp_nodes(rp->value,tmp)==0)
+ break;
+ }
+ if(rp==&reds[20]) {
+ panic("too many redirections",0);
+ return 0;
+ }
+ if(rp->flag!=0)
+ return rp->fp;
+ rp->flag=tflag;
+ rp->value=dupnode(tmp);
+ str=force_string(tmp)->stptr;
+ switch(tree->type) {
+ case Node_redirect_output:
+ fp=rp->fp=fopen(str,"w");
+ break;
+ case Node_redirect_append:
+ fp=rp->fp=fopen(str,"a");
+ break;
+ case Node_redirect_pipe:
+ fp=rp->fp=popen(str,"w");
+ break;
+ }
+ if(fp==0) panic("can't redirect to '%s'\n",str);
+ rp++;
+ rp->flag=0;
+ return fp;
+}
+
+print_simple(tree,fp)
+NODE *tree;
+FILE *fp;
+{
+#ifndef FAST
+ /* Deal with some obscure bugs */
+ if(tree==(NODE *)0x55000000) {
+ fprintf(fp,"***HUH***");
+ return;
+ }
+ if((int)tree&01) {
+ fprintf(fp,"$that's odd$");
+ return;
+ }
+#endif
+ tree=force_string(tree);
+ fwrite(tree->stptr,sizeof(char),tree->stlen,fp);
+}
+
diff --git a/debug.c b/debug.c
new file mode 100644
index 00000000..59033abf
--- /dev/null
+++ b/debug.c
@@ -0,0 +1,485 @@
+/*
+ Debug.c -- Various debugging routines
+
+ Copyright (C) 1986 Free Software Foundation
+ Written by Jay Fenlason, December 1986
+
+ */
+
+/*
+GAWK is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY. No author or distributor accepts responsibility to anyone
+for the consequences of using it or for whether it serves any
+particular purpose or works at all, unless he says so in writing.
+Refer to the GAWK General Public License for full details.
+
+Everyone is granted permission to copy, modify and redistribute GAWK,
+but only under the conditions described in the GAWK General Public
+License. A copy of this license is supposed to have been given to you
+along with GAWK so you can know your rights and responsibilities. It
+should be in a file named COPYING. Among other things, the copyright
+notice and this notice must be preserved on all copies.
+
+In other words, go ahead and share GAWK, but don't try to stop
+anyone else from sharing it farther. Help stamp out software hoarding!
+*/
+#include "awk.h"
+#include <stdio.h>
+
+#ifndef FAST
+
+extern NODE **fields_arr;
+extern f_arr_siz;
+
+
+/* This is all debugging stuff. Ignore it and maybe it'll go away. */
+
+/* Some of it could be turned into a really cute trace command, if anyone
+ wants to. */
+char *nnames[] = {
+ "Illegal Node",
+ "Times", "Divide", "Mod", "Plus", "Minus",
+ "Cond-pair" /* jfw */, "Subscript", "Concat",
+ "++Pre", "--Pre", "Post++",
+ "Post--", "Uminus", "Field",
+ "Assign", "*=", "/=", "%=",
+ "+=", "-=",
+ "And", "Or",
+ "Equal", "!=", "Less", "Greater", "<=", ">=",
+ "Not",
+ "Match", "Nomatch",
+ "String", "TmpString", "Number",
+ "Rule_list", "Rule_node", "State_list", "If_branches", "Exp_list",
+ "BEGIN", "END", "IF", "WHILE", "FOR",
+ "arrayfor", "BREAK", "CONTINUE", "PRINT", "PRINTF",
+ "next", "exit", "redirect", "Append",
+ "Pipe", "variable", "Varray", "builtin",
+ "Line-range" /*jfw*/,
+};
+
+ptree(n)
+{
+ print_parse_tree((NODE *)n);
+}
+
+pt()
+{
+ int x;
+ scanf("%x",&x);
+ printf("0x%x\n",x);
+ print_parse_tree((NODE *)x);
+ fflush(stdout);
+}
+
+static depth = 0;
+print_parse_tree(ptr)
+NODE *ptr;
+{
+ register int n;
+
+ if(!ptr) {
+ printf("NULL\n");
+ return;
+ }
+ if((int)(ptr->type)<0 || (int)(ptr->type)>sizeof(nnames)/sizeof(nnames[0])) {
+ printf("(0x%x Type %d??)\n",ptr,ptr->type);
+ return;
+ }
+ printf("(%d)%*s",depth,depth,"");
+ switch((int)ptr->type) {
+ case (int)Node_string:
+ case (int)Node_temp_string:
+ printf("(0x%x String \"%.*s\")\n",ptr,ptr->stlen,ptr->stptr);
+ return;
+ case (int)Node_number:
+ printf("(0x%x Number %g)\n",ptr,ptr->numbr);
+ return;
+ case (int)Node_var_array:
+ printf("(0x%x Array of %d)\n",ptr,ptr->arrsiz);
+ for(n=0;n<ptr->arrsiz;n++) {
+ printf("'");
+ print_simple((ptr->array)[n*2],stdout);
+ printf("' is '");
+ print_simple((ptr->array)[n*2+1],stdout);
+ printf("'\n");
+ }
+ return;
+ }
+ if(ptr->lnode) printf("0x%x = left<--",ptr->lnode);
+ printf("(0x%x %s.%d)",ptr,nnames[(int)(ptr->type)],ptr->type);
+ if(ptr->rnode) printf("-->right = 0x%x",ptr->rnode);
+ printf("\n");
+ depth++;
+ if(ptr->lnode)
+ print_parse_tree(ptr->lnode);
+ switch((int)ptr->type) {
+ case (int)Node_line_range: /* jfw */
+ case (int)Node_match:
+ case (int)Node_nomatch:
+ break;
+ case (int)Node_builtin:
+ printf("Builtin: %d\n",ptr->proc); /* jfw: was \N */
+ break;
+ case (int)Node_K_for:
+ case (int)Node_K_arrayfor:
+ printf("(%s:)\n",nnames[(int)(ptr->type)]);
+ print_parse_tree(ptr->forloop->init);
+ printf("looping:\n");
+ print_parse_tree(ptr->forloop->cond);
+ printf("doing:\n");
+ print_parse_tree(ptr->forloop->incr);
+ break;
+ default:
+ if(ptr->rnode)
+ print_parse_tree(ptr->rnode);
+ break;
+ }
+ --depth;
+}
+#endif
+
+#ifndef FAST
+/*
+ * print out all the variables in the world
+ */
+
+dump_vars()
+{
+ register int n;
+ register HASHNODE *buc;
+
+ printf("Fields:");
+ dump_fields();
+ printf("Vars:\n");
+ for(n=0;n<HASHSIZE;n++) {
+ for(buc=variables[n];buc;buc=buc->next) {
+ printf("'%.*s': ",buc->length,buc->name);
+ print_simple(buc->value->var_value,stdout);
+ printf(":");
+ print_parse_tree(buc->value->lnode);
+ /* print_parse_tree(buc->value); */
+ }
+ }
+ printf("End\n");
+}
+#endif
+
+#ifndef FAST
+dump_fields()
+{
+ register NODE **p;
+ register int n;
+
+ printf("%d fields\n",f_arr_siz);
+ for(n=0,p= &fields_arr[0];n<f_arr_siz;n++,p++) {
+ printf("$%d is '",n);
+ print_simple(*p,stdout);
+ printf("'\n");
+ }
+}
+#endif
+
+#ifndef FAST
+/*VARARGS1*/
+print_debug(str,n)
+char *str;
+{
+ extern int debugging;
+
+ if(debugging)
+ printf("%s:%d\n",str,n);
+}
+
+int indent = 0;
+
+print_a_node(ptr)
+NODE *ptr;
+{
+ NODE *p1;
+ char *str,*str2;
+ int n;
+ HASHNODE *buc;
+
+ if(!ptr) return; /* don't print null ptrs */
+ switch(ptr->type) {
+ case Node_number:
+ printf("%g",ptr->numbr);
+ return;
+ case Node_string:
+ printf("\"%.*s\"",ptr->stlen,ptr->stptr);
+ return;
+ case Node_times:
+ str="*";
+ goto pr_twoop;
+ case Node_quotient:
+ str="/";
+ goto pr_twoop;
+ case Node_mod:
+ str="%";
+ goto pr_twoop;
+ case Node_plus:
+ str="+";
+ goto pr_twoop;
+ case Node_minus:
+ str="-";
+ goto pr_twoop;
+ case Node_concat:
+ str=" ";
+ goto pr_twoop;
+ case Node_assign:
+ str="=";
+ goto pr_twoop;
+ case Node_assign_times:
+ str="*=";
+ goto pr_twoop;
+ case Node_assign_quotient:
+ str="/=";
+ goto pr_twoop;
+ case Node_assign_mod:
+ str="%=";
+ goto pr_twoop;
+ case Node_assign_plus:
+ str="+=";
+ goto pr_twoop;
+ case Node_assign_minus:
+ str="-=";
+ goto pr_twoop;
+ case Node_and:
+ str="&&";
+ goto pr_twoop;
+ case Node_or:
+ str="||";
+ goto pr_twoop;
+ case Node_equal:
+ str="==";
+ goto pr_twoop;
+ case Node_notequal:
+ str="!=";
+ goto pr_twoop;
+ case Node_less:
+ str="<";
+ goto pr_twoop;
+ case Node_greater:
+ str=">";
+ goto pr_twoop;
+ case Node_leq:
+ str="<=";
+ goto pr_twoop;
+ case Node_geq:
+ str=">=";
+ goto pr_twoop;
+
+ pr_twoop:
+ print_a_node(ptr->lnode);
+ printf("%s",str);
+ print_a_node(ptr->rnode);
+ return;
+
+ case Node_not:
+ str="!";
+ str2="";
+ goto pr_oneop;
+ case Node_field_spec:
+ str="$(";
+ str2=")";
+ goto pr_oneop;
+ case Node_postincrement:
+ str="";
+ str2="++";
+ goto pr_oneop;
+ case Node_postdecrement:
+ str="";
+ str2="--";
+ goto pr_oneop;
+ case Node_preincrement:
+ str="++";
+ str2="";
+ goto pr_oneop;
+ case Node_predecrement:
+ str="--";
+ str2="";
+ goto pr_oneop;
+ pr_oneop:
+ printf(str);
+ print_a_node(ptr->subnode);
+ printf(str2);
+ return;
+
+ case Node_expression_list:
+ print_a_node(ptr->lnode);
+ if(ptr->rnode) {
+ printf(",");
+ print_a_node(ptr->rnode);
+ }
+ return;
+
+ case Node_var:
+ for(n=0;n<HASHSIZE;n++) {
+ for(buc=variables[n];buc;buc=buc->next) {
+ if(buc->value==ptr) {
+ printf("%.*s",buc->length,buc->name);
+ n=HASHSIZE;
+ break;
+ }
+ }
+ }
+ return;
+ case Node_subscript:
+ print_a_node(ptr->lnode);
+ printf("[");
+ print_a_node(ptr->rnode);
+ printf("]");
+ return;
+ case Node_builtin:
+ printf("some_builtin(");
+ print_a_node(ptr->subnode);
+ printf(")");
+ return;
+
+ case Node_statement_list:
+ printf("{\n");
+ indent++;
+ for(n=indent;n;--n)
+ printf(" ");
+ while(ptr) {
+ print_maybe_semi(ptr->lnode);
+ if(ptr->rnode)
+ for(n=indent;n;--n)
+ printf(" ");
+ ptr=ptr->rnode;
+ }
+ --indent;
+ for(n=indent;n;--n)
+ printf(" ");
+ printf("}\n");
+ for(n=indent;n;--n)
+ printf(" ");
+ return;
+
+ case Node_K_if:
+ printf("if(");
+ print_a_node(ptr->lnode);
+ printf(") ");
+ ptr=ptr->rnode;
+ if(ptr->lnode->type==Node_statement_list) {
+ printf("{\n");
+ indent++;
+ for(p1=ptr->lnode;p1;p1=p1->rnode) {
+ for(n=indent;n;--n)
+ printf(" ");
+ print_maybe_semi(p1->lnode);
+ }
+ --indent;
+ for(n=indent;n;--n)
+ printf(" ");
+ if(ptr->rnode) {
+ printf("} else ");
+ } else {
+ printf("}\n");
+ return;
+ }
+ } else {
+ print_maybe_semi(ptr->lnode);
+ if(ptr->rnode) {
+ for(n=indent;n;--n)
+ printf(" ");
+ printf("else ");
+ } else return;
+ }
+ if(!ptr->rnode) return;
+ deal_with_curls(ptr->rnode);
+ return;
+
+ case Node_K_for:
+ printf("for(");
+ print_a_node(ptr->forloop->init);
+ printf(";");
+ print_a_node(ptr->forloop->cond);
+ printf(";");
+ print_a_node(ptr->forloop->incr);
+ printf(") ");
+ deal_with_curls(ptr->forsub);
+ return;
+ case Node_K_arrayfor:
+ printf("for(");
+ print_a_node(ptr->forloop->init);
+ printf(" in ");
+ print_a_node(ptr->forloop->incr);
+ printf(") ");
+ deal_with_curls(ptr->forsub);
+ return;
+
+ case Node_K_printf:
+ printf("printf(");
+ print_a_node(ptr->lnode);
+ printf(")");
+ return;
+ case Node_K_print:
+ printf("print(");
+ print_a_node(ptr->lnode);
+ printf(")");
+ return;
+ case Node_K_next:
+ printf("next");
+ return;
+ case Node_K_break:
+ printf("break");
+ return;
+ default:
+ print_parse_tree(ptr);
+ return;
+ }
+}
+
+print_maybe_semi(ptr)
+NODE *ptr;
+{
+ print_a_node(ptr);
+ switch(ptr->type) {
+ case Node_K_if:
+ case Node_K_for:
+ case Node_K_arrayfor:
+ case Node_statement_list:
+ break;
+ default:
+ printf(";\n");
+ break;
+ }
+}
+deal_with_curls(ptr)
+NODE *ptr;
+{
+ int n;
+
+ if(ptr->type==Node_statement_list) {
+ printf("{\n");
+ indent++;
+ while(ptr) {
+ for(n=indent;n;--n)
+ printf(" ");
+ print_maybe_semi(ptr->lnode);
+ ptr=ptr->rnode;
+ }
+ --indent;
+ for(n=indent;n;--n)
+ printf(" ");
+ printf("}\n");
+ } else {
+ print_maybe_semi(ptr);
+ }
+}
+
+NODE *
+do_prvars()
+{
+ dump_vars();
+ return Nnull_string;
+}
+
+NODE *
+do_bp()
+{
+ return Nnull_string;
+}
+
+#endif
diff --git a/obstack.c b/obstack.c
new file mode 100644
index 00000000..66148106
--- /dev/null
+++ b/obstack.c
@@ -0,0 +1,157 @@
+/* obstack.c - subroutines used implicitly by object stack macros
+ Copyright (c) 1986 Free Software Foundation, Inc.
+
+ NO WARRANTY
+
+ BECAUSE THIS PROGRAM IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY
+NO WARRANTY, TO THE EXTENT PERMITTED BY APPLICABLE STATE LAW. EXCEPT
+WHEN OTHERWISE STATED IN WRITING, FREE SOFTWARE FOUNDATION, INC,
+RICHARD M. STALLMAN AND/OR OTHER PARTIES PROVIDE THIS PROGRAM "AS IS"
+WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
+FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY
+AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE
+DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR
+CORRECTION.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW WILL RICHARD M.
+STALLMAN, THE FREE SOFTWARE FOUNDATION, INC., AND/OR ANY OTHER PARTY
+WHO MAY MODIFY AND REDISTRIBUTE THIS PROGRAM AS PERMITTED BELOW, BE
+LIABLE TO YOU FOR DAMAGES, INCLUDING ANY LOST PROFITS, LOST MONIES, OR
+OTHER SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR
+DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY THIRD PARTIES OR
+A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS) THIS
+PROGRAM, EVEN IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH
+DAMAGES, OR FOR ANY CLAIM BY ANY OTHER PARTY.
+
+ GENERAL PUBLIC LICENSE TO COPY
+
+ 1. You may copy and distribute verbatim copies of this source file
+as you receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy a valid copyright notice "Copyright
+(C) 1986 Free Software Foundation, Inc."; and include following the
+copyright notice a verbatim copy of the above disclaimer of warranty
+and of this License.
+
+ 2. You may modify your copy or copies of this source file or
+any portion of it, and copy and distribute such modifications under
+the terms of Paragraph 1 above, provided that you also do the following:
+
+ a) cause the modified files to carry prominent notices stating
+ that you changed the files and the date of any change; and
+
+ b) cause the whole of any work that you distribute or publish,
+ that in whole or in part contains or is a derivative of this
+ program or any part thereof, to be freely distributed
+ and licensed to all third parties on terms identical to those
+ contained in this License Agreement (except that you may choose
+ to grant more extensive warranty protection to third parties,
+ at your option).
+
+ 3. You may copy and distribute this program or any portion of it in
+compiled, executable or object code form under the terms of Paragraphs
+1 and 2 above provided that you do the following:
+
+ a) cause each such copy to be accompanied by the
+ corresponding machine-readable source code, which must
+ be distributed under the terms of Paragraphs 1 and 2 above; or,
+
+ b) cause each such copy to be accompanied by a
+ written offer, with no time limit, to give any third party
+ free (except for a nominal shipping charge) a machine readable
+ copy of the corresponding source code, to be distributed
+ under the terms of Paragraphs 1 and 2 above; or,
+
+ c) in the case of a recipient of this program in compiled, executable
+ or object code form (without the corresponding source code) you
+ shall cause copies you distribute to be accompanied by a copy
+ of the written offer of source code which you received along
+ with the copy you received.
+
+ 4. You may not copy, sublicense, distribute or transfer this program
+except as expressly provided under this License Agreement. Any attempt
+otherwise to copy, sublicense, distribute or transfer this program is void and
+your rights to use the program under this License agreement shall be
+automatically terminated. However, parties who have received computer
+software programs from you with this License Agreement will not have
+their licenses terminated so long as such parties remain in full compliance.
+*/
+
+#include <obstack.h>
+
+void
+_obstack_begin (h, chunkfun)
+ struct obstack *h;
+ int (*chunkfun) ();
+{
+ register _Ll* chunk; /* points to new chunk */
+ chunk = h->chunk =
+ (_Ll*) (*chunkfun) (h->chunk_size);
+ h->next_free = h->object_base = chunk->obstack_l_0;
+ h->chunk_limit = chunk->obstack_l_limit
+ = (char *) chunk + h->chunk_size;
+ chunk->obstack_l_prev = 0;
+}
+
+/* Allocate a new current chunk for the obstack *H
+ on the assumption that LENGTH bytes need to be added
+ to the current object, or a new object of length LENGTH allocated.
+ Copies any partial object from the end of the old chunk
+ to the beginning of the new one. */
+
+void
+_obstack_newchunk (h, chunkfun, length)
+ struct obstack *h;
+ int (*chunkfun) ();
+ int length;
+{
+ register _Ll* old_chunk = h->chunk;
+ register _Ll* new_chunk;
+ register long new_size;
+ register int obj_size = h->next_free - h->object_base;
+
+ /* Compute size for new chunk. */
+ new_size = (obj_size + length) << 1;
+ if (new_size < h->chunk_size)
+ new_size = h->chunk_size;
+
+ /* Allocate and initialize the new chunk. */
+ new_chunk = h->chunk = (_Ll*) (*chunkfun) (new_size);
+ new_chunk->obstack_l_prev = old_chunk;
+ new_chunk->obstack_l_limit = h->chunk_limit = (char *) new_chunk + new_size;
+
+ /* Move the existing object to the new chunk. */
+ bcopy (h->object_base, new_chunk->obstack_l_0, obj_size);
+ h->object_base = new_chunk->obstack_l_0;
+ h->next_free = h->object_base + obj_size;
+ };
+
+void
+_obstack_free (h, freechunkfun, obj)
+ struct obstack *h;
+ void (*freechunkfun) ();
+ char *obj;
+{
+ register _Ll* lp; /* below addr of any objects in this chunk */
+ register _Ll* plp; /* point to previous chunk if any */
+
+ lp = (h)->chunk;
+ while (lp != 0 && ((char *)lp > obj || (h)->chunk_limit < obj))
+ {
+ plp = lp -> obstack_l_prev;
+ (*freechunkfun) (lp);
+ if(lp==plp)
+ plp=0;
+ lp = plp;
+ }
+ if (lp)
+ {
+ (h)->object_base = (h)->next_free = (char *)(obj);
+ (h)->chunk_limit = lp->obstack_l_limit;
+ (h)->chunk = lp;
+ }
+ else if (obj != 0)
+ /* obj is not in any of the chunks! */
+ abort ();
+}
diff --git a/obstack.h b/obstack.h
new file mode 100644
index 00000000..772a5baa
--- /dev/null
+++ b/obstack.h
@@ -0,0 +1,204 @@
+/* obstack.h - object stack macros
+ Copyright (c) 1986 Free Software Foundation, Inc.
+
+Summary:
+
+All the apparent functions defined here are macros. The idea
+is that you would use these pre-tested macros to solve a
+very specific set of problems, and they would run fast.
+Caution: no side-effects in arguments please!! They may be
+evaluated MANY times!!
+
+These macros operate a stack of objects. Each object starts life
+small, and may grow to maturity. (Consider building a word syllable
+by syllable.) An object can move while it is growing. Once it has
+been "finished" it never changes address again. So the "top of the
+stack" is typically an immature growing object, while the rest of the
+stack is of mature, fixed size and fixed address objects.
+
+These routines grab large chunks of memory, using a function you
+supply, called `obstack_chunk_alloc'. On occasion, they free chunks,
+by calling `obstack_chunk_free'. You must define them and declare
+them before using any obstack macros.
+
+Each independent stack is represented by a `struct obstack'.
+Each of the obstack macros expects a pointer to such a structure
+as the first argument.
+
+One motivation for this package is the problem of growing char strings
+in symbol tables. Unless you are "facist pig with a read-only mind"
+[Gosper's immortal quote from HAKMEM item 154, out of context] you
+would not like to put any arbitrary upper limit on the length of your
+symbols.
+
+In practice this often means you will build many short symbols and a
+few long symbols. At the time you are reading a symbol you don't know
+how long it is. One traditional method is to read a symbol into a
+buffer, realloc()ating the buffer every time you try to read a symbol
+that is longer than the buffer. This is beaut, but you still will
+want to copy the symbol from the buffer to a more permanent
+symbol-table entry say about half the time.
+
+With obstacks, you can work differently. Use one obstack for all symbol
+names. As you read a symbol, grow the name in the obstack gradually.
+When the name is complete, finalize it. Then, if the symbol exists already,
+free the newly read name.
+
+The way we do this is to take a large chunk, allocating memory from
+low addresses. When you want to build a aymbol in the chunk you just
+add chars above the current "high water mark" in the chunk. When you
+have finished adding chars, because you got to the end of the symbol,
+you know how long the chars are, and you can create a new object.
+Mostly the chars will not burst over the highest address of the chunk,
+because you would typically expect a chunk to be (say) 100 times as
+long as an average object.
+
+In case that isn't clear, when we have enough chars to make up
+the object, THEY ARE ALREADY CONTIGUOUS IN THE CHUNK (guaranteed)
+so we just point to it where it lies. No moving of chars is
+needed and this is the second win: potentially long strings need
+never be explicitly shuffled. Once an object is formed, it does not
+change its address during its lifetime.
+
+When the chars burst over a chunk boundary, we allocate a larger
+chunk, and then copy the partly formed object from the end of the old
+chunk to the beggining of the new larger chunk. We then carry on
+accreting characters to the end of the object as we normaly would.
+
+A special macro is provided to add a single char at a time to a
+growing object. This allows the use of register variables, which
+break the ordinary 'growth' macro.
+
+Summary:
+ We allocate large chunks.
+ We carve out one object at a time from the current chunk.
+ Once carved, an object never moves.
+ We are free to append data of any size to the currently
+ growing object.
+ Exactly one object is growing in an obstack at any one time.
+ You can run one obstack per control block.
+ You may have as many control blocks as you dare.
+ Because of the way we do it, you can `unwind' a obstack
+ back to a previous state. (You may remove objects much
+ as you would with a stack.)
+*/
+
+#ifndef obstackH
+#define obstackH
+ /* these #defines keep it brief */
+#define _Ll struct obstack_chunk
+#define _LL (8) /* _L length in chars */
+
+struct obstack_chunk /* Lives at front of each chunk. */
+{
+ char *obstack_l_limit; /* 1 past end of this chunk */
+ _Ll *obstack_l_prev; /* address of prior chunk or NULL */
+ char obstack_l_0[4]; /* objects begin here */
+};
+
+#if 0
+This function, called like malloc but not returning on failure,
+must return a chunk of the size given to it as argument,
+aligned on a boundary of 2**OBSTACK_LOG_DEFAULT_ALIGNMENT bytes.
+
+struct obstack_chunk * obstack_chunk_alloc();
+#endif /* 0 */
+
+struct obstack /* control current object in current chunk */
+{
+ long chunk_size; /* preferred size to allocate chunks in */
+ _Ll* chunk; /* address of current struct obstack_chunk */
+ char *object_base; /* address of object we are building */
+ char *next_free; /* where to add next char to current object */
+ char *chunk_limit; /* address of char after current chunk */
+ int temp; /* Temporary for some macros. */
+ int alignment_mask; /* Mask of alignment for each object. */
+};
+
+/* Pointer to beginning of object being allocated or to be allocated next.
+ Note that this might not be the final address of the object
+ because a new chunk might be needed to hold the final size. */
+
+#define obstack_base(h) ((h)->object_base)
+
+/* Pointer to next byte not yet allocated in current chunk. */
+
+#define obstack_next_free(h) ((h)->next_free)
+
+/* Size of object currently growing */
+
+#define obstack_object_size(h) ((h)->next_free - (h)->object_base)
+
+/* Mask specifying low bits that should be clear in address of an object. */
+
+#define obstack_alignment_mask(h) ((h)->alignment_mask)
+
+#define obstack_init(h) obstack_begin (h, 4096 - 4 - _LL)
+
+#define obstack_begin(h,try_length) \
+((h)->chunk_size = (try_length) + (_LL), \
+ (h)->alignment_mask = ((1 << 2) - 1), \
+ _obstack_begin ((h), obstack_chunk_alloc))
+
+#define obstack_grow(h,where,length) \
+( (h)->temp = (length), \
+ (((h)->next_free + (h)->temp > (h)->chunk_limit) \
+ ? _obstack_newchunk ((h), obstack_chunk_alloc, (h)->temp) : 0), \
+ bcopy (where, (h)->next_free, (h)->temp), \
+ (h)->next_free += (h)->temp)
+
+#define obstack_grow0(h,where,length) \
+( (h)->temp = (length), \
+ (((h)->next_free + (h)->temp + 1 > (h)->chunk_limit) \
+ ? _obstack_newchunk ((h), obstack_chunk_alloc, (h)->temp + 1) : 0), \
+ bcopy (where, (h)->next_free, (h)->temp), \
+ (h)->next_free += (h)->temp, \
+ *((h)->next_free)++ = 0)
+
+#define obstack_1grow(h,datum) \
+( (((h)->next_free + 1 > (h)->chunk_limit) \
+ ? _obstack_newchunk ((h), obstack_chunk_alloc, 1) : 0), \
+ *((h)->next_free)++ = (datum))
+
+#define obstack_blank(h,length) \
+( (h)->temp = (length), \
+ (((h)->next_free + (h)->temp > (h)->chunk_limit) \
+ ? _obstack_newchunk ((h), obstack_chunk_alloc, (h)->temp) : 0), \
+ (h)->next_free += (h)->temp)
+
+#define obstack_alloc(h,length) \
+ (obstack_blank ((h), (length)), obstack_finish (h))
+
+#define obstack_copy(h,where,length) \
+ (obstack_grow ((h), (where), (length)), obstack_finish (h))
+
+#define obstack_copy0(h,where,length) \
+ (obstack_grow0 ((h), (where), (length)), obstack_finish (h))
+
+#define obstack_room(h) ((long unsigned int) \
+ ((h)->chunk_limit - (h)->next_free))
+
+#define obstack_1grow_fast(h,achar) (*((h)->next_free)++ = achar)
+
+#define obstack_blank_fast(h,n) ((h)->next_free += (n))
+
+#define obstack_finish(h) \
+ ((h)->temp = (int) (h)->object_base, \
+ (h)->next_free \
+ = (char*)((int)((h)->next_free+(h)->alignment_mask) \
+ & ~ ((h)->alignment_mask)), \
+ (((h)->next_free - (char *)(h)->chunk \
+ > (h)->chunk_limit - (char *)(h)->chunk) \
+ ? (h)->next_free = (h)->chunk_limit : 0), \
+ (h)->object_base = (h)->next_free, \
+ (char *) (h)->temp)
+
+#define obstack_free(h,obj) \
+(((h)->temp = (char *)(obj) - (char *) (h)->chunk), \
+ (((h)->temp >= 0 && (h)->temp < (h)->chunk_limit - (char *) (h)->chunk)\
+ ? (int) ((h)->next_free = (h)->object_base \
+ = (h)->temp + (char *) (h)->chunk) \
+ : (int) _obstack_free ((h), obstack_chunk_free, \
+ (h)->temp + (char *) (h)->chunk)))
+
+#endif /* #ifndef obstackH */
diff --git a/regex.c b/regex.c
new file mode 100644
index 00000000..ebfd612e
--- /dev/null
+++ b/regex.c
@@ -0,0 +1,1689 @@
+/* Extended regular expression matching and search.
+ Copyright (C) 1985 Free Software Foundation, Inc.
+
+ NO WARRANTY
+
+ BECAUSE THIS PROGRAM IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY
+NO WARRANTY, TO THE EXTENT PERMITTED BY APPLICABLE STATE LAW. EXCEPT
+WHEN OTHERWISE STATED IN WRITING, FREE SOFTWARE FOUNDATION, INC,
+RICHARD M. STALLMAN AND/OR OTHER PARTIES PROVIDE THIS PROGRAM "AS IS"
+WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
+FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY
+AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE
+DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR
+CORRECTION.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW WILL RICHARD M.
+STALLMAN, THE FREE SOFTWARE FOUNDATION, INC., AND/OR ANY OTHER PARTY
+WHO MAY MODIFY AND REDISTRIBUTE THIS PROGRAM AS PERMITTED BELOW, BE
+LIABLE TO YOU FOR DAMAGES, INCLUDING ANY LOST PROFITS, LOST MONIES, OR
+OTHER SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR
+DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY THIRD PARTIES OR
+A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS) THIS
+PROGRAM, EVEN IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH
+DAMAGES, OR FOR ANY CLAIM BY ANY OTHER PARTY.
+
+ GENERAL PUBLIC LICENSE TO COPY
+
+ 1. You may copy and distribute verbatim copies of this source file
+as you receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy a valid copyright notice "Copyright
+(C) 1985 Free Software Foundation, Inc."; and include following the
+copyright notice a verbatim copy of the above disclaimer of warranty
+and of this License.
+
+ 2. You may modify your copy or copies of this source file or
+any portion of it, and copy and distribute such modifications under
+the terms of Paragraph 1 above, provided that you also do the following:
+
+ a) cause the modified files to carry prominent notices stating
+ that you changed the files and the date of any change; and
+
+ b) cause the whole of any work that you distribute or publish,
+ that in whole or in part contains or is a derivative of this
+ program or any part thereof, to be freely distributed
+ and licensed to all third parties on terms identical to those
+ contained in this License Agreement (except that you may choose
+ to grant more extensive warranty protection to third parties,
+ at your option).
+
+ 3. You may copy and distribute this program or any portion of it in
+compiled, executable or object code form under the terms of Paragraphs
+1 and 2 above provided that you do the following:
+
+ a) cause each such copy to be accompanied by the
+ corresponding machine-readable source code, which must
+ be distributed under the terms of Paragraphs 1 and 2 above; or,
+
+ b) cause each such copy to be accompanied by a
+ written offer, with no time limit, to give any third party
+ free (except for a nominal shipping charge) a machine readable
+ copy of the corresponding source code, to be distributed
+ under the terms of Paragraphs 1 and 2 above; or,
+
+ c) in the case of a recipient of this program in compiled, executable
+ or object code form (without the corresponding source code) you
+ shall cause copies you distribute to be accompanied by a copy
+ of the written offer of source code which you received along
+ with the copy you received.
+
+ 4. You may not copy, sublicense, distribute or transfer this program
+except as expressly provided under this License Agreement. Any attempt
+otherwise to copy, sublicense, distribute or transfer this program is void and
+your rights to use the program under this License agreement shall be
+automatically terminated. However, parties who have received computer
+software programs from you with this License Agreement will not have
+their licenses terminated so long as such parties remain in full compliance.
+
+
+In other words, you are welcome to use, share and improve this program.
+You are forbidden to forbid anyone else to use, share and improve
+what you give them. Help stamp out software-hoarding! */
+
+
+/* To test, compile with -Dtest.
+ This Dtestable feature turns this into a self-contained program
+ which reads a pattern, describes how it compiles,
+ then reads a string and searches for it. */
+
+/* JF this var has taken on whole new meanings as time goes by. Various bits
+in this int tell how certain pieces of syntax should work */
+
+static int obscure_syntax = 0;
+
+#ifdef emacs
+
+/* The `emacs' switch turns on certain special matching commands
+ that make sense only in emacs. */
+
+#include "config.h"
+#include "lisp.h"
+#include "buffer.h"
+#include "syntax.h"
+
+#else /* not emacs */
+
+/*
+ * Define the syntax stuff, so we can do the \<...\> things.
+ */
+
+#ifndef Sword /* must be non-zero in some of the tests below... */
+#define Sword 1
+#endif
+
+#define SYNTAX(c) re_syntax_table[c]
+
+#ifdef SYNTAX_TABLE
+
+char *re_syntax_table;
+
+#else
+
+static char re_syntax_table[256];
+
+static void
+init_syntax_once ()
+{
+ register int c;
+ static int done = 0;
+
+ if (done)
+ return;
+
+ bzero (re_syntax_table, sizeof re_syntax_table);
+
+ for (c = 'a'; c <= 'z'; c++)
+ re_syntax_table[c] = Sword;
+
+ for (c = 'A'; c <= 'Z'; c++)
+ re_syntax_table[c] = Sword;
+
+ for (c = '0'; c <= '9'; c++)
+ re_syntax_table[c] = Sword;
+
+ done = 1;
+}
+
+#endif /* SYNTAX_TABLE */
+#endif /* not emacs */
+
+#include "regex.h"
+
+/* Number of failure points to allocate space for initially,
+ when matching. If this number is exceeded, more space is allocated,
+ so it is not a hard limit. */
+
+#ifndef NFAILURES
+#define NFAILURES 80
+#endif NFAILURES
+
+/* width of a byte in bits */
+
+#define BYTEWIDTH 8
+
+#ifndef SIGN_EXTEND_CHAR
+#define SIGN_EXTEND_CHAR(x) (x)
+#endif
+
+/* compile_pattern takes a regular-expression descriptor string in the user's format
+ and converts it into a buffer full of byte commands for matching.
+
+ pattern is the address of the pattern string
+ size is the length of it.
+ bufp is a struct re_pattern_buffer * which points to the info
+ on where to store the byte commands.
+ This structure contains a char * which points to the
+ actual space, which should have been obtained with malloc.
+ compile_pattern may use realloc to grow the buffer space.
+
+ The number of bytes of commands can be found out by looking in
+ the struct re_pattern_buffer that bufp pointed to,
+ after compile_pattern returns.
+*/
+
+#define PATPUSH(ch) (*b++ = (char) (ch))
+
+#define PATFETCH(c) \
+ {if (p == pend) goto end_of_pattern; \
+ c = * (unsigned char *) p++; \
+ if (translate) c = translate[c]; }
+
+#define PATFETCH_RAW(c) \
+ {if (p == pend) goto end_of_pattern; \
+ c = * (unsigned char *) p++; }
+
+#define PATUNFETCH p--
+
+#define EXTEND_BUFFER \
+ { char *old_buffer = bufp->buffer; \
+ if (bufp->allocated == (1<<16)) goto too_big; \
+ bufp->allocated *= 2; \
+ if (bufp->allocated > (1<<16)) bufp->allocated = (1<<16); \
+ if (!(bufp->buffer = (char *) realloc (bufp->buffer, bufp->allocated))) \
+ goto memory_exhausted; \
+ c = bufp->buffer - old_buffer; \
+ b += c; \
+ if (fixup_jump) \
+ fixup_jump += c; \
+ if (laststart) \
+ laststart += c; \
+ begalt += c; \
+ if (pending_exact) \
+ pending_exact += c; \
+ }
+
+static int store_jump (), insert_jump ();
+
+/* JF this function is used to compile UN*X style regexps. In particular,
+ ( ) and | don't have to be \ed to have a special meaning */
+
+int
+re_set_syntax(syntax)
+{
+ int ret;
+
+ ret=obscure_syntax;
+ obscure_syntax=syntax;
+ return ret;
+}
+
+
+char *
+re_compile_pattern (pattern, size, bufp)
+ char *pattern;
+ int size;
+ struct re_pattern_buffer *bufp;
+{
+ register char *b = bufp->buffer;
+ register char *p = pattern;
+ char *pend = pattern + size;
+ register unsigned c, c1;
+ char *p1;
+ unsigned char *translate = (unsigned char *) bufp->translate;
+
+ /* address of the count-byte of the most recently inserted "exactn" command.
+ This makes it possible to tell whether a new exact-match character
+ can be added to that command or requires a new "exactn" command. */
+
+ char *pending_exact = 0;
+
+ /* address of the place where a forward-jump should go
+ to the end of the containing expression.
+ Each alternative of an "or", except the last, ends with a forward-jump
+ of this sort. */
+
+ char *fixup_jump = 0;
+
+ /* address of start of the most recently finished expression.
+ This tells postfix * where to find the start of its operand. */
+
+ char *laststart = 0;
+
+ /* In processing a repeat, 1 means zero matches is allowed */
+
+ char zero_times_ok;
+
+ /* In processing a repeat, 1 means many matches is allowed */
+
+ char many_times_ok;
+
+ /* address of beginning of regexp, or inside of last \( */
+
+ char *begalt = b;
+
+ /* Stack of information saved by \( and restored by \).
+ Four stack elements are pushed by each \(:
+ First, the value of b.
+ Second, the value of fixup_jump.
+ Third, the value of regnum.
+ Fourth, the value of begalt. */
+
+ int stackb[40];
+ int *stackp = stackb;
+ int *stacke = stackb + 40;
+ int *stackt;
+
+ /* Counts \('s as they are encountered. Remembered for the matching \),
+ where it becomes the "register number" to put in the stop_memory command */
+
+ int regnum = 1;
+
+ bufp->fastmap_accurate = 0;
+
+#ifndef emacs
+#ifndef SYNTAX_TABLE
+ /*
+ * Initialize the syntax table.
+ */
+ init_syntax_once();
+#endif
+#endif
+
+ if (bufp->allocated == 0)
+ {
+ bufp->allocated = 28;
+ if (bufp->buffer)
+ /* EXTEND_BUFFER loses when bufp->allocated is 0 */
+ bufp->buffer = (char *) realloc (bufp->buffer, 28);
+ else
+ /* Caller did not allocate a buffer. Do it for him */
+ bufp->buffer = (char *) malloc (28);
+ if (!bufp->buffer) goto memory_exhausted;
+ begalt = b = bufp->buffer;
+ }
+
+ while (p != pend)
+ {
+ if (b - bufp->buffer > bufp->allocated - 10)
+ /* Note that EXTEND_BUFFER clobbers c */
+ EXTEND_BUFFER;
+
+ PATFETCH (c);
+
+ switch (c)
+ {
+ case '$':
+ /* $ means succeed if at end of line, but only in special contexts.
+ If randonly in the middle of a pattern, it is a normal character. */
+ if (p == pend || (*p == '\\' && (p[1] == ')' || p[1] == '|')))
+ {
+ PATPUSH (endline);
+ break;
+ }
+ goto normal_char;
+
+ case '^':
+ /* ^ means succeed if at beg of line, but only if no preceding pattern. */
+ if (laststart) goto normal_char;
+ PATPUSH (begline);
+ break;
+
+ case '*':
+ case '+':
+ case '?':
+ /* If there is no previous pattern, char not special. */
+ if (!laststart)
+ goto normal_char;
+ /* If there is a sequence of repetition chars,
+ collapse it down to equivalent to just one. */
+ zero_times_ok = 0;
+ many_times_ok = 0;
+ while (1)
+ {
+ zero_times_ok |= c != '+';
+ many_times_ok |= c != '?';
+ if (p == pend)
+ break;
+ PATFETCH (c);
+ if (!(c == '*' || c == '+' || c == '?'))
+ {
+ PATUNFETCH;
+ break;
+ }
+ }
+
+ /* Now we know whether 0 matches is allowed,
+ and whether 2 or more matches is allowed. */
+ if (many_times_ok)
+ {
+ /* If more than one repetition is allowed,
+ put in a backward jump at the end. */
+ store_jump (b, maybe_finalize_jump, laststart - 3);
+ b += 3;
+ }
+ insert_jump (on_failure_jump, laststart, b + 3, b);
+ pending_exact = 0;
+ b += 3;
+ if (!zero_times_ok)
+ {
+ /* At least one repetition required: insert before the loop
+ a skip over the initial on-failure-jump instruction */
+ insert_jump (dummy_failure_jump, laststart, laststart + 6, b);
+ b += 3;
+ }
+ break;
+
+ case '.':
+ laststart = b;
+ PATPUSH (anychar);
+ break;
+
+ case '[':
+ if (b - bufp->buffer
+ > bufp->allocated - 3 - (1 << BYTEWIDTH) / BYTEWIDTH)
+ /* Note that EXTEND_BUFFER clobbers c */
+ EXTEND_BUFFER;
+
+ laststart = b;
+ if (*p == '^')
+ PATPUSH (charset_not), p++;
+ else
+ PATPUSH (charset);
+ p1 = p;
+
+ PATPUSH ((1 << BYTEWIDTH) / BYTEWIDTH);
+ /* Clear the whole map */
+ bzero (b, (1 << BYTEWIDTH) / BYTEWIDTH);
+ /* Read in characters and ranges, setting map bits */
+ while (1)
+ {
+ PATFETCH (c);
+ if (c == ']' && p != p1 + 1) break;
+ if (*p == '-')
+ {
+ PATFETCH (c1);
+ PATFETCH (c1);
+ while (c <= c1)
+ b[c / BYTEWIDTH] |= 1 << (c % BYTEWIDTH), c++;
+ }
+ else
+ {
+ b[c / BYTEWIDTH] |= 1 << (c % BYTEWIDTH);
+ }
+ }
+ /* Discard any bitmap bytes that are all 0 at the end of the map.
+ Decrement the map-length byte too. */
+ while (b[-1] > 0 && b[b[-1] - 1] == 0)
+ b[-1]--;
+ b += b[-1];
+ break;
+
+ case '(':
+ if(!(obscure_syntax&RE_NO_BK_PARENS)) goto normal_char;
+ if (stackp == stacke) goto nesting_too_deep;
+ if (regnum < RE_NREGS)
+ {
+ PATPUSH (start_memory);
+ PATPUSH (regnum);
+ }
+ *stackp++ = b - bufp->buffer;
+ *stackp++ = fixup_jump ? fixup_jump - bufp->buffer + 1 : 0;
+ *stackp++ = regnum++;
+ *stackp++ = begalt - bufp->buffer;
+ fixup_jump = 0;
+ laststart = 0;
+ begalt = b;
+ break;
+
+ case ')':
+ if(!(obscure_syntax&RE_NO_BK_PARENS)) goto normal_char;
+ if (stackp == stackb) goto unmatched_close;
+ begalt = *--stackp + bufp->buffer;
+ if (fixup_jump)
+ store_jump (fixup_jump, jump, b);
+ if (stackp[-1] < RE_NREGS)
+ {
+ PATPUSH (stop_memory);
+ PATPUSH (stackp[-1]);
+ }
+ stackp -= 2;
+ fixup_jump = 0;
+ if (*stackp)
+ fixup_jump = *stackp + bufp->buffer - 1;
+ laststart = *--stackp + bufp->buffer;
+ break;
+
+ case '|':
+ if(!(obscure_syntax&RE_NO_BK_VBAR)) goto normal_char;
+ insert_jump (on_failure_jump, begalt, b + 6, b);
+ pending_exact = 0;
+ b += 3;
+ if (fixup_jump)
+ store_jump (fixup_jump, jump, b);
+ fixup_jump = b;
+ b += 3;
+ laststart = 0;
+ begalt = b;
+ break;
+
+ case '\\':
+ if (p == pend) goto invalid_pattern;
+ PATFETCH_RAW (c);
+ switch (c)
+ {
+#ifdef emacs
+ case '=':
+ PATPUSH (at_dot);
+ break;
+
+ case 's':
+ laststart = b;
+ PATPUSH (syntaxspec);
+ PATFETCH (c);
+ PATPUSH (syntax_spec_code[c]);
+ break;
+
+ case 'S':
+ laststart = b;
+ PATPUSH (notsyntaxspec);
+ PATFETCH (c);
+ PATPUSH (syntax_spec_code[c]);
+ break;
+#endif emacs
+
+ case '(':
+ if(obscure_syntax&RE_NO_BK_PARENS) goto normal_backsl;
+ if (stackp == stacke) goto nesting_too_deep;
+ if (regnum < RE_NREGS)
+ {
+ PATPUSH (start_memory);
+ PATPUSH (regnum);
+ }
+ *stackp++ = b - bufp->buffer;
+ *stackp++ = fixup_jump ? fixup_jump - bufp->buffer + 1 : 0;
+ *stackp++ = regnum++;
+ *stackp++ = begalt - bufp->buffer;
+ fixup_jump = 0;
+ laststart = 0;
+ begalt = b;
+ break;
+
+ case ')':
+ if(obscure_syntax&RE_NO_BK_PARENS) goto normal_backsl;
+ if (stackp == stackb) goto unmatched_close;
+ begalt = *--stackp + bufp->buffer;
+ if (fixup_jump)
+ store_jump (fixup_jump, jump, b);
+ if (stackp[-1] < RE_NREGS)
+ {
+ PATPUSH (stop_memory);
+ PATPUSH (stackp[-1]);
+ }
+ stackp -= 2;
+ fixup_jump = 0;
+ if (*stackp)
+ fixup_jump = *stackp + bufp->buffer - 1;
+ laststart = *--stackp + bufp->buffer;
+ break;
+
+ case '|':
+ if(obscure_syntax&RE_NO_BK_VBAR) goto normal_backsl;
+ insert_jump (on_failure_jump, begalt, b + 6, b);
+ pending_exact = 0;
+ b += 3;
+ if (fixup_jump)
+ store_jump (fixup_jump, jump, b);
+ fixup_jump = b;
+ b += 3;
+ laststart = 0;
+ begalt = b;
+ break;
+
+ case 'w':
+ laststart = b;
+ PATPUSH (wordchar);
+ break;
+
+ case 'W':
+ laststart = b;
+ PATPUSH (notwordchar);
+ break;
+
+ case '<':
+ PATPUSH (wordbeg);
+ break;
+
+ case '>':
+ PATPUSH (wordend);
+ break;
+
+ case 'b':
+ PATPUSH (wordbound);
+ break;
+
+ case 'B':
+ PATPUSH (notwordbound);
+ break;
+
+ case '`':
+ PATPUSH (begbuf);
+ break;
+
+ case '\'':
+ PATPUSH (endbuf);
+ break;
+
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ c1 = c - '0';
+ if (c1 >= regnum)
+ goto normal_char;
+ for (stackt = stackp - 2; stackt > stackb; stackt -= 4)
+ if (*stackt == c1)
+ goto normal_char;
+ laststart = b;
+ PATPUSH (duplicate);
+ PATPUSH (c1);
+ break;
+ default:
+ normal_backsl:
+ /* You might think it wuld be useful for \ to mean
+ not to translate; but if we don't translate it
+ it will never match anything. */
+ if (translate) c = translate[c];
+ goto normal_char;
+ }
+ break;
+
+ default:
+ normal_char:
+ if (!pending_exact || pending_exact + *pending_exact + 1 != b
+ || *pending_exact == 0177 || *p == '*' || *p == '^'
+ || *p == '+' || *p == '?')
+ {
+ laststart = b;
+ PATPUSH (exactn);
+ pending_exact = b;
+ PATPUSH (0);
+ }
+ PATPUSH (c);
+ (*pending_exact)++;
+ }
+ }
+
+ if (fixup_jump)
+ store_jump (fixup_jump, jump, b);
+
+ if (stackp != stackb) goto unmatched_open;
+
+ bufp->used = b - bufp->buffer;
+ return 0;
+
+ invalid_pattern:
+ return "Invalid regular expression";
+
+ unmatched_open:
+ return "Unmatched \\(";
+
+ unmatched_close:
+ return "Unmatched \\)";
+
+ end_of_pattern:
+ return "Premature end of regular expression";
+
+ nesting_too_deep:
+ return "Nesting too deep";
+
+ too_big:
+ return "Regular expression too big";
+
+ memory_exhausted:
+ return "Memory exhausted";
+}
+
+/* Store where `from' points a jump operation to jump to where `to' points.
+ `opcode' is the opcode to store. */
+
+static int
+store_jump (from, opcode, to)
+ char *from, *to;
+ char opcode;
+{
+ from[0] = opcode;
+ from[1] = (to - (from + 3)) & 0377;
+ from[2] = (to - (from + 3)) >> 8;
+}
+
+/* Open up space at char FROM, and insert there a jump to TO.
+ CURRENT_END gives te end of the storage no in use,
+ so we know how much data to copy up.
+ OP is the opcode of the jump to insert.
+
+ If you call this function, you must zero out pending_exact. */
+
+static int
+insert_jump (op, from, to, current_end)
+ char op;
+ char *from, *to, *current_end;
+{
+ register char *pto = current_end + 3;
+ register char *pfrom = current_end;
+ while (pfrom != from)
+ *--pto = *--pfrom;
+ store_jump (from, op, to);
+}
+
+/* Given a pattern, compute a fastmap from it.
+ The fastmap records which of the (1 << BYTEWIDTH) possible characters
+ can start a string that matches the pattern.
+ This fastmap is used by re_search to skip quickly over totally implausible text.
+
+ The caller must supply the address of a (1 << BYTEWIDTH)-byte data area
+ as bufp->fastmap.
+ The other components of bufp describe the pattern to be used. */
+
+void
+re_compile_fastmap (bufp)
+ struct re_pattern_buffer *bufp;
+{
+ unsigned char *pattern = (unsigned char *) bufp->buffer;
+ int size = bufp->used;
+ register char *fastmap = bufp->fastmap;
+ register unsigned char *p = pattern;
+ register unsigned char *pend = pattern + size;
+ register int j, k;
+ unsigned char *translate = (unsigned char *) bufp->translate;
+
+ unsigned char *stackb[NFAILURES];
+ unsigned char **stackp = stackb;
+
+ bzero (fastmap, (1 << BYTEWIDTH));
+ bufp->fastmap_accurate = 1;
+ bufp->can_be_null = 0;
+
+ while (p)
+ {
+ if (p == pend)
+ {
+ bufp->can_be_null = 1;
+ break;
+ }
+#ifdef SWITCH_ENUM_BUG
+ switch ((int) ((enum regexpcode) *p++))
+#else
+ switch ((enum regexpcode) *p++)
+#endif
+ {
+ case exactn:
+ if (translate)
+ fastmap[translate[p[1]]] = 1;
+ else
+ fastmap[p[1]] = 1;
+ break;
+
+ case begline:
+ case before_dot:
+ case at_dot:
+ case after_dot:
+ case begbuf:
+ case endbuf:
+ case wordbound:
+ case notwordbound:
+ case wordbeg:
+ case wordend:
+ continue;
+
+ case endline:
+ if (translate)
+ fastmap[translate['\n']] = 1;
+ else
+ fastmap['\n'] = 1;
+ if (bufp->can_be_null != 1)
+ bufp->can_be_null = 2;
+ break;
+
+ case finalize_jump:
+ case maybe_finalize_jump:
+ case jump:
+ case dummy_failure_jump:
+ bufp->can_be_null = 1;
+ j = *p++ & 0377;
+ j += SIGN_EXTEND_CHAR (*(char *)p) << 8;
+ p += j + 1; /* The 1 compensates for missing ++ above */
+ if (j > 0)
+ continue;
+ /* Jump backward reached implies we just went through
+ the body of a loop and matched nothing.
+ Opcode jumped to should be an on_failure_jump.
+ Just treat it like an ordinary jump.
+ For a * loop, it has pushed its failure point already;
+ if so, discard that as redundant. */
+ if ((enum regexpcode) *p != on_failure_jump)
+ continue;
+ p++;
+ j = *p++ & 0377;
+ j += SIGN_EXTEND_CHAR (*(char *)p) << 8;
+ p += j + 1; /* The 1 compensates for missing ++ above */
+ if (stackp != stackb && *stackp == p)
+ stackp--;
+ continue;
+
+ case on_failure_jump:
+ j = *p++ & 0377;
+ j += SIGN_EXTEND_CHAR (*(char *)p) << 8;
+ p++;
+ *++stackp = p + j;
+ continue;
+
+ case start_memory:
+ case stop_memory:
+ p++;
+ continue;
+
+ case duplicate:
+ bufp->can_be_null = 1;
+ fastmap['\n'] = 1;
+ case anychar:
+ for (j = 0; j < (1 << BYTEWIDTH); j++)
+ if (j != '\n')
+ fastmap[j] = 1;
+ if (bufp->can_be_null)
+ return;
+ /* Don't return; check the alternative paths
+ so we can set can_be_null if appropriate. */
+ break;
+
+ case wordchar:
+ for (j = 0; j < (1 << BYTEWIDTH); j++)
+ if (SYNTAX (j) == Sword)
+ fastmap[j] = 1;
+ break;
+
+ case notwordchar:
+ for (j = 0; j < (1 << BYTEWIDTH); j++)
+ if (SYNTAX (j) != Sword)
+ fastmap[j] = 1;
+ break;
+
+#ifdef emacs
+ case syntaxspec:
+ k = *p++;
+ for (j = 0; j < (1 << BYTEWIDTH); j++)
+ if (SYNTAX (j) == (enum syntaxcode) k)
+ fastmap[j] = 1;
+ break;
+
+ case notsyntaxspec:
+ for (j = 0; j < (1 << BYTEWIDTH); j++)
+ if (SYNTAX (j) != (enum syntaxcode) k)
+ fastmap[j] = 1;
+ break;
+#endif emacs
+
+ case charset:
+ for (j = *p++ * BYTEWIDTH - 1; j >= 0; j--)
+ if (p[j / BYTEWIDTH] & (1 << (j % BYTEWIDTH)))
+ {
+ if (translate)
+ fastmap[translate[j]] = 1;
+ else
+ fastmap[j] = 1;
+ }
+ break;
+
+ case charset_not:
+ /* Chars beyond end of map must be allowed */
+ for (j = *p * BYTEWIDTH; j < (1 << BYTEWIDTH); j++)
+ if (translate)
+ fastmap[translate[j]] = 1;
+ else
+ fastmap[j] = 1;
+
+ for (j = *p++ * BYTEWIDTH - 1; j >= 0; j--)
+ if (!(p[j / BYTEWIDTH] & (1 << (j % BYTEWIDTH))))
+ {
+ if (translate)
+ fastmap[translate[j]] = 1;
+ else
+ fastmap[j] = 1;
+ }
+ break;
+ }
+
+ /* Get here means we have successfully found the possible starting characters
+ of one path of the pattern. We need not follow this path any farther.
+ Instead, look at the next alternative remembered in the stack. */
+ if (stackp != stackb)
+ p = *stackp--;
+ else
+ break;
+ }
+}
+
+/* Like re_search_2, below, but only one string is specified. */
+
+int
+re_search (pbufp, string, size, startpos, range, regs)
+ struct re_pattern_buffer *pbufp;
+ char *string;
+ int size, startpos, range;
+ struct re_registers *regs;
+{
+ return re_search_2 (pbufp, 0, 0, string, size, startpos, range, regs, size);
+}
+
+/* Like re_match_2 but tries first a match starting at index `startpos',
+ then at startpos + 1, and so on.
+ `range' is the number of places to try before giving up.
+ If `range' is negative, the starting positions tried are
+ startpos, startpos - 1, etc.
+ It is up to the caller to make sure that range is not so large
+ as to take the starting position outside of the input strings.
+
+The value returned is the position at which the match was found,
+ or -1 if no match was found. */
+
+int
+re_search_2 (pbufp, string1, size1, string2, size2, startpos, range, regs, mstop)
+ struct re_pattern_buffer *pbufp;
+ char *string1, *string2;
+ int size1, size2;
+ int startpos;
+ register int range;
+ struct re_registers *regs;
+ int mstop;
+{
+ register char *fastmap = pbufp->fastmap;
+ register char *translate = pbufp->translate;
+ int total = size1 + size2;
+
+ /* Update the fastmap now if not correct already */
+ if (fastmap && !pbufp->fastmap_accurate)
+ re_compile_fastmap (pbufp);
+
+ while (1)
+ {
+ /* If a fastmap is supplied, skip quickly over characters
+ that cannot possibly be the start of a match.
+ Note, however, that if the pattern can possibly match
+ the null string, we must test it at each starting point
+ so that we take the first null string we get. */
+
+ if (fastmap && startpos < total && pbufp->can_be_null != 1)
+ {
+ if (range > 0)
+ {
+ register int lim = 0;
+ register char *p;
+ int irange = range;
+ if (startpos < size1 && startpos + range >= size1)
+ lim = range - (size1 - startpos);
+
+ p = &(startpos >= size1 ? string2 - size1 : string1)[startpos];
+
+ if (translate)
+ {
+ while (range > lim && !fastmap[translate[*p++]])
+ range--;
+ }
+ else
+ {
+ while (range > lim && !fastmap[*p++])
+ range--;
+ }
+ startpos += irange - range;
+ }
+ else
+ {
+ register char c;
+ if (startpos >= size1) c = string2[startpos - size1];
+ else c = string1[startpos];
+ if (translate ? !fastmap[translate[c]] : !fastmap[c])
+ goto advance;
+ }
+ }
+
+ if (range >= 0 && startpos == total
+ && fastmap && pbufp->can_be_null == 0)
+ return -1;
+
+ if (0 <= re_match_2 (pbufp, string1, size1, string2, size2, startpos, regs, mstop))
+ return startpos;
+
+#ifdef C_ALLOCA
+ alloca (0);
+#endif /* C_ALLOCA */
+
+ advance:
+ if (!range) break;
+ if (range > 0) range--, startpos++; else range++, startpos--;
+ }
+ return -1;
+}
+
+#ifndef emacs /* emacs never uses this */
+int
+re_match (pbufp, string, size, pos, regs)
+ struct re_pattern_buffer *pbufp;
+ char *string;
+ int size, pos;
+ struct re_registers *regs;
+{
+ return re_match_2 (pbufp, 0, 0, string, size, pos, regs, size);
+}
+#endif /* emacs */
+
+/* Match the pattern described by `pbufp'
+ against data which is the virtual concatenation of `string1' and `string2'.
+ `size1' and `size2' are the sizes of the two data strings.
+ Start the match at position `pos'.
+ Do not consider matching past the position `mstop'.
+
+ If pbufp->fastmap is nonzero, then it had better be up to date.
+
+ The reason that the data to match is specified as two components
+ which are to be regarded as concatenated
+ is so that this function can be used directly on the contents of an Emacs buffer.
+
+ -1 is returned if there is no match. Otherwise the value is the length
+ of the substring which was matched.
+*/
+
+int
+re_match_2 (pbufp, string1, size1, string2, size2, pos, regs, mstop)
+ struct re_pattern_buffer *pbufp;
+ char *string1, *string2;
+ int size1, size2;
+ int pos;
+ struct re_registers *regs;
+ int mstop;
+{
+ register char *p = pbufp->buffer;
+ register char *pend = p + pbufp->used;
+ /* End of first string */
+ char *end1;
+ /* End of second string */
+ char *end2;
+ /* Pointer just past last char to consider matching */
+ char *end_match_1, *end_match_2;
+ register char *d, *dend;
+ register int mcnt;
+ char *translate = pbufp->translate;
+
+ /* Failure point stack. Each place that can handle a failure further down the line
+ pushes a failure point on this stack. It consists of two char *'s.
+ The first one pushed is where to resume scanning the pattern;
+ the second pushed is where to resume scanning the strings.
+ If the latter is zero, the failure point is a "dummy".
+ If a failure happens and the innermost failure point is dormant,
+ it discards that failure point and tries the next one. */
+
+ char **stackb = (char **) alloca (2 * NFAILURES * sizeof (char *));
+ char **stackp = stackb, **stacke = &stackb[2 * NFAILURES];
+
+ /* Information on the "contents" of registers.
+ These are pointers into the input strings; they record
+ just what was matched (on this attempt) by some part of the pattern.
+ The start_memory command stores the start of a register's contents
+ and the stop_memory command stores the end.
+
+ At that point, regstart[regnum] points to the first character in the register,
+ regend[regnum] points to the first character beyond the end of the register,
+ and regstart_segend[regnum] is either the same as regend[regnum]
+ or else points to the end of the input string into which regstart[regnum] points.
+ The latter case happens when regstart[regnum] is in string1 and
+ regend[regnum] is in string2. */
+
+ char *regstart[RE_NREGS];
+ char *regstart_segend[RE_NREGS];
+ char *regend[RE_NREGS];
+
+ /* Set up pointers to ends of strings.
+ Don't allow the second string to be empty unless both are empty. */
+ if (!size2)
+ {
+ string2 = string1;
+ size2 = size1;
+ string1 = 0;
+ size1 = 0;
+ }
+ end1 = string1 + size1;
+ end2 = string2 + size2;
+
+ /* Compute where to stop matching, within the two strings */
+ if (mstop <= size1)
+ {
+ end_match_1 = string1 + mstop;
+ end_match_2 = string2;
+ }
+ else
+ {
+ end_match_1 = end1;
+ end_match_2 = string2 + mstop - size1;
+ }
+
+ /* Initialize \( and \) text positions to -1
+ to mark ones that no \( or \) has been seen for. */
+
+ for (mcnt = 0; mcnt < sizeof (regstart) / sizeof (*regstart); mcnt++)
+ regstart[mcnt] = (char *) -1;
+
+ /* `p' scans through the pattern as `d' scans through the data.
+ `dend' is the end of the input string that `d' points within.
+ `d' is advanced into the following input string whenever necessary,
+ but this happens before fetching;
+ therefore, at the beginning of the loop,
+ `d' can be pointing at the end of a string,
+ but it cannot equal string2. */
+
+ if (pos <= size1)
+ d = string1 + pos, dend = end_match_1;
+ else
+ d = string2 + pos - size1, dend = end_match_2;
+
+/* Write PREFETCH; just before fetching a character with *d. */
+#define PREFETCH \
+ while (d == dend) \
+ { if (dend == end_match_2) goto fail; /* end of string2 => failure */ \
+ d = string2; /* end of string1 => advance to string2. */ \
+ dend = end_match_2; }
+
+ /* This loop loops over pattern commands.
+ It exits by returning from the function if match is complete,
+ or it drops through if match fails at this starting point in the input data. */
+
+ while (1)
+ {
+ if (p == pend)
+ /* End of pattern means we have succeeded! */
+ {
+ /* If caller wants register contents data back, convert it to indices */
+ if (regs)
+ {
+ regend[0] = d;
+ regstart[0] = string1;
+ for (mcnt = 0; mcnt < RE_NREGS; mcnt++)
+ {
+ if ((mcnt != 0) && regstart[mcnt] == (char *) -1)
+ {
+ regs->start[mcnt] = -1;
+ regs->end[mcnt] = -1;
+ continue;
+ }
+ if (regstart[mcnt] - string1 < 0 ||
+ regstart[mcnt] - string1 > size1)
+ regs->start[mcnt] = regstart[mcnt] - string2 + size1;
+ else
+ regs->start[mcnt] = regstart[mcnt] - string1;
+ if (regend[mcnt] - string1 < 0 ||
+ regend[mcnt] - string1 > size1)
+ regs->end[mcnt] = regend[mcnt] - string2 + size1;
+ else
+ regs->end[mcnt] = regend[mcnt] - string1;
+ }
+ regs->start[0] = pos;
+ }
+ if (d - string1 >= 0 && d - string1 <= size1)
+ return d - string1 - pos;
+ else
+ return d - string2 + size1 - pos;
+ }
+
+ /* Otherwise match next pattern command */
+#ifdef SWITCH_ENUM_BUG
+ switch ((int) ((enum regexpcode) *p++))
+#else
+ switch ((enum regexpcode) *p++)
+#endif
+ {
+
+ /* \( is represented by a start_memory, \) by a stop_memory.
+ Both of those commands contain a "register number" argument.
+ The text matched within the \( and \) is recorded under that number.
+ Then, \<digit> turns into a `duplicate' command which
+ is followed by the numeric value of <digit> as the register number. */
+
+ case start_memory:
+ regstart[*p] = d;
+ regstart_segend[*p++] = dend;
+ break;
+
+ case stop_memory:
+ regend[*p] = d;
+ if (regstart_segend[*p] == dend)
+ regstart_segend[*p] = d;
+ p++;
+ break;
+
+ case duplicate:
+ {
+ int regno = *p++; /* Get which register to match against */
+ register char *d2, *dend2;
+
+ d2 = regstart[regno];
+ dend2 = regstart_segend[regno];
+ while (1)
+ {
+ /* Advance to next segment in register contents, if necessary */
+ while (d2 == dend2)
+ {
+ if (dend2 == end_match_2) break;
+ if (dend2 == regend[regno]) break;
+ d2 = string2, dend2 = regend[regno]; /* end of string1 => advance to string2. */
+ }
+ /* At end of register contents => success */
+ if (d2 == dend2) break;
+
+ /* Advance to next segment in data being matched, if necessary */
+ PREFETCH;
+
+ /* mcnt gets # consecutive chars to compare */
+ mcnt = dend - d;
+ if (mcnt > dend2 - d2)
+ mcnt = dend2 - d2;
+ /* Compare that many; failure if mismatch, else skip them. */
+ if (translate ? bcmp_translate (d, d2, mcnt, translate) : bcmp (d, d2, mcnt))
+ goto fail;
+ d += mcnt, d2 += mcnt;
+ }
+ }
+ break;
+
+ case anychar:
+ /* fetch a data character */
+ PREFETCH;
+ /* Match anything but a newline. */
+ if ((translate ? translate[*d++] : *d++) == '\n')
+ goto fail;
+ break;
+
+ case charset:
+ case charset_not:
+ {
+ /* Nonzero for charset_not */
+ int not = 0;
+ register int c;
+ if (*(p - 1) == (char) charset_not)
+ not = 1;
+
+ /* fetch a data character */
+ PREFETCH;
+
+ if (translate)
+ c = translate [*(unsigned char *)d];
+ else
+ c = *(unsigned char *)d;
+
+ if (c < *p * BYTEWIDTH
+ && p[1 + c / BYTEWIDTH] & (1 << (c % BYTEWIDTH)))
+ not = !not;
+
+ p += 1 + *p;
+
+ if (!not) goto fail;
+ d++;
+ break;
+ }
+
+ case begline:
+ if (d == string1 || d[-1] == '\n')
+ break;
+ goto fail;
+
+ case endline:
+ if (d == end2
+ || (d == end1 ? (size2 == 0 || *string2 == '\n') : *d == '\n'))
+ break;
+ goto fail;
+
+ /* "or" constructs ("|") are handled by starting each alternative
+ with an on_failure_jump that points to the start of the next alternative.
+ Each alternative except the last ends with a jump to the joining point.
+ (Actually, each jump except for the last one really jumps
+ to the following jump, because tensioning the jumps is a hassle.) */
+
+ /* The start of a stupid repeat has an on_failure_jump that points
+ past the end of the repeat text.
+ This makes a failure point so that, on failure to match a repetition,
+ matching restarts past as many repetitions have been found
+ with no way to fail and look for another one. */
+
+ /* A smart repeat is similar but loops back to the on_failure_jump
+ so that each repetition makes another failure point. */
+
+ case on_failure_jump:
+ if (stackp == stacke)
+ {
+ char **stackx = (char **) alloca (2 * (stacke - stackb) * sizeof (char *));
+ bcopy (stackb, stackx, (stacke - stackb) * sizeof (char *));
+ stackp += stackx - stackb;
+ stacke = stackx + 2 * (stacke - stackb);
+ stackb = stackx;
+ }
+ mcnt = *p++ & 0377;
+ mcnt += SIGN_EXTEND_CHAR (*p) << 8;
+ p++;
+ *stackp++ = mcnt + p;
+ *stackp++ = d;
+ break;
+
+ /* The end of a smart repeat has an maybe_finalize_jump back.
+ Change it either to a finalize_jump or an ordinary jump. */
+
+ case maybe_finalize_jump:
+ mcnt = *p++ & 0377;
+ mcnt += SIGN_EXTEND_CHAR (*p) << 8;
+ p++;
+ /* Compare what follows with the begining of the repeat.
+ If we can establish that there is nothing that they would
+ both match, we can change to finalize_jump */
+ if (p == pend)
+ p[-3] = (char) finalize_jump;
+ else if (*p == (char) exactn || *p == (char) endline)
+ {
+ register int c = *p == (char) endline ? '\n' : p[2];
+ register char *p1 = p + mcnt;
+ /* p1[0] ... p1[2] are an on_failure_jump.
+ Examine what follows that */
+ if (p1[3] == (char) exactn && p1[5] != c)
+ p[-3] = (char) finalize_jump;
+ else if (p1[3] == (char) charset || p1[3] == (char) charset_not)
+ {
+ int not = p1[3] == (char) charset_not;
+ if (c < p1[4] * BYTEWIDTH
+ && p1[5 + c / BYTEWIDTH] & (1 << (c % BYTEWIDTH)))
+ not = !not;
+ /* not is 1 if c would match */
+ /* That means it is not safe to finalize */
+ if (!not)
+ p[-3] = (char) finalize_jump;
+ }
+ }
+ p -= 2;
+ if (p[-1] != (char) finalize_jump)
+ {
+ p[-1] = (char) jump;
+ goto nofinalize;
+ }
+
+ /* The end of a stupid repeat has a finalize-jump
+ back to the start, where another failure point will be made
+ which will point after all the repetitions found so far. */
+
+ case finalize_jump:
+ stackp -= 2;
+
+ case jump:
+ nofinalize:
+ mcnt = *p++ & 0377;
+ mcnt += SIGN_EXTEND_CHAR (*p) << 8;
+ p += mcnt + 1; /* The 1 compensates for missing ++ above */
+ break;
+
+ case dummy_failure_jump:
+ if (stackp == stacke)
+ {
+ char **stackx = (char **) alloca (2 * (stacke - stackb) * sizeof (char *));
+ bcopy (stackb, stackx, (stacke - stackb) * sizeof (char *));
+ stackp += stackx - stackb;
+ stacke = stackx + 2 * (stacke - stackb);
+ stackb = stackx;
+ }
+ *stackp++ = 0;
+ *stackp++ = 0;
+ goto nofinalize;
+
+ case wordbound:
+ if (d == string1 /* Points to first char */
+ || d == end2 /* Points to end */
+ || (d == end1 && size2 == 0)) /* Points to end */
+ break;
+ if ((SYNTAX (((unsigned char *)d)[-1]) == Sword)
+ != (SYNTAX (d == end1 ? *(unsigned char *)string2 : *(unsigned char *)d) == Sword))
+ break;
+ goto fail;
+
+ case notwordbound:
+ if (d == string1 /* Points to first char */
+ || d == end2 /* Points to end */
+ || (d == end1 && size2 == 0)) /* Points to end */
+ goto fail;
+ if ((SYNTAX (((unsigned char *)d)[-1]) == Sword)
+ != (SYNTAX (d == end1 ? *(unsigned char *)string2 : *(unsigned char *)d) == Sword))
+ goto fail;
+ break;
+
+ case wordbeg:
+ if (d == end2 /* Points to end */
+ || (d == end1 && size2 == 0) /* Points to end */
+ || SYNTAX (*(unsigned char *) (d == end1 ? string2 : d)) != Sword) /* Next char not a letter */
+ goto fail;
+ if (d == string1 /* Points to first char */
+ || SYNTAX (((unsigned char *)d)[-1]) != Sword) /* prev char not letter */
+ break;
+ goto fail;
+
+ case wordend:
+ if (d == string1 /* Points to first char */
+ || SYNTAX (((unsigned char *)d)[-1]) != Sword) /* prev char not letter */
+ goto fail;
+ if (d == end2 /* Points to end */
+ || (d == end1 && size2 == 0) /* Points to end */
+ || SYNTAX (d == end1 ? *(unsigned char *)string2 : *(unsigned char *)d) != Sword) /* Next char not a letter */
+ break;
+ goto fail;
+
+#ifdef emacs
+ case before_dot:
+ if (((d - string2 <= (unsigned) size2)
+ ? d - (char *) bf_p2 : d - (char *) bf_p1)
+ <= point)
+ goto fail;
+ break;
+
+ case at_dot:
+ if (((d - string2 <= (unsigned) size2)
+ ? d - (char *) bf_p2 : d - (char *) bf_p1)
+ == point)
+ goto fail;
+ break;
+
+ case after_dot:
+ if (((d - string2 <= (unsigned) size2)
+ ? d - (char *) bf_p2 : d - (char *) bf_p1)
+ >= point)
+ goto fail;
+ break;
+
+ case wordchar:
+ mcnt = (int) Sword;
+ goto matchsyntax;
+
+ case syntaxspec:
+ mcnt = *p++;
+ matchsyntax:
+ PREFETCH;
+ if (SYNTAX (*(unsigned char *)d++) != (enum syntaxcode) mcnt) goto fail;
+ break;
+
+ case notwordchar:
+ mcnt = (int) Sword;
+ goto matchnotsyntax;
+
+ case notsyntaxspec:
+ mcnt = *p++;
+ matchnotsyntax:
+ PREFETCH;
+ if (SYNTAX (*(unsigned char *)d++) == (enum syntaxcode) mcnt) goto fail;
+ break;
+#else
+ case wordchar:
+ PREFETCH;
+ if (SYNTAX (*(unsigned char *)d++) == 0) goto fail;
+ break;
+
+ case notwordchar:
+ PREFETCH;
+ if (SYNTAX (*(unsigned char *)d++) != 0) goto fail;
+ break;
+#endif not emacs
+
+ case begbuf:
+ if (d == string1) /* Note, d cannot equal string2 */
+ break; /* unless string1 == string2. */
+ goto fail;
+
+ case endbuf:
+ if (d == end2 || (d == end1 && size2 == 0))
+ break;
+ goto fail;
+
+ case exactn:
+ /* Match the next few pattern characters exactly.
+ mcnt is how many characters to match. */
+ mcnt = *p++;
+ if (translate)
+ {
+ do
+ {
+ PREFETCH;
+ if (translate[*(unsigned char *)d++] != *p++) goto fail;
+ }
+ while (--mcnt);
+ }
+ else
+ {
+ do
+ {
+ PREFETCH;
+ if (*d++ != *p++) goto fail;
+ }
+ while (--mcnt);
+ }
+ break;
+ }
+ continue; /* Successfully matched one pattern command; keep matching */
+
+ /* Jump here if any matching operation fails. */
+ fail:
+ if (stackp != stackb)
+ /* A restart point is known. Restart there and pop it. */
+ {
+ if (!stackp[-2])
+ { /* If innermost failure point is dormant, flush it and keep looking */
+ stackp -= 2;
+ goto fail;
+ }
+ d = *--stackp;
+ p = *--stackp;
+ if (d >= string1 && d <= end1)
+ dend = end_match_1;
+ }
+ else break; /* Matching at this starting point really fails! */
+ }
+ return -1; /* Failure to match */
+}
+
+static int
+bcmp_translate (s1, s2, len, translate)
+ char *s1, *s2;
+ register int len;
+ char *translate;
+{
+ register char *p1 = s1, *p2 = s2;
+ while (len)
+ {
+ if (translate [*p1++] != translate [*p2++]) return 1;
+ len--;
+ }
+ return 0;
+}
+
+/* Entry points compatible with bsd4.2 regex library */
+
+#ifndef emacs
+
+static struct re_pattern_buffer re_comp_buf;
+
+char *
+re_comp (s)
+ char *s;
+{
+ if (!s)
+ {
+ if (!re_comp_buf.buffer)
+ return "No previous regular expression";
+ return 0;
+ }
+
+ if (!re_comp_buf.buffer)
+ {
+ if (!(re_comp_buf.buffer = (char *) malloc (200)))
+ return "Memory exhausted";
+ re_comp_buf.allocated = 200;
+ if (!(re_comp_buf.fastmap = (char *) malloc (1 << BYTEWIDTH)))
+ return "Memory exhausted";
+ }
+ return re_compile_pattern (s, strlen (s), &re_comp_buf);
+}
+
+int
+re_exec (s)
+ char *s;
+{
+ int len = strlen (s);
+ return 0 <= re_search (&re_comp_buf, s, len, 0, len, 0);
+}
+
+#endif /* emacs */
+
+#ifdef test
+
+#include <stdio.h>
+
+/* Indexed by a character, gives the upper case equivalent of the character */
+
+static char upcase[0400] =
+ { 000, 001, 002, 003, 004, 005, 006, 007,
+ 010, 011, 012, 013, 014, 015, 016, 017,
+ 020, 021, 022, 023, 024, 025, 026, 027,
+ 030, 031, 032, 033, 034, 035, 036, 037,
+ 040, 041, 042, 043, 044, 045, 046, 047,
+ 050, 051, 052, 053, 054, 055, 056, 057,
+ 060, 061, 062, 063, 064, 065, 066, 067,
+ 070, 071, 072, 073, 074, 075, 076, 077,
+ 0100, 0101, 0102, 0103, 0104, 0105, 0106, 0107,
+ 0110, 0111, 0112, 0113, 0114, 0115, 0116, 0117,
+ 0120, 0121, 0122, 0123, 0124, 0125, 0126, 0127,
+ 0130, 0131, 0132, 0133, 0134, 0135, 0136, 0137,
+ 0140, 0101, 0102, 0103, 0104, 0105, 0106, 0107,
+ 0110, 0111, 0112, 0113, 0114, 0115, 0116, 0117,
+ 0120, 0121, 0122, 0123, 0124, 0125, 0126, 0127,
+ 0130, 0131, 0132, 0173, 0174, 0175, 0176, 0177,
+ 0200, 0201, 0202, 0203, 0204, 0205, 0206, 0207,
+ 0210, 0211, 0212, 0213, 0214, 0215, 0216, 0217,
+ 0220, 0221, 0222, 0223, 0224, 0225, 0226, 0227,
+ 0230, 0231, 0232, 0233, 0234, 0235, 0236, 0237,
+ 0240, 0241, 0242, 0243, 0244, 0245, 0246, 0247,
+ 0250, 0251, 0252, 0253, 0254, 0255, 0256, 0257,
+ 0260, 0261, 0262, 0263, 0264, 0265, 0266, 0267,
+ 0270, 0271, 0272, 0273, 0274, 0275, 0276, 0277,
+ 0300, 0301, 0302, 0303, 0304, 0305, 0306, 0307,
+ 0310, 0311, 0312, 0313, 0314, 0315, 0316, 0317,
+ 0320, 0321, 0322, 0323, 0324, 0325, 0326, 0327,
+ 0330, 0331, 0332, 0333, 0334, 0335, 0336, 0337,
+ 0340, 0341, 0342, 0343, 0344, 0345, 0346, 0347,
+ 0350, 0351, 0352, 0353, 0354, 0355, 0356, 0357,
+ 0360, 0361, 0362, 0363, 0364, 0365, 0366, 0367,
+ 0370, 0371, 0372, 0373, 0374, 0375, 0376, 0377
+ };
+
+main ()
+{
+ char pat[80];
+ struct re_pattern_buffer buf;
+ int i;
+ char c;
+ char fastmap[(1 << BYTEWIDTH)];
+ char *gets();
+
+ buf.allocated = 40;
+ buf.buffer = (char *) malloc (buf.allocated);
+ buf.fastmap = fastmap;
+ buf.translate = upcase;
+
+ while (gets(pat))
+ {
+
+ if (*pat)
+ {
+ re_compile_pattern (pat, strlen(pat), &buf);
+
+ for (i = 0; i < buf.used; i++)
+ printchar (buf.buffer[i]);
+
+ putchar ('\n');
+
+ printf ("%d allocated, %d used.\n", buf.allocated, buf.used);
+
+ re_compile_fastmap (&buf);
+ printf ("Allowed by fastmap: ");
+ for (i = 0; i < (1 << BYTEWIDTH); i++)
+ if (fastmap[i]) printchar (i);
+ putchar ('\n');
+ }
+
+ gets (pat); /* Now read the string to match against */
+
+ i = re_match (&buf, pat, strlen (pat), 0, 0);
+ printf ("Match value %d.\n", i);
+ }
+}
+
+#ifdef NOTDEF
+print_buf (bufp)
+ struct re_pattern_buffer *bufp;
+{
+ int i;
+
+ printf ("buf is :\n----------------\n");
+ for (i = 0; i < bufp->used; i++)
+ printchar (bufp->buffer[i]);
+
+ printf ("\n%d allocated, %d used.\n", bufp->allocated, bufp->used);
+
+ printf ("Allowed by fastmap: ");
+ for (i = 0; i < (1 << BYTEWIDTH); i++)
+ if (bufp->fastmap[i])
+ printchar (i);
+ printf ("\nAllowed by translate: ");
+ if (bufp->translate)
+ for (i = 0; i < (1 << BYTEWIDTH); i++)
+ if (bufp->translate[i])
+ printchar (i);
+ printf ("\nfastmap is%s accurate\n", bufp->fastmap_accurate ? "" : "n't");
+ printf ("can %s be null\n----------", bufp->can_be_null ? "" : "not");
+}
+#endif
+
+printchar (c)
+ char c;
+{
+ if (c < 041 || c >= 0177)
+ {
+ putchar ('\\');
+ putchar (((c >> 6) & 3) + '0');
+ putchar (((c >> 3) & 7) + '0');
+ putchar ((c & 7) + '0');
+ }
+ else
+ putchar (c);
+}
+
+error (string)
+ char *string;
+{
+ puts (string);
+ exit (1);
+}
+
+#endif test
diff --git a/regex.h b/regex.h
new file mode 100644
index 00000000..b87b47e6
--- /dev/null
+++ b/regex.h
@@ -0,0 +1,213 @@
+/* Definitions for data structures callers pass the regex library.
+ Copyright (C) 1985 Free Software Foundation, Inc.
+
+ NO WARRANTY
+
+ BECAUSE THIS PROGRAM IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY
+NO WARRANTY, TO THE EXTENT PERMITTED BY APPLICABLE STATE LAW. EXCEPT
+WHEN OTHERWISE STATED IN WRITING, FREE SOFTWARE FOUNDATION, INC,
+RICHARD M. STALLMAN AND/OR OTHER PARTIES PROVIDE THIS PROGRAM "AS IS"
+WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
+FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY
+AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE
+DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR
+CORRECTION.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW WILL RICHARD M.
+STALLMAN, THE FREE SOFTWARE FOUNDATION, INC., AND/OR ANY OTHER PARTY
+WHO MAY MODIFY AND REDISTRIBUTE THIS PROGRAM AS PERMITTED BELOW, BE
+LIABLE TO YOU FOR DAMAGES, INCLUDING ANY LOST PROFITS, LOST MONIES, OR
+OTHER SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR
+DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY THIRD PARTIES OR
+A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS) THIS
+PROGRAM, EVEN IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH
+DAMAGES, OR FOR ANY CLAIM BY ANY OTHER PARTY.
+
+ GENERAL PUBLIC LICENSE TO COPY
+
+ 1. You may copy and distribute verbatim copies of this source file
+as you receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy a valid copyright notice "Copyright
+(C) 1985 Free Software Foundation, Inc."; and include following the
+copyright notice a verbatim copy of the above disclaimer of warranty
+and of this License.
+
+ 2. You may modify your copy or copies of this source file or
+any portion of it, and copy and distribute such modifications under
+the terms of Paragraph 1 above, provided that you also do the following:
+
+ a) cause the modified files to carry prominent notices stating
+ that you changed the files and the date of any change; and
+
+ b) cause the whole of any work that you distribute or publish,
+ that in whole or in part contains or is a derivative of this
+ program or any part thereof, to be freely distributed
+ and licensed to all third parties on terms identical to those
+ contained in this License Agreement (except that you may choose
+ to grant more extensive warranty protection to third parties,
+ at your option).
+
+ 3. You may copy and distribute this program or any portion of it in
+compiled, executable or object code form under the terms of Paragraphs
+1 and 2 above provided that you do the following:
+
+ a) cause each such copy to be accompanied by the
+ corresponding machine-readable source code, which must
+ be distributed under the terms of Paragraphs 1 and 2 above; or,
+
+ b) cause each such copy to be accompanied by a
+ written offer, with no time limit, to give any third party
+ free (except for a nominal shipping charge) a machine readable
+ copy of the corresponding source code, to be distributed
+ under the terms of Paragraphs 1 and 2 above; or,
+
+ c) in the case of a recipient of this program in compiled, executable
+ or object code form (without the corresponding source code) you
+ shall cause copies you distribute to be accompanied by a copy
+ of the written offer of source code which you received along
+ with the copy you received.
+
+ 4. You may not copy, sublicense, distribute or transfer this program
+except as expressly provided under this License Agreement. Any attempt
+otherwise to copy, sublicense, distribute or transfer this program is void and
+your rights to use the program under this License agreement shall be
+automatically terminated. However, parties who have received computer
+software programs from you with this License Agreement will not have
+their licenses terminated so long as such parties remain in full compliance.
+
+
+In other words, you are welcome to use, share and improve this program.
+You are forbidden to forbid anyone else to use, share and improve
+what you give them. Help stamp out software-hoarding! */
+
+
+#ifndef RE_NREGS
+#define RE_NREGS 10
+#endif
+
+
+/* JF for syntax stuff */
+/* To add more variable-syntax features, just use more bits. If we go over 16,
+ we probably should make obscure_syntax a long. (JF: Yes, virgina, there
+really are 16 bit machines out there) */
+#define RE_NO_BK_PARENS (1<<0)
+#define RE_NO_BK_VBAR (1<<1)
+
+/* This data structure is used to represent a compiled pattern. */
+
+struct re_pattern_buffer
+ {
+ char *buffer; /* Space holding the compiled pattern commands. */
+ int allocated; /* Size of space that buffer points to */
+ int used; /* Length of portion of buffer actually occupied */
+ char *fastmap; /* Pointer to fastmap, if any, or zero if none. */
+ /* re_search uses the fastmap, if there is one,
+ to skip quickly over totally implausible characters */
+ char *translate; /* Translate table to apply to all characters before comparing.
+ Or zero for no translation.
+ The translation is applied to a pattern when it is compiled
+ and to data when it is matched. */
+ char fastmap_accurate;
+ /* Set to zero when a new pattern is stored,
+ set to one when the fastmap is updated from it. */
+ char can_be_null; /* Set to one by compiling fastmap
+ if this pattern might match the null string.
+ It does not necessarily match the null string
+ in that case, but if this is zero, it cannot.
+ 2 as value means can match null string
+ but at end of range or before a character
+ listed in the fastmap. */
+ };
+
+/* Structure to store "register" contents data in.
+
+ Pass the address of such a structure as an argument to re_match, etc.,
+ if you want this information back.
+
+ start[i] and end[i] record the string matched by \( ... \) grouping i,
+ for i from 1 to RE_NREGS - 1.
+ start[0] and end[0] record the entire string matched. */
+
+struct re_registers
+ {
+ int start[RE_NREGS];
+ int end[RE_NREGS];
+ };
+
+/* These are the command codes that appear in compiled regular expressions, one per byte.
+ Some command codes are followed by argument bytes.
+ A command code can specify any interpretation whatever for its arguments.
+ Zero-bytes may appear in the compiled regular expression. */
+
+enum regexpcode
+ {
+ unused,
+ exactn, /* followed by one byte giving n, and then by n literal bytes */
+ begline, /* fails unless at beginning of line */
+ endline, /* fails unless at end of line */
+ jump, /* followed by two bytes giving relative address to jump to */
+ on_failure_jump, /* followed by two bytes giving relative address of place
+ to resume at in case of failure. */
+ finalize_jump, /* Throw away latest failure point and then jump to address. */
+ maybe_finalize_jump, /* Like jump but finalize if safe to do so.
+ This is used to jump back to the beginning
+ of a repeat. If the command that follows
+ this jump is clearly incompatible with the
+ one at the beginning of the repeat, such that
+ we can be sure that there is no use backtracking
+ out of repetitions already completed,
+ then we finalize. */
+ dummy_failure_jump, /* jump, and push a dummy failure point.
+ This failure point will be thrown away
+ if an attempt is made to use it for a failure.
+ A + construct makes this before the first repeat. */
+ anychar, /* matches any one character */
+ charset, /* matches any one char belonging to specified set.
+ First following byte is # bitmap bytes.
+ Then come bytes for a bit-map saying which chars are in.
+ Bits in each byte are ordered low-bit-first.
+ A character is in the set if its bit is 1.
+ A character too large to have a bit in the map
+ is automatically not in the set */
+ charset_not, /* similar but match any character that is NOT one of those specified */
+ start_memory, /* starts remembering the text that is matched
+ and stores it in a memory register.
+ followed by one byte containing the register number.
+ Register numbers must be in the range 0 through NREGS. */
+ stop_memory, /* stops remembering the text that is matched
+ and stores it in a memory register.
+ followed by one byte containing the register number.
+ Register numbers must be in the range 0 through NREGS. */
+ duplicate, /* match a duplicate of something remembered.
+ Followed by one byte containing the index of the memory register. */
+ before_dot, /* Succeeds if before dot */
+ at_dot, /* Succeeds if at dot */
+ after_dot, /* Succeeds if after dot */
+ begbuf, /* Succeeds if at beginning of buffer */
+ endbuf, /* Succeeds if at end of buffer */
+ wordchar, /* Matches any word-constituent character */
+ notwordchar, /* Matches any char that is not a word-constituent */
+ wordbeg, /* Succeeds if at word beginning */
+ wordend, /* Succeeds if at word end */
+ wordbound, /* Succeeds if at a word boundary */
+ notwordbound, /* Succeeds if not at a word boundary */
+ syntaxspec, /* Matches any character whose syntax is specified.
+ followed by a byte which contains a syntax code, Sword or such like */
+ notsyntaxspec /* Matches any character whose syntax differs from the specified. */
+ };
+
+extern char *re_compile_pattern ();
+/* Is this really advertised? */
+extern void re_compile_fastmap ();
+extern int re_search (), re_search_2 ();
+extern int re_match (), re_match_2 ();
+
+/* 4.2 bsd compatibility (yuck) */
+extern char *re_comp ();
+extern int re_exec ();
+
+#ifdef SYNTAX_TABLE
+extern char *re_syntax_table;
+#endif