aboutsummaryrefslogtreecommitdiffstats
path: root/gawk.info-1
diff options
context:
space:
mode:
Diffstat (limited to 'gawk.info-1')
-rw-r--r--gawk.info-11069
1 files changed, 1069 insertions, 0 deletions
diff --git a/gawk.info-1 b/gawk.info-1
new file mode 100644
index 00000000..e6f2e750
--- /dev/null
+++ b/gawk.info-1
@@ -0,0 +1,1069 @@
+This is Info file gawk.info, produced by Makeinfo-1.54 from the input
+file gawk.texi.
+
+ This file documents `awk', a program that you can use to select
+particular records in a file and perform operations upon them.
+
+ This is Edition 0.15 of `The GAWK Manual',
+for the 2.15 version of the GNU implementation
+of AWK.
+
+ Copyright (C) 1989, 1991, 1992, 1993 Free Software Foundation, Inc.
+
+ Permission is granted to make and distribute verbatim copies of this
+manual provided the copyright notice and this permission notice are
+preserved on all copies.
+
+ Permission is granted to copy and distribute modified versions of
+this manual under the conditions for verbatim copying, provided that
+the entire resulting derived work is distributed under the terms of a
+permission notice identical to this one.
+
+ Permission is granted to copy and distribute translations of this
+manual into another language, under the above conditions for modified
+versions, except that this permission notice may be stated in a
+translation approved by the Foundation.
+
+
+File: gawk.info, Node: Top, Next: Preface, Prev: (dir), Up: (dir)
+
+General Introduction
+********************
+
+ This file documents `awk', a program that you can use to select
+particular records in a file and perform operations upon them.
+
+ This is Edition 0.15 of `The GAWK Manual',
+for the 2.15 version of the GNU implementation
+of AWK.
+
+* Menu:
+
+* Preface:: What you can do with `awk'; brief history
+ and acknowledgements.
+* Copying:: Your right to copy and distribute `gawk'.
+* This Manual:: Using this manual.
+ Includes sample input files that you can use.
+* Getting Started:: A basic introduction to using `awk'.
+ How to run an `awk' program.
+ Command line syntax.
+* Reading Files:: How to read files and manipulate fields.
+* Printing:: How to print using `awk'. Describes the
+ `print' and `printf' statements.
+ Also describes redirection of output.
+* One-liners:: Short, sample `awk' programs.
+* Patterns:: The various types of patterns
+ explained in detail.
+* Actions:: The various types of actions are
+ introduced here. Describes
+ expressions and the various operators in
+ detail. Also describes comparison expressions.
+* Expressions:: Expressions are the basic building
+ blocks of statements.
+* Statements:: The various control statements are
+ described in detail.
+* Arrays:: The description and use of arrays.
+ Also includes array-oriented control
+ statements.
+* Built-in:: The built-in functions are summarized here.
+* User-defined:: User-defined functions are described in detail.
+* Built-in Variables:: Built-in Variables
+* Command Line:: How to run `gawk'.
+* Language History:: The evolution of the `awk' language.
+* Installation:: Installing `gawk' under
+ various operating systems.
+* Gawk Summary:: `gawk' Options and Language Summary.
+* Sample Program:: A sample `awk' program with a
+ complete explanation.
+* Bugs:: Reporting Problems and Bugs.
+* Notes:: Something about the
+ implementation of `gawk'.
+* Glossary:: An explanation of some unfamiliar terms.
+* Index::
+
+
+File: gawk.info, Node: Preface, Next: Copying, Prev: Top, Up: Top
+
+Preface
+*******
+
+ If you are like many computer users, you would frequently like to
+make changes in various text files wherever certain patterns appear, or
+extract data from parts of certain lines while discarding the rest. To
+write a program to do this in a language such as C or Pascal is a
+time-consuming inconvenience that may take many lines of code. The job
+may be easier with `awk'.
+
+ The `awk' utility interprets a special-purpose programming language
+that makes it possible to handle simple data-reformatting jobs easily
+with just a few lines of code.
+
+ The GNU implementation of `awk' is called `gawk'; it is fully upward
+compatible with the System V Release 4 version of `awk'. `gawk' is
+also upward compatible with the POSIX (draft) specification of the
+`awk' language. This means that all properly written `awk' programs
+should work with `gawk'. Thus, we usually don't distinguish between
+`gawk' and other `awk' implementations in this manual.
+
+ This manual teaches you what `awk' does and how you can use `awk'
+effectively. You should already be familiar with basic system commands
+such as `ls'. Using `awk' you can:
+
+ * manage small, personal databases
+
+ * generate reports
+
+ * validate data
+
+ * produce indexes, and perform other document preparation tasks
+
+ * even experiment with algorithms that can be adapted later to other
+ computer languages
+
+* Menu:
+
+* History:: The history of `gawk' and
+ `awk'. Acknowledgements.
+
+
+File: gawk.info, Node: History, Prev: Preface, Up: Preface
+
+History of `awk' and `gawk'
+===========================
+
+ The name `awk' comes from the initials of its designers: Alfred V.
+Aho, Peter J. Weinberger, and Brian W. Kernighan. The original version
+of `awk' was written in 1977. In 1985 a new version made the
+programming language more powerful, introducing user-defined functions,
+multiple input streams, and computed regular expressions. This new
+version became generally available with System V Release 3.1. The
+version in System V Release 4 added some new features and also cleaned
+up the behavior in some of the "dark corners" of the language. The
+specification for `awk' in the POSIX Command Language and Utilities
+standard further clarified the language based on feedback from both the
+`gawk' designers, and the original `awk' designers.
+
+ The GNU implementation, `gawk', was written in 1986 by Paul Rubin
+and Jay Fenlason, with advice from Richard Stallman. John Woods
+contributed parts of the code as well. In 1988 and 1989, David
+Trueman, with help from Arnold Robbins, thoroughly reworked `gawk' for
+compatibility with the newer `awk'. Current development (1992) focuses
+on bug fixes, performance improvements, and standards compliance.
+
+ We need to thank many people for their assistance in producing this
+manual. Jay Fenlason contributed many ideas and sample programs.
+Richard Mlynarik and Robert J. Chassell gave helpful comments on early
+drafts of this manual. The paper `A Supplemental Document for `awk''
+by John W. Pierce of the Chemistry Department at UC San Diego,
+pinpointed several issues relevant both to `awk' implementation and to
+this manual, that would otherwise have escaped us. David Trueman, Pat
+Rankin, and Michal Jaegermann also contributed sections of the manual.
+
+ The following people provided many helpful comments on this edition
+of the manual: Rick Adams, Michael Brennan, Rich Burridge, Diane Close,
+Christopher ("Topher") Eliot, Michael Lijewski, Pat Rankin, Miriam
+Robbins, and Michal Jaegermann. Robert J. Chassell provided much
+valuable advice on the use of Texinfo.
+
+ Finally, we would like to thank Brian Kernighan of Bell Labs for
+invaluable assistance during the testing and debugging of `gawk', and
+for help in clarifying numerous points about the language.
+
+
+File: gawk.info, Node: Copying, Next: This Manual, Prev: Preface, Up: Top
+
+GNU GENERAL PUBLIC LICENSE
+**************************
+
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+ 675 Mass Ave, Cambridge, MA 02139, USA
+
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+Preamble
+========
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it in
+new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software,
+and (2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 1. This License applies to any program or other work which contains a
+ notice placed by the copyright holder saying it may be distributed
+ under the terms of this General Public License. The "Program",
+ below, refers to any such program or work, and a "work based on
+ the Program" means either the Program or any derivative work under
+ copyright law: that is to say, a work containing the Program or a
+ portion of it, either verbatim or with modifications and/or
+ translated into another language. (Hereinafter, translation is
+ included without limitation in the term "modification".) Each
+ licensee is addressed as "you".
+
+ Activities other than copying, distribution and modification are
+ not covered by this License; they are outside its scope. The act
+ of running the Program is not restricted, and the output from the
+ Program is covered only if its contents constitute a work based on
+ the Program (independent of having been made by running the
+ Program). Whether that is true depends on what the Program does.
+
+ 2. You may copy and distribute verbatim copies of the Program's
+ source code as you receive it, in any medium, provided that you
+ conspicuously and appropriately publish on each copy an appropriate
+ copyright notice and disclaimer of warranty; keep intact all the
+ notices that refer to this License and to the absence of any
+ warranty; and give any other recipients of the Program a copy of
+ this License along with the Program.
+
+ You may charge a fee for the physical act of transferring a copy,
+ and you may at your option offer warranty protection in exchange
+ for a fee.
+
+ 3. You may modify your copy or copies of the Program or any portion
+ of it, thus forming a work based on the Program, and copy and
+ distribute such modifications or work under the terms of Section 1
+ above, provided that you also meet all of these conditions:
+
+ a. You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b. You must cause any work that you distribute or publish, that
+ in whole or in part contains or is derived from the Program
+ or any part thereof, to be licensed as a whole at no charge
+ to all third parties under the terms of this License.
+
+ c. If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display
+ an announcement including an appropriate copyright notice and
+ a notice that there is no warranty (or else, saying that you
+ provide a warranty) and that users may redistribute the
+ program under these conditions, and telling the user how to
+ view a copy of this License. (Exception: if the Program
+ itself is interactive but does not normally print such an
+ announcement, your work based on the Program is not required
+ to print an announcement.)
+
+ These requirements apply to the modified work as a whole. If
+ identifiable sections of that work are not derived from the
+ Program, and can be reasonably considered independent and separate
+ works in themselves, then this License, and its terms, do not
+ apply to those sections when you distribute them as separate
+ works. But when you distribute the same sections as part of a
+ whole which is a work based on the Program, the distribution of
+ the whole must be on the terms of this License, whose permissions
+ for other licensees extend to the entire whole, and thus to each
+ and every part regardless of who wrote it.
+
+ Thus, it is not the intent of this section to claim rights or
+ contest your rights to work written entirely by you; rather, the
+ intent is to exercise the right to control the distribution of
+ derivative or collective works based on the Program.
+
+ In addition, mere aggregation of another work not based on the
+ Program with the Program (or with a work based on the Program) on
+ a volume of a storage or distribution medium does not bring the
+ other work under the scope of this License.
+
+ 4. You may copy and distribute the Program (or a work based on it,
+ under Section 2) in object code or executable form under the terms
+ of Sections 1 and 2 above provided that you also do one of the
+ following:
+
+ a. Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of
+ Sections 1 and 2 above on a medium customarily used for
+ software interchange; or,
+
+ b. Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a
+ medium customarily used for software interchange; or,
+
+ c. Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with
+ such an offer, in accord with Subsection b above.)
+
+ The source code for a work means the preferred form of the work for
+ making modifications to it. For an executable work, complete
+ source code means all the source code for all modules it contains,
+ plus any associated interface definition files, plus the scripts
+ used to control compilation and installation of the executable.
+ However, as a special exception, the source code distributed need
+ not include anything that is normally distributed (in either
+ source or binary form) with the major components (compiler,
+ kernel, and so on) of the operating system on which the executable
+ runs, unless that component itself accompanies the executable.
+
+ If distribution of executable or object code is made by offering
+ access to copy from a designated place, then offering equivalent
+ access to copy the source code from the same place counts as
+ distribution of the source code, even though third parties are not
+ compelled to copy the source along with the object code.
+
+ 5. You may not copy, modify, sublicense, or distribute the Program
+ except as expressly provided under this License. Any attempt
+ otherwise to copy, modify, sublicense or distribute the Program is
+ void, and will automatically terminate your rights under this
+ License. However, parties who have received copies, or rights,
+ from you under this License will not have their licenses
+ terminated so long as such parties remain in full compliance.
+
+ 6. You are not required to accept this License, since you have not
+ signed it. However, nothing else grants you permission to modify
+ or distribute the Program or its derivative works. These actions
+ are prohibited by law if you do not accept this License.
+ Therefore, by modifying or distributing the Program (or any work
+ based on the Program), you indicate your acceptance of this
+ License to do so, and all its terms and conditions for copying,
+ distributing or modifying the Program or works based on it.
+
+ 7. Each time you redistribute the Program (or any work based on the
+ Program), the recipient automatically receives a license from the
+ original licensor to copy, distribute or modify the Program
+ subject to these terms and conditions. You may not impose any
+ further restrictions on the recipients' exercise of the rights
+ granted herein. You are not responsible for enforcing compliance
+ by third parties to this License.
+
+ 8. If, as a consequence of a court judgment or allegation of patent
+ infringement or for any other reason (not limited to patent
+ issues), conditions are imposed on you (whether by court order,
+ agreement or otherwise) that contradict the conditions of this
+ License, they do not excuse you from the conditions of this
+ License. If you cannot distribute so as to satisfy simultaneously
+ your obligations under this License and any other pertinent
+ obligations, then as a consequence you may not distribute the
+ Program at all. For example, if a patent license would not permit
+ royalty-free redistribution of the Program by all those who
+ receive copies directly or indirectly through you, then the only
+ way you could satisfy both it and this License would be to refrain
+ entirely from distribution of the Program.
+
+ If any portion of this section is held invalid or unenforceable
+ under any particular circumstance, the balance of the section is
+ intended to apply and the section as a whole is intended to apply
+ in other circumstances.
+
+ It is not the purpose of this section to induce you to infringe any
+ patents or other property right claims or to contest validity of
+ any such claims; this section has the sole purpose of protecting
+ the integrity of the free software distribution system, which is
+ implemented by public license practices. Many people have made
+ generous contributions to the wide range of software distributed
+ through that system in reliance on consistent application of that
+ system; it is up to the author/donor to decide if he or she is
+ willing to distribute software through any other system and a
+ licensee cannot impose that choice.
+
+ This section is intended to make thoroughly clear what is believed
+ to be a consequence of the rest of this License.
+
+ 9. If the distribution and/or use of the Program is restricted in
+ certain countries either by patents or by copyrighted interfaces,
+ the original copyright holder who places the Program under this
+ License may add an explicit geographical distribution limitation
+ excluding those countries, so that distribution is permitted only
+ in or among countries not thus excluded. In such case, this
+ License incorporates the limitation as if written in the body of
+ this License.
+
+ 10. The Free Software Foundation may publish revised and/or new
+ versions of the General Public License from time to time. Such
+ new versions will be similar in spirit to the present version, but
+ may differ in detail to address new problems or concerns.
+
+ Each version is given a distinguishing version number. If the
+ Program specifies a version number of this License which applies
+ to it and "any later version", you have the option of following
+ the terms and conditions either of that version or of any later
+ version published by the Free Software Foundation. If the Program
+ does not specify a version number of this License, you may choose
+ any version ever published by the Free Software Foundation.
+
+ 11. If you wish to incorporate parts of the Program into other free
+ programs whose distribution conditions are different, write to the
+ author to ask for permission. For software which is copyrighted
+ by the Free Software Foundation, write to the Free Software
+ Foundation; we sometimes make exceptions for this. Our decision
+ will be guided by the two goals of preserving the free status of
+ all derivatives of our free software and of promoting the sharing
+ and reuse of software generally.
+
+ NO WARRANTY
+
+ 12. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO
+ WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE
+ LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+ HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT
+ WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT
+ NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
+ FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE
+ QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+ PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
+ SERVICING, REPAIR OR CORRECTION.
+
+ 13. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
+ WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY
+ MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE
+ LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL,
+ INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR
+ INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+ DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU
+ OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY
+ OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
+ ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+How to Apply These Terms to Your New Programs
+=============================================
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these
+terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
+ Copyright (C) 19YY NAME OF AUTHOR
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+ Also add information on how to contact you by electronic and paper
+mail.
+
+ If the program is interactive, make it output a short notice like
+this when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) 19YY NAME OF AUTHOR
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
+ type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+ The hypothetical commands `show w' and `show c' should show the
+appropriate parts of the General Public License. Of course, the
+commands you use may be called something other than `show w' and `show
+c'; they could even be mouse-clicks or menu items--whatever suits your
+program.
+
+ You should also get your employer (if you work as a programmer) or
+your school, if any, to sign a "copyright disclaimer" for the program,
+if necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ SIGNATURE OF TY COON, 1 April 1989
+ Ty Coon, President of Vice
+
+ This General Public License does not permit incorporating your
+program into proprietary programs. If your program is a subroutine
+library, you may consider it more useful to permit linking proprietary
+applications with the library. If this is what you want to do, use the
+GNU Library General Public License instead of this License.
+
+
+File: gawk.info, Node: This Manual, Next: Getting Started, Prev: Copying, Up: Top
+
+Using this Manual
+*****************
+
+ The term `awk' refers to a particular program, and to the language
+you use to tell this program what to do. When we need to be careful,
+we call the program "the `awk' utility" and the language "the `awk'
+language." The term `gawk' refers to a version of `awk' developed as
+part the GNU project. The purpose of this manual is to explain both the
+`awk' language and how to run the `awk' utility.
+
+ While concentrating on the features of `gawk', the manual will also
+attempt to describe important differences between `gawk' and other
+`awk' implementations. In particular, any features that are not in the
+POSIX standard for `awk' will be noted.
+
+ The term "`awk' program" refers to a program written by you in the
+`awk' programming language.
+
+ *Note Getting Started with `awk': Getting Started, for the bare
+essentials you need to know to start using `awk'.
+
+ Some useful "one-liners" are included to give you a feel for the
+`awk' language (*note Useful "One-liners": One-liners.).
+
+ A sample `awk' program has been provided for you (*note Sample
+Program::.).
+
+ If you find terms that you aren't familiar with, try looking them up
+in the glossary (*note Glossary::.).
+
+ The entire `awk' language is summarized for quick reference in *Note
+`gawk' Summary: Gawk Summary. Look there if you just need to refresh
+your memory about a particular feature.
+
+ Most of the time complete `awk' programs are used as examples, but in
+some of the more advanced sections, only the part of the `awk' program
+that illustrates the concept being described is shown.
+
+* Menu:
+
+* Sample Data Files:: Sample data files for use in the `awk'
+ programs illustrated in this manual.
+
+
+File: gawk.info, Node: Sample Data Files, Prev: This Manual, Up: This Manual
+
+Data Files for the Examples
+===========================
+
+ Many of the examples in this manual take their input from two sample
+data files. The first, called `BBS-list', represents a list of
+computer bulletin board systems together with information about those
+systems. The second data file, called `inventory-shipped', contains
+information about shipments on a monthly basis. Each line of these
+files is one "record".
+
+ In the file `BBS-list', each record contains the name of a computer
+bulletin board, its phone number, the board's baud rate, and a code for
+the number of hours it is operational. An `A' in the last column means
+the board operates 24 hours a day. A `B' in the last column means the
+board operates evening and weekend hours, only. A `C' means the board
+operates only on weekends.
+
+ aardvark 555-5553 1200/300 B
+ alpo-net 555-3412 2400/1200/300 A
+ barfly 555-7685 1200/300 A
+ bites 555-1675 2400/1200/300 A
+ camelot 555-0542 300 C
+ core 555-2912 1200/300 C
+ fooey 555-1234 2400/1200/300 B
+ foot 555-6699 1200/300 B
+ macfoo 555-6480 1200/300 A
+ sdace 555-3430 2400/1200/300 A
+ sabafoo 555-2127 1200/300 C
+
+ The second data file, called `inventory-shipped', represents
+information about shipments during the year. Each record contains the
+month of the year, the number of green crates shipped, the number of
+red boxes shipped, the number of orange bags shipped, and the number of
+blue packages shipped, respectively. There are 16 entries, covering
+the 12 months of one year and 4 months of the next year.
+
+ Jan 13 25 15 115
+ Feb 15 32 24 226
+ Mar 15 24 34 228
+ Apr 31 52 63 420
+ May 16 34 29 208
+ Jun 31 42 75 492
+ Jul 24 34 67 436
+ Aug 15 34 47 316
+ Sep 13 55 37 277
+ Oct 29 54 68 525
+ Nov 20 87 82 577
+ Dec 17 35 61 401
+
+ Jan 21 36 64 620
+ Feb 26 58 80 652
+ Mar 24 75 70 495
+ Apr 21 70 74 514
+
+ If you are reading this in GNU Emacs using Info, you can copy the
+regions of text showing these sample files into your own test files.
+This way you can try out the examples shown in the remainder of this
+document. You do this by using the command `M-x write-region' to copy
+text from the Info file into a file for use with `awk' (*Note Misc File
+Ops: (emacs)Misc File Ops, for more information). Using this
+information, create your own `BBS-list' and `inventory-shipped' files,
+and practice what you learn in this manual.
+
+
+File: gawk.info, Node: Getting Started, Next: Reading Files, Prev: This Manual, Up: Top
+
+Getting Started with `awk'
+**************************
+
+ The basic function of `awk' is to search files for lines (or other
+units of text) that contain certain patterns. When a line matches one
+of the patterns, `awk' performs specified actions on that line. `awk'
+keeps processing input lines in this way until the end of the input
+file is reached.
+
+ When you run `awk', you specify an `awk' "program" which tells `awk'
+what to do. The program consists of a series of "rules". (It may also
+contain "function definitions", but that is an advanced feature, so we
+will ignore it for now. *Note User-defined Functions: User-defined.)
+Each rule specifies one pattern to search for, and one action to
+perform when that pattern is found.
+
+ Syntactically, a rule consists of a pattern followed by an action.
+The action is enclosed in curly braces to separate it from the pattern.
+Rules are usually separated by newlines. Therefore, an `awk' program
+looks like this:
+
+ PATTERN { ACTION }
+ PATTERN { ACTION }
+ ...
+
+* Menu:
+
+* Very Simple:: A very simple example.
+* Two Rules:: A less simple one-line example with two rules.
+* More Complex:: A more complex example.
+* Running gawk:: How to run `gawk' programs;
+ includes command line syntax.
+* Comments:: Adding documentation to `gawk' programs.
+* Statements/Lines:: Subdividing or combining statements into lines.
+* When:: When to use `gawk' and
+ when to use other things.
+
+
+File: gawk.info, Node: Very Simple, Next: Two Rules, Prev: Getting Started, Up: Getting Started
+
+A Very Simple Example
+=====================
+
+ The following command runs a simple `awk' program that searches the
+input file `BBS-list' for the string of characters: `foo'. (A string
+of characters is usually called, a "string". The term "string" is
+perhaps based on similar usage in English, such as "a string of
+pearls," or, "a string of cars in a train.")
+
+ awk '/foo/ { print $0 }' BBS-list
+
+When lines containing `foo' are found, they are printed, because
+`print $0' means print the current line. (Just `print' by itself means
+the same thing, so we could have written that instead.)
+
+ You will notice that slashes, `/', surround the string `foo' in the
+actual `awk' program. The slashes indicate that `foo' is a pattern to
+search for. This type of pattern is called a "regular expression", and
+is covered in more detail later (*note Regular Expressions as Patterns:
+Regexp.). There are single-quotes around the `awk' program so that the
+shell won't interpret any of it as special shell characters.
+
+ Here is what this program prints:
+
+ fooey 555-1234 2400/1200/300 B
+ foot 555-6699 1200/300 B
+ macfoo 555-6480 1200/300 A
+ sabafoo 555-2127 1200/300 C
+
+ In an `awk' rule, either the pattern or the action can be omitted,
+but not both. If the pattern is omitted, then the action is performed
+for *every* input line. If the action is omitted, the default action
+is to print all lines that match the pattern.
+
+ Thus, we could leave out the action (the `print' statement and the
+curly braces) in the above example, and the result would be the same:
+all lines matching the pattern `foo' would be printed. By comparison,
+omitting the `print' statement but retaining the curly braces makes an
+empty action that does nothing; then no lines would be printed.
+
+
+File: gawk.info, Node: Two Rules, Next: More Complex, Prev: Very Simple, Up: Getting Started
+
+An Example with Two Rules
+=========================
+
+ The `awk' utility reads the input files one line at a time. For
+each line, `awk' tries the patterns of each of the rules. If several
+patterns match then several actions are run, in the order in which they
+appear in the `awk' program. If no patterns match, then no actions are
+run.
+
+ After processing all the rules (perhaps none) that match the line,
+`awk' reads the next line (however, *note The `next' Statement: Next
+Statement.). This continues until the end of the file is reached.
+
+ For example, the `awk' program:
+
+ /12/ { print $0 }
+ /21/ { print $0 }
+
+contains two rules. The first rule has the string `12' as the pattern
+and `print $0' as the action. The second rule has the string `21' as
+the pattern and also has `print $0' as the action. Each rule's action
+is enclosed in its own pair of braces.
+
+ This `awk' program prints every line that contains the string `12'
+*or* the string `21'. If a line contains both strings, it is printed
+twice, once by each rule.
+
+ If we run this program on our two sample data files, `BBS-list' and
+`inventory-shipped', as shown here:
+
+ awk '/12/ { print $0 }
+ /21/ { print $0 }' BBS-list inventory-shipped
+
+we get the following output:
+
+ aardvark 555-5553 1200/300 B
+ alpo-net 555-3412 2400/1200/300 A
+ barfly 555-7685 1200/300 A
+ bites 555-1675 2400/1200/300 A
+ core 555-2912 1200/300 C
+ fooey 555-1234 2400/1200/300 B
+ foot 555-6699 1200/300 B
+ macfoo 555-6480 1200/300 A
+ sdace 555-3430 2400/1200/300 A
+ sabafoo 555-2127 1200/300 C
+ sabafoo 555-2127 1200/300 C
+ Jan 21 36 64 620
+ Apr 21 70 74 514
+
+Note how the line in `BBS-list' beginning with `sabafoo' was printed
+twice, once for each rule.
+
+
+File: gawk.info, Node: More Complex, Next: Running gawk, Prev: Two Rules, Up: Getting Started
+
+A More Complex Example
+======================
+
+ Here is an example to give you an idea of what typical `awk'
+programs do. This example shows how `awk' can be used to summarize,
+select, and rearrange the output of another utility. It uses features
+that haven't been covered yet, so don't worry if you don't understand
+all the details.
+
+ ls -l | awk '$5 == "Nov" { sum += $4 }
+ END { print sum }'
+
+ This command prints the total number of bytes in all the files in the
+current directory that were last modified in November (of any year).
+(In the C shell you would need to type a semicolon and then a backslash
+at the end of the first line; in a POSIX-compliant shell, such as the
+Bourne shell or the Bourne-Again shell, you can type the example as
+shown.)
+
+ The `ls -l' part of this example is a command that gives you a
+listing of the files in a directory, including file size and date. Its
+output looks like this:
+
+ -rw-r--r-- 1 close 1933 Nov 7 13:05 Makefile
+ -rw-r--r-- 1 close 10809 Nov 7 13:03 gawk.h
+ -rw-r--r-- 1 close 983 Apr 13 12:14 gawk.tab.h
+ -rw-r--r-- 1 close 31869 Jun 15 12:20 gawk.y
+ -rw-r--r-- 1 close 22414 Nov 7 13:03 gawk1.c
+ -rw-r--r-- 1 close 37455 Nov 7 13:03 gawk2.c
+ -rw-r--r-- 1 close 27511 Dec 9 13:07 gawk3.c
+ -rw-r--r-- 1 close 7989 Nov 7 13:03 gawk4.c
+
+The first field contains read-write permissions, the second field
+contains the number of links to the file, and the third field
+identifies the owner of the file. The fourth field contains the size
+of the file in bytes. The fifth, sixth, and seventh fields contain the
+month, day, and time, respectively, that the file was last modified.
+Finally, the eighth field contains the name of the file.
+
+ The `$5 == "Nov"' in our `awk' program is an expression that tests
+whether the fifth field of the output from `ls -l' matches the string
+`Nov'. Each time a line has the string `Nov' in its fifth field, the
+action `{ sum += $4 }' is performed. This adds the fourth field (the
+file size) to the variable `sum'. As a result, when `awk' has finished
+reading all the input lines, `sum' is the sum of the sizes of files
+whose lines matched the pattern. (This works because `awk' variables
+are automatically initialized to zero.)
+
+ After the last line of output from `ls' has been processed, the
+`END' rule is executed, and the value of `sum' is printed. In this
+example, the value of `sum' would be 80600.
+
+ These more advanced `awk' techniques are covered in later sections
+(*note Overview of Actions: Actions.). Before you can move on to more
+advanced `awk' programming, you have to know how `awk' interprets your
+input and displays your output. By manipulating fields and using
+`print' statements, you can produce some very useful and spectacular
+looking reports.
+
+
+File: gawk.info, Node: Running gawk, Next: Comments, Prev: More Complex, Up: Getting Started
+
+How to Run `awk' Programs
+=========================
+
+ There are several ways to run an `awk' program. If the program is
+short, it is easiest to include it in the command that runs `awk', like
+this:
+
+ awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ...
+
+where PROGRAM consists of a series of patterns and actions, as
+described earlier.
+
+ When the program is long, it is usually more convenient to put it in
+a file and run it with a command like this:
+
+ awk -f PROGRAM-FILE INPUT-FILE1 INPUT-FILE2 ...
+
+* Menu:
+
+* One-shot:: Running a short throw-away `awk' program.
+* Read Terminal:: Using no input files (input from
+ terminal instead).
+* Long:: Putting permanent `awk' programs in files.
+* Executable Scripts:: Making self-contained `awk' programs.
+
+
+File: gawk.info, Node: One-shot, Next: Read Terminal, Prev: Running gawk, Up: Running gawk
+
+One-shot Throw-away `awk' Programs
+----------------------------------
+
+ Once you are familiar with `awk', you will often type simple
+programs at the moment you want to use them. Then you can write the
+program as the first argument of the `awk' command, like this:
+
+ awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ...
+
+where PROGRAM consists of a series of PATTERNS and ACTIONS, as
+described earlier.
+
+ This command format instructs the shell to start `awk' and use the
+PROGRAM to process records in the input file(s). There are single
+quotes around PROGRAM so that the shell doesn't interpret any `awk'
+characters as special shell characters. They also cause the shell to
+treat all of PROGRAM as a single argument for `awk' and allow PROGRAM
+to be more than one line long.
+
+ This format is also useful for running short or medium-sized `awk'
+programs from shell scripts, because it avoids the need for a separate
+file for the `awk' program. A self-contained shell script is more
+reliable since there are no other files to misplace.
+
+
+File: gawk.info, Node: Read Terminal, Next: Long, Prev: One-shot, Up: Running gawk
+
+Running `awk' without Input Files
+---------------------------------
+
+ You can also run `awk' without any input files. If you type the
+command line:
+
+ awk 'PROGRAM'
+
+then `awk' applies the PROGRAM to the "standard input", which usually
+means whatever you type on the terminal. This continues until you
+indicate end-of-file by typing `Control-d'.
+
+ For example, if you execute this command:
+
+ awk '/th/'
+
+whatever you type next is taken as data for that `awk' program. If you
+go on to type the following data:
+
+ Kathy
+ Ben
+ Tom
+ Beth
+ Seth
+ Karen
+ Thomas
+ `Control-d'
+
+then `awk' prints this output:
+
+ Kathy
+ Beth
+ Seth
+
+as matching the pattern `th'. Notice that it did not recognize
+`Thomas' as matching the pattern. The `awk' language is "case
+sensitive", and matches patterns exactly. (However, you can override
+this with the variable `IGNORECASE'. *Note Case-sensitivity in
+Matching: Case-sensitivity.)
+
+
+File: gawk.info, Node: Long, Next: Executable Scripts, Prev: Read Terminal, Up: Running gawk
+
+Running Long Programs
+---------------------
+
+ Sometimes your `awk' programs can be very long. In this case it is
+more convenient to put the program into a separate file. To tell `awk'
+to use that file for its program, you type:
+
+ awk -f SOURCE-FILE INPUT-FILE1 INPUT-FILE2 ...
+
+ The `-f' instructs the `awk' utility to get the `awk' program from
+the file SOURCE-FILE. Any file name can be used for SOURCE-FILE. For
+example, you could put the program:
+
+ /th/
+
+into the file `th-prog'. Then this command:
+
+ awk -f th-prog
+
+does the same thing as this one:
+
+ awk '/th/'
+
+which was explained earlier (*note Running `awk' without Input Files:
+Read Terminal.). Note that you don't usually need single quotes around
+the file name that you specify with `-f', because most file names don't
+contain any of the shell's special characters. Notice that in
+`th-prog', the `awk' program did not have single quotes around it. The
+quotes are only needed for programs that are provided on the `awk'
+command line.
+
+ If you want to identify your `awk' program files clearly as such,
+you can add the extension `.awk' to the file name. This doesn't affect
+the execution of the `awk' program, but it does make "housekeeping"
+easier.
+
+
+File: gawk.info, Node: Executable Scripts, Prev: Long, Up: Running gawk
+
+Executable `awk' Programs
+-------------------------
+
+ Once you have learned `awk', you may want to write self-contained
+`awk' scripts, using the `#!' script mechanism. You can do this on
+many Unix systems (1) (and someday on GNU).
+
+ For example, you could create a text file named `hello', containing
+the following (where `BEGIN' is a feature we have not yet discussed):
+
+ #! /bin/awk -f
+
+ # a sample awk program
+ BEGIN { print "hello, world" }
+
+After making this file executable (with the `chmod' command), you can
+simply type:
+
+ hello
+
+at the shell, and the system will arrange to run `awk' (2) as if you
+had typed:
+
+ awk -f hello
+
+Self-contained `awk' scripts are useful when you want to write a
+program which users can invoke without knowing that the program is
+written in `awk'.
+
+ If your system does not support the `#!' mechanism, you can get a
+similar effect using a regular shell script. It would look something
+like this:
+
+ : The colon makes sure this script is executed by the Bourne shell.
+ awk 'PROGRAM' "$@"
+
+ Using this technique, it is *vital* to enclose the PROGRAM in single
+quotes to protect it from interpretation by the shell. If you omit the
+quotes, only a shell wizard can predict the results.
+
+ The `"$@"' causes the shell to forward all the command line
+arguments to the `awk' program, without interpretation. The first
+line, which starts with a colon, is used so that this shell script will
+work even if invoked by a user who uses the C shell.
+
+ ---------- Footnotes ----------
+
+ (1) The `#!' mechanism works on Unix systems derived from Berkeley
+Unix, System V Release 4, and some System V Release 3 systems.
+
+ (2) The line beginning with `#!' lists the full pathname of an
+interpreter to be run, and an optional initial command line argument to
+pass to that interpreter. The operating system then runs the
+interpreter with the given argument and the full argument list of the
+executed program. The first argument in the list is the full pathname
+of the `awk' program. The rest of the argument list will either be
+options to `awk', or data files, or both.
+
+
+File: gawk.info, Node: Comments, Next: Statements/Lines, Prev: Running gawk, Up: Getting Started
+
+Comments in `awk' Programs
+==========================
+
+ A "comment" is some text that is included in a program for the sake
+of human readers, and that is not really part of the program. Comments
+can explain what the program does, and how it works. Nearly all
+programming languages have provisions for comments, because programs are
+typically hard to understand without their extra help.
+
+ In the `awk' language, a comment starts with the sharp sign
+character, `#', and continues to the end of the line. The `awk'
+language ignores the rest of a line following a sharp sign. For
+example, we could have put the following into `th-prog':
+
+ # This program finds records containing the pattern `th'. This is how
+ # you continue comments on additional lines.
+ /th/
+
+ You can put comment lines into keyboard-composed throw-away `awk'
+programs also, but this usually isn't very useful; the purpose of a
+comment is to help you or another person understand the program at a
+later time.
+