diff options
Diffstat (limited to 'gawk.info-1')
-rw-r--r-- | gawk.info-1 | 1069 |
1 files changed, 1069 insertions, 0 deletions
diff --git a/gawk.info-1 b/gawk.info-1 new file mode 100644 index 00000000..e6f2e750 --- /dev/null +++ b/gawk.info-1 @@ -0,0 +1,1069 @@ +This is Info file gawk.info, produced by Makeinfo-1.54 from the input +file gawk.texi. + + This file documents `awk', a program that you can use to select +particular records in a file and perform operations upon them. + + This is Edition 0.15 of `The GAWK Manual', +for the 2.15 version of the GNU implementation +of AWK. + + Copyright (C) 1989, 1991, 1992, 1993 Free Software Foundation, Inc. + + Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + + Permission is granted to copy and distribute modified versions of +this manual under the conditions for verbatim copying, provided that +the entire resulting derived work is distributed under the terms of a +permission notice identical to this one. + + Permission is granted to copy and distribute translations of this +manual into another language, under the above conditions for modified +versions, except that this permission notice may be stated in a +translation approved by the Foundation. + + +File: gawk.info, Node: Top, Next: Preface, Prev: (dir), Up: (dir) + +General Introduction +******************** + + This file documents `awk', a program that you can use to select +particular records in a file and perform operations upon them. + + This is Edition 0.15 of `The GAWK Manual', +for the 2.15 version of the GNU implementation +of AWK. + +* Menu: + +* Preface:: What you can do with `awk'; brief history + and acknowledgements. +* Copying:: Your right to copy and distribute `gawk'. +* This Manual:: Using this manual. + Includes sample input files that you can use. +* Getting Started:: A basic introduction to using `awk'. + How to run an `awk' program. + Command line syntax. +* Reading Files:: How to read files and manipulate fields. +* Printing:: How to print using `awk'. Describes the + `print' and `printf' statements. + Also describes redirection of output. +* One-liners:: Short, sample `awk' programs. +* Patterns:: The various types of patterns + explained in detail. +* Actions:: The various types of actions are + introduced here. Describes + expressions and the various operators in + detail. Also describes comparison expressions. +* Expressions:: Expressions are the basic building + blocks of statements. +* Statements:: The various control statements are + described in detail. +* Arrays:: The description and use of arrays. + Also includes array-oriented control + statements. +* Built-in:: The built-in functions are summarized here. +* User-defined:: User-defined functions are described in detail. +* Built-in Variables:: Built-in Variables +* Command Line:: How to run `gawk'. +* Language History:: The evolution of the `awk' language. +* Installation:: Installing `gawk' under + various operating systems. +* Gawk Summary:: `gawk' Options and Language Summary. +* Sample Program:: A sample `awk' program with a + complete explanation. +* Bugs:: Reporting Problems and Bugs. +* Notes:: Something about the + implementation of `gawk'. +* Glossary:: An explanation of some unfamiliar terms. +* Index:: + + +File: gawk.info, Node: Preface, Next: Copying, Prev: Top, Up: Top + +Preface +******* + + If you are like many computer users, you would frequently like to +make changes in various text files wherever certain patterns appear, or +extract data from parts of certain lines while discarding the rest. To +write a program to do this in a language such as C or Pascal is a +time-consuming inconvenience that may take many lines of code. The job +may be easier with `awk'. + + The `awk' utility interprets a special-purpose programming language +that makes it possible to handle simple data-reformatting jobs easily +with just a few lines of code. + + The GNU implementation of `awk' is called `gawk'; it is fully upward +compatible with the System V Release 4 version of `awk'. `gawk' is +also upward compatible with the POSIX (draft) specification of the +`awk' language. This means that all properly written `awk' programs +should work with `gawk'. Thus, we usually don't distinguish between +`gawk' and other `awk' implementations in this manual. + + This manual teaches you what `awk' does and how you can use `awk' +effectively. You should already be familiar with basic system commands +such as `ls'. Using `awk' you can: + + * manage small, personal databases + + * generate reports + + * validate data + + * produce indexes, and perform other document preparation tasks + + * even experiment with algorithms that can be adapted later to other + computer languages + +* Menu: + +* History:: The history of `gawk' and + `awk'. Acknowledgements. + + +File: gawk.info, Node: History, Prev: Preface, Up: Preface + +History of `awk' and `gawk' +=========================== + + The name `awk' comes from the initials of its designers: Alfred V. +Aho, Peter J. Weinberger, and Brian W. Kernighan. The original version +of `awk' was written in 1977. In 1985 a new version made the +programming language more powerful, introducing user-defined functions, +multiple input streams, and computed regular expressions. This new +version became generally available with System V Release 3.1. The +version in System V Release 4 added some new features and also cleaned +up the behavior in some of the "dark corners" of the language. The +specification for `awk' in the POSIX Command Language and Utilities +standard further clarified the language based on feedback from both the +`gawk' designers, and the original `awk' designers. + + The GNU implementation, `gawk', was written in 1986 by Paul Rubin +and Jay Fenlason, with advice from Richard Stallman. John Woods +contributed parts of the code as well. In 1988 and 1989, David +Trueman, with help from Arnold Robbins, thoroughly reworked `gawk' for +compatibility with the newer `awk'. Current development (1992) focuses +on bug fixes, performance improvements, and standards compliance. + + We need to thank many people for their assistance in producing this +manual. Jay Fenlason contributed many ideas and sample programs. +Richard Mlynarik and Robert J. Chassell gave helpful comments on early +drafts of this manual. The paper `A Supplemental Document for `awk'' +by John W. Pierce of the Chemistry Department at UC San Diego, +pinpointed several issues relevant both to `awk' implementation and to +this manual, that would otherwise have escaped us. David Trueman, Pat +Rankin, and Michal Jaegermann also contributed sections of the manual. + + The following people provided many helpful comments on this edition +of the manual: Rick Adams, Michael Brennan, Rich Burridge, Diane Close, +Christopher ("Topher") Eliot, Michael Lijewski, Pat Rankin, Miriam +Robbins, and Michal Jaegermann. Robert J. Chassell provided much +valuable advice on the use of Texinfo. + + Finally, we would like to thank Brian Kernighan of Bell Labs for +invaluable assistance during the testing and debugging of `gawk', and +for help in clarifying numerous points about the language. + + +File: gawk.info, Node: Copying, Next: This Manual, Prev: Preface, Up: Top + +GNU GENERAL PUBLIC LICENSE +************************** + + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 675 Mass Ave, Cambridge, MA 02139, USA + + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +Preamble +======== + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it in +new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, +and (2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 1. This License applies to any program or other work which contains a + notice placed by the copyright holder saying it may be distributed + under the terms of this General Public License. The "Program", + below, refers to any such program or work, and a "work based on + the Program" means either the Program or any derivative work under + copyright law: that is to say, a work containing the Program or a + portion of it, either verbatim or with modifications and/or + translated into another language. (Hereinafter, translation is + included without limitation in the term "modification".) Each + licensee is addressed as "you". + + Activities other than copying, distribution and modification are + not covered by this License; they are outside its scope. The act + of running the Program is not restricted, and the output from the + Program is covered only if its contents constitute a work based on + the Program (independent of having been made by running the + Program). Whether that is true depends on what the Program does. + + 2. You may copy and distribute verbatim copies of the Program's + source code as you receive it, in any medium, provided that you + conspicuously and appropriately publish on each copy an appropriate + copyright notice and disclaimer of warranty; keep intact all the + notices that refer to this License and to the absence of any + warranty; and give any other recipients of the Program a copy of + this License along with the Program. + + You may charge a fee for the physical act of transferring a copy, + and you may at your option offer warranty protection in exchange + for a fee. + + 3. You may modify your copy or copies of the Program or any portion + of it, thus forming a work based on the Program, and copy and + distribute such modifications or work under the terms of Section 1 + above, provided that you also meet all of these conditions: + + a. You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b. You must cause any work that you distribute or publish, that + in whole or in part contains or is derived from the Program + or any part thereof, to be licensed as a whole at no charge + to all third parties under the terms of this License. + + c. If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display + an announcement including an appropriate copyright notice and + a notice that there is no warranty (or else, saying that you + provide a warranty) and that users may redistribute the + program under these conditions, and telling the user how to + view a copy of this License. (Exception: if the Program + itself is interactive but does not normally print such an + announcement, your work based on the Program is not required + to print an announcement.) + + These requirements apply to the modified work as a whole. If + identifiable sections of that work are not derived from the + Program, and can be reasonably considered independent and separate + works in themselves, then this License, and its terms, do not + apply to those sections when you distribute them as separate + works. But when you distribute the same sections as part of a + whole which is a work based on the Program, the distribution of + the whole must be on the terms of this License, whose permissions + for other licensees extend to the entire whole, and thus to each + and every part regardless of who wrote it. + + Thus, it is not the intent of this section to claim rights or + contest your rights to work written entirely by you; rather, the + intent is to exercise the right to control the distribution of + derivative or collective works based on the Program. + + In addition, mere aggregation of another work not based on the + Program with the Program (or with a work based on the Program) on + a volume of a storage or distribution medium does not bring the + other work under the scope of this License. + + 4. You may copy and distribute the Program (or a work based on it, + under Section 2) in object code or executable form under the terms + of Sections 1 and 2 above provided that you also do one of the + following: + + a. Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of + Sections 1 and 2 above on a medium customarily used for + software interchange; or, + + b. Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a + medium customarily used for software interchange; or, + + c. Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with + such an offer, in accord with Subsection b above.) + + The source code for a work means the preferred form of the work for + making modifications to it. For an executable work, complete + source code means all the source code for all modules it contains, + plus any associated interface definition files, plus the scripts + used to control compilation and installation of the executable. + However, as a special exception, the source code distributed need + not include anything that is normally distributed (in either + source or binary form) with the major components (compiler, + kernel, and so on) of the operating system on which the executable + runs, unless that component itself accompanies the executable. + + If distribution of executable or object code is made by offering + access to copy from a designated place, then offering equivalent + access to copy the source code from the same place counts as + distribution of the source code, even though third parties are not + compelled to copy the source along with the object code. + + 5. You may not copy, modify, sublicense, or distribute the Program + except as expressly provided under this License. Any attempt + otherwise to copy, modify, sublicense or distribute the Program is + void, and will automatically terminate your rights under this + License. However, parties who have received copies, or rights, + from you under this License will not have their licenses + terminated so long as such parties remain in full compliance. + + 6. You are not required to accept this License, since you have not + signed it. However, nothing else grants you permission to modify + or distribute the Program or its derivative works. These actions + are prohibited by law if you do not accept this License. + Therefore, by modifying or distributing the Program (or any work + based on the Program), you indicate your acceptance of this + License to do so, and all its terms and conditions for copying, + distributing or modifying the Program or works based on it. + + 7. Each time you redistribute the Program (or any work based on the + Program), the recipient automatically receives a license from the + original licensor to copy, distribute or modify the Program + subject to these terms and conditions. You may not impose any + further restrictions on the recipients' exercise of the rights + granted herein. You are not responsible for enforcing compliance + by third parties to this License. + + 8. If, as a consequence of a court judgment or allegation of patent + infringement or for any other reason (not limited to patent + issues), conditions are imposed on you (whether by court order, + agreement or otherwise) that contradict the conditions of this + License, they do not excuse you from the conditions of this + License. If you cannot distribute so as to satisfy simultaneously + your obligations under this License and any other pertinent + obligations, then as a consequence you may not distribute the + Program at all. For example, if a patent license would not permit + royalty-free redistribution of the Program by all those who + receive copies directly or indirectly through you, then the only + way you could satisfy both it and this License would be to refrain + entirely from distribution of the Program. + + If any portion of this section is held invalid or unenforceable + under any particular circumstance, the balance of the section is + intended to apply and the section as a whole is intended to apply + in other circumstances. + + It is not the purpose of this section to induce you to infringe any + patents or other property right claims or to contest validity of + any such claims; this section has the sole purpose of protecting + the integrity of the free software distribution system, which is + implemented by public license practices. Many people have made + generous contributions to the wide range of software distributed + through that system in reliance on consistent application of that + system; it is up to the author/donor to decide if he or she is + willing to distribute software through any other system and a + licensee cannot impose that choice. + + This section is intended to make thoroughly clear what is believed + to be a consequence of the rest of this License. + + 9. If the distribution and/or use of the Program is restricted in + certain countries either by patents or by copyrighted interfaces, + the original copyright holder who places the Program under this + License may add an explicit geographical distribution limitation + excluding those countries, so that distribution is permitted only + in or among countries not thus excluded. In such case, this + License incorporates the limitation as if written in the body of + this License. + + 10. The Free Software Foundation may publish revised and/or new + versions of the General Public License from time to time. Such + new versions will be similar in spirit to the present version, but + may differ in detail to address new problems or concerns. + + Each version is given a distinguishing version number. If the + Program specifies a version number of this License which applies + to it and "any later version", you have the option of following + the terms and conditions either of that version or of any later + version published by the Free Software Foundation. If the Program + does not specify a version number of this License, you may choose + any version ever published by the Free Software Foundation. + + 11. If you wish to incorporate parts of the Program into other free + programs whose distribution conditions are different, write to the + author to ask for permission. For software which is copyrighted + by the Free Software Foundation, write to the Free Software + Foundation; we sometimes make exceptions for this. Our decision + will be guided by the two goals of preserving the free status of + all derivatives of our free software and of promoting the sharing + and reuse of software generally. + + NO WARRANTY + + 12. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO + WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE + LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT + HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT + WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT + NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND + FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE + QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE + PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY + SERVICING, REPAIR OR CORRECTION. + + 13. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN + WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY + MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE + LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, + INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR + INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF + DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU + OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY + OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN + ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + +How to Apply These Terms to Your New Programs +============================================= + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these +terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES. + Copyright (C) 19YY NAME OF AUTHOR + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + Also add information on how to contact you by electronic and paper +mail. + + If the program is interactive, make it output a short notice like +this when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) 19YY NAME OF AUTHOR + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details + type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + + The hypothetical commands `show w' and `show c' should show the +appropriate parts of the General Public License. Of course, the +commands you use may be called something other than `show w' and `show +c'; they could even be mouse-clicks or menu items--whatever suits your +program. + + You should also get your employer (if you work as a programmer) or +your school, if any, to sign a "copyright disclaimer" for the program, +if necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + SIGNATURE OF TY COON, 1 April 1989 + Ty Coon, President of Vice + + This General Public License does not permit incorporating your +program into proprietary programs. If your program is a subroutine +library, you may consider it more useful to permit linking proprietary +applications with the library. If this is what you want to do, use the +GNU Library General Public License instead of this License. + + +File: gawk.info, Node: This Manual, Next: Getting Started, Prev: Copying, Up: Top + +Using this Manual +***************** + + The term `awk' refers to a particular program, and to the language +you use to tell this program what to do. When we need to be careful, +we call the program "the `awk' utility" and the language "the `awk' +language." The term `gawk' refers to a version of `awk' developed as +part the GNU project. The purpose of this manual is to explain both the +`awk' language and how to run the `awk' utility. + + While concentrating on the features of `gawk', the manual will also +attempt to describe important differences between `gawk' and other +`awk' implementations. In particular, any features that are not in the +POSIX standard for `awk' will be noted. + + The term "`awk' program" refers to a program written by you in the +`awk' programming language. + + *Note Getting Started with `awk': Getting Started, for the bare +essentials you need to know to start using `awk'. + + Some useful "one-liners" are included to give you a feel for the +`awk' language (*note Useful "One-liners": One-liners.). + + A sample `awk' program has been provided for you (*note Sample +Program::.). + + If you find terms that you aren't familiar with, try looking them up +in the glossary (*note Glossary::.). + + The entire `awk' language is summarized for quick reference in *Note +`gawk' Summary: Gawk Summary. Look there if you just need to refresh +your memory about a particular feature. + + Most of the time complete `awk' programs are used as examples, but in +some of the more advanced sections, only the part of the `awk' program +that illustrates the concept being described is shown. + +* Menu: + +* Sample Data Files:: Sample data files for use in the `awk' + programs illustrated in this manual. + + +File: gawk.info, Node: Sample Data Files, Prev: This Manual, Up: This Manual + +Data Files for the Examples +=========================== + + Many of the examples in this manual take their input from two sample +data files. The first, called `BBS-list', represents a list of +computer bulletin board systems together with information about those +systems. The second data file, called `inventory-shipped', contains +information about shipments on a monthly basis. Each line of these +files is one "record". + + In the file `BBS-list', each record contains the name of a computer +bulletin board, its phone number, the board's baud rate, and a code for +the number of hours it is operational. An `A' in the last column means +the board operates 24 hours a day. A `B' in the last column means the +board operates evening and weekend hours, only. A `C' means the board +operates only on weekends. + + aardvark 555-5553 1200/300 B + alpo-net 555-3412 2400/1200/300 A + barfly 555-7685 1200/300 A + bites 555-1675 2400/1200/300 A + camelot 555-0542 300 C + core 555-2912 1200/300 C + fooey 555-1234 2400/1200/300 B + foot 555-6699 1200/300 B + macfoo 555-6480 1200/300 A + sdace 555-3430 2400/1200/300 A + sabafoo 555-2127 1200/300 C + + The second data file, called `inventory-shipped', represents +information about shipments during the year. Each record contains the +month of the year, the number of green crates shipped, the number of +red boxes shipped, the number of orange bags shipped, and the number of +blue packages shipped, respectively. There are 16 entries, covering +the 12 months of one year and 4 months of the next year. + + Jan 13 25 15 115 + Feb 15 32 24 226 + Mar 15 24 34 228 + Apr 31 52 63 420 + May 16 34 29 208 + Jun 31 42 75 492 + Jul 24 34 67 436 + Aug 15 34 47 316 + Sep 13 55 37 277 + Oct 29 54 68 525 + Nov 20 87 82 577 + Dec 17 35 61 401 + + Jan 21 36 64 620 + Feb 26 58 80 652 + Mar 24 75 70 495 + Apr 21 70 74 514 + + If you are reading this in GNU Emacs using Info, you can copy the +regions of text showing these sample files into your own test files. +This way you can try out the examples shown in the remainder of this +document. You do this by using the command `M-x write-region' to copy +text from the Info file into a file for use with `awk' (*Note Misc File +Ops: (emacs)Misc File Ops, for more information). Using this +information, create your own `BBS-list' and `inventory-shipped' files, +and practice what you learn in this manual. + + +File: gawk.info, Node: Getting Started, Next: Reading Files, Prev: This Manual, Up: Top + +Getting Started with `awk' +************************** + + The basic function of `awk' is to search files for lines (or other +units of text) that contain certain patterns. When a line matches one +of the patterns, `awk' performs specified actions on that line. `awk' +keeps processing input lines in this way until the end of the input +file is reached. + + When you run `awk', you specify an `awk' "program" which tells `awk' +what to do. The program consists of a series of "rules". (It may also +contain "function definitions", but that is an advanced feature, so we +will ignore it for now. *Note User-defined Functions: User-defined.) +Each rule specifies one pattern to search for, and one action to +perform when that pattern is found. + + Syntactically, a rule consists of a pattern followed by an action. +The action is enclosed in curly braces to separate it from the pattern. +Rules are usually separated by newlines. Therefore, an `awk' program +looks like this: + + PATTERN { ACTION } + PATTERN { ACTION } + ... + +* Menu: + +* Very Simple:: A very simple example. +* Two Rules:: A less simple one-line example with two rules. +* More Complex:: A more complex example. +* Running gawk:: How to run `gawk' programs; + includes command line syntax. +* Comments:: Adding documentation to `gawk' programs. +* Statements/Lines:: Subdividing or combining statements into lines. +* When:: When to use `gawk' and + when to use other things. + + +File: gawk.info, Node: Very Simple, Next: Two Rules, Prev: Getting Started, Up: Getting Started + +A Very Simple Example +===================== + + The following command runs a simple `awk' program that searches the +input file `BBS-list' for the string of characters: `foo'. (A string +of characters is usually called, a "string". The term "string" is +perhaps based on similar usage in English, such as "a string of +pearls," or, "a string of cars in a train.") + + awk '/foo/ { print $0 }' BBS-list + +When lines containing `foo' are found, they are printed, because +`print $0' means print the current line. (Just `print' by itself means +the same thing, so we could have written that instead.) + + You will notice that slashes, `/', surround the string `foo' in the +actual `awk' program. The slashes indicate that `foo' is a pattern to +search for. This type of pattern is called a "regular expression", and +is covered in more detail later (*note Regular Expressions as Patterns: +Regexp.). There are single-quotes around the `awk' program so that the +shell won't interpret any of it as special shell characters. + + Here is what this program prints: + + fooey 555-1234 2400/1200/300 B + foot 555-6699 1200/300 B + macfoo 555-6480 1200/300 A + sabafoo 555-2127 1200/300 C + + In an `awk' rule, either the pattern or the action can be omitted, +but not both. If the pattern is omitted, then the action is performed +for *every* input line. If the action is omitted, the default action +is to print all lines that match the pattern. + + Thus, we could leave out the action (the `print' statement and the +curly braces) in the above example, and the result would be the same: +all lines matching the pattern `foo' would be printed. By comparison, +omitting the `print' statement but retaining the curly braces makes an +empty action that does nothing; then no lines would be printed. + + +File: gawk.info, Node: Two Rules, Next: More Complex, Prev: Very Simple, Up: Getting Started + +An Example with Two Rules +========================= + + The `awk' utility reads the input files one line at a time. For +each line, `awk' tries the patterns of each of the rules. If several +patterns match then several actions are run, in the order in which they +appear in the `awk' program. If no patterns match, then no actions are +run. + + After processing all the rules (perhaps none) that match the line, +`awk' reads the next line (however, *note The `next' Statement: Next +Statement.). This continues until the end of the file is reached. + + For example, the `awk' program: + + /12/ { print $0 } + /21/ { print $0 } + +contains two rules. The first rule has the string `12' as the pattern +and `print $0' as the action. The second rule has the string `21' as +the pattern and also has `print $0' as the action. Each rule's action +is enclosed in its own pair of braces. + + This `awk' program prints every line that contains the string `12' +*or* the string `21'. If a line contains both strings, it is printed +twice, once by each rule. + + If we run this program on our two sample data files, `BBS-list' and +`inventory-shipped', as shown here: + + awk '/12/ { print $0 } + /21/ { print $0 }' BBS-list inventory-shipped + +we get the following output: + + aardvark 555-5553 1200/300 B + alpo-net 555-3412 2400/1200/300 A + barfly 555-7685 1200/300 A + bites 555-1675 2400/1200/300 A + core 555-2912 1200/300 C + fooey 555-1234 2400/1200/300 B + foot 555-6699 1200/300 B + macfoo 555-6480 1200/300 A + sdace 555-3430 2400/1200/300 A + sabafoo 555-2127 1200/300 C + sabafoo 555-2127 1200/300 C + Jan 21 36 64 620 + Apr 21 70 74 514 + +Note how the line in `BBS-list' beginning with `sabafoo' was printed +twice, once for each rule. + + +File: gawk.info, Node: More Complex, Next: Running gawk, Prev: Two Rules, Up: Getting Started + +A More Complex Example +====================== + + Here is an example to give you an idea of what typical `awk' +programs do. This example shows how `awk' can be used to summarize, +select, and rearrange the output of another utility. It uses features +that haven't been covered yet, so don't worry if you don't understand +all the details. + + ls -l | awk '$5 == "Nov" { sum += $4 } + END { print sum }' + + This command prints the total number of bytes in all the files in the +current directory that were last modified in November (of any year). +(In the C shell you would need to type a semicolon and then a backslash +at the end of the first line; in a POSIX-compliant shell, such as the +Bourne shell or the Bourne-Again shell, you can type the example as +shown.) + + The `ls -l' part of this example is a command that gives you a +listing of the files in a directory, including file size and date. Its +output looks like this: + + -rw-r--r-- 1 close 1933 Nov 7 13:05 Makefile + -rw-r--r-- 1 close 10809 Nov 7 13:03 gawk.h + -rw-r--r-- 1 close 983 Apr 13 12:14 gawk.tab.h + -rw-r--r-- 1 close 31869 Jun 15 12:20 gawk.y + -rw-r--r-- 1 close 22414 Nov 7 13:03 gawk1.c + -rw-r--r-- 1 close 37455 Nov 7 13:03 gawk2.c + -rw-r--r-- 1 close 27511 Dec 9 13:07 gawk3.c + -rw-r--r-- 1 close 7989 Nov 7 13:03 gawk4.c + +The first field contains read-write permissions, the second field +contains the number of links to the file, and the third field +identifies the owner of the file. The fourth field contains the size +of the file in bytes. The fifth, sixth, and seventh fields contain the +month, day, and time, respectively, that the file was last modified. +Finally, the eighth field contains the name of the file. + + The `$5 == "Nov"' in our `awk' program is an expression that tests +whether the fifth field of the output from `ls -l' matches the string +`Nov'. Each time a line has the string `Nov' in its fifth field, the +action `{ sum += $4 }' is performed. This adds the fourth field (the +file size) to the variable `sum'. As a result, when `awk' has finished +reading all the input lines, `sum' is the sum of the sizes of files +whose lines matched the pattern. (This works because `awk' variables +are automatically initialized to zero.) + + After the last line of output from `ls' has been processed, the +`END' rule is executed, and the value of `sum' is printed. In this +example, the value of `sum' would be 80600. + + These more advanced `awk' techniques are covered in later sections +(*note Overview of Actions: Actions.). Before you can move on to more +advanced `awk' programming, you have to know how `awk' interprets your +input and displays your output. By manipulating fields and using +`print' statements, you can produce some very useful and spectacular +looking reports. + + +File: gawk.info, Node: Running gawk, Next: Comments, Prev: More Complex, Up: Getting Started + +How to Run `awk' Programs +========================= + + There are several ways to run an `awk' program. If the program is +short, it is easiest to include it in the command that runs `awk', like +this: + + awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ... + +where PROGRAM consists of a series of patterns and actions, as +described earlier. + + When the program is long, it is usually more convenient to put it in +a file and run it with a command like this: + + awk -f PROGRAM-FILE INPUT-FILE1 INPUT-FILE2 ... + +* Menu: + +* One-shot:: Running a short throw-away `awk' program. +* Read Terminal:: Using no input files (input from + terminal instead). +* Long:: Putting permanent `awk' programs in files. +* Executable Scripts:: Making self-contained `awk' programs. + + +File: gawk.info, Node: One-shot, Next: Read Terminal, Prev: Running gawk, Up: Running gawk + +One-shot Throw-away `awk' Programs +---------------------------------- + + Once you are familiar with `awk', you will often type simple +programs at the moment you want to use them. Then you can write the +program as the first argument of the `awk' command, like this: + + awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ... + +where PROGRAM consists of a series of PATTERNS and ACTIONS, as +described earlier. + + This command format instructs the shell to start `awk' and use the +PROGRAM to process records in the input file(s). There are single +quotes around PROGRAM so that the shell doesn't interpret any `awk' +characters as special shell characters. They also cause the shell to +treat all of PROGRAM as a single argument for `awk' and allow PROGRAM +to be more than one line long. + + This format is also useful for running short or medium-sized `awk' +programs from shell scripts, because it avoids the need for a separate +file for the `awk' program. A self-contained shell script is more +reliable since there are no other files to misplace. + + +File: gawk.info, Node: Read Terminal, Next: Long, Prev: One-shot, Up: Running gawk + +Running `awk' without Input Files +--------------------------------- + + You can also run `awk' without any input files. If you type the +command line: + + awk 'PROGRAM' + +then `awk' applies the PROGRAM to the "standard input", which usually +means whatever you type on the terminal. This continues until you +indicate end-of-file by typing `Control-d'. + + For example, if you execute this command: + + awk '/th/' + +whatever you type next is taken as data for that `awk' program. If you +go on to type the following data: + + Kathy + Ben + Tom + Beth + Seth + Karen + Thomas + `Control-d' + +then `awk' prints this output: + + Kathy + Beth + Seth + +as matching the pattern `th'. Notice that it did not recognize +`Thomas' as matching the pattern. The `awk' language is "case +sensitive", and matches patterns exactly. (However, you can override +this with the variable `IGNORECASE'. *Note Case-sensitivity in +Matching: Case-sensitivity.) + + +File: gawk.info, Node: Long, Next: Executable Scripts, Prev: Read Terminal, Up: Running gawk + +Running Long Programs +--------------------- + + Sometimes your `awk' programs can be very long. In this case it is +more convenient to put the program into a separate file. To tell `awk' +to use that file for its program, you type: + + awk -f SOURCE-FILE INPUT-FILE1 INPUT-FILE2 ... + + The `-f' instructs the `awk' utility to get the `awk' program from +the file SOURCE-FILE. Any file name can be used for SOURCE-FILE. For +example, you could put the program: + + /th/ + +into the file `th-prog'. Then this command: + + awk -f th-prog + +does the same thing as this one: + + awk '/th/' + +which was explained earlier (*note Running `awk' without Input Files: +Read Terminal.). Note that you don't usually need single quotes around +the file name that you specify with `-f', because most file names don't +contain any of the shell's special characters. Notice that in +`th-prog', the `awk' program did not have single quotes around it. The +quotes are only needed for programs that are provided on the `awk' +command line. + + If you want to identify your `awk' program files clearly as such, +you can add the extension `.awk' to the file name. This doesn't affect +the execution of the `awk' program, but it does make "housekeeping" +easier. + + +File: gawk.info, Node: Executable Scripts, Prev: Long, Up: Running gawk + +Executable `awk' Programs +------------------------- + + Once you have learned `awk', you may want to write self-contained +`awk' scripts, using the `#!' script mechanism. You can do this on +many Unix systems (1) (and someday on GNU). + + For example, you could create a text file named `hello', containing +the following (where `BEGIN' is a feature we have not yet discussed): + + #! /bin/awk -f + + # a sample awk program + BEGIN { print "hello, world" } + +After making this file executable (with the `chmod' command), you can +simply type: + + hello + +at the shell, and the system will arrange to run `awk' (2) as if you +had typed: + + awk -f hello + +Self-contained `awk' scripts are useful when you want to write a +program which users can invoke without knowing that the program is +written in `awk'. + + If your system does not support the `#!' mechanism, you can get a +similar effect using a regular shell script. It would look something +like this: + + : The colon makes sure this script is executed by the Bourne shell. + awk 'PROGRAM' "$@" + + Using this technique, it is *vital* to enclose the PROGRAM in single +quotes to protect it from interpretation by the shell. If you omit the +quotes, only a shell wizard can predict the results. + + The `"$@"' causes the shell to forward all the command line +arguments to the `awk' program, without interpretation. The first +line, which starts with a colon, is used so that this shell script will +work even if invoked by a user who uses the C shell. + + ---------- Footnotes ---------- + + (1) The `#!' mechanism works on Unix systems derived from Berkeley +Unix, System V Release 4, and some System V Release 3 systems. + + (2) The line beginning with `#!' lists the full pathname of an +interpreter to be run, and an optional initial command line argument to +pass to that interpreter. The operating system then runs the +interpreter with the given argument and the full argument list of the +executed program. The first argument in the list is the full pathname +of the `awk' program. The rest of the argument list will either be +options to `awk', or data files, or both. + + +File: gawk.info, Node: Comments, Next: Statements/Lines, Prev: Running gawk, Up: Getting Started + +Comments in `awk' Programs +========================== + + A "comment" is some text that is included in a program for the sake +of human readers, and that is not really part of the program. Comments +can explain what the program does, and how it works. Nearly all +programming languages have provisions for comments, because programs are +typically hard to understand without their extra help. + + In the `awk' language, a comment starts with the sharp sign +character, `#', and continues to the end of the line. The `awk' +language ignores the rest of a line following a sharp sign. For +example, we could have put the following into `th-prog': + + # This program finds records containing the pattern `th'. This is how + # you continue comments on additional lines. + /th/ + + You can put comment lines into keyboard-composed throw-away `awk' +programs also, but this usually isn't very useful; the purpose of a +comment is to help you or another person understand the program at a +later time. + |