aboutsummaryrefslogtreecommitdiffstats
path: root/doc/gawk.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gawk.texi')
-rw-r--r--doc/gawk.texi766
1 files changed, 766 insertions, 0 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index 5b3dd71c..2d68b9cc 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -285,6 +285,7 @@ particular records in a file and perform operations upon them.
* Functions:: Built-in and user-defined functions.
* Internationalization:: Getting @command{gawk} to speak your
language.
+* Arbitrary Precision Arithmetic:: Arbitrary precision arithmetic with @command{gawk}.
* Advanced Features:: Stuff for advanced users, specific to
@command{gawk}.
* Library Functions:: A Library of @command{awk} Functions.
@@ -551,6 +552,21 @@ particular records in a file and perform operations upon them.
* I18N Portability:: @command{awk}-level portability issues.
* I18N Example:: A simple i18n example.
* Gawk I18N:: @command{gawk} is also internationalized.
+* Floating-point Programming:: Effective floating-point programming.
+* Floating-point Representation:: Binary floating-point representation.
+* Floating-point Context:: Floating-point context.
+* Rounding Mode:: Floating-point rounding mode.
+* Arbitrary Precision Floats:: Arbitrary precision floating-point
+ arithmetic with @command{gawk}.
+* Setting Precision:: Setting the working precision.
+* Setting Rounding Mode:: Setting the rounding mode.
+* Floating-point Constants:: Representing floating-point constants.
+* Changing Precision:: Changing the precision of a number.
+* Exact Arithmetic:: Exact arithmetic with floating-point numbers.
+* Integer Programming:: Effective integer programming.
+* Arbitrary Precision Integers:: Arbitrary precision integer
+ arithmetic with @command{gawk}.
+* MPFR and GMP Libraries:: Information about the MPFR and GMP libraries.
* Nondecimal Data:: Allowing nondecimal input data.
* Array Sorting:: Facilities for controlling array traversal
and sorting arrays.
@@ -3212,6 +3228,14 @@ when eliminating problems pointed out by @option{--lint}, you should take
care to search for all occurrences of each inappropriate construct. As
@command{awk} programs are usually short, doing so is not burdensome.
+@item -M
+@itemx --bcmath
+@cindex @code{-M} option
+@cindex @code{--bcmath} option
+Force arbitrary precision arithmetic on numbers. This option has no effect
+if @command{gawk} is not compiled to use the GNU MPFR and MP libraries
+(@pxref{Arbitrary Precision Arithmetic}).
+
@item -n
@itemx --non-decimal-data
@cindex @code{-n} option
@@ -18294,6 +18318,748 @@ then @command{gawk} produces usage messages, warnings,
and fatal errors in the local language.
@c ENDOFRANGE inloc
+@node Arbitrary Precision Arithmetic
+@chapter Arbitrary Precision Arithmetic with @command{gawk}
+@cindex arbitrary precision
+@cindex multiple precision
+@cindex infinite precision
+@cindex floating-point numbers, arbitrary precision
+@cindex MPFR
+@cindex GMP
+
+@cindex Knuth, Donald
+@quotation
+@i{There's a credibility gap: We don't know how much of the computer's answers
+to believe. Novice computer users solve this problem by implicitly trusting
+in the computer as an infallible authority; they tend to believe that all
+digits of a printed answer are significant. Disillusioned computer users have
+just the opposite approach; they are constantly afraid that their answers
+are almost meaningless.}@footnote{
+Donald E. Knuth. The Art of Computer Programming. Volume 2,
+Seminumerical Algorithms, 3rd edition, 1998, ISBN 0-201-89683-4, p. 229.
+}
+
+Donald Knuth
+@end quotation
+
+
+This section is about how to use the arbitrary precision
+(also known as multiple precision or infinite precision) numeric
+capabilites in @command{gawk} to produce maximally accurate results
+when you need it. But first you should check if your version of
+@command{gawk} supports arbitrary precision arithmetic.
+The easiest way to find out is to look at the output of
+the following command:
+
+@example
+$ @kbd{gawk --version}
+@print{} GNU Awk 4.1.0 (GNU MPFR 3.1.0, GNU MP 5.0.3)
+@print{} Copyright (C) 1989, 1991-2012 Free Software Foundation.
+..
+@end example
+
+Gawk uses the GNU MPFR and MP libraries for arbitrary precision arithmetic
+on numbers. So if you do not see the names of these libraries in the output above,
+then your version of @command{gawk} does not support arbitrary precision math.
+
+Even if you aren't interested in arbitrary precision arithmetic, you
+may still benifit from knowing about how @command{gawk} handles numbers
+in general, and the limitations of doing arithmetic with ordinary
+@command{gawk} numbers.
+
+@menu
+* Floating-point Programming:: Effective Floating-point Programming.
+* Floating-point Representation:: Binary Floating-point Representation.
+* Floating-point Context:: Floating-point Context.
+* Rounding Mode:: Floating-point Rounding Mode.
+* Arbitrary Precision Floats:: Arbitrary Precision Floating-point
+ Arithmetic with @command{gawk}.
+* Setting Precision:: Setting the Working Precision.
+* Setting Rounding Mode:: Setting the Rounding Mode.
+* Floating-point Constants:: Representing Floating-point Constants.
+* Changing Precision:: Changing the Precision of a Number.
+* Exact Arithmetic:: Exact Arithmetic with Floating-point Numbers.
+* Integer Programming:: Effective Integer Programming.
+* Arbitrary Precision Integers:: Arbitrary Precision Integer.
+ Arithmetic with @command{gawk}.
+* MPFR and GMP Libraries:: Information About the MPFR and GMP Libraries.
+@end menu
+
+@node Floating-point Programming
+@section Effective Floating-point Programming
+
+Numerical programming is an extensive area; if you need to develop
+sophisticated numerical algorithms then @command{gawk} may not be
+the ideal tool, and this documentation may not be sufficient.
+It might require a book or two to communicate how to compute
+with ideal accuracy and precision, and the result often depends
+on the particular application.
+
+Binary floating-point representations and arithmetic are inexact.
+Simple values like 0.1 cannot be precisely represented using
+binary floating-point numbers, and the limited precision of
+floating-point numbers means that slight changes in
+the order of operations or the precision of intermediate storage
+can change the result. To make matters worse with arbitrary precision
+floating-point, one can set the precision before starting a computation,
+and then one cannot be sure of the final result.
+
+Sometimes you need to think more about what you really want
+and what's really happening. Consider the two numbers
+in the following example:
+
+@example
+ x = 0.875 # 1/2 + 1/4 + 1/8
+ y = 0.425
+@end example
+
+Unlike the number in y, the number stored in x is exactly representable
+in binary since it can be written as a finite sum of one or
+more fractions whose denominators are all powers of two.
+When @command{gawk} reads a floating-point number from
+a program source, it automatically rounds that number to whatever
+precision that your machine supports. If you try to print the numeric
+content of a variable using an output format string "%.17g",
+it may not produce the same number as you assigned to it:
+
+@example
+$ @kbd{gawk 'BEGIN @{ printf("%0.17g, %0.17g\n", x, y) @}'}
+@print{} 0.875, 0.42499999999999999
+@end example
+
+Often the error is so small you do not even notice it, and if you do,
+you can always specify how much precision you would like in your output.
+Usually this is a format string like "%.15g", which when
+used in the example above will produce an output identical to the input.
+
+Because the underlying representation can be little bit off from the exact value,
+comparing floats to see if they are equal is generally not a good idea.
+Here is an example where it does not work like you expect:
+
+@example
+$ @kbd{gawk 'BEGIN @{ print (0.1 + 12.2 == 12.3) @}'}
+@print{} 0
+@end example
+
+The loss of accuracy during a single computation with floating-point numbers
+usually isn't enough to worry about. However, if you compute a value
+which is the result of a sequence of floating point operations,
+the error can accumulate and greatly affect the computation itself.
+Here is an attempt to compute the value of the constant @samp{pi} using one of its many
+series representations:
+
+@example
+$ cat pi.awk
+BEGIN @{
+ x = 1.0 / sqrt(3.0)
+ n = 6
+ for (i = 1; i < 30; i++) @{
+ n = n * 2.0
+ x = (sqrt(x * x + 1) - 1) / x
+ printf("%.15f\n", n * x)
+ @}
+@}
+@end example
+
+When run, the early errors propagating through later computations will
+cause the loop to terminate prematurely after an attempt to divide by zero.
+Here is one more example where the inaccuracies in internal representations
+yield unexpected result:
+
+@example
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{for (d = 1.1; d <= 1.5; d += 0.1)}
+> @kbd{i++}
+> @kbd{print i}
+> @kbd{@}'}
+@print{} 4
+@end example
+
+Can computation using aribitrary precision help with the examples above?
+If you are impatient to know,
+@xref{Exact Arithmetic}.
+Instead of aribitrary precision floating-point arithmetic,
+often all you need is an adjustment of your logic
+or different order for the operations in your calculation.
+The stability and the accuracy of the computation of the constant @samp{pi}
+in the example above can be enhanced by using the following
+simple algebraic transformation:
+
+@example
+ (sqrt(x * x + 1) - 1) / x = x / (sqrt(x * x + 1) + x)
+@end example
+
+There is no need to be unduly suspicious about the results from
+floating-point arithmetic. The lesson to remember is that
+floating-point math is always more complex than the math using
+pencil and paper. In order to take advantage of the power
+of computer floating-point, you need to know its limitations
+and work within them. For most casual use of floating-point arithmetic,
+you will often get the expected result in the end if you simply round
+the display of your final results to the correct number of significant
+decimal digits. Avoid presenting numerical data in a manner that
+implies better precision than is actually the case.
+
+@node Floating-point Representation
+@section Binary Floating-point Representation
+@cindex IEEE-754 format
+
+Although floating-point representations vary from machine to machine,
+the most commonly encountered representation is that defined by the
+IEEE 754 Standard. An IEEE-754 format has three components:
+a sign bit telling whether the number is positive or negative,
+an exponent giving its order of magnitude @var{e}, and a significand @var{s}
+specifying the actual digits of the number. The value of the
+number is then @var{s * 2^e}. The first bit of a non-zero binary significand
+is always one so the significand in an IEEE-754 format only includes the
+fractional part leaving the leading one implicit.
+
+Three of the standard IEEE-754 types are 32-bit single precision,
+64-bit double precision and 128-bit quadruple precision.
+The standard also specifies extended precision formats
+to allow greater precisions and larger exponent ranges.
+
+
+@node Floating-point Context
+@section Floating-point Context
+@cindex context, floating-point
+
+A floating-point context defines the environment for arithmetic operations.
+It governs precision, sets rules for rounding and limits range for exponents.
+The context has the following primary components:
+
+@table @code
+@item precision
+Precision of the floating-point format in bits.
+@item emax
+Maximum exponent allowed for this format.
+@item emin
+Minimum exponent allowed for this format.
+@item subnormal behavior
+The format may or may not support gradual underflow.
+@item rounding
+The rounding mode of this context.
+@end table
+
+@ref{table-ieee-formats} lists the precision and exponent
+field values for the basic IEEE-754 binary formats:
+
+@float Table,table-ieee-formats
+@caption{Basic IEEE Formats}
+@multitable @columnfractions .20 .20 .20 .20 .20
+@headitem Name @tab Total bits @tab Precision @tab emin @tab emax
+@item Single @tab 32 @tab 24 @tab -126 @tab +127
+@item Double @tab 64 @tab 53 @tab -1022 @tab +1023
+@item Quadruple @tab 128 @tab 113 @tab -16382 @tab +16383
+@end multitable
+@end float
+
+@quotation NOTE
+The precision numbers include the implied leading one that gives them
+one extra bit of significand.
+@end quotation
+
+A floating-point context can also determine which signals are treated as exceptions,
+or can set rules for arithmetic with special values. The interested reader should
+consult the IEEE-754 standard or other resources for details.
+
+Gawk ordinarily uses the hardware double precision for a number.
+On most systems, it is in IEEE-754 floating-point format which corresponds
+to 64-bit binary with 53 bits of precision.
+
+
+@quotation NOTE
+In case an underflow occurs, the standard allows, but does not require, the smallest
+normal number to loose precision gradually when an arithmetic operation is not
+exactly zero but is too close to zero. Such numbers do not have as many significant
+digits as normal numbers, and are called denormals or subnormals.
+The basic IEEE-754 binary formats support subnormal numbers.
+@end quotation
+
+
+@node Rounding Mode
+@section Floating-point Rounding Mode
+@cindex rounding mode, floating-point
+
+Rounding mode specifies the behavior for the results of numerical operations when
+discarding extra precision. Each rounding mode indicates how the
+least significant returned digit of a rounded result is to be calculated.
+@ref{table-rounding-modes} lists the IEEE-754 defined rounding modes:
+
+@float Table,table-rounding-modes
+@caption{Rounding Modes}
+@multitable @columnfractions .45 .25 .30
+@headitem Rounding Mode @tab IEEE Name @tab @code{RNDMODE} (@pxref{Setting Rounding Mode})
+@item Round to nearest, ties to even @tab @code{roundTiesToEven} @tab @code{"N"} or @code{"n"}
+@item Round toward plus Infinity @tab @code{roundTowardPositive} @tab @code{"U"} or @code{"u"}
+@item Round toward negative Infinity @tab @code{roundTowardNegative} @tab @code{"D"} or @code{"d"}
+@item Round toward zero @tab @code{roundTowardZero} @tab @code{"Z"} or @code{"z"}
+@item Round to nearest, ties away from zero @tab @code{roundTiesToAway} @tab @code{"A"} or @code{"a"}
+@end multitable
+@end float
+
+The default mode @samp{roundTiesToEven} is the most preferred,
+but the least intuitive. This method does the obvious thing for most values,
+by rounding them up or down to the nearest digit.
+For example, rounding 1.132 to two digits yields 1.13,
+and rounding 1.157 yields 1.16.
+When it comes to rounding a value that is exactly halfway between,
+it does not probably work the way you have learned in school.
+In this case, the number is rounded to the nearest even digit.
+So rounding 0.125 to two digits rounds down to 0.12,
+but rounding 0.6875 to three digits rounds up to 0.688.
+You probably have already encountered this rounding mode when
+using the @code{printf} routine to format floating-point numbers.
+For example:
+
+@example
+BEGIN @{
+ x = -4.5
+ for (i = 1; i < 10; i++) @{
+ x += 1.0
+ printf("%4.1f => %2.0f\n", x, x)
+ @}
+@}
+@end example
+
+@noindent
+produces the following output when run@footnote{
+It is possible for the output to be completely different if the
+C library in your system does not use the IEEE-754 even-rounding
+rule to round halfway cases for @code{printf()}.}:
+
+@example
+-3.5 => -4
+-2.5 => -2
+-1.5 => -2
+-0.5 => 0
+ 0.5 => 0
+ 1.5 => 2
+ 2.5 => 2
+ 3.5 => 4
+ 4.5 => 4
+@end example
+
+The theory behind the rounding mode @samp{roundTiesToEven} is that
+it more or less evenly distributes upward and downward rounds
+of exact halves, which might cause the round-off error
+to cancel itself out. This is the default rounding mode used
+in IEEE-754 computing functions and operators.
+
+The other rounding modes are rarely used.
+Round toward positive infinity @samp{roundTowardPositive}
+and round toward negative infinity @samp{roundTowardNegative}
+are often used to implement interval arithmetic,
+where you adjust the rounding mode to calculate upper and lower bounds
+for the range of output. The @samp{roundTowardZero}
+mode can be used for converting floating-point numbers to integers.
+The rounding mode @samp{roundTiesToAway} rounds the result to the
+nearest number and selects the number with the larger magnitude
+if a tie occurs.
+
+Some numerical analysts will tell you that your choice of rounding style
+has tremendous impact on the final outcome, and advice you to wait until
+final output for any rounding. This goal can often be achieved by
+setting the precision initially to some value sufficiently larger than
+the final desired precision so that the accumulation of round-off error
+do not influence the outcome.
+If you suspect that results from your computation are
+sensitive to accumulation of round-off error,
+one way to be sure is to look for significant difference in output
+when you change the rounding mode.
+
+
+@node Arbitrary Precision Floats
+@section Arbitrary Precision Floating-point Arithmetic with @command{gawk}
+
+Gawk uses the GNU MPFR library for arbitrary precision floating-point arithmetic.
+The MPFR library provides precise control over precisions and rounding modes,
+and gives correctly rounded reproducible platform-independent results.
+With the command-line option @option{--bcmath} or @option{-M}, all floating-point
+arithmetic operators and numeric functions can yield results to any
+desired precision level supported by MPFR. Two built-in variables @code{PREC}
+(@pxref{Setting Precision})
+and @code{RNDMODE}
+(@pxref{Setting Rounding Mode})
+give a simple way of controlling the working precision and the rounding mode in @command{gawk}.
+The precision and the rounding mode are set globally for every operation to follow.
+The default working precision for arbitrary precision floats is 53@footnote{The
+default precision is 53, since according to the MPFR documentation, mpfr should be able to exactly
+reproduce all computations with double-precision machine floating-point numbers (double type in C),
+except the default exponent range is much wider and subnormal numbers are not implemented.}
+and the default value for @code{RNDMODE} is @code{"N"} which selects the IEEE-754
+@samp{roundTiesToEven} (@pxref{Rounding Mode}) rounding mode.
+The default exponent range in MPFR (@var{emax} = 2^30 - 1, @var{emin} = -@var{emax})
+is used by @command{gawk} for all floating-point contexts.
+There is no explicit mechanism in @command{gawk} to adjust the exponent range.
+MPFR does not implement subnormal numbers by default,
+and this behavior cannot be changed in @command{gawk}.
+
+@quotation NOTE
+When emulating an IEEE-754 format (@pxref{Setting Precision}),
+@command{gawk} internally adjusts the exponent range
+to the value defined for the format and also performs computations needed for
+gradual underflow (subnormal numbers).
+@end quotation
+
+@quotation NOTE
+MPFR numbers are variable-size entities, consuming only as much space as needed to store
+the significant digits. Since the performance using MPFR numbers pales compared to
+doing math on the underlying machine types, you should consider only using as much
+precision as needed by your program.
+@end quotation
+
+
+@node Setting Precision
+@section Setting the Working Precision
+@cindex @code{PREC} variable
+
+Gawk uses a global working precision; it does not keep track of
+the precision or accuracy of individual numbers. Performing an arithmetic
+operation or calling a built-in function rounds the result to the current
+working precision. The default working precision is 53 which can be
+modified using the built-in variable @code{PREC}. You can also set the
+value to one of the following pre-defined case-insensitive strings
+to emulate an IEEE-754 binary format:
+
+@multitable {double} {12345678901234567890123456789012345}
+@headitem @code{PREC} @tab IEEE-754 Binary Format
+@item @code{"half"} @tab 16-bit half-precision.
+@item @code{"single"} @tab Basic 32-bit single precision.
+@item @code{"double"} @tab Basic 64-bit double precision.
+@item @code{"quad"} @tab Basic 128-bit quadruple precision.
+@item @code{"oct"} @tab 256-bit octuple precision.
+@end multitable
+
+The following example illustrates the effects of changing precision
+on arithmetic operations:
+
+@example
+$ @kbd{gawk -M -vPREC=100 'BEGIN @{ x = 1.0e-400; print x + 0; \}
+> @kbd{PREC = "double"; print x + 0 @}'}
+@print{} 1e-400
+@print{} 0
+@end example
+
+Binary and decimal precisions are related approximately according to the
+formula @code{prec = 3.322 * dps}, where @code{prec} denotes the binary precision
+(measured in bits) and @code{dps} (short for decimal places)
+is the decimal digits. We can easily calculate how many decimal
+digits the 53-bit significand of an IEEE double is equivalent to:
+53 / 3.332 which is equal to about 15.95.
+But what does 15.95 digits actually mean? It depends whether you are
+concerned about how many digits you can rely on, or how many digits
+you need.
+
+It is important to know how many bits it takes to uniquely
+identify a double. If you want to round-trip from double to decimal and
+back to double (saving a double representing an intermediate result
+to a file, and later reading it back to restart the computation for instance)
+then few more decimal digits are required. 17 digits will generally
+be enough for a double.
+
+It can also be important to know what decimal numbers can be uniquely
+represented with a floating-point double. If you want to round-trip
+from decimal to double and back again, 15 is the most that
+you can get. Stated differently, you should not present
+the numbers from your floating-point computations with more than 15
+significant digits in them.
+
+Conversely, it takes a precision of 332 bits to hold an approximation
+of constant @samp{pi} that is accurate to 100 decimal places.
+You should always add few extra bits in order to avoid confusing round-off
+issues that occur because numbers are stored internally in binary.
+
+
+@node Setting Rounding Mode
+@section Setting the Rounding Mode
+@cindex @code{RNDMODE} variable
+
+The built-in variable @code{RNDMODE} has the default value @code{"N"} which selects
+the IEEE-754 rounding mode @samp{roundTiesToEven}.
+The other possible values for @code{RNDMODE} are @code{"U"} for rounding mode
+@samp{roundTowardPositive}, @code{"D"} for @samp{roundTowardNegative},
+and @code{"Z"} for @samp{roundTowardZero}.
+Gawk also accepts @code{"A"} to select the IEEE-754 mode @samp{roundTiesToAway}
+if the version of your MPFR library supports it, otherwise setting
+@code{RNDMODE} to this value has no effect. @xref{Rounding Mode},
+for the meanings of the various round modes.
+
+Here is an example of how to change the default rounding behavior of
+the @code{printf} output:
+
+@example
+$ @kbd{gawk -M -vRNDMODE="Z" 'BEGIN@{ printf("%.2f\n", 1.378)@}'}
+@print{} 1.37
+@end example
+
+
+@node Floating-point Constants
+@section Representing Floating-point Constants
+@cindex constants, floating-point
+
+Be wary of floating-point constants! When reading a floating-point constant
+from a program source, @command{gawk} uses the default precision, unless overridden
+by an assignment to the special variable @code{PREC} in the command
+line, to store it internally as a MPFR number.
+Changing the precision using @code{PREC} in the program text does
+not change the precision of a constant. If you need to
+represent a floating-point constant at a higher precision than the
+default and cannot use a command line assignment to @code{PREC},
+you should either specify the constant as a string, or
+a rational number whenever possible. The following example
+illustrates the differences among various ways to
+print a floating-point constant:
+
+@example
+$ @kbd{gawk -M 'BEGIN @{ PREC=113; printf("%0.25f\n", 0.1) @}'}
+@print{} 0.1000000000000000055511151
+$ @kbd{gawk -M -vPREC=113 'BEGIN @{ printf("%0.25f\n", 0.1) @}'}
+@print{} 0.1000000000000000000000000
+$ @kbd{gawk -M 'BEGIN @{ PREC=113; printf("%0.25f\n", "0.1") @}'}
+@print{} 0.1000000000000000000000000
+$ @kbd{gawk -M 'BEGIN @{ PREC=113; printf("%0.25f\n", 1/10) @}'}
+@print{} 0.1000000000000000000000000
+@end example
+
+In the first case above, the number is stored with the default precision of 53.
+
+
+@node Changing Precision
+@section Changing the Precision of a Number
+
+@cindex Laurie, Dirk
+@quotation
+@i{.. The point is that in any variable-precision package,
+a decision is made on how to treat numbers given as data,
+or arising in intermediate results, which are represented in
+floating-point format to a precision lower than working precision.
+Do we promote them to full membership of the high-precision club,
+or do we treat them and all their associates as second-class citizens?
+Sometimes the first course is proper, sometimes the second, and it takes
+careful analysis to tell which.}@footnote{
+Dirk Laurie. Variable-precision Arithmetic Considered Perilous - A Detective Story.
+Electronic Transactions on Numerical Analysis. Volume 28, pp. 168-173, 2008.
+}
+
+Dirk Laurie
+@end quotation
+
+
+Gawk does not implicitly modify the precision of any previously computed results
+when the working precision is changed with an assignment to @code{PREC} in the
+program. The precision of a number is always the one that was used at the time
+of its creation, and there is no way for the user to explicitly change it
+thereafter. However, since the result of a floating-point arithmetic operation
+is always an arbitrary precision float with a precision set by the value
+of @code{PREC}, the following workaround will effectively accomplish
+the same desired behavior:
+
+@example
+ x = x + 0.0
+@end example
+
+@node Exact Arithmetic
+@section Exact Arithmetic with Floating-point Numbers
+
+@quotation CAUTION
+Never depend on the exactness of floating-point arithmetic,
+even for apparently simple expressions!
+@end quotation
+
+Can arbitrary precision arithmetic give exact results? There are
+no easy answers. The standard rules of algebra often do not apply
+when using floating-point arithmetic.
+Among other things, the distributive and associative laws
+do not hold completely, and order of operation may be important
+for your computation. Rounding error, cumulative precision loss,
+and underflow are often troublesome.
+
+When @command{gawk} tests the expressions 0.1 + 12.2 and 12.3 for equality
+using the machine double precision arithmetic it decides that they
+are not equal
+(@pxref{Floating-point Programming})!
+You can get the result you want by increasing the precision,
+56 in this case will get the job done:
+
+@example
+$ @kbd{gawk -M -vPREC=56 'BEGIN @{ print (0.1 + 12.2 == 12.3) @}'}
+@print{} 1
+@end example
+
+Using an even larger value of @code{PREC}:
+
+@example
+$ @kbd{gawk -M -vPREC=201 'BEGIN @{ print (0.1 + 12.2 == 12.3) @}'}
+@print{} 0
+@end example
+
+This is not a bug in @command{gawk} or in the MPFR library.
+It is easy to forget that the finite number of bits used to store the value
+is often just an approximation after proper rounding.
+The test for equality succeeds if and only if all bits in the two operands
+are exactly the same. Since this is not necessarily true after floating-point
+computations with a particular precision and the effective rounding rule,
+a straight test for equality may not work.
+
+So don't assume that floating-point values can be compared for equality.
+You should also exercise caution when using other forms of comparisons.
+The standard way to compare between floating-point numbers is to determine
+how much error (or tolerance) you will allow in a comparison and
+check to see if one value is within this error range of the other.
+
+In applications where 15 or fewer decimal places suffice,
+hardware double precision arithmetic can be adequate, and is usually much faster.
+But you do need to keep in mind that every floating-point operation
+can suffer a new rounding error with catastrophic consequences as illustrated
+by our attempt to compute the value of the constant @samp{pi},
+(@pxref{Floating-point Programming}).
+Extra precision can greatly enhance the stability and the accuracy
+of your computation in such cases.
+
+Repeated addition is not necessarily equivalent to multiplication
+in floating-point arithmetic. In the last example
+(@pxref{Floating-point Programming}),
+you may or may not succeed in getting the correct result by choosing
+an arbitrarily large value for @code{PREC}. Reformulation of
+the problem at hand is often the correct approach in such situations.
+
+
+@node Integer Programming
+@section Effective Integer Programming
+
+As has been mentioned already, @command{gawk} ordinarily uses hardware double
+precision with 64-bit IEEE binary floating-point representation
+for numbers on most systems. A large integer like 9007199254740997
+has a binary representation that, although finite, is more than 53 bits long;
+it must also be rounded to 53 bits.
+The biggest integer that can be stored in a double is usually the same
+as the largest possible value of a double. If your system double is
+an IEEE 64-bit double, it is an integer and can be represented precisely.
+What more should one know about integers?
+
+If you want to know what is the largest integer, such that it and
+all smaller integers can be stored in 64-bit doubles without losing precision,
+then the answer is 2^53. The next representable number is the even number
+2^53 + 2 meaning it is unlikely that you will be able to make
+@command{gawk} to print 2^53 + 1 in integer format.
+The range of integers exactly representable by a 64-bit double
+is [-2^53, 2^53]. If you ever see an integer outside this range in @command{gawk}
+using 64-bit doubles, you have the reason to be very suspicious about
+the accuracy of the output. Here is a simple program with erroneous output:
+
+@example
+$ @kbd{gawk 'BEGIN @{ i = 2^53 - 1; for (j = 0; j < 4; j++) print i + j @}'}
+@print{} 9007199254740991
+@print{} 9007199254740992
+@print{} 9007199254740992
+@print{} 9007199254740994
+@end example
+
+The lesson is not to assume a large integer printed by @command{gawk}
+to be an exact result from your computation, especially if it wraps around on
+your terminal screen.
+
+@node Arbitrary Precision Integers
+@section Arbitrary Precision Integer Arithmetic with @command{gawk}
+@cindex integer, arbitrary precision
+
+If the option @option{--bcmath} or @option{-M} is specified, @command{gawk} will perform all
+integer arithmetic using GMP arbitrary precision integers.
+Any number that looks like an integer in a program source or data file
+will be stored as an arbitrary precision integer.
+The size of the integer is limited only by your computer's memory.
+The current floating-point context has no effect on operations involving integers.
+For example, the following computes 5^4^3^2, the result of which is beyond the
+limits of ordinary @command{gawk} numbers:
+
+@example
+$ @kbd{gawk -M 'BEGIN @{}
+> @kbd{x = 5^4^3^2}
+> @kbd{print "# of digits =", length(x)}
+> @kbd{print substr(x, 1, 20), "...", substr(x, length(x) - 19, 20)}
+> @kbd{@}'}
+@print{} # of digits = 183231
+@print{} 62060698786608744707 ... 92256259918212890625
+@end example
+
+If you were to compute the same using arbitrary precision floats instead,
+the precision needed for correct output,
+using the formula @code{prec = 3.322 * dps},
+would be 3.322 * 183231 or 608693.
+
+The result from an arithmetic operation with an integer and a float
+is a float with a precision equal to the working precision.
+The following program calculates the eighth term in
+Sylvester's sequence using a recurrence:
+
+@example
+$ @kbd{gawk -M 'BEGIN @{}
+> @kbd{s = 2.0}
+> @kbd{for (i = 1; i <= 7; i++)}
+> @kbd{s = s * (s - 1) + 1}
+> @kbd{print s@}'}
+@print{} 113423713055421845118910464
+@end example
+
+The output differs from the acutal number 113423713055421844361000443
+because the default precision 53 is not enough to represent the
+floating-point results exactly. You can either increase
+the precision (100 in this case is enough), or replace the float 2.0 with
+an integer to perform all computations using integer arithmetic to
+get the correct output.
+
+It will sometimes be necessary for @command{gawk} to implicitly convert an
+arbitrary precision integer into an arbitrary precision float.
+This is primarily because the MPFR library does not always provide the
+relevant interface to process arbitrary precision integers or mixed-mode
+numbers as needed by an operation or function.
+In such a case, the precision is set to the minimum value necessary
+for exact conversion, and the working precision is not used for this purpose.
+If this is not what you need or want, you can employ a subterfuge
+like this:
+
+@example
+$ @kbd{gawk -M 'BEGIN @{ n = 13; print (n + 0.0) % 2.0 @}'}
+@end example
+
+You can avoid this issue altogether by specifying the number as a float
+to begin with:
+
+@example
+$ @kbd{gawk -M 'BEGIN @{ n = 13.0; print n % 2.0 @}'}
+@end example
+
+Note that for the particular example above, there is unlikely to be a
+reason for simply not using the following:
+
+@example
+$ @kbd{gawk -M 'BEGIN @{ n = 13; print n % 2 @}'}
+@end example
+
+
+@node MPFR and GMP Libraries
+@section Information About the MPFR and GMP Libraries
+@cindex @code{PROCINFO} array
+
+The following elements of the PROCINFO array (@pxref{Built-in Variables})
+are available to provide information about the MPFR and GMP libraries:
+
+@table @code
+@item PROCINFO["mpfr_version"]
+The version of the GNU MPFR library.
+
+@item PROCINFO["gmp_version"]
+The version of the GNU MP library.
+
+@item PROCINFO["prec_max"]
+The maximum precision supported by MPFR.
+
+@item PROCINFO["prec_min"]
+The minimum precision required by MPFR.
+@end table
+
+
@node Advanced Features
@chapter Advanced Features of @command{gawk}
@cindex advanced features, network connections, See Also networks, connections