blob: 899b1a3db1b493d3cbc666c7ecc0ef03279d766a [file] [log] [blame]
# User Documentation for the IMath Library
Author: [M. J. Fromberger](https://github.com/creachadair)
## Installation
1. Edit Makefile to select compiler and options. The default is to use gcc.
You may want to change CC to `clang` instead of `gcc` (and on macOS that
what you will get anyway), but you should be able to use the default GCC
settings for either.
By default, the Makefile assumes you can use 64-bit integer types, even
though they were not standard in ANSI C90. If you cannot, add
`-DUSE_32BIT_WORDS` to the compiler options.
2. Type `make` or `make test` to build the test driver and run the unit tests.
None of these should fail. If they do, see below for how you can report
bugs.
To build with debugging enabled (and optimization disabled), run `make
DEBUG=Y`. This sets the preprocessor macro `DEBUG` to 1, and several other
things (see Makefile for details).
To use the library in your code, include "imath.h" wherever you intend to use
the library's routines. The integer library is just a single source file, so
you can compile it into your project in whatever way makes sense. If you wish
to use rational arithmetic, you will also need to include "imrat.h".
## Background
The basic types defined by the imath library are `mpz_t`, an arbitrary
precision signed integer, and `mpq_t`, an arbitrary precision signed rational
number. The type `mp_int` is a pointer to an `mpz_t`, and `mp_rat` is a
pointer to an `mpq_t`.
Most of the functions in the imath library return a value of type `mp_result`.
This is a signed integer type which can be used to convey status information
and also return small values. Any negative value is considered to be a status
message. The following constants are defined for processing these:
| Status | Description |
| ----------- | -------------------------------------------- |
| `MP_OK` | operation successful, all is well (= 0) |
| `MP_FALSE` | boolean false (= `MP_OK`) |
| `MP_TRUE` | boolean true |
| `MP_MEMORY` | out of memory |
| `MP_RANGE` | parameter out of range |
| `MP_UNDEF` | result is undefined (e.g., division by zero) |
| `MP_TRUNC` | output value was truncated |
| `MP_BADARG` | an invalid parameter was passed |
If you obtain a zero or negative value of an `mp_result`, you can use the
`mp_error_string()` routine to obtain a pointer to a brief human-readable
string describing the error. These strings are statically allocated, so they
need not be freed by the caller; the same strings are re-used from call to
call.
Unless otherwise noted, it is legal to use the same parameter for both inputs
and output with most of the functions in this library. For example, you can
add a number to itself and replace the original by writing:
mp_int_add(a, a, a); /* a = a + a */
Any cases in which this is not legal will be noted in the function summaries
below (if you discover that this is not so, please report it as a bug; I will
fix either the function or the documentation :)
## The IMath API
Each of the API functions is documented here. The general format of the
entries is:
> ------------
> <pre>
> return_type function_name(parameters ...)
> </pre>
> - English description.
Unless otherwise noted, any API function that returns `mp_result` may be
expected to return `MP_OK`, `MP_BADARG`, or `MP_MEMORY`. Other return values
should be documented in the description. Please let me know if you discover
this is not the case.
The following macros are defined in "imath.h", to define the sizes of the
various data types used in the library:
| Constant | Description
| --------------- | ----------------------------------------
| `MP_DIGIT_BIT` | the number of bits in a single `mpz_t` digit.
| `MP_WORD_BIT` | the number of bits in a `mpz_t` word.
| `MP_SMALL_MIN` | the minimum value representable by an `mp_small`.
| `MP_SMALL_MAX` | the maximum value representable by an `mp_small`.
| `MP_USMALL_MAX` | the maximum value representable by an `mp_usmall`.
| `MP_MIN_RADIX` | the minimum radix accepted for base conversion.
| `MP_MAX_RADIX` | the maximum radix accepted for base conversion.
#### Initialization
An `mp_int` must be initialized before use. By default, an `mp_int` is
initialized with a certain minimum amount of storage for digits, and the
storage is expanded automatically as needed. To initialize an `mp_int`, use
the following functions:
{{insert "imath.h"
mp_int_init mp_int_alloc mp_int_init_size
mp_int_init_copy
mp_int_init_value
}}
#### Cleanup
When you are finished with an `mp_int`, you must free the memory it uses:
{{insert "imath.h" mp_int_clear mp_int_free}}
#### Setting Values
To set an `mp_int` which has already been initialized to a small integer value,
use:
{{insert "imath.h" mp_int_set_value mp_int_set_uvalue}}
To copy one initialized `mp_int` to another, use:
{{insert "imath.h" mp_int_copy}}
### Arithmetic Functions
{{insert "imath.h"
mp_int_is_odd mp_int_is_even
mp_int_zero
mp_int_abs
mp_int_neg
mp_int_add mp_int_add_value
mp_int_sub mp_int_sub_value
mp_int_mul mp_int_mul_value mp_int_mul_pow2
mp_int_sqr
mp_int_root mp_int_sqrt
mp_int_div mp_int_div_value mp_int_div_pow2
mp_int_mod mp_int_mod_value
mp_int_expt mp_int_expt_value mp_int_expt_full
}}
### Comparison Functions
Unless otherwise specified, comparison between values `x` and `y` returns a
**comparator**, an integer value < 0 if `x` is less than `y`, 0 if `x` is equal
to `y`, and > 0 if `x` is greater than `y`.
{{insert "imath.h"
mp_int_compare mp_int_compare_unsigned mp_int_compare_zero
mp_int_compare_value mp_int_compare_uvalue
mp_int_divisible_value mp_int_is_pow2
}}
### Modular Operations
{{insert "imath.h"
mp_int_exptmod mp_int_exptmod_evalue mp_int_exptmod_bvalue
mp_int_exptmod_known mp_int_redux_const
mp_int_invmod
mp_int_gcd mp_int_egcd mp_int_lcm
}}
### Conversion of Values
{{insert "imath.h"
mp_int_to_int mp_int_to_uint
mp_int_to_string mp_int_string_len
mp_int_read_string mp_int_read_cstring
mp_int_count_bits
mp_int_to_binary mp_int_read_binary mp_int_binary_len
mp_int_to_unsigned mp_int_read_unsigned mp_int_unsigned_len
}}
### Other Functions
Ordinarily, integer multiplication and squaring are done using the simple
quadratic "schoolbook" algorithm. However, for sufficiently large values,
there is a more efficient algorithm usually attributed to Karatsuba and Ofman
that is usually faster. See Knuth Vol. 2 for more details about how this
algorithm works.
The breakpoint between the "normal" and the recursive algorithm is controlled
by a static digit threshold defined in `imath.c`. Values with fewer significant
digits use the standard algorithm. This value can be modified by calling
`mp_int_multiply_threshold(n)`. The `imtimer` program and the
`findthreshold.py` script (Python) can help you find a suitable value for for
your particular platform.
{{insert "imath.h" mp_error_string}}
## Rational Arithmetic
{{insert "imrat.h"}}
## Representation Details
> NOTE: You do not need to read this section to use IMath. This is provided
> for the benefit of developers wishing to extend or modify the internals of
> the library.
IMath uses a signed magnitude representation for arbitrary precision integers.
The magnitude is represented as an array of radix-R digits in increasing order
of significance; the value of R is chosen to be half the size of the largest
available unsigned integer type, so typically 16 or 32 bits. Digits are
represented as mp_digit, which must be an unsigned integral type.
Digit arrays are allocated using `malloc(3)` and `realloc(3)`. Because this
can be an expensive operation, the library takes pains to avoid allocation as
much as possible. For this reason, the `mpz_t` structure distinguishes between
how many digits are allocated and how many digits are actually consumed by the
representation. The fields of an `mpz_t` are:
mp_digit single; /* single-digit value (see note) */
mp_digit *digits; /* array of digits */
mp_size alloc; /* how many digits are allocated */
mp_size used; /* how many digits are in use */
mp_sign sign; /* the sign of the value */
The elements of `digits` at indices less than `used` are the significant
figures of the value; the elements at indices greater than or equal to `used`
are undefined (and may contain garbage). At all times, `used` must be at least
1 and at most `alloc`.
To avoid interaction with the memory allocator, single-digit values are stored
directly in the `mpz_t` structure, in the `single` field. The semantics of
access are the same as the more general case.
The number of digits allocated for an `mpz_t` is referred to in the library
documentation as its "precision". Operations that affect an `mpz_t` cause
precision to increase as needed. In any case, all allocations are measured in
digits, and rounded up to the nearest `mp_word` boundary. There is a default
minimum precision stored as a static constant default_precision (`imath.c`).
This value can be set using `mp_int_default_precision(n)`.
Note that the allocated size of an `mpz_t` can only grow; the library never
reallocates in order to decrease the size. A simple way to do so explicitly is
to use `mp_int_init_copy()`, as in:
```
mpz_t big, new;
/* ... */
mp_int_init_copy(&new, &big);
mp_int_swap(&new, &big);
mp_int_clear(&new);
```
The value of `sign` is 0 for positive values and zero, 1 for negative values.
Constants `MP_ZPOS` and `MP_NEG` are defined for these; no other sign values
are used.
If you are adding to this library, you should be careful to preserve the
convention that inputs and outputs can overlap, as described above. So, for
example, `mp_int_add(a, a, a)` is legal. Often, this means you must maintain
one or more temporary mpz_t structures for intermediate values. The private
macros `DECLARE_TEMP(N)`, `CLEANUP_TEMP()`, and `TEMP(K)` can be used to
maintain a conventional structure like this:
```c
{
/* Declare how many temp values you need.
Use TEMP(i) to access the ith value (0-indexed). */
DECLARE_TEMP(8);
...
/* Perform actions that must return MP_OK or fail. */
REQUIRE(mp_int_copy(x, TEMP(1)));
...
REQUIRE(mp_int_expt(TEMP(1), TEMP(2), TEMP(3)));
...
/* You can also use REQUIRE directly for more complex cases. */
if (some_difficult_question(TEMP(3)) != answer(x)) {
REQUIRE(MP_RANGE); /* falls through to cleanup (below) */
}
/* Ensure temporary values are cleaned up at exit.
If control reaches here via a REQUIRE failure, the code below
the cleanup will not be executed.
*/
CLEANUP_TEMP();
return MP_OK;
}
```
Under the covers, these macros are just maintaining an array of `mpz_t` values,
and a jump label to handle cleanup. You may only have one `DECLARE_TEMP` and
its corresponding `CLEANUP_TEMP` per function body.
"Small" integer values are represented by the types `mp_small` and `mp_usmall`,
which are mapped to appropriately-sized types on the host system. The default
for `mp_small` is "long" and the default for `mp_usmall` is "unsigned long".
You may change these, provided you insure that `mp_small` is signed and
`mp_usmall` is unsigned. You will also need to adjust the size macros:
MP_SMALL_MIN, MP_SMALL_MAX
MP_USMALL_MIN, MP_USMALL_MAX
... which are defined in `<imath.h>`, if you change these.
Rational numbers are represented using a pair of arbitrary precision integers,
with the convention that the sign of the numerator is the sign of the rational
value, and that the result of any rational operation is always represented in
lowest terms. The canonical representation for rational zero is 0/1. See
"imrat.h".
## Testing and Reporting of Bugs
Test vectors are included in the `tests/` subdirectory of the imath
distribution. When you run `make test`, it builds the `imtest` program and
runs all available test vectors. If any tests fail, you will get a line like
this:
x y FAILED v
Here, _x_ is the line number of the test which failed, _y_ is index of the test
within the file, and _v_ is the value(s) actually computed. The name of the
file is printed at the beginning of each test, so you can find out what test
vector failed by executing the following (with x, y, and v replaced by the
above values, and where "foo.t" is the name of the test file that was being
processed at the time):
% tail +x tests/foo.t | head -1
None of the tests should fail (but see [Note 2](#note2)); if any do, it
probably indicates a bug in the library (or at the very least, some assumption
I made which I shouldn't have). Please [file an
issue](https://github.com/creachadair/imath/issues/new), including the `FAILED`
test line(s), as well as the output of the above `tail` command (so I know what
inputs caused the failure).
If you build with the preprocessor symbol `DEBUG` defined as a positive
integer, the digit allocators (`s_alloc`, `s_realloc`) fill all new buffers
with the value `0xdeadbeefabad1dea`, or as much of it as will fit in a digit,
so that you can more easily catch uninitialized reads in the debugger.
## Notes
1. <a name="note1"></a>You can generally use the same variables for both input
and output. One exception is that you may not use the same variable for
both the quotient and the remainder of `mp_int_div()`.
2. <a name="note2"></a>Many of the tests for this library were written under
the assumption that the `mp_small` type is 32 bits or more. If you compile
with a smaller type, you may see `MP_RANGE` errors in some of the tests that
otherwise pass (due to conversion failures). Also, the pi generator (pi.c)
will not work correctly if `mp_small` is too short, as its algorithm for arc
tangent is fairly simple-minded.
## Contacts
The IMath library was written by Michael J. Fromberger.
If you discover any bugs or testing failures, please [open an
issue](https://github.com/creachadair/imath/issues/new). Please be sure to
include a complete description of what went wrong, and if possible, a test
vector for `imtest` and/or a minimal test program that will demonstrate the bug
on your system. Please also let me know what hardware, operating system, and
compiler you're using.
## Acknowledgements
The algorithms used in this library came from Vol. 2 of Donald Knuth's "The Art
of Computer Programming" (Seminumerical Algorithms). Thanks to Nelson Bolyard,
Bryan Olson, Tom St. Denis, Tushar Udeshi, and Eric Silva for excellent
feedback on earlier versions of this code. Special thanks to Jonathan Shapiro
for some very helpful design advice, as well as feedback and some clever ideas
for improving performance in some common use cases.
## License and Disclaimers
IMath is Copyright 2002-2009 Michael J. Fromberger
You may use it subject to the following Licensing Terms:
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.