Home | Libraries | People | FAQ | More |
The Multiprecision Library provides integer, rational and floating-point types in C++ that have more range and precision than C++'s ordinary built-in types. The big number types in Multiprecision can be used with a wide selection of basic mathematical operations, elementary transcendental functions as well as the functions in Boost.Math. The Multiprecision types can also interoperate with the built-in types in C++ using clearly defined conversion rules. This allows Boost.Multiprecision to be used for all kinds of mathematical calculations involving integer, rational and floating-point types requiring extended range and precision.
Multiprecision consists of a generic interface to the mathematics of large numbers as well as a selection of big number back ends, with support for integer, rational and floating-point types. Boost.Multiprecision provides a selection of back ends provided off-the-rack in including interfaces to GMP, MPFR, MPIR, TomMath as well as its own collection of Boost-licensed, header-only back ends for integers, rationals and floats. In addition, user-defined back ends can be created and used with the interface of Multiprecision, provided the class implementation adheres to the necessary concepts.
Depending upon the number type, precision may be arbitrarily large (limited only by available memory), fixed at compile time (for example 50 or 100 decimal digits), or a variable controlled at run-time by member functions. The types are expression-template-enabled for better performance than naive user-defined types.
The Multiprecision library comes in two distinct parts:
number
that handles all the operator overloading, expression evaluation optimization,
and code reduction.
Separation of front-end and back-end allows use of highly refined, but restricted license libraries where possible, but provides Boost license alternatives for users who must have a portable unconstrained license. Which is to say some back-ends rely on 3rd party libraries, but a header-only Boost license version is always available (if somewhat slower).
Should you just wish to cut to the chase and use a fully Boost-licensed number type, then skip to cpp_int for multiprecision integers, cpp_dec_float for multiprecision floating-point types and cpp_rational for rational types.
The library is often used via one of the predefined typedefs: for example if you wanted an arbitrary precision integer type using GMP as the underlying implementation then you could use:
#include <boost/multiprecision/gmp.hpp> // Defines the wrappers around the GMP library's types boost::multiprecision::mpz_int myint; // Arbitrary precision integer type.
Alternatively, you can compose your own multiprecision type, by combining
number
with one of the predefined
back-end types. For example, suppose you wanted a 300 decimal digit floating-point
type based on the MPFR library. In
this case, there's no predefined typedef with that level of precision, so instead
we compose our own:
#include <boost/multiprecision/mpfr.hpp> // Defines the Backend type that wraps MPFR namespace mp = boost::multiprecision; // Reduce the typing a bit later... typedef mp::number<mp::mpfr_float_backend<300> > my_float; my_float a, b, c; // These variables have 300 decimal digits precision
We can repeat the above example, but with the expression templates disabled
(for faster compile times, but slower runtimes) by passing a second template
argument to number
:
#include <boost/multiprecision/mpfr.hpp> // Defines the Backend type that wraps MPFR namespace mp = boost::multiprecision; // Reduce the typing a bit later... typedef mp::number<mp::mpfr_float_backend<300>, et_off> my_float; my_float a, b, c; // These variables have 300 decimal digits precision
We can also mix arithmetic operations between different types, provided there is an unambiguous implicit conversion from one type to the other:
#include <boost/multiprecision/cpp_int.hpp> namespace mp = boost::multiprecision; // Reduce the typing a bit later... mp::int128_t a(3), b(4); mp::int512_t c(50), d; d = c * a; // OK, result of mixed arithmetic is an int512_t
Conversions are also allowed:
d = a; // OK, widening conversion. d = a * b; // OK, can convert from an expression template too.
However conversions that are inherently lossy are either declared explicit or else forbidden altogether:
d = 3.14; // Error implicit conversion from float not allowed. d = static_cast<mp::int512_t>(3.14); // OK explicit construction is allowed
Mixed arithmetic will fail if the conversion is either ambiguous or explicit:
number<cpp_int_backend<>, et_off> a(2); number<cpp_int_backend<>, et_on> b(3); b = a * b; // Error, implicit conversion could go either way. b = a * 3.14; // Error, no operator overload if the conversion would be explicit.
On compilers that support rvalue-references, class number
is move-enabled if the underlying backend is.
In addition the non-expression template operator overloads (see below) are move aware and have overloads that look something like:
template <class B> number<B, et_off> operator + (number<B, et_off>&& a, const number<B, et_off>& b) { return std::move(a += b); }
These operator overloads ensure that many expressions can be evaluated without actually generating any temporaries. However, there are still many simple expressions such as:
a = b * c;
Which don't noticeably benefit from move support. Therefore, optimal performance comes from having both move-support, and expression templates enabled.
Note that while "moved-from" objects are left in a sane state, they have an unspecified value, and the only permitted operations on them are destruction or the assignment of a new value. Any other operation should be considered a programming error and all of our backends will trigger an assertion if any other operation is attempted. This behavior allows for optimal performance on move-construction (i.e. no allocation required, we just take ownership of the existing object's internal state), while maintaining usability in the standard library containers.
Class number
is expression-template-enabled:
that means that rather than having a multiplication operator that looks like
this:
template <class Backend> number<Backend> operator * (const number<Backend>& a, const number<Backend>& b) { number<Backend> result(a); result *= b; return result; }
Instead the operator looks more like this:
template <class Backend> unmentionable-type operator * (const number<Backend>& a, const number<Backend>& b);
Where the "unmentionable" return type is an implementation detail that, rather than containing the result of the multiplication, contains instructions on how to compute the result. In effect it's just a pair of references to the arguments of the function, plus some compile-time information that stores what the operation is.
The great advantage of this method is the elimination of temporaries:
for example the "naive" implementation of operator*
above, requires one temporary for computing
the result, and at least another one to return it. It's true that sometimes
this overhead can be reduced by using move-semantics, but it can't be eliminated
completely. For example, lets suppose we're evaluating a polynomial via Horner's
method, something like this:
T a[7] = { /* some values */ }; //.... y = (((((a[6] * x + a[5]) * x + a[4]) * x + a[3]) * x + a[2]) * x + a[1]) * x + a[0];
If type T
is a number
, then this expression is evaluated
without creating a single temporary value. In contrast,
if we were using the mpfr_class
C++ wrapper for MPFR - then this expression
would result in no less than 11 temporaries (this is true even though mpfr_class does
use expression templates to reduce the number of temporaries somewhat). Had
we used an even simpler wrapper around MPFR
like mpreal things
would have been even worse and no less that 24 temporaries are created for
this simple expression (note - we actually measure the number of memory allocations
performed rather than the number of temporaries directly, note also that the
mpf_class
wrapper that will be supplied with GMP-5.1 reduces the number of temporaries
to pretty much zero). Note that if we compile with expression templates disabled
and rvalue-reference support on, then actually still have no wasted memory
allocations as even though temporaries are created, their contents are moved
rather than copied. [1]
Important | |
---|---|
Expression templates can radically reorder the operations in an expression, for example: a = (b * c) * a; Will get transformed into: a *= c; a *= b; If this is likely to be an issue for a particular application, then they should be disabled. |
This library also extends expression template support to standard library functions
like abs
or sin
with number
arguments. This
means that an expression such as:
y = abs(x);
can be evaluated without a single temporary being calculated. Even expressions like:
y = sin(x);
get this treatment, so that variable 'y' is used as "working storage"
within the implementation of sin
,
thus reducing the number of temporaries used by one. Of course, should you
write:
x = sin(x);
Then we clearly can't use x
as working storage during the calculation, so then a temporary variable is
created in this case.
Given the comments above, you might be forgiven for thinking that expression-templates
are some kind of universal-panacea: sadly though, all tricks like this have
their downsides. For one thing, expression template libraries like this one,
tend to be slower to compile than their simpler cousins, they're also harder
to debug (should you actually want to step through our code!), and rely on
compiler optimizations being turned on to give really good performance. Also,
since the return type from expressions involving number
s
is an "unmentionable implementation detail", you have to be careful
to cast the result of an expression to the actual number type when passing
an expression to a template function. For example, given:
template <class T> void my_proc(const T&);
Then calling:
my_proc(a+b);
Will very likely result in obscure error messages inside the body of my_proc
- since we've passed it an expression
template type, and not a number type. Instead we probably need:
my_proc(my_number_type(a+b));
Having said that, these situations don't occur that often - or indeed not at all for non-template functions. In addition, all the functions in the Boost.Math library will automatically convert expression-template arguments to the underlying number type without you having to do anything, so:
mpfr_float_100 a(20), delta(0.125); boost::math::gamma_p(a, a + delta);
Will work just fine, with the a + delta
expression
template argument getting converted to an mpfr_float_100
internally by the Boost.Math library.
One other potential pitfall that's only possible in C++11: you should never store an expression template using:
auto my_expression = a + b - c;
unless you're absolutely sure that the lifetimes of a
,
b
and c
will outlive that of my_expression
.
And finally... the performance improvements from an expression template library like this are often not as dramatic as the reduction in number of temporaries would suggest. For example if we compare this library with mpfr_class and mpreal, with all three using the underlying MPFR library at 50 decimal digits precision then we see the following typical results for polynomial execution:
Table 1.1. Evaluation of Order 6 Polynomial.
Library |
Relative Time |
Relative number of memory allocations |
---|---|---|
number |
1.0 (0.00957s) |
1.0 (2996 total) |
1.1 (0.0102s) |
4.3 (12976 total) |
|
1.6 (0.0151s) |
9.3 (27947 total) |
As you can see, the execution time increases a lot more slowly than the number of memory allocations. There are a number of reasons for this:
operator*=
) tends to be more than an out-of-place
operator*
(typically operator *=
has to create a temporary workspace to carry out the multiplication, where
as operator*
can use the target variable as workspace). Since the expression templates
carry out their magic by converting out-of-place operators to in-place
ones, we necessarily take this hit. Even so the transformation is more
efficient than creating the extra temporary variable, just not by as much
as one would hope.
Finally, note that number
takes
a second template argument, which, when set to et_off
disables all the expression template machinery. The result is much faster to
compile, but slower at runtime.
We'll conclude this section by providing some more performance comparisons between these three libraries, again, all are using MPFR to carry out the underlying arithmetic, and all are operating at the same precision (50 decimal digits):
Table 1.2. Evaluation of Boost.Math's Bessel function test data
Library |
Relative Time |
Relative Number of Memory Allocations |
---|---|---|
mpfr_float_50 |
1.0 (5.78s) |
1.0 (1611963) |
number<mpfr_float_backend<50>, et_off> |
1.1 (6.29s) |
2.64 (4260868) |
1.1 (6.28s) |
2.45 (3948316) |
|
1.65 (9.54s) |
8.21 (13226029) |
Table 1.3. Evaluation of Boost.Math's Non-Central T distribution test data
Library |
Relative Time |
Relative Number of Memory Allocations |
---|---|---|
number |
1.0 (263s) |
1.0 (127710873) |
number<mpfr_float_backend<50>, et_off> |
1.0 (260s) |
1.2 (156797871) |
1.1 (287s) |
2.1 (268336640) |
|
1.5 (389s) |
3.6 (466960653) |
The above results were generated on Win32 compiling with Visual C++ 2010, all optimizations on (/Ox), with MPFR 3.0 and MPIR 2.3.0.
[1] The actual number generated will depend on the compiler, how well it optimises the code, and whether it supports rvalue references. The number of 11 temporaries was generated with Visual C++ 10