An Open-Source Unbounded Integer in C++

Bill Seymour
2022-12-18


Contents


Introduction

This paper describes the public API of an open-source unbounded integer distributed under the Boost Software License.  (This is not part of Boost.  The author just likes their open-source license.)

This paper contains links, usually to cppreference.com, to explain some terms to readers who aren’t familiar with features that were introduced in C++11 or later.
Text with a gray background provides additional information or rationale for the design.
Text with a yellow background points out issues that you might want to watch out for.

The bigint class is intended to provide a C++ equivalent to SQL’s NUMERIC and DECIMAL types (for values that happen to be integers) for use in a database access library that the author is developing; so it’s in that library’s dbacc namespace; and like that library, it deliberately doesn’t require any language or library features past C++11.
The class is written with only C++11 in mind because the principal target audience is engaged in good old “business data processing”, and large companies are often slow to approve new development tools (for good reason).

Bitwise operations are only partially supported, and the efficiency of the author’s implementation might be unaceptable for serious numerical work.  Pete Becker’s more interesting integer class might be coming Real Soon Now (see C++ Numerics Work In Progress §9.3).

As expected for integers, the class has no NaN or infinity values; and it will never yield a negative zero that’s visible outside the class.

Division and conversions from finite floating point values quietly truncate their results toward zero.

Exceptions:

In general, all functions provide at least the strong exception guarantee, but only the basic guarantee when bad_alloc is thrown.
The strong guarantee for bad_alloc slowed the author’s arithmetic operations quite a bit, so that didn’t seem worthwhile since failure to allocate memory is unlikely, and most programs that fail in that manner probably can’t continue anyway.
But note that the author’s implementation might not be suitable for some safety-critical applications absent the strong guarantee for bad_alloc.


The source files


Synopsis

namespace dbacc {

class bigint final
{
public:
  //
  // Special member functions and swap
  //
    bigint();

    bigint(const bigint&);
    bigint(bigint&&);

    bigint& operator=(const bigint&);
    bigint& operator=(bigint&&);

    ~bigint() noexcept;

    void swap(bigint&);

  //
  // Contruction from arithmetic types
  //
    template<typename FundamentalArithmeticType>
      bigint(FundamentalArithmeticType);

  //
  // Contruction from strings
  //
    explicit bigint(const char*, int radix = 0, const char** = nullptr);
    explicit bigint(const std::string&, int radix = 0, std::size_t* = nullptr);

  //
  // Assignment
  //
    template<typename FundamentalArithmeticType>
      bigint& operator=(FundamentalArithmeticType);

    bigint& assign(const char*, int radix = 0, const char** = nullptr);
    bigint& assign(const std::string&, int radix = 0, std::size_t* = nullptr);

  //
  // Conversion to other types
  //
    explicit operator bool() const noexcept;
    explicit operator char() const;
    explicit operator signed char() const;
    explicit operator unsigned char() const;
    explicit operator short() const;
    explicit operator unsigned short() const;
    explicit operator int() const;
    explicit operator unsigned int() const;
    explicit operator long() const;
    explicit operator unsigned long() const;
    explicit operator long long() const;
    explicit operator unsigned long long() const;
    explicit operator float() const;
    explicit operator double() const;
    explicit operator long double() const;

    std::string to_string(int radix = 10) const;

  //
  // Observers
  //
    bool is_neg()  const noexcept;
    bool is_pos()  const noexcept;
    bool is_odd()  const noexcept;
    bool is_even() const noexcept;
    bool is_zero() const noexcept;
    bool is_one()  const noexcept;
    bool is_one(bool) const noexcept;

    int signum() const noexcept;

    int compare(const bigint&) const noexcept;

  //
  // Mutators
  //
    void clear() noexcept;

    bigint& set_to_zero() noexcept;
    bigint& set_to_one(bool negative = false);

    bigint& negate() noexcept;
    bigint& abs() noexcept;

  //
  // Arithmetic operations
  //
    bigint& operator++();
    bigint& operator--();

    bigint  operator++(int);
    bigint  operator--(int);

    bigint& operator+=(const bigint&);
    bigint& operator-=(const bigint&);
    bigint& operator*=(const bigint&);
    bigint& operator/=(const bigint&);
    bigint& operator%=(const bigint&);

    struct div_t;
    div_t div(const bigint&) const;

  //
  // Bitwise operations
  //
    bigint& complement() noexcept;
    bigint& operator&=(const bigint&) noexcept;
    bigint& operator|=(const bigint&);
    bigint& operator^=(const bigint&);
    bigint& operator<<=(int);
    bigint& operator>>=(int);

  //
  // Size, etc.
  //
    bool empty() const noexcept;
    std::size_t size() const noexcept;
    std::size_t capacity() const noexcept;
    void shrink_to_fit();
};

//
// Miscellaneous functions
//
using std::swap;
void swap(bigint&, bigint&);
void swap(bigint::div_t&, bigint::div_t&);

bigint abs(const bigint&);

std::string to_string(const bigint&, int radix = 10);

bigint gcd(const bigint&, const bigint&);
bigint lcm(const bigint&, const bigint&);

//
// Overloaded operators
//
bool operator==(const bigint&, const bigint&) noexcept;
bool operator!=(const bigint&, const bigint&) noexcept;
bool operator< (const bigint&, const bigint&) noexcept;
bool operator> (const bigint&, const bigint&) noexcept;
bool operator<=(const bigint&, const bigint&) noexcept;
bool operator>=(const bigint&, const bigint&) noexcept;

bigint operator+(const bigint&);
bigint operator-(const bigint&);

bigint operator+(const bigint&, const bigint&);
bigint operator-(const bigint&, const bigint&);
bigint operator*(const bigint&, const bigint&);
bigint operator/(const bigint&, const bigint&);
bigint operator%(const bigint&, const bigint&);

bigint::div_t div(const bigint&, const bigint&);

bigint operator~(const bigint&);
bigint operator&(const bigint&, const bigint&);
bigint operator|(const bigint&, const bigint&);
bigint operator^(const bigint&, const bigint&);
bigint operator<<(const bigint&, int);
bigint operator>>(const bigint&, int);

//
// Formatted I/O
//
template<typename CharType, typename CharTraits>
  std::basic_ostream<CharType,CharTraits>&
    operator<<(std::basic_ostream<CharType,CharTraits>&, const bigint&);

template<typename CharType, typename CharTraits>
  std::basic_istream<CharType,CharTraits>&
    operator>>(std::basic_istream<CharType,CharTraits>&, bigint&);

} // namespace dbacc


Detailed descriptions


Member functions


Special member functions and swap

bigint();
The default constructor constructs a bigint with a value of zero.
~bigint() noexcept;
The destructor is non-trivial.
bigint(const bigint&);
bigint& operator=(const bigint&);

bigint(bigint&&);
bigint& operator=(bigint&&);

void swap(bigint&);
bigints are freely copyable, moveable, and swappable.


Construction from arithmetic types

template<typename FundamentalArithmeticType>
  bigint(FundamentalArithmeticType);
The class provides implicit conversion from all the fundamental arithmetic types.  The conversion from finite floating point values quietly truncates its argument toward zero.  Attempting to convert from a floating-point NaN or infinity will throw an invalid_argument exception.

There are actually four constructor templates that provide this functionality.  The details aren’t documented here because they would just obscure the high-level purpose of these constructors.  C++ coders who are familiar with type traits templates will likely find the details unsurprising (and probably tedious as well).


Construction from strings

explicit bigint(const char*        value, int radix = 0, const char** termptr = nullptr);
explicit bigint(const std::string& value, int radix = 0, std::size_t* termpos = nullptr);
The class provides explicit conversions from C-style strings and std::strings.  If value is an empty string, the constructed value will be zero.  The conversion from C-style strings will throw an invalid_argument exception if value is nullptr.

In general, value should contain only digits; but the first character may be a '+' or a '-'; and hexadecimal values may begin with "0x" or "0X".  If value begins with a sign, any "0x" or "0X" must immediately follow that.

Only octal, decimal and hexadecimal representations are supported.  Any character that’s not allowed will just quietly terminate the parse.
And note that whitespace and thousands separators are not allowed.

Any value is allowed for the radix argument; but if it’s not 8, 10, or 16, or it’s not supplied at all, the radix will be inferred from the string:

Hexadecimal digits may be upper or lower case, and the case may be mixed in a single string.

An optional third argument can be used to discover the character that terminated the parse:


Assignment

template<typename FundamentalArithmeticType>
  bigint& operator=(FundamentalArithmeticType);
Values of any fundamental arithmetic type may be assigned to a bigint.  If an operand is a finite floating point value, the value will be quietly truncated toward zero.  If an operand is a floating point NaN or infinity, the function will throw an invalid_argument exception.
bigint& assign(const char*        value, int radix = 0, const char** termptr = nullptr);
bigint& assign(const std::string& value, int radix = 0, std::size_t* termpos = nullptr);
There’s no operator=(some string) because implicit conversion from strings is not allowed.  Instead, there are assign member functions which also allow for additional arguments.  All the arguments have the same semantics as they do in the constructors that take string arguments.

These functions all return *this.


Conversion to other types

explicit operator bool() const noexcept;
The conversion to bool returns whether *this is non-zero.  It’s intended so support the if(my_bigint) idiom.
explicit operator char() const;
explicit operator signed char() const;
explicit operator unsigned char() const;
explicit operator short() const;
explicit operator unsigned short() const;
explicit operator int() const;
explicit operator unsigned int() const;
explicit operator long() const;
explicit operator unsigned long() const;
explicit operator long long() const;
explicit operator unsigned long long() const;
explicit operator float() const;
explicit operator double() const;
explicit operator long double() const;
Also provided are direct initialization of, and explicit casts to, all the rest of the usual fundamental arithmetic types.  If the value won’t fit in the requested type, including attempting to convert a negative value to an unsigned type, the function will throw an overflow_error.  The conversions to floating point types can quietly result in loss of precision.
We probably could have gotten away with just long long, unsigned long long and long double; but having all of them allows for a finer grained check for overflow; and it also allows including your C++ implementation’s type_info::name(), whatever that is, in the overflow_error’s message, which might be helpful during debugging.
If your C++ implementation’s long double lacks quiet NaNs and infinities, operator long double() might not throw an overflow_error, but instead return HUGE_VALL with errno set to ERANGE.  This could also happen with operator double() if double and long double have the same representation.
The conversions to floating point types require access to the floating point environment.  If your compiler doesn’t understand “#pragma STDC FENV_ACCESS ON” at block scope, overflow might not be detected.  (It seems to work OK with a Microsoft compiler (_MSC_VER == 1928) that doesn’t grok any STDC pragmas at all; but YMMV.)
std::string to_string(int radix = 10) const;
If to_string’s argument is not 8, 10 or 16, or is not supplied at all, it defaults to 10.  Hexadecimal digits will be upper case.  There could be a leading '-', but there will never be a leading '+' nor any indication of the radix.  If you need something fancier, you can write to a std::basic_ostringstream<> and get the whole nine yards (or a localized version of that 😊).


Observers

bool is_zero() const noexcept;
bool is_even() const noexcept;
bool is_odd()  const noexcept;
bool is_neg()  const noexcept;
bool is_pos()  const noexcept;
is_zero() returns whether *this is zero;
is_even() returns whether *this is an even number;
is_odd()  returns whether *this is an odd number;
is_neg()  returns whether *this is less than zero;
is_pos()  returns whether *this is greater than zero.
A better name for is_pos() might be is_pos_nonzero() since it returns false if the value is zero; but some users might prefer the less prolix name.
bool is_one() const noexcept;
bool is_one(bool negative) const noexcept;
is_one() with no argument returns whether *this is ±1;
is_one(false) returns whether *this is +1;
is_one(true) returns whether *this is −1.
int signum() const noexcept;
signum() returns +1 if *this is greater than zero, 0 if *this is equal to zero, or −1 if *this is less than zero.
int compare(const bigint& rhs) const noexcept;
compare(rhs) returns a value greater than zero if *this > rhs, zero if *this == rhs, or a value less than zero if *this < rhs.


Mutators

void clear() noexcept;
bigint& set_to_zero() noexcept;
clear() and set_to_zero() both assign zero to *this.
Some users might prefer clear() because it’s a common member of classes that allocate memory on the free store (a.k.a., “heap”); some users might prefer the more expressive name or a function that returns *this.
bigint& set_to_one(bool negative = false);
set_to_one(false) assigns +1 to *this;
set_to_one(true) assigns −1 to *this.
bigint& negate() noexcept;
negate() changes the sign of *this if *this is non-zero.  It will never create a negative zero.
bigint& abs() noexcept;
abs() unconditionally makes *this non-negative.

All but clear() return *this.


Arithmetic operations

bigint& operator++();
bigint& operator--();

bigint  operator++(int);
bigint  operator--(int);

bigint& operator+=(const bigint&);
bigint& operator-=(const bigint&);
bigint& operator*=(const bigint&);
bigint& operator/=(const bigint&);
bigint& operator%=(const bigint&);
Overloads of all the usual arithmetic operators that apply to integers are provided.  Division truncates the quotient toward zero.  Division by zero throws a domain_error exception.
Overloads of the operators that don’t also do assignment are provided as nonmember functions shown below.
struct div_t final
{
    bigint quot, rem;
    void swap(div_t&);
};
div_t div(const bigint& divisor) const;
bigint::div is inspired by std::div.  It divides *this by divisor and returns the quotient and remainder in a bigint::div_t.  The quotient will be truncated toward zero; the remainder will have the same sign as *this or be zero.  quot * divisor + rem == *this.

bigint::div_ts are freely copyable, moveable, and swappable.


Bitwise operations

bigint& complement() noexcept;
bigint& operator&=(const bigint&) noexcept;
bigint& operator|=(const bigint&);
bigint& operator^=(const bigint&);
bigint& operator<<=(int bit_count);
bigint& operator>>=(int bit_count);
The bitwise operations should be mostly unsurprising, but note:
  • The complement() function toggles exactly those bits already present in the object.  It doesn’t generate any representation of an infinite number of most significant 1s, so the result of this->complement() isn’t really unbounded; and leading zeros can be lost.
  • Bit counts for shifts are limited to values that can be stored in an int.  You can’t shift by some huge bigint number of bits in a single operation.
  • Bitwise operations never change the sign of the object except to insure that there’s never a negative zero.

If the bit_count argument to a shift operator is less than zero, the function will shift in the opposite direction by the absolute value of the bit count even if the bit count is 2’s-comp. INT_MIN.


Size, etc.

bool empty() const noexcept;
std::size_t size() const noexcept;
std::size_t capacity() const noexcept;
void shrink_to_fit();
These functions relate to the number of bytes that bigint objects allocate on the free store (a.k.a., “heap”).

The bigint class is implemented, in part, as a std::vector<something unsigned>.  These functions all just call the vector’s member functions of the same name, and then size() and capacity() multiply by sizeof(the vector’s value_type) so that users don’t have to care what the value_type is, and you always get the same answer regardless of the details of the implementation.

Class invariant:  is_zero() == empty().


Nonmember functions


Miscellaneous functions

using std::swap;
void swap(bigint&, bigint&);
void swap(bigint::div_t&, bigint::div_t&);

bigint abs(const bigint&);

std::string to_string(const bigint&, int radix = 10);
The above all have the same behavior as their corresponding member functions.
bigint gcd(const bigint&, const bigint&);
bigint lcm(const bigint&, const bigint&);
gcd() returns the greatest common divisor of two bigints.  If either argument is zero, it returns the absolute value of the other one.  If both arguments are zero, it throws an invalid_argument exception because gcd(0,0) would otherwise return zero which is no divisor of any kind.

lcm() returns the least common multiple, the smallest integer greater than zero that’s divisible by both arguments.  If either argument is zero, it throws an invalid_argument exception because no number is divisible by zero.

I’ve heard that some would prefer that lcm() just return zero if either argument is zero.  If anybody knows of a source other than Wikipedia that offers a cogent rationale for that, or an interesting use case, please let me know.


Overloaded operators

bool operator==(const bigint&, const bigint&) noexcept;
bool operator!=(const bigint&, const bigint&) noexcept;
bool operator< (const bigint&, const bigint&) noexcept;
bool operator> (const bigint&, const bigint&) noexcept;
bool operator<=(const bigint&, const bigint&) noexcept;
bool operator>=(const bigint&, const bigint&) noexcept;
Note that there’s no operator<=> (a.k.a., “spaceship operator”).  That’s because the database access library that the bigint class is intended to support tries very hard not to require any language or library features past C++11; and note that we always expect strong ordering for integers anyway since we don’t have NaNs to worry about.
bigint operator+(const bigint&);
bigint operator-(const bigint&);
The unary + operator just returns a copy of its argument; the unary − operator returns a negated copy of its argument.
bigint operator+(const bigint&, const bigint&);
bigint operator-(const bigint&, const bigint&);
bigint operator*(const bigint&, const bigint&);
bigint operator/(const bigint&, const bigint&);
bigint operator%(const bigint&, const bigint&);

bigint::div_t div(const bigint& dividend, const bigint& divisor);
The nonmember arithmetic operations have the same semantics as do their corresponding member functions.
bigint operator~(const bigint&);
bigint operator&(const bigint&, const bigint&);
bigint operator|(const bigint&, const bigint&);
bigint operator^(const bigint&, const bigint&);
bigint operator<<(const bigint&, int);
bigint operator>>(const bigint&, int);
The nonmember bitwise operators have the same semantics as do their corresponding member functions, except that bigint::complement() and the &= operator will never throw an exception, but the nonmember ~ and & operators can throw bad_alloc because they make copies of their arguments.


Formatted I/O

template<typename Ch, typename Tr>
std::basic_ostream<Ch,Tr>& operator<<(std::basic_ostream<Ch,Tr>&, const bigint&);

template<typename Ch, typename Tr>
std::basic_istream<Ch,Tr>& operator>>(std::basic_istream<Ch,Tr>&, bigint&);
The canonical iostream insertion and extraction operators are provided.  They’re localized (ctype and numpunct facets) and recognize all the ios_base and basic_ios<> formatting facilities that make sense for integers.

C++20’s formatting library is not supported because, as with the spaceship operator, we deliberately refrain from requiring any language or library features past C++11.

Issue:  the author’s current implementation of the >> operator recognizes localized thousands separators but just ignores them.  It doesn’t require that they be in the right places.

Although it’s not shown above, if you won’t be doing any I/O of bigints in a particular translation unit (TU), you can #define BIGINT_NO_IO_OPERATORS before you include bigint.hpp, which might save compilation time in complicated builds, although it won’t reduce code size or running time.  You can define that macro in some TUs and not in others without fear of violating the one definition rule (ODR).

The actual code in bigint.hpp is:
  } // end of namespace dbacc
  #ifndef BIGINT_NO_IO_OPERATORS
    #include "bigint_io.hpp"
  #endif
and bigint_io.hpp contains the actual template definitions in the dbacc namespace.  The hope is that the compiler’s lexer won’t even have to scan that code if it doesn’t need to.

bigint_io.hpp is a proper header file beginning and ending in the global namespace, and with the usual include guard.  If you include bigint.hpp with BIGINT_NO_IO_OPERATORS defined, you can still get the I/O operators later in the same TU by explicitly including bigint_io.hpp yourself.  It’s not clear why you’d want to do that, but there it is.
And note that any explicit #include of bigint_io.hpp may appear only in the global namespace if you want ADL to work right.


All suggestions and corrections will be welcome; all flames will be amusing.
Mail to was@pobox.com.