An Open-Source Unbounded Integer in C++

Bill Seymour
2024-02-02


Contents


Introduction

This paper describes the public API of an open-source unbounded integer distributed under the Boost Software License.  (This is not part of Boost.  The author just likes their open-source license.)

This paper contains links, usually to cppreference.com, to explain some terms to readers who aren’t familiar with features that were introduced in C++11 or later.
Text with a gray background provides additional information or rationale for the design.
Text with a yellow background points out issues that you might want to watch out for.

The integer class is intended to provide a C++ equivalent to SQL’s NUMERIC and DECIMAL types (for values that happen to be integers) for use in a database access library that the author is developing; so it deliberately doesn’t require any language or library features past C++11.
The class is written with only C++11 in mind because the principal target audience is engaged in good old “business data processing”, and large companies are often slow to approve new development tools (for good reason).

Bitwise operations are only partially supported, and the efficiency of the author’s implementation might be unaceptable for serious numerical work.

As expected for integers, the class has no NaN or infinity values; and it will never yield a negative zero that’s visible outside the class.

Division and conversions from finite floating point values quietly truncate their results toward zero.

This is part of a larger library that includes this integer class, a big decimal that can be used as either a fixed point or a floating point type, and a rational number.  All the sources files are available here.

All symbols are declared in the bignum namespace.

Exceptions:

In general, all functions provide at least the strong exception guarantee, but only the basic guarantee when bad_alloc is thrown.
The strong guarantee for bad_alloc slowed the author’s arithmetic operations quite a bit, so that didn’t seem worthwhile since failure to allocate memory is unlikely, and most programs that fail in that manner probably can’t continue anyway.


The source files


Synopsis

namespace bignum {

class integer final
{
public:
  //
  // Special member functions and swap
  //
    integer();

    integer(const integer&);
    integer(integer&&);

    integer& operator=(const integer&);
    integer& operator=(integer&&);

    ~integer() noexcept;

    void swap(integer&);

  //
  // Contruction from arithmetic types
  // (SFINAE not shown)
  //
    template<typename FundamentalArithmeticType>
      integer(FundamentalArithmeticType);

  //
  // Contruction from strings
  //
    explicit integer(const char*, int radix = 0, const char** = nullptr);
    explicit integer(const std::string&, int radix = 0, std::size_t* = nullptr);

  //
  // Assignment
  // (SFINAE for template not shown)
  //
    template<typename FundamentalArithmeticType>
      integer& operator=(FundamentalArithmeticType);

    integer& assign(const char*, int radix = 0, const char** = nullptr);
    integer& assign(const std::string&, int radix = 0, std::size_t* = nullptr);

  //
  // Conversion to other types
  //
    explicit operator bool() const noexcept;
    explicit operator char() const;
    explicit operator signed char() const;
    explicit operator unsigned char() const;
    explicit operator short() const;
    explicit operator unsigned short() const;
    explicit operator int() const;
    explicit operator unsigned int() const;
    explicit operator long() const;
    explicit operator unsigned long() const;
    explicit operator long long() const;
    explicit operator unsigned long long() const;
    explicit operator float() const;
    explicit operator double() const;
    explicit operator long double() const;

    std::string to_string(int radix = 10) const;

  //
  // Observers
  //
    bool is_neg()  const noexcept;
    bool is_pos()  const noexcept;
    bool is_odd()  const noexcept;
    bool is_even() const noexcept;
    bool is_zero() const noexcept;
    bool is_one()  const noexcept;
    bool is_one(bool) const noexcept;

    int signum() const noexcept;

    int compare(const integer&) const noexcept;

  //
  // Mutators
  //
    void clear() noexcept;

    integer& set_to_zero() noexcept;
    integer& set_to_one(bool negative = false);

    integer& negate() noexcept;
    integer& abs() noexcept;

  //
  // Arithmetic operations
  //
    integer& operator++();
    integer& operator--();

    integer  operator++(int);
    integer  operator--(int);

    integer& operator+=(const integer&);
    integer& operator-=(const integer&);
    integer& operator*=(const integer&);
    integer& operator/=(const integer&);
    integer& operator%=(const integer&);

    struct div_t;
    div_t div(const integer&) const;

  //
  // Bitwise operations
  //
    integer& complement() noexcept;
    integer& operator&=(const integer&) noexcept;
    integer& operator|=(const integer&);
    integer& operator^=(const integer&);
    integer& operator<<=(int);
    integer& operator>>=(int);

  //
  // Size, etc.
  //
    bool empty() const noexcept;
    std::size_t size() const noexcept;
    std::size_t capacity() const noexcept;
    void shrink_to_fit();
};

//
// Miscellaneous functions
//
using std::swap;
void swap(integer&, integer&);
void swap(integer::div_t&, integer::div_t&);

integer abs(const integer&);

std::string to_string(const integer&, int radix = 10);

integer gcd(const integer&, const integer&);
integer lcm(const integer&, const integer&);

//
// Overloaded operators
//
bool operator==(const integer&, const integer&) noexcept;
bool operator!=(const integer&, const integer&) noexcept;
bool operator< (const integer&, const integer&) noexcept;
bool operator> (const integer&, const integer&) noexcept;
bool operator<=(const integer&, const integer&) noexcept;
bool operator>=(const integer&, const integer&) noexcept;

integer operator+(const integer&);
integer operator-(const integer&);

integer operator+(const integer&, const integer&);
integer operator-(const integer&, const integer&);
integer operator*(const integer&, const integer&);
integer operator/(const integer&, const integer&);
integer operator%(const integer&, const integer&);

integer::div_t div(const integer&, const integer&);

integer operator~(const integer&);
integer operator&(const integer&, const integer&);
integer operator|(const integer&, const integer&);
integer operator^(const integer&, const integer&);
integer operator<<(const integer&, int);
integer operator>>(const integer&, int);

//
// Formatted I/O
//
template<typename CharType, typename CharTraits>
  std::basic_ostream<CharType,CharTraits>&
    operator<<(std::basic_ostream<CharType,CharTraits>&, const integer&);

template<typename CharType, typename CharTraits>
  std::basic_istream<CharType,CharTraits>&
    operator>>(std::basic_istream<CharType,CharTraits>&, integer&);

} // namespace bignum


Detailed descriptions


Member functions


Special member functions and swap

integer();
The default constructor constructs a integer with a value of zero.
~integer() noexcept;
The destructor is non-trivial.
integer(const integer&);
integer& operator=(const integer&);

integer(integer&&);
integer& operator=(integer&&);

void swap(integer&);
integers are freely copyable, moveable, and swappable.


Construction from arithmetic types

template<typename FundamentalArithmeticType>
  integer(FundamentalArithmeticType);
The class provides implicit conversion from all the fundamental arithmetic types.  The conversion from finite floating point values quietly truncates its argument toward zero.  Attempting to convert from a floating-point NaN or infinity will throw an invalid_argument exception.

This functionality is actually provided by four constructor templates that use SFINAE to guarantee that we get the correct overload for the argument type.  The details aren’t documented here because they would just obscure the high-level purpose of these constructors.  C++ coders who are familiar with type traits templates will likely find the details unsurprising (and probably tedious as well).


Construction from strings

explicit integer(const char*        value, int radix = 0, const char** termptr = nullptr);
explicit integer(const std::string& value, int radix = 0, std::size_t* termpos = nullptr);
The class provides explicit conversions from C-style strings and std::strings.  If value is an empty string, the constructed value will be zero.  The conversion from C-style strings will throw an invalid_argument exception if value is nullptr.

In general, value should contain only digits; but the first character may be a '+' or a '-'; and hexadecimal values may begin with "0x" or "0X".  If value begins with a sign, any "0x" or "0X" must immediately follow that.

Only octal, decimal and hexadecimal representations are supported.  Any character that’s not allowed will just quietly terminate the parse.
And note that whitespace and thousands separators are not allowed.

Any value is allowed for the radix argument; but if it’s not 8, 10, or 16, or it’s not supplied at all, the radix will be inferred from the string:

Hexadecimal digits may be upper or lower case, and the case may be mixed in a single string.

An optional third argument can be used to discover the character that terminated the parse:


Assignment

template<typename FundamentalArithmeticType>
  integer& operator=(FundamentalArithmeticType);
Values of any fundamental arithmetic type may be assigned to a integer.  If an operand is a finite floating point value, the value will be quietly truncated toward zero.  If an operand is a floating point NaN or infinity, the function will throw an invalid_argument exception.  As with implicit construction, SFINAE guarantees that we get the correct overload.
integer& assign(const char*        value, int radix = 0, const char** termptr = nullptr);
integer& assign(const std::string& value, int radix = 0, std::size_t* termpos = nullptr);
There’s no operator=(some string) because implicit conversion from strings is not allowed.  Instead, there are assign member functions which also allow for additional arguments.  All the arguments have the same semantics as they do in the constructors that take string arguments.

These functions all return *this.


Conversion to other types

explicit operator bool() const noexcept;
The conversion to bool returns whether *this is non-zero.  It’s intended so support the if(my_integer) idiom.
explicit operator char() const;
explicit operator signed char() const;
explicit operator unsigned char() const;
explicit operator short() const;
explicit operator unsigned short() const;
explicit operator int() const;
explicit operator unsigned int() const;
explicit operator long() const;
explicit operator unsigned long() const;
explicit operator long long() const;
explicit operator unsigned long long() const;
explicit operator float() const;
explicit operator double() const;
explicit operator long double() const;
Also provided are direct initialization of, and explicit casts to, all the rest of the usual fundamental arithmetic types.  If the value won’t fit in the requested type, including attempting to convert a negative value to an unsigned type, the function will throw an overflow_error.  The conversions to floating point types can quietly result in loss of precision.
We probably could have gotten away with just long long, unsigned long long and long double; but having all of them allows for a finer grained check for overflow; and it also allows including the target type name in the overflow_error’s message, which might be helpful during debugging.
If your C++ implementation’s long double lacks quiet NaNs and infinities, operator long double() might not throw an overflow_error, but instead return HUGE_VALL with errno set to ERANGE.  This could also happen with operator double() if double and long double have the same representation.
The conversions to floating point types require access to the floating point environment.  If your compiler doesn’t understand “#pragma STDC FENV_ACCESS ON” at block scope, overflow might not be detected.  (It seems to work OK with an older Microsoft compiler (_MSC_VER == 1928) that doesn’t grok any STDC pragmas at all; but YMMV.)
std::string to_string(int radix = 10) const;
If to_string’s argument is not 8, 10 or 16, or is not supplied at all, it defaults to 10.  Hexadecimal digits will be upper case.  There could be a leading '-', but there will never be a leading '+' nor any indication of the radix.  If you need something fancier, you can write to a std::basic_ostringstream<> and get the whole nine yards (or a localized version of that 😊).


Observers

bool is_zero() const noexcept;
bool is_even() const noexcept;
bool is_odd()  const noexcept;
bool is_neg()  const noexcept;
bool is_pos()  const noexcept;
is_zero() returns whether *this is zero;
is_even() returns whether *this is an even number;
is_odd()  returns whether *this is an odd number;
is_neg()  returns whether *this is less than zero;
is_pos()  returns whether *this is greater than zero.
A better name for is_pos() might be is_pos_nonzero() since it returns false if the value is zero; but some users might prefer the less prolix name.
bool is_one() const noexcept;
bool is_one(bool negative) const noexcept;
is_one() with no argument returns whether *this is ±1;
is_one(false) returns whether *this is +1;
is_one(true) returns whether *this is −1.
int signum() const noexcept;
signum() returns +1 if *this is greater than zero, 0 if *this is equal to zero, or −1 if *this is less than zero.
int compare(const integer& rhs) const noexcept;
compare(rhs) returns a value greater than zero if *this > rhs, zero if *this == rhs, or a value less than zero if *this < rhs.


Mutators

void clear() noexcept;
integer& set_to_zero() noexcept;
clear() and set_to_zero() both assign zero to *this.
Some users might prefer clear() because it’s a common member of classes that allocate memory on the free store (a.k.a., “heap”); some users might prefer the more expressive name or a function that returns *this.
integer& set_to_one(bool negative = false);
set_to_one(false) assigns +1 to *this;
set_to_one(true) assigns −1 to *this.
integer& negate() noexcept;
negate() changes the sign of *this if *this is non-zero.  It will never create a negative zero.
integer& abs() noexcept;
abs() unconditionally makes *this non-negative.

All but clear() return *this.


Arithmetic operations

integer& operator++();
integer& operator--();

integer  operator++(int);
integer  operator--(int);

integer& operator+=(const integer&);
integer& operator-=(const integer&);
integer& operator*=(const integer&);
integer& operator/=(const integer&);
integer& operator%=(const integer&);
Overloads of all the usual arithmetic operators that apply to integers are provided.  Division truncates the quotient toward zero.  Division by zero throws a domain_error exception.
Overloads of the operators that don’t also do assignment are provided as nonmember functions shown below.
struct div_t final
{
    integer quot, rem;
    void swap(div_t&);
};
div_t div(const integer& divisor) const;
integer::div is inspired by std::div.  It divides *this by divisor and returns the quotient and remainder in a integer::div_t.  The quotient will be truncated toward zero; the remainder will have the same sign as *this or be zero.  quot * divisor + rem == *this.

integer::div_ts are freely copyable, moveable, and swappable.


Bitwise operations

integer& complement() noexcept;
integer& operator&=(const integer&) noexcept;
integer& operator|=(const integer&);
integer& operator^=(const integer&);
integer& operator<<=(int bit_count);
integer& operator>>=(int bit_count);
The bitwise operations should be mostly unsurprising, but note:
  • The complement() function toggles exactly those bits already present in the object.  It doesn’t generate any representation of an infinite number of most significant 1s, so the result of this->complement() isn’t really unbounded; and leading zeros can be lost.
  • Bit counts for shifts are limited to values that can be stored in an int.  You can’t shift by some huge integer number of bits in a single operation.
  • Bitwise operations never change the sign of the object except to insure that there’s never a negative zero.

If the bit_count argument to a shift operator is less than zero, the function will shift in the opposite direction by the absolute value of the bit count even if the bit count is 2’s-comp. INT_MIN.


Size, etc.

bool empty() const noexcept;
std::size_t size() const noexcept;
std::size_t capacity() const noexcept;
void shrink_to_fit();
These functions relate to the number of bytes that integer objects allocate on the free store (a.k.a., “heap”).

The integer class is implemented, in part, as a std::vector<something unsigned>.  These functions all just call the vector’s member functions of the same name, and then size() and capacity() multiply by sizeof(the vector’s value_type) so that users don’t have to care what the value_type is, and you always get the same answer regardless of the details of the implementation.

Class invariant:  is_zero() == empty().


Nonmember functions


Miscellaneous functions

using std::swap;
void swap(integer&, integer&);
void swap(integer::div_t&, integer::div_t&);

integer abs(const integer&);

std::string to_string(const integer&, int radix = 10);
The above all have the same behavior as their corresponding member functions.
integer gcd(const integer&, const integer&);
integer lcm(const integer&, const integer&);
gcd() returns the greatest common divisor of two integers.  If either argument is zero, it returns the absolute value of the other one.  If both arguments are zero, it throws an invalid_argument exception because gcd(0,0) would otherwise return zero which is no divisor of any kind.

lcm() returns the least common multiple, the smallest integer greater than zero that’s divisible by both arguments.  If either argument is zero, it throws an invalid_argument exception because no number is divisible by zero.

I’ve heard that some would prefer that lcm() just return zero if either argument is zero.  If anybody knows of a source other than Wikipedia that offers a cogent rationale for that, or an interesting use case, please let me know.


Overloaded operators

bool operator==(const integer&, const integer&) noexcept;
bool operator!=(const integer&, const integer&) noexcept;
bool operator< (const integer&, const integer&) noexcept;
bool operator> (const integer&, const integer&) noexcept;
bool operator<=(const integer&, const integer&) noexcept;
bool operator>=(const integer&, const integer&) noexcept;
Note that there’s no operator<=> (a.k.a., “spaceship operator”).  That’s because the database access library that the integer class is intended to support tries very hard not to require any language or library features past C++11; and note that we always expect strong ordering for integers anyway since we don’t have NaNs to worry about.
integer operator+(const integer&);
integer operator-(const integer&);
The unary + operator just returns a copy of its argument; the unary − operator returns a negated copy of its argument.
integer operator+(const integer&, const integer&);
integer operator-(const integer&, const integer&);
integer operator*(const integer&, const integer&);
integer operator/(const integer&, const integer&);
integer operator%(const integer&, const integer&);

integer::div_t div(const integer& dividend, const integer& divisor);
The nonmember arithmetic operations have the same semantics as do their corresponding member functions.
integer operator~(const integer&);
integer operator&(const integer&, const integer&);
integer operator|(const integer&, const integer&);
integer operator^(const integer&, const integer&);
integer operator<<(const integer&, int);
integer operator>>(const integer&, int);
The nonmember bitwise operators have the same semantics as do their corresponding member functions, except that integer::complement() and the &= operator will never throw an exception, but the nonmember ~ and & operators can throw bad_alloc because they make copies of their arguments.


Formatted I/O

template<typename Ch, typename Tr>
std::basic_ostream<Ch,Tr>& operator<<(std::basic_ostream<Ch,Tr>&, const integer&);

template<typename Ch, typename Tr>
std::basic_istream<Ch,Tr>& operator>>(std::basic_istream<Ch,Tr>&, integer&);
The canonical iostream insertion and extraction operators are provided.  They’re localized (ctype and numpunct facets) and recognize all the ios_base and basic_ios<> formatting facilities that make sense for integers.

C++20’s formatting library is not supported because, as with the spaceship operator, we deliberately refrain from requiring any language or library features past C++11.

Issue:  the author’s current implementation of the >> operator recognizes localized thousands separators but just ignores them.  It doesn’t require that they be in the right places.

Although it’s not shown above, if you won’t be doing any I/O of integers in a particular translation unit (TU), you can #define INTEGER_NO_IO_OPERATORS before you include integer.hpp, which might save compilation time in complicated builds, although it won’t reduce code size or running time.  You can define that macro in some TUs and not in others without fear of violating the one definition rule (ODR).

The actual code in integer.hpp is:
  } // end of namespace bignum
  #ifndef INTEGER_NO_IO_OPERATORS
    #include "integer_io.hpp"
  #endif
and integer_io.hpp contains the actual template definitions in the bignum namespace.  The hope is that the compiler’s lexer won’t even have to scan that code if it doesn’t need to.

integer_io.hpp is a proper header file beginning and ending in the global namespace, and with the usual include guard.  If you include integer.hpp with INTEGER_NO_IO_OPERATORS defined, you can still get the I/O operators later in the same TU by explicitly including integer_io.hpp yourself.  It’s not clear why you’d want to do that, but there it is.
And note that any explicit #include of integer_io.hpp may appear only in the global namespace if you want ADL to work right.


All suggestions and corrections will be welcome; all flames will be amusing.
Mail to was@pobox.com.