A Database Access Library in C++

Bill Seymour
2021-05-05


Contents


1. Introduction

This paper describes the public API of a proposed open-source database access library.  It’s distributed under the Boost Software License which isn’t viral as the GPL and others are said to be. (This is not part of Boost. The author just likes their open-source license (which, recursively, is open-source).)

The intent is that the library be implementable using any C++ implementation that conforms at least to C++11 with maybe some additional data types.

For the sake of brevity, this paper generally doesn’t show things like copy constructors and swap functions. Except as noted, all classes are copyable, copy-assignable, moveable, move-assignable, and noexcept-swappable; and they all have public non-virtual destructors.

For simplicity of exposition, this paper assumes that the database command language is SQL. It’s intended that the actual library be indifferent to that. (An acceptable command should be anything that the database engine’s parser accepts. This library shouldn’t do any parsing on its own or require any particular command syntax. Indeed, to the extent that it’s possible, the library should support Web service clients and even the cloud; although the principal target audience is folk wanting to connect to SQL databases.)


2. Acknowledgements

I’m indebted to Thomas Neumann of the Technische Univeristät München for the transaction class, the nullval template, row::into(Args&...), and basic_statement<>::operator()(const Args&...), all of which are based on code that appeared in WG21 paper N3458.

Thanks also to Howard Hinnant for his exquisite civil time extensions to <chrono>.


3. Synopsis

#include <stdexcept>
#include <string>
#include <iterator>
#include <memory>
#include <chrono>

#include "bigint.hpp"
#include "bigdec.hpp"
namespace dbacc {

using std::string;    // for brevity in this paper
using std::size_t;    //            "
using std::ptrdiff_t; //            "
class connection final { public: connection(const string&, const string&, const string&); explicit connection(const string&); explicit connection(void (*)(string&, string&, string&)); // optional: connection(const string&, const string&, const string&, implementation-defined); connection(const string&, implementation-defined); connection(void (*)(string&, string&, string&), implementation-defined); bool commit() noexcept; bool rollback() noexcept; bool auto_commit() const; void auto_commit(bool); void begin_nested_transaction(); bool commit_nested_transaction() noexcept; bool rollback_nested_transaction() noexcept; }; class transaction final { public: explicit transaction(connection&) noexcept; transaction(const transaction&) = delete; transaction(transaction&&) = delete; transaction& operator=(const transaction&) = delete; transaction& operator=(transaction&&) = delete; }; class nested_transaction final { public: explicit nested_transaction(connection&, bool = true); nested_transaction(const nested_transaction&) = delete; nested_transaction(nested_transaction&&) = delete; nested_transaction& operator=(const nested_transaction&) = delete; nested_transaction& operator=(nested_transaction&&) = delete; void begin(); bool commit() noexcept; bool rollback() noexcept; }; // civil time types new in C++20: using std::chrono::months; using std::chrono::days; using std::chrono::seconds; using std::chrono::microseconds; using time = std::chrono::time_of_day<seconds>; using precise_time = std::chrono::time_of_day<microseconds>; using date = std::chrono::local_time<days>; using datetime = std::chrono::local_time<seconds>; using timestamp = std::chrono::local_time<microseconds>; using std::chrono::time_zone; using zoned_date = std::chrono::zoned_time<days>; using zoned_datetime = std::chrono::zoned_time<seconds>; using zoned_timestamp = std::chrono::zoned_time<microseconds>; class blob final { public: constexpr blob() noexcept; blob(const void*, size_t); bool empty() const noexcept; size_t size() const noexcept; const void* data() const noexcept; blob& assign(const void*, size_t); void clear() noexcept; }; class column final { public: bool null() const noexcept; template<typename T> bool get(T&) const; template<typename T> T get() const; }; class row final { public: // // A const vector<column>-like API: // using size_type = size_t; using difference_type = ptrdiff_t; using value_type = const column; using const_reference = const column&; using const_iterator = implementation-detail; using const_reverse_iterator = implementation-detail; size_type size() const; bool empty() const; const_iterator begin() const; const_iterator end() const; const_reverse_iterator rbegin() const; const_reverse_iterator rend() const; const_iterator cbegin() const const_iterator cend() const; const_reverse_iterator crbegin() const; const_reverse_iterator crend() const; const_reference front() const; const_reference back() const; const_reference operator[](size_type) const; const_reference at(size_type) const; // // Extra added attractions: // const_reference operator[](const char*) const; const_reference operator[](const string&) const; const_reference at(const char*) const; const_reference at(const string&) const; template<typename ...Args> void into(Args&...) const; template<typename First, typename Second, typename ...Rest> void acquire(const First&, Second&, Rest&...) const; void acquire() const; }; class cursor_base { public: using value_type = row; cursor_base(const cursor_base&) = delete; cursor_base& operator=(const cursor_base&) = delete; virtual ~cursor_base() = 0; value_type operator*(); bool operator==(const cursor_base&) const noexcept; bool operator!=(const cursor_base&) const noexcept; bool operator< (const cursor_base&) const noexcept; bool operator> (const cursor_base&) const noexcept; bool operator<=(const cursor_base&) const noexcept; bool operator>=(const cursor_base&) const noexcept; }; struct cursor_tag { }; struct scrolling_cursor_tag : cursor_tag { }; template<typename IteratorCategory> class basic_cursor; template<> class basic_cursor<cursor_tag> : public cursor_base { public: using iterator_category = cursor_tag; basic_cursor(const basic_cursor&) = delete; basic_cursor& operator=(const basic_cursor&) = delete; basic_cursor& operator++(); }; using cursor = basic_cursor<cursor_tag>; template<> class basic_cursor<scrolling_cursor_tag> : public cursor_base { public: using iterator_category = scrolling_cursor_tag; using difference_type = ptrdiff_t; using position_type = size_t; static constexpr position_type npos = position_type(-1); basic_cursor(const basic_cursor&) = delete; basic_cursor& operator=(const basic_cursor&) = delete; basic_cursor& operator++(); basic_cursor& operator--(); basic_cursor& operator+=(difference_type); basic_cursor& operator-=(difference_type); basic_cursor& operator=(position_type); difference_type operator-(const basic_cursor&) const; }; using scrolling_cursor = basic_cursor<scrolling_cursor_tag>; using std::advance; using std::next; using std::prev; using std::distance; void advance(scrolling_cursor&, scrolling_cursor::difference_type); scrolling_cursor& next(scrolling_cursor&, scrolling_cursor::difference_type = 1); scrolling_cursor& prev(scrolling_cursor&, scrolling_cursor::difference_type = 1); scrolling_cursor::difference_type distance(const scrolling_cursor&, const scrolling_cursor&); template<typename CursorType> class basic_table { public: using const_iterator = CursorType; const_iterator begin() const; const_iterator end() const; const_iterator cbegin() const; const_iterator cend() const; }; using table = basic_table<cursor>; using scrolling_table = basic_table<scrolling_cursor>; template<typename T> struct nullval { }; using null_number = nullval<int>; using null_string = nullval<const char*>; template<typename ResultType> class basic_statement { public: using result_type = ResultType; using row_count = int; explicit basic_statement(connection&, row_count = 0); basic_statement(connection&, const string&, row_count = 0); row_count prefetch_size() const; void prefetch_size(row_count); void prepare(const string&); basic_statement& operator=(const string&); template<typename T> void set(int, const T&); template<typename T> void set(int, nullval<T>); template<typename T> void set_null(int); template<typename T> void set(const string&, const T&); template<typename T> void set(const string&, nullval<T>); template<typename T> void set_null(const string&); basic_statement& execute(); basic_statement& execute(const string&); result_type result(); template<typename ...Args> result_type operator()(const Args&...); template<typename ...Args> result_type run(const Args&...); }; using statement = basic_statement<void>; using dml_statement = basic_statement<unsigned int>; using call_statement = basic_statement<row>; using query = basic_statement<table>; using scrolling_query = basic_statement<scrolling_table>; class db_error : public std::logic_error { public: static constexpr size_t statelen = implementation-defined; db_error(const char*, int, const char* = nullptr); db_error(const char*, int, const string&); db_error(const string&, int, const char* = nullptr); db_error(const string&, int, const string&); const char* errmsg() const noexcept; int vendor_code() const noexcept; const char* db_state() const noexcept; }; [[noreturn]] void throw_err(const char*, int, const char* = nullptr); [[noreturn]] void throw_err(const char*, int, const string&); [[noreturn]] void throw_err(const string&, int, const char* = nullptr); [[noreturn]] void throw_err(const string&, int, const string&); enum class error_code; const char* error_state(error_code); [[noreturn]] void throw_err(const char*, error_code); [[noreturn]] void throw_err(const string&, error_code); class logger { public: static logger& get() noexcept; static logger* set(logger*) noexcept; using memfuncptr = bool (logger::*)(const string&); protected: virtual bool dberr(const db_error&, memfuncptr) noexcept; constexpr logger() noexcept { } public: virtual ~logger() noexcept; virtual bool error(const char*) noexcept = 0; virtual bool error(const string&) noexcept; virtual bool error(const db_error&) noexcept; virtual bool warning(const char*) noexcept; virtual bool warning(const string&) noexcept; virtual bool warning(const db_error&) noexcept; virtual bool info(const char*) noexcept; virtual bool info(const string&) noexcept; virtual bool info(const db_error&) noexcept; }; bool log_err(const char*, error_code) noexcept; bool log_err(const std::string&, error_code) noexcept; bool log_warn(const char*, error_code) noexcept; bool log_warn(const std::string&, error_code) noexcept; bool log_info(const char*, error_code) noexcept; bool log_info(const std::string&, error_code) noexcept; } // namespace dbacc


4. Detailed descriptions


4.1. The connection class

class connection final
{
public:
    connection(const string& uri, const string& usr, const string& pwd);
    explicit connection(const string& credentials);
    explicit connection(void (*func_ptr)(string& uri, string& usr, string& pwd));
A connection can be constructed with a database URI, a user ID and a password, or with some implementation-defined credentials string. It can also be constructed given a function that supplies the URI, user ID and password to strings passed by non-const reference. The purpose of the third constructor is to provide an easy way to supply default credentials that depend on the environment in which the program is running, for example, development, testing and production boxes.
    // optional:
    connection(const string&, const string&, const string&, implementation-defined);
    connection(const string&, implementation-defined);
    connection(void (*)(string&, string&, string&), implementation-defined);
An additional argument can specify the kind of middleware that the connection will need and thus the kind of commands that will be accepted. The type and semantics of such an actual argument are implementation-defined. One possible use might be to specify some dynamic library to load. The idea is that user-defined middleware should be allowed somehow.

All implementations must support at least SQL commands as specified in ISO/IEC 9075. Implementations may also chose to support other kinds of commands, for example, HTTP or HTTPS URLs intended to call WEB services. Implementations that support only SQL may have only the one-argument and three-argument constructors.

And note that, under the as-if rule, implementations may have only three constructors with the final argument defaulting to support of SQL commands.

Issue:  should the constructor arguments be basic_string<>s? Is there a popular commercial database that allows localized URIs, user IDs and passwords?
Issue:  should the library provide some kind of connection pool? Is there an existing connection pool implementation that would be suitable for standardization?

    bool commit() noexcept;
    bool rollback() noexcept;
The commit() and rollback() functions never throw exceptions, but rather return bool indicating success. If either would actually throw, presumably after the program has already done something successfully, we’d have bigger problems than that. This is particularly important for rollback() which might very well be called while another exception is being handled.
    bool auto_commit() const;
    void auto_commit(bool);
Some databases have an auto-commit mode in which every modification of the database is automatically committed as soon as it completes. In this author’s opinion, that’s evil; so at a minimum, we need a way to turn that off. auto_commit() returns the current state, and auto_commit(bool) turns it on or off. If a database lacks that feature, it’s acceptable for auto_commit() to always return false and auto_commit(bool) to quietly do nothing. (It’s also possible, but not recommended, for some implementations, for example those that support Web services, to always auto-commit. In this case, auto_commit() would always return true.)
    void begin_nested_transaction();
    bool commit_nested_transaction() noexcept;
    bool rollback_nested_transaction() noexcept;
};
Some databases have nested transactions, independent transactions that can interrupt another ongoing transaction. Nested transactions are an optional feature, and begin_nested_transaction() will throw a db_error if they’re not supported. The commit_nested_transaction() and rollback_nested_transaction() functions end the nested transaction putting the previously ongoing transaction back in effect. If they’re called when a nested transaction is not in effect, they quietly do nothing but return false. If the unnested commit() or rollback() is called while a nested transaction is in effect, it’s unspecified what work is actually committed or rolled back.

Note that a connection is not a statement factory as it is in some similar libraries. Instead, connections are passed as arguments to statement constructors.


4.2. The transaction class

class transaction final
{
    connection* cp; // exposition only
public:
    explicit transaction(connection& c) noexcept : cp(&c) { }
    ~transaction() noexcept { cp->rollback(); }

    // neither copyable nor moveable:

    transaction(const transaction&) = delete;
    transaction(transaction&&) = delete;

    transaction& operator=(const transaction&) = delete;
    transaction& operator=(transaction&&) = delete;
};

An instance of this class simply rolls back any database activity that has yet to be committed when the object goes out of scope. The intent is to use RAII to provide exception safety in situations like:

    void foo(connection& conn)
    {
        transaction t(conn);
        // do something that might throw
        conn.commit();
    }

If the block is exited before conn.commit() is executed, “something that might throw” gets rolled back. “Less is more.”

Note that a “transaction”, in the database world, does not guarantee atomicity. Modifications to the database itself can be committed or rolled back as a group; but that has nothing to do with side effects in the program that uses the database.


4.3. The nested_transaction class

Like the transaction class above:
class nested_transaction final
{
    connection* cp; // exposition only
public:
    explicit nested_transaction(connection& c, bool auto_begin = true) : cp(&c)
    {
        // as if:
        if (auto_begin)
        {
            begin();
        }
    }
    ~nested_transaction() noexcept
    {
        cp->rollback_nested_transaction();
    }

    nested_transaction(const nested_transaction&) = delete;
    nested_transaction(nested_transaction&&) = delete;

    nested_transaction& operator=(const nested_transaction&) = delete;
    nested_transaction& operator=(nested_transaction&&) = delete;

    void begin() { cp->begin_nested_transaction(); }
    bool commit() noexcept { cp->commit_nested_transaction(); }
    bool rollback() noexcept { cp->rollback_nested_transaction(); }
};
This class is intended to work like the transaction class except that it creates a nested transaction.

The optional second constructor argument controls whether the nested transaction is started immediately or waits until begin() is called explicitly.

begin() will throw a db_error if nested transactions are not supported.


4.4. Retrieved data


4.4.1. Newer types that you might not have

In general, we don’t want to require C++ implementations beyond C++11 because this library’s target audience includes folk working in the traditional “business data processing” domain; and many large companies are not quick to approve new versions of development tools (for good reason). There are, however, some additional data types that we hope to have.


4.4.1.1. Newer numeric types:

Not really std:: yet:
using std::integer;   // an unbounded integer
using std::decimal;   // an unbounded decimal (fixed-point and/or floating-point)

using std::decimal32; // IEEE 754 decimal floating-point types
using std::decimal64;
using std::decimal128;
We’d really like to have these types as the C++ equivalent of SQL’s DECIMAL and NUMERIC types. Without them, we’d have to use binary floating-point types which wouldn’t be exact; and even long double might have as few as ten decimal digits of precision.

Unfortunately, none of these is a standard C++ type. The maybe-not-so-bad news is that there’s an open-source implementation of the IEEE 754 decimal floating-point types available at https://github.com/bloomberg/bde/tree/master/groups/bdl/bdldfp, but it assumes Intel’s decimal floating-point hardware, so it might not be generally usable.

And even decimal128 guarantees only 34 decimal digits of precision which might still not be enough (for example, Oracle has a NUMBER(38) type); so we also hope to have a decimal type with unbounded precision. And if we have an unbounded integer, that will work for those DECIMAL and NUMERIC values that happen to be integers.

Both the unbounded integer and the IEEE decimal floating-point types will probably appear in a Technical Specification (a kind of warning about future standardization) that’s currently being developed by WG21’s Study Group 6 – Numerics.

In the mean time, the library provides the following two classes and one scoped enumeration, all in the dbacc namespace.


4.4.1.2. Dates and times:

Howard Hinnant’s civil time extensions to <chrono> are included in C++20. If you don’t have that yet, there’s an open-source implementation at https://github.com/HowardHinnant/date.

using std::chrono::months;
using std::chrono::days;
using std::chrono::seconds;
using std::chrono::microseconds;

using time = std::chrono::time_of_day<seconds>;
using precise_time = std::chrono::time_of_day<microseconds>;

using date = std::chrono::local_time<days>;
using datetime = std::chrono::local_time<seconds>;
using timestamp = std::chrono::local_time<microseconds>;

using std::chrono::time_zone;
using zoned_date = std::chrono::zoned_time<days>;
using zoned_datetime = std::chrono::zoned_time<seconds>;
using zoned_timestamp = std::chrono::zoned_time<microseconds>;
Domain Proposed
C++ type
Approximate SQL type
(inferred from ISO/IEC 9075-2:2011)
Remarks
Amounts
of time
months INTERVAL YEAR TO MONTH  a number of months
days INTERVAL DAY TO DAY  a number of days
seconds INTERVAL DAY TO SECOND(0)  a number of seconds
microseconds INTERVAL DAY TO SECOND(6)  a number of microseconds
Time
of day
time TIME(0)  seconds after local midnight
precise_time TIME(6)  microseconds after local midnight 
Local
points
in time
date DATE  to nearest day
datetime TIMESTAMP(0)  to nearest second
timestamp TIMESTAMP(6)  to nearest microsecond
Absolute
points
in time
time_zone  historical offsets from UTC 
zoned_date  to nearest day
zoned_datetime TIMESTAMP(0) WITH TIME ZONE  to nearest second
zoned_timestamp TIMESTAMP(6) WITH TIME ZONE  to nearest microsecond

Note that SQL doesn’t seem to have a DATE WITH TIME ZONE type, and so it’s not clear how useful a TIME WITH TIME ZONE would really be.

Note also that not all databases have all the standard datetime types. Oracle, for example, lacks the TIME type; and what it calls DATE is really a kind of TIMESTAMP(0), but with addition and subtraction being done with a (possibly fractional) number of days rather than with INTERVALs.

(There should probably also be types that hold values to the nearest minute, but this author hasn’t yet figured out what to call them.)


4.4.2. The blob class

This library supports SQL’s “binary large objects”, basically just a bunch of bytes of arbitrary type. (The so-called “character large objects” are supported by basic_string<>s in the same way that SQL’s other string types are.)

The contained object must be TriviallyCopyable, and the implementation is free to memcpy the raw bytes (and probably does).

class blob final
{
public:
    constexpr blob() noexcept;
    blob(const void*, size_t);

    bool empty() const noexcept;
    size_t size() const noexcept;
    const void* data() const noexcept;

    blob& assign(const void*, size_t);
    void clear() noexcept;
};

The default constructor will create an empty blob, as will calling the two-argument constructor or the assign member function passing either nullptr as the first argument or zero as the second. An existing blob can be emptied by calling clear().

Class invariants:

If the first constructor or assign argument is nullptr, or the second argument is zero, the other argument will be ignored. If the first argument is not nullptr, the blob class will not take ownership of the pointer. This implies that, if the user allocated the memory that the pointer points to, the user is responsible for freeing the memory. (Thus the passed pointer need not point to memory allocated in any particular way, or even to the free store (“heap”) at all.)

data() just returns a pointer to the raw bytes.

Nothing in the blob class will ever throw a db_error exception, but the two-argument constructor and assign member function can throw a std::bad_alloc if memory allocation fails.


4.4.3. The column class

A column is a datum returned from the database. It’s this class that knows how to convert SQL types to C++ types.

Instances of this class are constructed only by the library.

class column final
{
public:
    bool null() const noexcept;

    template<typename T> bool get(T& destination) const
    {
        // If !null(), store the value in destination;
        // otherwise store T() in destination.
        // Return null().
    }
    template<typename T> T get() const
    {
        // as if:
        T val;
        (void)get(val);
        return val;
    }
};

Note that T must be default-constructible, and that the template argument to the no-argument get() can’t be deduced.

Issue:  should get(T& destination) leave destination untouched when null()?


4.4.4. The row class

A row is a const random-access range of columns. It has much of the same public API as would a const vector<column> with a few additional features.

It acts much like a const sequence container; but note that instances of this class are constructed only by the library. Also, there’s no pointer or const_pointer member type.

class row final
{
public:
  // the const vector-like API:

    using size_type = size_t;
    using difference_type = ptrdiff_t;
    using value_type = const column;
    using const_reference = const column&;
    using const_iterator = implementation-detail;
    using const_reverse_iterator = implementation-detail;

    size_type size() const;
    bool empty() const;

    const_iterator begin() const;
    const_iterator end() const;
    const_reverse_iterator rbegin() const;
    const_reverse_iterator rend() const;

    const_iterator cbegin() const
    const_iterator cend() const;
    const_reverse_iterator crbegin() const;
    const_reverse_iterator crend() const;

    const_reference front() const;
    const_reference back() const;
It’s not clear how useful most of the vector-like API really is, but vector<column> is an obvious way to implement row behind the scenes, and so this API is easy to implement and providing it does no harm.
    const_reference operator[](size_type) const;
    const_reference operator[](const char*) const;
    const_reference operator[](const string&) const;

    const_reference at(size_type) const;
    const_reference at(const char*) const;
    const_reference at(const string&) const;
The array subscript operator and at() member function have overloads that take string arguments to provide the ability to retrieve columns by name. As with vector, passing an invalid argument to the [] operator yields undefined behavior, while at(), in that case, throws an exception. Integer arguments to [] and at() are zero-based like array indices (not one-based like in Java’s ResultSet.getFoo() zoo).
    template<typename ...Args> void into(Args&...) const;
The into variadic template stores zero or more columns in C++ variables in the order in which they were returned from the database.
    template<typename First, typename Second, typename ...Rest>
      void acquire(const First&, Second&, Rest&...) const;
    void acquire() const;
};
The acquire template is similar to into, except that it takes pairs of arguments: the column’s ordinal or name followed by a non-const reference to the C++ variable into which that column should be stored. This template must be passed an even number of arguments otherwise you’ll get a compile-time error.

The no-argument acquire() is used internally to halt the variadic template recursion. Making it public can do no harm, and it allows calling acquire() with no argument which one can imagine machine-generated code doing.


4.4.5. The basic_table template

A basic_table<> instance is a const range of rows; but not all rows are in memory at the same time; so this template does not satisfy container requirements. It’s really just an iterator factory.

Instances are constructed only by the library.

template<typename CursorType> class basic_table
{
public:
    using const_iterator = CursorType;

    const_iterator begin() const;
    const_iterator end() const;

    const_iterator cbegin() const;
    const_iterator cend() const;
};

Its iterators are called “cursors”, a term of art in the database world.

using table = basic_table<cursor>;
using scrolling_table = basic_table<scrolling_cursor>;
(“Scrolling cursor” is also a recognized term of art; and since scrolling cursors are not often used in practice, we’re not worried about making the user type a few extra keystrokes in the type name.)


4.5. Cursors

A cursor is an iterator. The library supplies two new iterator categories:
    struct cursor_tag { };
    struct scrolling_cursor_tag : cursor_tag { };
There are cursor and scrolling_cursor classes that are specializations of:
    template<typename IteratorCategory> class basic_cursor;
Note that this library’s cursors model C++ iterators and so they don’t magically “fetch” data into C++ variables. That’s done by the row and column classes as stated above.


4.5.1. The cursor_base class

Most of the work can be refactored into a non-template base class.
class cursor_base
{
public:
    using value_type = row;

    // move-only:
    cursor_base(const cursor_base&) = delete;
    cursor_base& operator=(const cursor_base&) = delete;

    virtual ~cursor_base() = 0;

    value_type operator*();

    bool operator==(const cursor_base&) const noexcept;
    bool operator!=(const cursor_base&) const noexcept;
    bool operator< (const cursor_base&) const noexcept;
    bool operator> (const cursor_base&) const noexcept;
    bool operator<=(const cursor_base&) const noexcept;
    bool operator>=(const cursor_base&) const noexcept;
};
== implies that both cursors were produced by the same basic_table and both have iterated the same distance.

Cursors keep track of how far they’ve iterated, and so they can also be less-than comparable. If two cursors were not produced by the same basic_table, all of <, >, <= and >= will return false. (Note that this could lead to surprising behavior in programs that aren’t careful, although database users are already aware of a similar situation in that all comparisons to NULL are false.)

Note that a cursor is not even an input iterator:  it lacks pointer and reference types and the -> operator; the unary * operator returns a row by value, not by reference; and as we’ll see shortly, derived classes lack the postfix ++ and -- operators.

Cursors are constructed only by the library, and once they’re constructed, they’re move-only and non-swappable.


4.5.2. The cursor class

For ordinary cursors, cursor_base already does almost everything we need. All we have to add are the iterator category and the prefix ++ operator (and delete the copy constructor and copy-assignment operator because all cursors are move-only).
template<> class basic_cursor<cursor_tag> : public cursor_base
{
public:
    using iterator_category = cursor_tag;

    basic_cursor(const basic_cursor&) = delete;
    basic_cursor& operator=(const basic_cursor&) = delete;

    basic_cursor& operator++();
};
using cursor = basic_cursor<cursor_tag>;


4.5.3. The scrolling_cursor class

A scrolling cursor adds to the usual cursor functionality the ability to move back and forth by arbitrary distances; but note that this does not make it a random-access iterator. It’s still not even an input iterator for reasons already stated.

Furthermore, even a scrolling_cursor, when it’s in the past-the-end state, is permitted to be a singularity that can’t be decremented; so there’s no such thing as a “reverse_scrolling_cursor”. (But operator=(position_type), which provides the equivalent of “fetch absolute” functionality, is required to work correctly, even in the past-the-end state; so it’s possible to make multiple passes on the same scrolling_cursor.)

And note that there’s no overloaded [] operator which would necessarily fetch a row and so reposition the cursor itself, which would seem to be a more serious violation of the principle of least astonishment.

template<> class basic_cursor<scrolling_cursor_tag> : public cursor_base
{
public:
    using iterator_category = scrolling_cursor_tag;

    using difference_type = ptrdiff_t;
    using position_type = size_t;

    static constexpr position_type npos = position_type(-1);

    basic_cursor(const basic_cursor&) = delete;
    basic_cursor& operator=(const basic_cursor&) = delete;

    basic_cursor& operator++(); // “fetch next”
    basic_cursor& operator--(); // “fetch prior”

    basic_cursor& operator+=(difference_type); // “fetch relative”
    basic_cursor& operator-=(difference_type); //       "

    basic_cursor& operator=(position_type); // “fetch absolute”

    // distance between *this and rhs:
    difference_type operator-(const basic_cursor& rhs) const;
};
using scrolling_cursor = basic_cursor<scrolling_cursor_tag>;

For scrolling cursors, the following overloads of standard library iterator operations will run in constant time:

using std::advance;
using std::next;
using std::prev;
using std::distance;

void advance(scrolling_cursor&, scrolling_cursor::difference_type);

scrolling_cursor& next(scrolling_cursor&,
                       scrolling_cursor::difference_type = 1);

scrolling_cursor& prev(scrolling_cursor&,
                       scrolling_cursor::difference_type = 1);

scrolling_cursor::difference_type
  distance(const scrolling_cursor&, const scrolling_cursor&);


4.6. The basic_statement template

A possible argument to basic_statement<>::set():
template<typename T> struct nullval { };  // a “null” value that has a C++ type
using null_number = nullval<int>;         // a common use case
using null_string = nullval<const char*>; //        "
The class template itself:
template<typename ResultType> class basic_statement
{
public:
    using result_type = ResultType;
    using row_count = int;

    explicit basic_statement(connection&, row_count = 0);

    basic_statement(connection&, const string& sql, row_count = 0);

    row_count prefetch_size() const;
    void prefetch_size(row_count);
The optional row_count constructor argument and the prefetch_size(row_count) function give a hint suggesting the number of rows that a SELECT statement might return in a single round trip to the database. Any row_count value not greater than zero is the absence of such a hint.
    void prepare(const string& sql);

    basic_statement& operator=(const string& sql)
    {
        // as if:
        prepare(sql);
        return *this;
    }
Issue:  should the sql arguments be basic_string<>s to provide what database folk call “national language support”? That would seem to be useful only for character literals in database commands, and one obvious workaround is to use placeholders for those and assign values using one of the templates immediately below.
    template<typename T> void set(int placeholder_ordinal, const T& value);
    template<typename T> void set(int placeholder_ordinal, nullval<T>);
    template<typename T> void set_null(int placeholder_ordinal);

    template<typename T> void set(const string& placeholder_name, const T& value);
    template<typename T> void set(const string& placeholder_name, nullval<T>);
    template<typename T> void set_null(const string& placeholder_name);
The set() and set_null() member templates assign values to placeholders in the SQL. Placeholders do not have to be set in any particular order; and once they’re set, the value remains in effect until the SQL changes. (The SQL changes when you call any of prepare(const string&), operator=(const string&), or execute(const string&).)

Note that set_null()’s template argument can’t be deduced.

Placeholder ordinals are zero-based like array indices (not one-based like in Java’s PreparedStatement.setFoo() zoo).

Setting placeholders by name is an optional feature, and any set(const string&) will throw a db_error if that’s not supported.

    basic_statement& execute()
    {
        // Do the deed.
        return *this;
    }

    basic_statement& execute(const string& sql) // “execute immediate”
    {
        // as if:
        prepare(sql);
        return execute();
    }

    result_type result();
The execute() member function doesn’t return a result because not every database operation does; even when it does, we might not care; and constructing a result_type instance might be expensive. Instead, we have a separate result() member function that constructs and returns the result; and execute() returns *this so that we can write “execute().result() when that’s what’s wanted.
    template<typename ...Args> result_type operator()(const Args&...);
    template<typename ...Args> result_type run(const Args&...);
};

The overloaded function-call operator assigns zero or more values to placeholders in the order in which they’re declared in the SQL, executes the statement, and returns the result. If it’s called with no argument, it’s just return execute().result().

The run() template is similar to operator()(), except that it takes pairs of arguments: a placeholder ordinal or name followed by the value that should be assigned. This template must be called with an even number of arguments, otherwise you’ll get a compile-time error.


4.6.1. basic_statement<ResultType> aliases

DDL statements (CREATE TABLE, etc.) don’t return results; and sometimes we don’t care about results:

    using statement = basic_statement<void>;
DML statements (INSERT, UPDATE, DELETE) return row counts, so when we care:
    using dml_statement = basic_statement<unsigned int>;
SELECT statements return tables (zero or more rows):
    using query = basic_statement<table>;
    using scrolling_query = basic_statement<scrolling_table>;
Some databases have stored procedures and functions; and calling them can return zero or more data, which is just what a row is:
    using call_statement = basic_statement<row>;


4.7. Error handling


4.7.1. The db_error class

class db_error : public std::logic_error
{
public:
    static constexpr size_t statelen = implementation-defined;

    db_error(const char*   message, int sqlcode, const char*   sqlstate = nullptr);
    db_error(const char*   message, int sqlcode, const string& sqlstate);
    db_error(const string& message, int sqlcode, const char*   sqlstate = nullptr);
    db_error(const string& message, int sqlcode, const string& sqlstate);

    // like logic_error, noexcept-copyable:
    db_error(const db_error&) = default;

    // but neither moveable nor assignable:
    db_error(db_error&&) = delete;
    db_error& operator=(const db_error&) = delete;
    db_error& operator=(db_error&&) = delete;

    const char* errmsg() const noexcept;   // returns message
    int vendor_code() const noexcept;      // returns sqlcode
    const char* db_state() const noexcept; // returns sqlstate (but never nullptr)
};
It’s not clear whether a db_error should be a logic error or a runtime error; but in this author’s experience, exceptions get thrown when he goofs; so it’s most likely a logic error.

message is passed to the base class constructor and so is also returned by logic_error::what().

A const char* sqlstate may be nullptr, and defaults to nullptr, in which case db_state() will return a valid pointer to a '\0' (an empty C-style string).

The implementation may quietly truncate sqlstate to statelen characters. For example, an implementation that assumes connecting to an SQL database would probably truncate it to five characters since that’s what SQL’s SQLSTATE is. (The intent of this permission is to allow implementations to store state strings in fixed-length arrays of char and so not have to worry about allocating memory without throwing exceptions.)

The implementation may also define zero or more error codes and state strings for db_errors that it can throw on its own authority, and it will define ranges of error codes and state strings available to users of the library. For example, an implementation for an SQL database might reserve for its own use error codes ascending from INT_MIN+1 up to some maximum value and SQLSTATE classes "90" and "91". Users could then use codes from the stated maximum value to some value specified by the database vendor and the other SQLSTATE classes reserved for the users.

enum class error_code
{
    min_err, // implementation-defined most-negative value used by the library
    max_err, // implementation-defined most-positive value used by the library
    // other implementation-defined values
};

If the implementation throws no exceptions on its own authority, min_err and max_err can both be INT_MIN.

There’s a convenience function that returns a SQLSTATE string that corresponds to an error_code (or nullptr if the argument is invalid):

const char* error_state(error_code);

The library also supplies convenience functions for throwing db_errors. The first four have the same arguments as db_error’s constructors.

[[noreturn]] void throw_err(const char*,   int, const char* = nullptr);
[[noreturn]] void throw_err(const char*,   int, const string&);
[[noreturn]] void throw_err(const string&, int, const char* = nullptr);
[[noreturn]] void throw_err(const string&, int, const string&);

[[noreturn]] void throw_err(const char*,   error_code);
[[noreturn]] void throw_err(const string&, error_code);


4.7.2. The logger class

The library provides an abstract base class from which users may derive a custom error logging mechanism.
class logger
{
public:
    static logger& get() noexcept;
    static logger* set(logger*) noexcept;

The static get() function, by default, returns a reference to a library-supplied logger that just writes all messages to the standard error device (cerr) in a thread-safe manner.

You can cause get() to return an instance of your own derived class by passing a pointer to it to set() which returns a pointer to the previously installed logger. Passing nullptr won’t turn off logging, but rather will restore the library-supplied default.

Both get() and set(logger*) are atomic operations, so they’re thread-safe, and they never throw exceptions.

    using memfuncptr = bool (logger::*)(const string&);
protected:
    // a helper to create a string from a db_error and log it:
    virtual bool dberr(const db_error& err, memfuncptr func) noexcept
    {
        // as if:
        try
        {
            std::ostringstream ostr;
            ostr << "Error " << err.db_state()
                 << " (" << err.vendor_code()
                 << "):  " << err.errmsg();
            return (this->*func)(ostr.str());
        }
        catch (...) // NB: noexcept
        {
            return false;
        }
    }

The protected dberr function creates a single string from a db_error instance and calls the member function a pointer to which is passed as the second argument. You may override this function if you want behavior that’s different from what’s shown above.

    constexpr logger() noexcept { }
public:
    virtual ~logger() noexcept;

    virtual bool error(const char*) noexcept = 0;
    virtual bool error(const string& msg) noexcept
    {
        // as if:
        return error(msg.c_str());
    }
    virtual bool error(const db_error& err) noexcept
    {
        // as if:
        return dberr(err, &logger::error(const string&));
    }

    virtual bool warning(const char* msg) noexcept
    {
        // as if:
        return error(msg);
    }
    virtual bool warning(const string& msg) noexcept
    {
        // as if:
        return warning(msg.c_str());
    }
    virtual bool warning(const db_error& err) noexcept
    {
        // as if:
        return dberr(err, &logger::warning(const string&));
    }

    // with similar default definitions:
    virtual bool info(const char*) noexcept;
    virtual bool info(const string&) noexcept;
    virtual bool info(const db_error&) noexcept;
};

Note that only error(const char*) is pure virtual, so that’s the only one that you must override. The others all have default implementations that eventually call error(const char*) in what are intended to be obvious ways. You may override any of them if you want them to have special behavior.

All the member functions that write error messages are noexcept and return bools indicating whether the message was written successfully. (If you’re writing an error message, you might well be handling an exception already, in which case you certainly wouldn’t want to throw another one.)

As stated, the library-supplied default logger is thread-safe, but you might have some work to do if you need a thread-safe custom logger. (Aside from the static get() and set() functions, the abstract logger class has no thread safety built in.)


5. Quick tutorials by example

Here are a couple of toy programs that illustrate some of the library’s features. They assume that we’re using the legendary SCOTT database.

These examples assume that the database command language is SQL and that placeholders in the SQL are just question marks. That might not be the case in real life.

(Both examples are written rather stupidly, but decent code wouldn’t illustrate the features I want to point out.)


5.1. Example 1

The following program generates an across-the-board salary reduction of ten dollars for everyone in the Research department.  Don’t ask Management why.

#include <string>    // the standard string class
#include <cstdlib>   // EXIT_SUCCESS, EXIT_FAILURE
using namespace std; // for brevity in this toy program

#include "dbacc.hpp" // the author’s proof of concept
using namespace dbacc;

int main()
{
    try
    {
        connection cn("scott/tiger");
        cn.auto_commit(false); // just in case

        query qry(cn,
                  "SELECT empno, sal FROM emp, dept "
                  "WHERE emp.deptno = dept.deptno AND dept.dname = ?",
                  10);

        statement upd(cn, "UPDATE emp SET sal = ? WHERE empno = ?");

        int empno, sal;

        transaction tr(cn);

        //
        // A query’s result is a table (a range of rows), so we can use
        // a range-based for loop.  But note that the cursor’s operator*
        // returns a row by value, so there’s no point in writing “const row&”.
        //
        for (row r : qry("RESEARCH"))
        {
            r.into(empno, sal);   // stores the row’s columns into C++ variables
            upd(sal - 10, empno); // sets placeholders and executes
        }

        cn.commit();

        //
        // tr’s destructor will call cn.rollback() when we exit the try block;
        // but that’s OK since we just committed everything.
        //
        // Note that the normal C++ rules about unwinding the stack assure
        // that the transaction will be destructed before the connection is.
        //

        return EXIT_SUCCESS;
    }
    catch (const db_error& e)
    {
        //
        // If we get here, the transaction’s destructor
        // will have already rolled back any changes.
        //
        logger::get().error(e);
        return EXIT_FAILURE;
    }
}


5.2. Example 2

Give a ten dollar raise to those employees in a department specified by a command-line argument. Write a message to the standard output device showing the number of employees getting the raises.

#include <exception>
#include <iostream>
#include <sstream>
#include <cstdlib>
using namespace std;

#include "dbacc.hpp"
using namespace dbacc;

namespace {

//
// A custom logger that writes info messages to cout,
// but everything else to cerr:
//
class my_logger : public logger
{
    static bool write(ostream& os, const char* msg) noexcept
    {
        try
        {
            return (os << msg << '\n').good();
        }
        catch (...)  // NB: noexcept
        {
            return false;
        }
    }

public:
    bool error(const char* msg) noexcept // We must override this
    {                                    // because it’s pure-virtual.
        return write(cerr, msg);
    }
    bool info(const char* msg) noexcept  // This is the special behavior
    {                                    // that we want.
        return write(cout, msg);
    }
    // logger’s default behavior for all the rest
};

} // anonymous namespace

int main(int argc, char** argv)
{
    my_logger log;
    logger::set(&log); // Make it the default because the middleware
                       // might log messages of its own.

    static constexpr int mindep = 10, maxdep = 40;
    int dep = 0;
    if (argc == 2)
    {
        dep = atoi(argv[1]);
    }
    if (dep < mindep || dep > maxdep || dep % 10 != 0)
    {
        log.error("Usage:  tutorial2 department-number");
        return EXIT_FAILURE;
    }

    try
    {
        connection cn("scott/tiger");
        cn.auto_commit(false);

        query qry(cn, "SELECT empno FROM emp");
        qry.prefetch_size(10); // We can set this after construction.

        dml_statement upd(cn, "UPDATE emp SET sal = sal + 10 WHERE empno = ? AND deptno = ?");

        upd.set(1, dep); // Placeholder ordinals are zero-based,
                         // so this is the value for deptno.

        dml_statement::result_type empcnt = 0;

        transaction tr(cn);

        for (row r : qry())
        {
            empcnt += upd(r["empno"].get<int>()); // see below
        }

        cn.commit();

        ostringstream os;
        os << "We gave raises to " << empcnt << " employees.";

        log.info(os.str()); // Calls logger::info(const string&) which,
                            // in turn, calls the virtual info(const char*)
                            // which we’ve overridden above.

        return EXIT_SUCCESS;
    }
    catch (const db_error& e)
    {
        log.error(e);
        return EXIT_FAILURE;
    }
    catch (const exception& e) // Could something else bad have happened?
    {
        log.error(e.what());   // If so, just write the message.
        return EXIT_FAILURE;
    }
    catch (...)
    {
        log.error("A non-standard exception was thrown.");
        return EXIT_FAILURE;
    }
}

About “empcnt += upd(r["empno"].get<int>())” in the for loop:


6. A sample implementation

This section is about a proof of concept that’s comming Real Soon Now. As of this writing, it’s still vaporware. When it’s finished, it’ll probably work as described below.

It’s intended that this library be proposed to ISO/IEC JTC1 SC22 WG21 for possible international standardization.

But it’s not WG21’s business to say how to communicate with databases; that’s SC32 WG3’s job; so this proof of concept attempts to show that the proposed library is not in conflict with WG3’s ISO/IEC 9075 by implementing the library on top of ODBC.

<if you care>
ISO/IEC 9075-3 specifies an API called “Structured Query Language Call-Level Interface (SQL/CLI)”. Its Annex H1 specifies a C binding that is to ODBC much like ISO/IEC 10646 is to Unicode (modulo de facto vs. de jure).
</if you care>

This initial proof of concept is mostly oblivious to internationalization and localization:  the only standard library string type supported is std::string, and string literals may be only const char*s. There’s also no support for the IEEE 754 decimal floating-point types; and connections and statements may not be shared among threads. A real implementation would need to do a much better job.


6.1. Implementation-defined features


6.2. Compiling

This version of the library requires at least C++11.

The library’s sources are will be available as dbacc.tar.gz (with POSIX newlines), and dbacc.zip (with Teletype® newlines for Windows®).

More coming Real Soon Now.


Appendix A: User-defined implementations

Not documented above are:
namespace dbacc {
namespace impl {

class conn;
class stmt;

class libloader final
{
public:
    constexpr libloader(const char* dynamic_library_filename,
                        const char* conn_creation_function_name) noexcept;
    void unload();
    // ...
};

} // namespace impl
two libloaders that the library supplies:
extern impl::libloader sql_loader, // loads ODBC middleware
                       web_loader; // loads WEB client support
and the three connection constructors with the final argument defaulting:
class connection final
{
public:
    connection(const string&, const string&, const string&, impl::libloader& = sql_loader);
    explicit connection(const string&, impl::libloader& = sql_loader);
    explicit connection(void (*)(string&,string&,string&), impl::libloader& = sql_loader);
    // ...
};

} // namespace dbacc
conn is an abstract base class for a class that actually implements the library’s connection class; stmt is an abstract base class for a class that actually implements the library’s basic_statement template along with most of the rest of the library.

Your two derived classes are expected to be in a dynamic library of some kind that the libloader class will lazily load the first time it’s needed. By default, the implementation is expected to be a POSIX shared library, but if we’re running under Microsoft® Windows® (including under Cygwin), the implementation should be a Windows DLL. See this library’s implementation files for examples of how to write a custom implementation. (More helpful documentation is coming Real Soon Now. For a general outline, see below.)

libloader instances are intended to be singletons. They’re neither copyable nor moveable, and you can’t derive another class from it. If you want to use a custom implementation, you should define its libloader at namespace scope with string literals as arguments so that the constructor really is constexpr, and the libloader itself doesn’t get destructed until the end of the complete program.

If your dynamic library actually does get loaded, the libloader’s destructor will unload it.  You can also call the unload() function to unload the dynamic library in a thread-safe manner at some earlier time if you’re sure that all connections and statements that use it have been destructed and you won’t be constructing any more.

Warning:  the constructor just copies its argument pointers, so the actual strings must survive at least the construction of the first connection instance that uses the implementation. (Constructing the first connection lazily loads your dynamic library.) Once you’ve constructed the first connection that uses the dynamic library, and you haven’t explicitly unloaded it by calling unload(), the libloader constructor’s arguments will no longer be needed, and any number of connections may be constructed and destructed in a thread-safe manner. As long as your libloader instance is constructed at namespace scope, it won’t be destructed until the end of the complete program, so this shouldn’t be an issue.

The libloader constructor’s first argument is a full or relative path to your shared library or DLL.

The libloader constructor’s second argument is the name of a function in your dynamic library’s global namespace that takes no argument and returns a pointer to a newed instance of your conn subclass. It might need some decoration depending on what operating system you’re compiling for. In Windows, for example:

    extern "C" __declspec(dllexport) dbacc::impl::conn* your_connection_function()
    {
        return new your_conn_subclass();
    }
Other code will open the actual connection given the credential(s) passed to the dbacc::connection constructor.


The general idea for creating a custom implementation is:

#include "dbacc.hpp" // the author’s proof of concept

// other headers, possibly O/S-specific

namespace
{
    class my_connection : public dbacc::impl::conn
    {
        // ...
    };

    class my_statement : public dbacc::impl::stmt
    {
        // ...
    };
}

// then in the global namespace:

extern "C" DBACC_EXPORT_FUNCTION dbacc::impl::conn* create_my_connection()
{
    return new my_connection();
}

// And you might also need:

#ifdef _MSC_VER
bool __stdcall DllMain(const void*, unsigned long, const void*) { return true; }
#endif
The second argument to your libloader’s constructor would be "create_my_connection".

The DBACC_EXPORT_FUNCTION macro is defined in dbacc_config.hpp as:

#ifndef DBACC_EXPORT_FUNCTION
  #if defined(_MSC_VER) || defined(__CYGWIN__)
    #ifdef __GNUC__
      #define DBACC_EXPORT_FUNCTION __attribute__((dllexport))
    #else
      #define DBACC_EXPORT_FUNCTION __declspec(dllexport)
    #endif
  #elif defined(__GNUC__) && __GNUC__ >= 4
    #define DBACC_EXPORT_FUNCTION __attribute__((visibility("default")))
  #else
    #define DBACC_EXPORT_FUNCTION
  #endif
#endif

Note that you might need to provide your own definition prior to #include "dbacc.hpp" if you're doing something other than using the Microsoft compiler to compile a Windows program, or using Gnu C++ to compile a POSIX or Cygwin program.


All suggestions and corrections will be welcome; all flames will be amusing.
Mail to stdbill.h@pobox.com.