Making Your Own Container Compatible With C++20 Ranges
With some of my spare time lately, I’ve been enjoying learning about some of the new features in
C++20. Concepts and the closely-related
requires
clauses are two great
extensions to template syntax that remove the necessity for all the SFINAE junk we used to have to
do, making our code both more readable and more precise, and providing much better error messages
(although MSVC has sadly been lagging in the error messages department,
at the time of this writing).
Another interesting C++20 feature is the addition of the ranges library (also ranges algorithms), which provides a nicer, more composable abstraction for operating on containers and sequences of objects. At the most basic level, a range wraps an iterator begin/end pair, but there’s much more to it than that. This article isn’t going to be a tutorial on ranges, but here’s a talk to watch if you want to see more of what it’s all about.
What I’m going to discuss today is the process of adding “ranges compatibility” to your own container class. Many of the C++ codebases we work in have their own set of container classes beyond the STL ones, for a variety of reasons—better performance, more control over memory layouts, more customized interfaces, and so on. With a little work, it’s possible to make your custom containers also function as ranges and interoperate with the C++20 ranges library. Here’s how to do it.
Making Your Container an Input Range
At the high level, there are two basic ways that a container class can interact with ranges. First, it can be readable as a range, meaning that we can iterate over it, pipe it into views and pass it to range algorithms, and so forth. In the parlance of the ranges library, this is known as being an input range: a range that can provide input to other things.
The other direction is to accept output from ranges, storing the output into your container. We’ll do that later. To begin with, let’s see how to make your container act as an input range.
Range Concepts
The first decision we have to make is what particular kind of input range we can model. The C++20 STL defines a number of different concepts for ranges, depending on the capabilities of their iterators and other things. Several of these form a hierarchy from more general to more specific kinds of ranges with tighter requirements. Generally speaking, it’s best for your container to implement the most specific range concept it’s able to. This enables code that works with ranges to make better decisions and use more optimal code paths. (We’ll see some examples of this in a minute.)
The relevant input range concepts are:
std::ranges::input_range
: the most bare-bones version. It requires only that you have iterators that can retrieve the contents of the range. In particular, it doesn’t require that the range can be iterated more than once: iterators are not required to be copyable, andbegin
/end
are not required to give you the iterators more than once. This could be an appropriate concept for ranges that are actually generating their contents as the result of some algorithm that’s not easily/cheaply repeatable, or receiving data from a network connection or suchlike.std::ranges::forward_range
: the range can be iterated as many times as you like, but only in the forward direction. Iterators can be copied and saved off to later resume iteration from an earlier point, for example.std::ranges::bidirectional_range
: iterators can be decremented as well as incremented.std::ranges::random_access_range
: you can efficiently do arithmetic on iterators—you can offset them forward or backward by a given number of steps, or subtract them to find the number of steps between.std::ranges::contiguous_range
: the elements are actually stored as a contiguous array in memory; the iterators are essentially fancy pointers (or literally are just pointers).
In addition to this hierarchy of input range concepts, there are a couple of other standalone ones worth mentioning:
std::ranges::sized_range
: you can efficiently get the size of the range, i.e. how many elements from begin to end. Note that this is a much looser constraint thanrandom_access_range
: the latter requires you be able to efficiently measure the distance between any pair of iterators inside the range, whilesized_range
only requires that the size of the whole range is known.std::ranges::borrowed_range
: indicates that a range doesn’t own its data, i.e. it’s referencing (“borrowing”) data that lives somewhere else. This can be useful because it allows references/iterators into the data to survive beyond the lifetime of the range object itself.
The reason all these concepts are important is that if I’m writing code that operates on ranges, I might need to
require some of these concepts in order to do my work efficiently. For example, a sorting routine
would be very difficult to write for anything less than a random_access_range
(and indeed you’ll
see that std::ranges::sort
requires that).
In other cases, I might be able to do things more optimally when the range satisfies certain
concepts—for instance, if it’s a sized_range
, I could preallocate some storage for results,
while if it’s only an input_range
and no more, then I’ll have to dynamically reallocate, as I have
no idea how many elements there are going to be.
The rest of the ranges library is written in terms of these concepts (and you can write your own code that operates generically on ranges using these concepts as well). So, once your container satisfies the relevant concepts, it will automatically be recognized and function as a range!
In C++20, concepts act as boolean expressions, so you can check whether your container satisfies the concepts you expect by just writing asserts for them:
#include <ranges>
static_assert(std::ranges::forward_range<MyCoolContainer<int>>);
// int is just an arbitrarily chosen element type, since we
// can't assert a concept for an uninstantiated template
Checks like this are great to add to your test suite—I’m big in favor of writing compile-time tests for generic/metaprogramming stuff, in addition to the usual runtime tests.
However, when you first drop that assert into your code, it will almost certainly fail. Let’s see now what you need to do to actually satisfy the range concepts.
Defining Range-Compatible Iterators
In order to satisfy the input range concepts, you need to do two things:
- Have
begin
andend
functions that return some iterator and sentinel types. (We’ll discuss these in a little bit.) - The iterator type must satisfy the iterator concept that matches your range concept.
Each one of the concepts from input_range
down to contiguous_range
has a corresponding
iterator concept:
std::input_iterator
, std::forward_iterator
, and so on. It’s these concepts that contain the real
meat of the requirements that define the different types of ranges: they list all the operations
each kind of iterator must support.
To begin with, there are a couple of member type aliases that any iterator class will need to define:
difference_type
: some signed integer type, usuallystd::ptrdiff_t
value_type
: the type of elements that the iterator references
The second one seems pretty understandable, but I honestly have no idea why the difference_type
requirement is here. Taking the difference between iterators doesn’t make sense until you get to
random-access iterators, which actually define that operation. As far as I can tell, the
difference_type
for more general iterators isn’t actually used by anything. Nevertheless,
according to the C++ standard, it has to be there. It seems that the usual idiom is to set it to
std::ptrdiff_t
in such cases, although it can be any signed integer type.
(Technically you can also define these types by specializing std::iterator_traits
for your iterator,
but here we’re just going to put them in the class.)
Beyond that, the requirements for std::input_iterator
are pretty straightforward:
- The iterator must be default-initializable and movable. (It doesn’t have to be copyable.)
- It must be equality-comparable with its sentinel (the value marking the end of the range). It doesn’t have to be equality-comparable with other iterators.
- It must implement
operator ++
, in both preincrement and postincrement positions. However, the postincrement version does not have to return anything. - It must have an
operator *
that returns a reference to whatever thevalue_type
is.
One point of interest here is that the default-initializable requirement means that the iterator class can’t contain references, e.g. a reference to the container it comes from. It can store pointers, though.
A prototype input iterator class could look like this:
template <typename T>
class Iterator
{
public:
using difference_type = std::ptrdiff_t;
using value_type = T;
Iterator(); // default-initializable
bool operator == (const Sentinel&) const; // equality with sentinel
T& operator * () const; // dereferenceable
Iterator& operator ++ () // pre-incrementable
{ /*do stuff...*/ return *this; }
void operator ++ (int) // post-incrementable
{ ++*this; }
private:
// implementation...
};
For a std::forward_iterator
, the requirements are just slightly tighter:
- The iterator must be copyable.
- It must be equality-comparable with other iterators of the same container.
- The postincrement operator must return a copy of the iterator before modification.
A prototype forward iterator class could look like:
template <typename T>
class Iterator
{
public:
// ...same as the previous one, except:
bool operator == (const Iterator&) const; // equality with iterators
Iterator operator ++ (int) // post-incrementable, returns prev value
{ Iterator temp = *this; ++*this; return temp; }
};
I’m not going to go through the rest of them in detail; you can read the details on cppreference.
Begin, End, Size
Once your container is equipped with an iterator class that satisfies the relevant concepts, you’ll
need to provide begin
and end
functions to get those iterators. There are three ways to do this:
they can be member functions on the container, they can be free functions that live next to the
container in the same namespace, or they can be “hidden friends”;
they just need to be findable by ADL.
The return types from begin
and end
don’t have to be the same. In some cases, it can be useful
to have end
return a different type of object, a “sentinel”, which isn’t actually an iterator; it
just needs to be equality-comparable with iterators, so you can tell when you’ve gotten to the end
of the container.
Also, these are the same begin
/end
used for range-based for
loops.
One oddity worth mentioning here is that if you go the free/friend functions route, you’ll need to add overloads for both const and non-const versions of your container:
class MyCoolContainer;
auto begin(const MyCoolContainer& c);
auto end(const MyCoolContainer& c);
auto begin(MyCoolContainer& c);
auto end(MyCoolContainer& c);
You might think it would be enough to provide just the const overloads, but if you do that, only the const version of the container will be recognized as a range! The non-const overloads must be present as well for non-const containers to work.
Curiously, if you provide begin
/end
as member functions instead, then this doesn’t come up:
const overloads will work for both.
This behavior is surprising, and I’m not sure if it was intended. However, it’s worth noting that
iterators generally need to remember the constness of the container they came from: a const
container should give you a “const iterator” that doesn’t allow mutating its elements. Therefore,
the const and non-const overloads of begin
/end
will generally need to return different
iterator types, and so you’ll need to have both in any case. (The exception would be if you’re
building an immutable container; then it only needs a const iterator type.)
In addition to begin
and end
, you’ll also want to implement a size
function, if applicable.
Again, this can be either a member function, a free function, or a hidden friend. The
presence of this function satisfies std::ranges::sized_range
, which (as mentioned earlier) can
enable range algorithms to operate more efficiently.
So, to sum up: to allow your custom container class to be readable as a range, you’ll need to:
- Decide which range concept(s) you can model, which mainly comes down to what level of iterator capabilities you can provide;
- Implement iterator classes (both const and non-const, if applicable) that fulfill all the requirements of the chosen iterator concept;
- Implement
begin
,end
, andsize
functions.
Once we’ve done this, the ranges library should recognize your container as a range. It will automatically be accepted by range algorithms, we can take views of it, we can iterate over it in range-for loops, and so on.
As before, you can test that you’ve done everything correctly by asserting that your container satisfies the expected range concepts. If you’re working with gcc or clang, this will even give you some pretty reasonable error messages if you didn’t get it right! (In MSVC, for the time being, you’ll have to narrow down errors by popping open the hood and asserting each of the concept’s sub-clauses one at a time, to see which one(s) failed.)
Accepting Output From Ranges
We’ve discussed how to make a custom container serve as input to the C++20 ranges library. Now, we need to come back to the other direction: how to let your container capture output from the ranges library.
There are a couple of different forms this can take. One way is to accept generic ranges as parameters to a constructor (or other methods, such as append or insert methods) of your container class. This allows, for example, easily converting other containers (that are also range-compatible) to your container. It also allows capturing the output of a ranges “pipeline” (a series of views chained together).
Another form of range output, which comes up with certain of the range algorithms, is via output iterators, which are iterators that allow storing or inserting values into your container.
Constructor From A Range
To write a constructor (or other method) that takes a generic range parameter, we can use the same range concepts we saw earlier. One neat new feature in C++20 is writing functions with a parameter type (or return type) constrained to match a given concept. The syntax looks like this:
#include <ranges>
class MyCoolContainer
{
public:
explicit MyCoolContainer(std::ranges::input_range auto&& range)
{
for (auto&& item : range)
{
// process the item
}
}
};
The syntax concept-name auto
for the parameter type reminds us that concepts aren’t types; this
is still, under the hood, a template function that’s performing argument type deduction (hence the
auto
). In other words, the above is syntactic sugar for:
template <std::ranges::input_range R>
explicit MyCoolContainer(R&& range)
{
// ...
}
which is in turn sugar for:
template <typename R>
requires(std::ranges::input_range<R>)
explicit MyCoolContainer(R&& range)
{
// ...
}
I prefer the shorthand std::ranges::input_range auto
syntax, but at the time of this writing
MSVC’s support for it is still shaky. (Update: fixed in 16.10! 😊) If in doubt, use
the syntax template <std::ranges::input_range R>
.
In any case, constraining the parameter type to satisfy input_range
allows this constructor
overload to accept anything out there that implements begin
, end
, and iterators, as we’ve seen
in previous sections. You can then iterate over it generically and do whatever you want with the
results.
The range parameter is declared as auto&&
to make it a universal reference,
meaning that it can accept either lvalues or rvalues; in particular, it can accept the result of a
function call returning a range, and it can accept the result of a pipeline:
MyCoolContainer c{ another_range |
std::views::transform(blah) |
std::views::filter(blah) };
A completely generic range-accepting method like this might not be the most useful thing. If we have
a container storing int
values, for example, it wouldn’t make a lot of sense for us to accept
ranges of strings or other arbitrary types. We’d like to be able to put some additional constraints
on the element type of the range: perhaps we only want element types that are convertible to int
.
Helpfully, the ranges library provides a template range_value_t
that retrieves the element type of a range—namely, the value_type
declared by the range’s
iterator. With this, we can state additional constraints like so:
explicit MyCoolContainer(std::ranges::input_range auto&& range)
requires(std::convertible_to<std::ranges::range_value_t<decltype(range)>, int>)
{
// ...
}
We can even define a concept that wraps up these requirements:
template <typename R, typename T>
concept input_range_of =
std::ranges::input_range<R> &&
std::convertible_to<std::ranges::range_value_t<R>, T>;
and then use it as follows:
explicit MyCoolContainer(input_range_of<int> auto&& range)
{
// ...
}
Something like this should be in the standard library, IMO.
You can also choose to require one of the more specialized concepts, like forward_range
or
random_access_range
, if you need those extra capabilities for whatever you’re doing.
However, just as a container should generally implement the most specific range concept it can
provide, a function that takes a range parameter should generally require the most general range
concept it can deal with, or it will unduly restrict what kind of ranges can be passed to it.
That said, there might be cases where you can switch to a more efficient implementation if the range
satisfies some extra requirements. For example, if it’s a sized_range
, then you might be able to
reserve storage before inserting the elements. You can test for this inside your function body using
if constexpr
:
explicit MyCoolContainer(input_range_of<int> auto&& range)
{
if constexpr (std::ranges::sized_range<decltype(range)>)
{
reserve(std::ranges::size(range));
}
for (auto&& item : range)
{
// process the item
}
}
Here, std::ranges::size
is a convenience wrapper
that knows how to call the range’s associated size
function, whether it’s implemented as a method
or a free function.
You could also do things like: check if the range is a contiguous_range
and the item is something
trivially copyable, and switch to memcpy
rather than iterating over all the items.
Output Iterators
Range views and pipelines operate on a “pull” model, where the pipeline is represented by a proxy
range object that generates its results lazily when you iterate it. Taking generic range objects as
parameters to your container is an easy and useful way to consume such objects, and that probably suffices
for most uses. However, there are a handful of bits in the ranges library that operate on a “push”
model, where you call a function that wants to store values into your container via an output
iterator. This comes up with certain ranges algorithms
like ranges::copy
, ranges::transform
, and ranges::generate
.
Personally, I don’t see a hugely compelling reason to worry about these, as it’s also possible to use views to express the same operations; but for the sake of completeness, I’ll discuss them briefly here.
At this point, it won’t surprise you to learn that just as there were concepts for input ranges,
there are also concepts std::ranges::output_range
and std::output_iterator
.
In this case there’s just that one concept, not a hierarchy of refinements of them; however, if you
peruse the definitions of some of the ranges algorithms, you’ll find that many of them don’t actually
use output_iterator
, but state slightly different, less- or more-specific requirements of their
own. (This part of the standard library feels a little less fully baked than the rest; I wouldn’t be
surprised if some of this gets elaborated or polished a bit more in C++23 or later revisions.)
The requirements for an output iterator (broadly construed) are very similar to those for an input
iterator, only adding that the value returned by dereferencing the iterator must be writable by
assigning to it: you must be able to do *iter = foo;
for some appropriate type of foo
. If you’ve
implemented a non-const input iterator, it probably satisfies the requirement already.
It’s also possible to do slightly more exotic things with an output iterator, like returning a proxy
object that accepts assignment and does “something” with the value assigned. An example of this is
the STL’s std::back_insert_iterator
,
which takes whatever is assigned to it and appends to its container (as opposed to overwriting an
existing value in the container). The STL has a few more things like that, including an iterator
that writes characters out to an ostream
.
There are also some cases amongst the ranges algorithms of “input-output” iterators, such as for
operations that reorder a range in place, like sorting. These often have a bidirectional or
random-access iterator requirement, plus needing the dereferenced types to be swappable, movable,
and varying other constraints. Those details probably aren’t going to be relevant to you unless
you’re doing something tricky, like making a container that generates elements on the fly somehow,
or returns proxy objects rather than direct references to elements (like std::vector<bool>
).
Conclusion
The C++20 ranges library provides a lot of powerful, composable tools for manipulating sequences of objects, and a range of specificity from the most generic and abstract container-shaped things down to the very concrete, efficient, and practical. When working with your own container types, it would be nice to be able to take advantage of these tools.
As we’ve seen, it’s hardly an onerous task to implement ranges compatibility for your own containers.
Most of the necessaries are things you were probably already doing: you probably already had an
iterator class and begin/end methods. It only takes a little bit of attention to satisfying certain
details—like adding the difference_type
and value_type
aliases, and making sure you can both
preincrement and postincrement—to make your iterators satisfy the STL iterator concepts, and thus
have your containers recognized as ranges. It’s also no sweat to write functions accepting generic
ranges as input, letting you store the output of other range operations into your container.
I hope this has been a useful peek under the hood and has given you some ideas about how your container classes can benefit from the new C++20 features.