Creating an input range

Input ranges are D's improvement to the iterators of C++ that are safer, potentially more efficient, and can also generate their own data. Here, we'll create an input range that generates numbers in a Fibonacci sequence to see how it can be done and briefly explore how well it integrates with std.algorithm automatically.

How to do it…

Let's execute the following steps to create an input range that generates numbers in a Fibonacci sequence:

  1. Create a struct with, minimally, the three core input range members: the properties front, the properties of empty, and the function popFront. It should also hold whatever state is necessary for iteration with successive calls to popFront.
  2. Add all other functionality as possible without breaking the complexity guarantee. Our Fibonacci generator can also implement a simple save property, so we will add it, upgrading it to a forward range.
  3. Test it with static assert for the interface you need and perform unit tests.
  4. Write a helper method to create it.
  5. Use it! We'll print out the first several entries with std.range.take and writeln.

The code is as follows:

import std.range;
struct FibonacciRange {
    // input range required methods
    // Fibonacci is an infinite sequence, so it is never empty
    enum bool empty = false;
    @property int front() { return current; }
    void popFront() {
        previous = current;
        current = next;
        next = current + previous;
    }
    // data members for state between calls
    private {
        int previous = 0;
        int current = 0;
        int next = 1;
    }
    // other range functions we can implement cheaply
    @property FibonacciRange save() { return this; }
}
// explicit check that it fulfils the interface we want
static assert(isForwardRange!FibonacciRange);

// helper function to create the range
FibonacciRange fibonacci() {
     return FibonacciRange();
}

void main() {
    import std.stdio;
    auto numbers = fibonacci();
    writeln(numbers.take(15));
}

Running it will print the first 15 Fibonacci numbers as follows:

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]

How it works…

Input ranges are defined in terms of two required properties and one required function, and they can be extended with a number of optional functions and properties. Any object that implements the functions and properties is considered an input range.

The three required members are front, empty, and popFront. The front property is the current item at the front at the list. The empty property is true when there are no more items left. The popFront member advances to the next item.

Note

The front property should always be valid unless the range is empty, including immediately after construction. For some ranges, this means you should prime it by calling popFront in the constructor, or you can do what we did here and set up valid data with the variable's initializers.

As they can be called multiple times in a single loop, front and empty should always be very fast properties or, when appropriate, direct data members. The popFront property should execute as fast as possible, but it is also permitted to perform expensive computations to prepare the next value if necessary.

The optional range primitives, such as save, which returns a new range that is a copy of the current range and can be advanced independently, or opIndex, which offers random access to the range's contents, should only be implemented if you can do so quickly (O(1) algorithmic complexity). As a general rule, if your design for save has to perform a large memory copy or allocation, don't implement it. The typical save property does exactly what we did here: return this inside a struct. Assigning a struct automatically performs a shallow copy—a fast operation. A struct copy will be usable independently as long as there are no mutable references. Here, our only members are integers, so the copy works.

Generally, if you are writing a loop inside a range primitive, you should rethink if your range really supports that function. Range primitives are typically used in loops, so a loop inside these functions risks higher algorithmic complexity than the consumer expects, and that will make the function slow. We'll experiment with this later in this chapter.

While implementing the range properties, you may implement them as a data member, an enum, or a @property function. The @property functions are the most common. The @property function is a function that can be replaced with a data member, ideally without breaking any code. They aren't quite interchangeable as the @property function can restrict access to the variable (for example, a getter function without a corresponding setter results in a read-only variable to which references cannot exist—something impossible with regular data members). However, for the most part, the goal of a @property function is to look like a variable to the outside world.

The enum storage class in D is also similar to a data member, but is different in one important way; an enum variable, at the usage point, is identical to a data literal and not a variable. If you write enum a = 10; int b = a;, the compiler will translate that into int b = 10;. It is impossible to get a reference or pointer to an enum type, just like it is impossible to get a pointer to an integer literal; neither have an address in memory.

The enum variables are interesting because their value is always accessible at compile time and the fact that they are enum variables is also apparent at compile time. This means an enum variable can be tested in a static if statement, in a template constraint or any other compile-time context. It also means their initializer is always evaluated at compile time, giving a key gateway into the world of compile-time function evaluation (CTFE). Here, however, we didn't need to evaluate any function. Our Fibonacci range is simply never empty, since the Fibonacci sequence is infinite. The enum bool empty = false; statement (or simply enum empty = false; as the type could be automatically deduced) advertises this fact, and unlike bool empty = false, the enum version can always be tested at compile time. Thus, our range would pass the std.range.isInfinite test; a fact our average function in the previous recipe tested. It is impossible to get the average of all Fibonacci numbers.

The enum keyword is also used to create enumerations; this is a new type that has members associated with a literal, similar to enum variables in other languages. The implementation of these two features is mostly the same, which is why they share a keyword, though the two usages are generally separate.

If you were writing a range that wasn't infinite, the empty or @property function should be a data member, just like front.

The Fibonacci struct is our range. It is customary, however, to write at least two other pieces of code: the static assert(s) to ensure the range actually fulfils the required interface and a helper instantiator function to help create the range. You will often also wish to write unit tests for your range, testing it with a variety of input and ensuring it generates correct outputs.

A static assert is given a condition and, optionally, a message that is tested at compile time. If the test fails, the compiler prints the location of the failed assertion, the message (if provided), and the call stack of any template instantiation. A failing static assert will result in a failed build with a compile error. Static asserts are flexible tools used to test code, compile flags, and the supported platforms by generating user-defined compile errors. Here, we use it to ensure our code does what it needs to do to implement the interface. It isn't strictly necessary, but helps us to ensure that we didn't make any mistakes that otherwise wouldn't appear until a user tried to actually create our range.

Finally, the helper function, fibonacci, serves as a convenience constructor. Here, we didn't really need it, since our range is fairly simple. However, with more complex ranges, the helper function gives us a chance to enforce our invariants (for example, ensure the constructor that primes the range—ensuring front is always valid unless empty—is called) and can significantly simplify the creation due to implicit function template instantiation. Recall how compile-time arguments are often automatically deduced when calling a function. Let's start with the following code:

to!string(10) == to!(string, int)(10)

Deductions don't happen with the struct constructors, but do happen with helper functions, as shown in the following code:

struct Test(T) { T t; }
auto test = Test(10); // compile error!
auto test = Test!int(10); // this is needed instead

When you use the auto helper function, you can save the user from having to list out the exact types by doing it for them just once, as shown in the following code:

auto helper(T)(T t) { return Test!T(t); }
auto test = helper(10); // works!
auto test2 = helper("foo"); // also works!

Helper functions' ability to deduce types simplifies the use of complex ranges considerably. This, combined with the ability to call the constructor on your terms, makes ranges and helper functions often go hand-in-hand.

There's more…

Many algorithms from std.algorithm work with our Fibonacci range. Experiment with them to see the possibilities! It is interesting to note that even complex functions such as std.algorithm.map and std.algorithm.filter will work with our range, despite the fact that it is infinite. This is accomplished because they use lazy evaluation. They only evaluate a result upon request, so passing them an infinite amount of data does not mean they must do an infinite amount of work.

If you need to pass it to a function that cannot work with infinite data, such as writeln or our average function, std.range.take can be used to limit the size to something more manageable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.181.36