Using ranges when implementing an algorithm

D code favors the use of ranges when implementing algorithms because ranges are a natural abstraction for operating on sequences of elements. We'll implement a function that takes a range and calculates the average of the values.

Getting ready

First, we need to sketch out our algorithm and determine what functions and types we'll need from the range. The algorithm should accept a range as generic as possible for element types that make sense for us, without sacrificing efficiency. Since our algorithm is taking an average value, the elements should be numeric. We'll also need to be able to iterate over each individual item. However, we don't need to know the length ahead of time (we can keep a count while calculating the sum); we don't need random access and we don't need to save our position. All we need to know is whether it is empty, the current item, and be able to advance the iteration. The result is we need an input range, nothing more. Other types of ranges are as follows:

  • Forward ranges: These are input ranges where a particular point can be saved and used later. You'll want a forward range when you need to look ahead and return the part of the range already passed for review.
  • Bidirectional ranges: These offer backward iteration as well as forward iteration.
  • Random access ranges: These are input ranges that also offer fast access to any index. You'll want a random access range when your algorithm needs to jump around rapidly.
  • Output ranges: These are ranges that receive data instead of iterating over data.

    Tip

    You can go to http://dlang.org/phobos/std_range.html for the official documentation on ranges and detailed definitions of each type.

There are also additional features you may offer on top of the core range primitives. These include infiniteness, slicing, length, and there may be more in the future. They are checked independently of the range type and are used, if available, for added efficiency in algorithms.

It is also important to look at what kind of ranges our algorithm cannot handle. Our simple average algorithm will loop over all the values, which is impossible if we have passed an infinite range—an input range that is never empty. Thus, we will reject infinite ranges in our function signature when implementing the average algorithm.

Note

It is possible to use additional facilities if and only if they are actually available. For example, std.range.walkLength has two constraints; it needs to be iterable, so it needs to be an input range, and it must not be infinite. However, the function body of std.range.walkLength starts with the following code:

static if(hasLength!Range) return range.length;

The hasLength function is a function from std.range that checks for the presence of a length member. If a length member is available from this range, it returns that value instead of looping over the values and calculating a count, increasing the efficiency of the algorithm.

How to do it…

The following steps show how to use ranges when implementing an algorithm:

  1. Import the std.range and std.traits modules. They have functions to help perform compile-time checks on range capabilities.
  2. Create a function that takes the range we need, with a constraint for both the range type and the range's element type.
  3. Determine a good type for the return value. For our average, we want a type that can hold a large sum of the range's elements. So, if the range we received has an element type of floating point, we'll use double. Otherwise, we'll use long.
  4. Implement your algorithm in terms of the range primitives you required in the constraint. For our average function, we'll loop through the values and add them to a running total and count, and then we'll calculate the arithmetic average by dividing the sum by the count.
  5. Return the result. The result should be as useful as possible. The code is as follows:
    import std.range, std.traits;
    auto average(Range)(Range range)
    if(isInputRange!Range && !isInfinite!Range && isNumeric!(ElementType!Range))
    {
         static if(isFloatingPoint!(ElementType!Range))
            double sum = 0;
         else
            long sum = 0;
         int count = 0;
         foreach(item; range) {
            count++;
            sum += item;
         }
         return sum / count;
    }
  6. Test it by passing it an array (or any other numeric range object!), as shown in the following line of code:
    writeln(average([1, 2, 3])); // prints 2

How it works…

Ranges are generally used by constrained function templates. That's what the first few lines of code are: a function template declaration and the constraint. The syntax is as follows:

return_type name(compile, time, args)(run, time, args)
if(constraint_condition)

The compile-time arguments are all available in the constraint condition. If the constraint fails, the template is not considered a match for the arguments. This allows you to be specific about what you will and won't accept, and also to overload your templates on different conditions. For example, we could write a separate function with the same name and same arguments, but leave out the !isInfinite!Range condition. Then, if we try to call the function with an infinite range, it would use the second function instead of the first.

The constraint condition's contents are just like any other if statement. You can call functions; combine conditions with &&, ||, or parenthesis; and you can reference the compile-time arguments along with any literals or manifest constants (enum values defined elsewhere in the module). Here, we used three functions from std.range and tone from std.traits to describe exactly what we need. These functions perform type checks to ensure that the passed object conforms to the range interface we need. When accepting a generic input range, this does not mean to accept anything. It simply means to accept the most generic input that makes sense for you. If data transformation is required to get it into the form you need, don't accept that data—let the user do the transformation if they choose to. This keeps the cost of your algorithm predictable.

Note

It is also possible to write unconstrained templates that match types with pure duck typing. Duck typing is named from the saying "if it quacks like a duck, it's a duck". In code, it means if duck.quack() compiles, it will be used as a duck, regardless of whether it explicitly implements the duck interface or not. In fact, this is exactly how isInputRange and friends are implemented. However, it is generally considered bad practice to write an unconstrained template in D. This is because figuring out what their requirements are can be difficult and it is easy to miss bugs in the implementation, since it may be passed objects that it has no idea how to handle.

Once the compile-time arguments are validated by the constraint, they can also be used in the runtime arguments or inside the function as types or values, depending on the compile-time argument. That's what (Range range) is. The compile-time argument Range is used as a wildcard type for the runtime argument range.

The other notable aspect of this function's signature is the return type. Here, we returned auto. This means, like with auto local variables, that the compiler will automatically determine the return type by looking at the first return statement in the function body. An auto template function must return one specific type—it is illegal to try to return both a string and an integer, for example—but you don't have to specify the type yourself. This is useful for more than just convenience. Here, we used the auto return type because the specific type returned depends on the result of the static if statement in the function body. Automatic type deduction lets us have more complex logic inside the function to return the best possible type for any particular input type. Functions returning auto are also useful to return nested struct types whose name may not be accessible outside the function's scope. Phobos uses this pattern when implementing some higher order ranges.

Our function's body is pretty straightforward, with one exception: the use of static if. The static if function works like regular if, but with two key differences that are given as follows:

  • It runs at compile time, meaning only compile-time information, such as variable types or compile-time template arguments, is available to it.
  • It does not introduce a new scope. Variables declared inside the static if function are still available outside the static if function. This is why we were able to use sum in the rest of the function.

The static if function is very useful when specializing functions and looking at information received by compile-time reflection. As you can customize just part of a template, it helps with code reuse. There is no need to write a whole new function when the only difference in sum is a float value instead of an int value. The branch of static if that is not true does not get compiled into the program at all. It must have valid syntax, but the code isn't actually compiled. So, you can safely use static if for conditional compilation or to filter out code that does not work with a given set of arguments.

Finally, we can look at how the function is called. Note how simple it is. The caller doesn't have to worry about the specifics of compile-time arguments, as they are 100 percent automatically deduced, nor does the caller need to worry about passing it a specific type, since the function is very generic! It just works.

Tip

There are times when the user may have to care about the specifics. The most frequent case is when they pass the function types it doesn't support. The D compiler's error messages may be difficult to read and may falsely place the error in a library function instead of the user's code. If this happens to you, the key information to watch for is the constraint the compiler says does not match. If your call says no matching template[…] candidates are: foo(T) if(isInputRange!T), look at where you called foo and be sure it is actually being passed an input range. Note that static arrays are not input ranges; they need to be sliced with the [] operator.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.122.68