Phobos' std.algorthm
includes sorting algorithms. Let's look at how they are used, what requirements they have, and the dangers of trying to implement range primitives without minding their efficiency requirements.
Let's make a linked list container that exposes an appropriate view, a forward range, and an inappropriate view, a random access range that doesn't meet its efficiency requirements. A singly-linked list can only efficiently implement forward iteration due to its nature; the only tool it has is a pointer to the next element. Implementing any other range primitives will require loops, which is not recommended. Here, however, we'll implement a fully functional range, with assignable elements, length, bidirectional iteration, random access, and even slicing on top of a linked list to see the negative effects this has when we try to use it. The code for this can be found at Chapter03Code 3
of your code bundle.
We'll also create a skeleton main program that initializes the list. You can find the code for this at Chapter03Code 3
of your code bundle.
We're going to both sort and benchmark this program.
Let's sort ranges by executing the following steps:
std.algorithm
.(a, b) => a < b
, which results in an ascending order when the sorting is complete (for example, [1,2,3]
). If you want ascending order, you don't have to specify a predicate at all. If you need descending order, you can pass a greater-than predicate instead, as shown in the following line of code:auto sorted = sort!((a, b) => a > b)([1,2,3]); // results: [3,2,1]
std.string.cmp
(case-sensitive) or std.string.icmp
(case-insensitive) may be used, as is done in the following code:auto sorted = sort!((a, b) => cmp(a, b) < 0)(["b", "c", "a"]); // results: a, b, c
struct
member, as shown in the following code:auto sorted = sort!((a, b) => a.value < b.value)(structArray);
std.range
, as shown in the following code:auto sorted = sort(fibanocci().take(10)); // won't compile, not enough capabilities auto sorted = sort(fibanocci().take(10).array); // ok, good
Let's sort objects using benchmark by executing the following steps:
std.datetime.benchmark
to test the sorting of an array from the appropriate walker against the slow walker and print the results at the end of main
. The code is as follows:auto result = benchmark!( { auto sorted = sort(list.walker.array); }, { auto sorted = sort(list.slowWalker); } )(100); writefln("Emulation resulted in a sort that was %d times slower.", result[1].hnsecs / result[0].hnsecs);
Emulation resulted in a sort that was 16 times slower.
dmd –inline –O yourfile.d
) and see the difference. The emulated version will be consistently slower, and as the list becomes longer, the gap will widen.On my computer, a growing list size led to a growing slowdown factor, as shown in the following table:
List size |
Slowdown factor |
---|---|
500 |
13 |
1000 |
16 |
2000 |
29 |
4000 |
73 |
The interface to Phobos' main sort
function hides much of the complexity of the implementation. As long as we follow the efficiency rules when writing our ranges, things either just work or fail, telling us we must call an array in the range before we can sort it. Building an array has a cost in both time and memory, which is why it isn't performed automatically (std.algorithm
prefers lazy evaluation whenever possible for best speed and minimum memory use). However, as you can see in our benchmark, building an array is much cheaper than emulating unsupported functions.
The sort
algorithms require a full-featured range and will modify the range you pass to it instead of allocating memory for a copy. Thus, the range you pass to it must support random access, slicing, and either assignable or swappable elements. The prime example of such a range is a mutable array. This is why it is often necessary to use the array function when passing data to sort
.
Our linked list code used static if
with a compile-time parameter as a configuration tool. The implemented functions include opSlice
and properties that return ref
. The ref value can only be used on function return values or parameters. Assignments to a ref
value are forwarded to the original item. The opSlice
function is called when the user tries to use the slice syntax: obj[start .. end]
.
Inside the beSlow
condition, we broke the main rule of implementing range functions: avoid loops. Here, we see the consequences of breaking that rule; it ruined algorithm restrictions and optimizations, resulting in code that performs very poorly. If we follow the rules, we at least know where a performance problem will arise and can handle it gracefully.
For ranges that do not implement the fast length
property, std.algorithm
includes a function called walkLength
that determines the length by looping through all items (like we did in the slow length
property). The walkLength
function has a longer name than length precisely to warn you that it is a slower function, running in O(n) (linear with length) time instead of O(1) (constant) time. Slower functions are OK, they just need to be explicit so that the user isn't surprised.
std.algorithm
module also includes other sorting algorithms that may fit a specific use case better than the generic (automatically specialized) function. See the documentation at http://dlang.org/phobos/std_algorithm.html for more information.18.218.151.44