Recursive sorting

We will implement the quick sort with an extra class that is in the qsort package along with the partitioning class, which is as follows:

package packt.java9.by.example.ch03.qsort; 

// imports deleted from the print

public class Qsort<E> {
// constructor injected final fields deleted from the print
public void qsort(SortableCollection<E> sortable, int start, int end) {
if (start < end) {
final E pivot = sortable.get(start);
final Partitioner<E> partitioner = new Partitioner<>(comparator, swapper);
int cutIndex = partitioner.partition(sortable, start, end, pivot);
if (cutIndex == start) {
cutIndex++;
}
qsort(sortable, start, cutIndex - 1);
qsort(sortable, cutIndex, end);
}
}
}

The method gets SortableCollection<E> and two index parameters. It does not sort the whole collection; it sorts only the elements between the start and the end index.

It is always important to be extremely precise with the indexing. Usually, there is no problem with the start index in Java, but a lot of bugs source from how the end index is interpreted.
In this method, the value of end can mean that the index is already not part of the to-be-sorted interval. In that case, the partition method should be invoked with end-1 and the first recursive call with cutIndex as last parameter. It is a matter of taste. The important thing is to be precise and define the interpretation of index parameters.

If there is only one element (start == end), then there is nothing to be sorted and the method returns. This is the end criterion of the recursion. The method also assumes that the end index is never smaller than the start index. As this method is used only inside the library that we are developing at the moment, such an assumption is not too risky to make.

If there is something to be sorted, then the method takes the first element of the to-be-sorted interval and uses it as pivot and calls the partition method. When the partition is done, the method recursively calls itself for the two halves.

This algorithm is recursive. This means that the method calls itself. When a method call is executed, the processor allocates some memory in an area called stack and it stores the local variables there. This area that belongs to the method in the stack is called stack frame. When the method returns, this area is released and the stack is restored, simply moving the stack pointer where it was to the previous state. This way a method can continue its execution after calling another method; the local variables are there.

When a method calls itself, it is not different. The local variables are local to the actual call of the method. When the method calls itself, it allocates space for the local variables again on the stack. In other words, these are new instances of the local variables.

We will use recursive methods in Java, and in other programming languages, when the definition of the algorithm is recursive. It is extremely important to understand that when the processor code runs, it is not recursive any more. On that level, there are instructions, register stores, and memory loads and jumps. There is nothing like function or method and therefore, on that level, there is nothing like recursion.

If you get that, it is easy to understand that any recursion can be coded as a loop.

As a matter of fact, it is also true the other way around—every loop can be coded as recursion but that is not really interesting until you start functional programming.

The problem with the recursion in Java, and in many other programming languages, is that it may run out of stack space. In the case of quick sort, this is not the case. You can safely assume that the stack for method calling in Java is a few hundreds of levels. Quick sort needs a stack that is approximately log2n deep, where n is the number of elements to be sorted. In the case of one billion elements, this is 30 that should just fit.

Why is the stack not moved or resized? That is because the code that runs out of the stack space is usually bad style. They can be expressed more readable in form of some loop. A more robust stack implementation would only lure the novice programmer to do some less readable recursive coding.

There is a special case of recursion named tail recursion. A tail recursive method calls itself as the last instruction of the method. When the recursive call returns the code, executing the method does nothing else but release the stack frame that was used for this method invocation. In other words, we will keep the stack frame during the recursive call just to throw it away afterwards. Why not throw it away before the call? In that case, the actual frame, which has the same size and call, will allocate because this is just the same method that is kept and the recursive call is transformed into a jump instruction. This is an optimization that Java does not do. Functional languages are doing it, but Java is not really a functional language and therefore tail-recursive functions should rather be avoided and transformed to a loop in the Java source level.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.20.231