Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Recursion

Recursive algorithms are used because they’re often clearer and more elegant than the alternatives, and therefore have a lower maintenance cost than the equivalent iterative algorithm. However, recursion often (but not always) has a cost to it; recursive algorithms are frequently slower. So it is useful to understand the costs associated with recursion, and how to improve the performance of recursive algorithms when necessary.

Recursive code can be optimized by a clever compiler (as is done with some C compilers), but only if presented in the right way (typically, it needs to be tail-recursive: see Tail Recursion). For example, Jon Bentley^[49] found that a functionally identical recursive method was optimized by a C compiler if he did not use the ?: conditional operator (using if statements instead). However, it was not optimized if he did use the ?: conditional operator. He also found that recursion can be very expensive, taking up to 20 times longer for some operations that are naturally iterative. Bentley’s article also looks briefly at optimizing partial match searching in ternary search trees by transforming a tail recursion in the search into an iteration. See Chapter 11, for an example of tuning a ternary search tree, including an example of converting a recursive algorithm to an iterative one.

Tail Recursion

A tail-recursive function is a recursive function for which each recursive call to itself is a reduction of the original call. A reduction is the situation where a problem is converted into a new problem that is simpler, and the solution of that new problem is exactly the solution of the original problem, with no further computation necessary. This is a subtle concept, best illustrated with a simple example. I will take the factorial example used in the text. The original recursive solution is:

public static long factorial1(int n)
{
if (n < 2) return 1L;
else return n*factorial1(n-1);
}

This is not tail-recursive, since each call to itself does not provide the solution to the original problem. Instead, the recursive call provides a partial solution that must be multiplied by a number to get the final result. If you consider the operating stack of the VM, each recursive call must be kept on the stack, as each call is incomplete until the next call above on the stack is returned. So factorial1(20) goes on the stack and stays there until factorial1(19) returns. factorial1(19) goes above factorial1(20) on the stack and stays there until factorial1(18) returns, etc.

The tail-recursive version of this function requires two functions: one to set up the recursive call (to keep compatibility), and the actual recursive call. This looks like:

public static long factorial1a(int n)
{
  //NOT recursive. Sets up the tail-recursive call to factorial1b( )
  if (n < 2) return 1L;
  else return factorial1b(n, 1L);
}

public static long factorial1b(int n, long result)
{
  //No need to consider n < 2, as factorial1a handles that
  if (n == 2) return 2L*result;
  else return factorial1b(n-1, result*n);
}

I have changed the recursive call to add an extra parameter, whihc is the partial result, built up as you calculate the answer. The consequence is that each time you return the recursive call, the answer is the full answer to the function, since you are holding the partial answer in a variable. Considering the VM stack again, the situation is vastly improved. Because the recursive method returns a call to itself each time, with no further operations needed (i.e., the recursive caller actually exits with the call to recurse), there is no need to keep any calls on the stack except for the current one. factorial1b(20,1) is put on the stack, but this exits with a call to factorial1b(19,20), which replaces the call to factorial1b(18,380), which is in turn replaced by the call to factorial1b(17,6840), and so on, until factorial1b(2, ...) returns just the result.

Generally, the advice for dealing with methods that are naturally recursive (because that is the natural way to code them for clarity) is to go ahead with the recursive solution. You only need to spend time counting the cost (if any) when your profiling shows that this particular method call is a bottleneck in the application. At that stage, it is worth pursuing alternative implementations or avoiding the method call completely with a different structure.

In case you need to tune a recursive algorithm or convert it into an iterative one, I provide some examples here. I start with an extremely simple recursive algorithm for calculating factorial numbers, as this illustrates several tuning points:

public static long factorial1(int n)
{
  if (n < 2) return 1L;
  else return n*factorial1(n-1);
}

I have limited the function to long values, which means that you cannot use the function beyond factorial 20, as that overflows the long data type. This is to keep the function simple for this illustration.

Since this function is easily converted to a tail-recursive version, it is natural to test the tail-recursive version to see if it performs any better. For this particular function, the tail-recursive version does not perform any better, which is not typical. Here, the factorial function consists of a very simple fast calculation, and the extra function call overhead in the tail-recursive version is enough of an overhead that it negates the benefit that is normally gained. (Note that the HotSpot 1.0 VM does manage to optimize the tail-recursive version to be faster than the original, after the compiler optimizations have had a chance to be applied. See Table 7-3.)

Let’s look at other ways this function can be optimized. Start with the classic conversion for recursive to iterative and note that the factorial method contains just one value which is successively operated on, to give a new value (the result), along with a parameter specifying how to operate on the partial result (the current input to the factorial). A standard way to convert this type of recursive method is to replace the parameters passed to the method with temporary variables in a loop. In this case, you need two variables, one of which is passed into the method and can be reused. The converted method looks like:

public static long factorial2(int n)
{
  long result = 1;
  while(n>1)
  {
    result *= n--;
  }
  return result;
}

Measuring the performance, you see that this method calculates the result in 88% of the time taken by the original recursive factorial1( ) method (using the JDK 1.2 results.^[50] See Table 7-3).

Table 7-3. Timings of the Various Factorial Implementations

	1.2	1.2 no JIT	1.3	HotSpot 1.0 2nd Run	1.1.6
`factoral1` (original recursive)	100%	572%	152%	137%	101%
`factoral1a` (tail recursive)	110%	609%	173%	91%	111%
`factorial2` (iterative)	88%	344%	129%	177%	88%
`factoral3` (dynamically cached)	46%	278%	71%	74%	46%
`factoral4` (statically cached)	41%	231%	67%	57%	40%
`factoral3` (dynamically cached with cache size of 21 elements)	4%	56%	11%	8%	4%

The recursion-to-iteration technique as illustrated here is general, and another example in a different domain may help make this generality clear. Consider a linked list, with singly linked nodes consisting of a next pointer to the next node, and a value instance variable holding (in this case) just an integer. A simple linear search method to find the first node holding a particular integer looks like:

Node find_recursive(int i)
{
  if (node.value == i)
    return node;
  else if(node.next != null)
    node.next.find_recursive(i);
  else
    return null;
}

To convert this to an iterative method, use a temporary variable to hold the “current” node, and reassign that variable with the next node in the list at each iteration. The method is clear, and its only drawback compared to the recursive method is that it violates encapsulation (this one method directly accesses the instance variable of each node object):

Node find_iterative(int i)
{
  Node node = this;
  while(node != null)
  {
    if (node.value == i)
      return node;
    else
      node = node.next;
  }
  return null;
}

Before looking at general techniques for converting other types of recursive methods to iterative ones, I will revisit the original factorial method to illustrate some other techniques for improving the performance of recursive methods.

To test the timing of the factorial method, I put it into a loop to recalculate factorial(20) many times. Otherwise, the time taken is too short to be reliably measured. When this situation is close to the actual problem, a good tuning technique is to cache the intermediate results. This technique can be applied when some recursive function is repeatedly being called and some of the intermediate results are repeatedly being identified. This technique is simple to illustrate for the factorial method:

public static final int CACHE_SIZE = 15;
public static final long[] factorial3Cache = new long[CACHE_SIZE];

public static long factorial3(int n)
{
  if (n < 2) return 1L;
  else if (n < CACHE_SIZE)
  {
    if (factorial3Cache[n] == 0)
      factorial3Cache[n] = n*factorial3(n-1);
    return factorial3Cache[n];
  }
  else return n*factorial3(n-1);
}

With the choice of 15 elements for the cache, the factorial3( ) method takes 46% of the time taken by factorial1( ). If you choose a cache with 21 elements, so that all except the first call to factorial3(20) is simply returning from the cache with no calculations at all, the time taken is just 4% of the time taken by factorial1( )(using the JDK 1.2 results: see Table 7-3).

In this particular situation, you can make one further improvement, which is to compile the values at implementation and hardcode them in:

public static final long[] factorial4Cache = {
  1L, 1L, 2L, 6L, 24L, 120L, 720L, 5040L, 40320L, 362880L, 3628800L,
  39916800L, 479001600L, 6227020800L, 87178291200L};
public static final int CACHE_SIZE = factorial4Cache.length;
public static long factorial4(int n)
{
  if (n < CACHE_SIZE)
    return factorial4Cache[n];
  else return n*factorial4(n-1);
}

This is a valid technique that applies when you can identify and calculate partial solutions that can be included with the class at compilation time.^[51]

^[49]“The Cost of Recursion,” Dr. Dobb’s Journal, June 1998.

^[50]HotSpot optimized the recursive version sufficiently to make it faster than the iterative version.

^[51]My editor points out that a variation on hardcoded values, used by state-of-the-art high-performance mathematical functions, is a partial table of values together with an interpolation method to calculate intermediate values.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Recursion

Create new playlist

Sign In

Sign Up

Recursion

Table of Contents for
Recursion