Optimizing Fibonacci

The first optimization we are going to perform eliminates a method call, as shown in Listing 1–3. As this implementation is recursive, removing a single call in the method dramatically reduces the total number of calls. For example, computeRecursively(30) generated 2,692,537 calls while computeRecursivelyWithLoop(30) generated “only” 1,346,269. However, the performance of this method is still not acceptable considering the response-time criteria defined above, 100 milliseconds or less, as computeRecursivelyWithLoop(30) takes about 270 milliseconds to complete.

Listing 1–3. Optimized Recursive Implementation of Fibonacci Series

public class Fibonacci {
    public static long computeRecursivelyWithLoop (int n)
    {
        if (n > 1) {
            long result = 1;
            do {
                result += computeRecursivelyWithLoop(n-2);
                n--;
            } while (n > 1);
            return result;
        }
        return n;
    }
}

NOTE: This is not a true tail-recursion optimization.

From Recursive To Iterative

For the second optimization, we switch from a recursive implementation to an iterative one. Recursive algorithms often have a bad reputation with developers, especially on embedded systems without much memory, because they tend to consume a lot of stack space and, as we just saw, can generate too many method calls. Even when performance is acceptable, a recursive algorithm can cause a stack overflow and crash an application. An iterative implementation is therefore often preferred whenever possible. Listing 1–4 shows what is considered a textbook iterative implementation of the Fibonacci series.

Listing 1–4. Iterative Implementation of Fibonacci Series

public class Fibonacci {
    public static long computeIteratively (int n)
    {
        if (n > 1) {
            long a = 0, b = 1;
            do {
                long tmp = b;
                b += a;
                a = tmp;
            } while (--n > 1);
            return b;
        }
        return n;
    }
}

Because the nth term of the Fibonacci series is simply the sum of the two previous terms, a simple loop can do the job. Compared to the recursive algorithms, the complexity of this iterative algorithm is also greatly reduced because it is linear. Consequently, its performance is also much better, and computeIteratively(30) takes less than 1 millisecond to complete. Because of its linear nature, you can use such an algorithm to compute terms beyond the 30th. For example, computeIteratively(50000) takes only 2 milliseconds to return a result and, by extrapolation, you could guess computeIteratively(500000) would take between 20 and 30 milliseconds to complete.

While such performance is more than acceptable, it is possible to to achieve even faster results with a slightly modified version of the same algorithm, as showed in Listing 1–5. This new version computes two terms per iteration, and the total number of iterations is halved. Because the number of iterations in the original iterative algorithm could be odd, the initial values for a and b are modified accordingly: the series starts with a=0 and b=1 when n is odd, and it starts with a=1 and b=1 (Fib(2)=1) when n is even.

Listing 1–5. Modified Iterative Implementation of Fibonacci Series

public class Fibonacci {
    public static long computeIterativelyFaster (int n)
    {
        if (n > 1) {
            long a, b = 1;
            n--;
            a = n & 1;
            n /= 2;
            while (n-- > 0) {
                a += b;
                b += a;
            }
            return b;
        }
        return n;
    }
}

Results show this modified iterative version is about twice as fast as the original one.

While these iterative implementations are fast, they do have one major problem: they don't return correct results. The issue lies with the return value being stored in a long value, which is 64-bit. The largest Fibonacci number that can fit in a signed 64-bit value is 7,540,113,804,746,346,429 or, in other words, the 92nd Fibonacci number. While the methods will still return without crashing the application for values of n greater than 92, the results will be incorrect because of an overflow: the 93rd Fibonacci number would be negative! The recursive implementations actually have the same limitation, but one would have to be quite patient to eventually find out.

NOTE: Java specifies the size of all primitive types (except boolean): long is 64-bit, int is 32-bit, and short is 16-bit. All integer types are signed.

BigInteger

Java offers just the right class to fix this overflow problem: java.math.BigInteger. A BigInteger object can hold a signed integer of arbitrary size and the class defines all the basic math operations (in addition to some not-so-basic ones). Listing 1–6 shows the BigInteger version of computeIterativelyFaster.

TIP: The java.math package also defines BigDecimal in addition to BigInteger, while java.lang.Math provides math constant and operations. If your application does not need double precision, use Android's FloatMath instead of Math for performance (although gains may vary depending on platform).

Listing 1–6. BigInteger Version of Fibonacci.computeIterativelyFaster

public class Fibonacci {
    public static BigInteger computeIterativelyFasterUsingBigInteger (int n)
    {
        if (n > 1) {
            BigInteger a, b = BigInteger.ONE;
            n--;
            a = BigInteger.valueOf(n & 1);
            n /= 2;
            while (n-- > 0) {
                a = a.add(b);
                b = b.add(a);
            }
            return b;
        }
        return (n == 0) ? BigInteger.ZERO : BigInteger.ONE;
    }
}

That implementation guarantees correctness as overflows can no longer occur. However, it is not without problems because, again, it is quite slow: a call to computeIterativelyFasterUsingBigInteger(50000) takes about 1.3 seconds to complete. The lackluster performance can be explained by three things:

  • BigInteger is immutable.
  • BigInteger is implemented using BigInt and native code.
  • The larger the numbers, the longer it takes to add them together.

Since BigInteger is immutable, we have to write “a = a.add(b)” instead of simply “a.add(b)”. Many would assume “a.add(b)” is the equivalent of “a += b” and many would be wrong: it is actually the equivalent of “a + b”. Therefore, we have to write “a = a.add(b)” to assign the result. That small detail is extremely significant as “a.add(b)” creates a new BigInteger object that holds the result of the addition.

Because of BigInteger's current internal implementation, an additional BigInt object is created for every BigInteger object that is allocated. This results in twice as many objects being allocated during the execution of computeIterativelyFasterUsingBigInteger: about 100,000 objects are created when calling computeIterativelyFasterUsingBigInteger(50000) (and all of them but one will become available for garbage collection almost immediately). Also, BigInt is implemented using native code and calling native code from Java (using JNI) has a certain overhead.

The third reason is that very large numbers do not fit in a single, long 64-bit value. For example, the 50,000th Fibonacci number is 34,7111–bit long.

NOTE: BigInteger's internal implementation (BigInteger.java) may change in future Android releases. In fact, internal implementation of any class can change.

For performance reasons, memory allocations should be avoided whenever possible in critical paths of the code. Unfortunately, there are some cases where allocations are needed, for example when working with immutable objects like BigInteger. The next optimization focuses on reducing the number of allocations by switching to a different algorithm. Based on the Fibonacci Q-matrix, we have the following:

F2n-1 = Fn2 + Fn-12

F2n = (2Fn-1 + Fn) * Fn

This can be implemented using BigInteger again (to guarantee correct results), as shown in Listing 1–7.

Listing 1–7. Faster Recursive Implementation of Fibonacci Series Using BigInteger

public class Fibonacci {
    public static BigInteger computeRecursivelyFasterUsingBigInteger (int n)
    {
        if (n > 1) {
            int m = (n / 2) + (n & 1); // not obvious at first – wouldn't it be great to
have a better comment here?
            BigInteger fM = computeRecursivelyFasterUsingBigInteger(m);
            BigInteger fM_1 = computeRecursivelyFasterUsingBigInteger(m - 1);
            if ((n & 1) == 1) {
                // F(m)^2 + F(m-1)^2
                return fM.pow(2).add(fM_1.pow(2)); // three BigInteger objects created
            } else {
                // (2*F(m-1) + F(m)) * F(m)
                return fM_1.shiftLeft(1).add(fM).multiply(fM); // three BigInteger
objects created
            }
        }
        return (n == 0) ? BigInteger.ZERO : BigInteger.ONE; // no BigInteger object
created
    }

    public static long computeRecursivelyFasterUsingBigIntegerAllocations(int n)
    {
        long allocations = 0;
        if (n > 1) {
            int m = (n / 2) + (n & 1);
            allocations += computeRecursivelyFasterUsingBigIntegerAllocations(m);
            allocations += computeRecursivelyFasterUsingBigIntegerAllocations(m - 1);

            // 3 more BigInteger objects allocated
            allocations += 3;
        }
        return allocations; // approximate number of BigInteger objects allocated when computeRecursivelyFasterUsingBigInteger(n) is called
    }
}

A call to computeRecursivelyFasterUsingBigInteger(50000) returns in about 1.6 seconds. This shows this latest implementation is actually slower than the fastest iterative implementation we have so far. Again, the number of allocations is the culprit as around 200,000 objects were allocated (and almost immediately marked as eligible for garbage collection).

NOTE: The actual number of allocations is less than what computeRecursivelyFasterUsingBigIntegerAllocations would return. Because BigInteger's implementation uses preallocated objects such as BigInteger.ZERO, BigInteger.ONE, or BigInteger.TEN, there may be no need to allocate a new object for some operations. You would have to look at Android's BigInteger implementation to know exactly how many objects are allocated.

This implementation is slower, but it is a step in the right direction nonetheless. The main thing to notice is that even though we need to use BigInteger to guarantee correctness, we don't have to use BigInteger for every value of n. Since we know the primitive type long can hold results for n less than or equal to 92, we can slightly modify the recursive implementation to mix BigInteger and primitive type, as shown in Listing 1–8.

Listing 1–8. Faster Recursive Implementation of Fibonacci Series Using BigInteger and long Primitive Type

public class Fibonacci {
    public static BigInteger computeRecursivelyFasterUsingBigIntegerAndPrimitive(int n)
    {
        if (n > 92) {
            int m = (n / 2) + (n & 1);
            BigInteger fM = computeRecursivelyFasterUsingBigIntegerAndPrimitive(m);
            BigInteger fM_1 = computeRecursivelyFasterUsingBigIntegerAndPrimitive(m -
1);
            if ((n & 1) == 1) {
                return fM.pow(2).add(fM_1.pow(2));
            } else {
                return fM_1.shiftLeft(1).add(fM).multiply(fM);// shiftLeft(1) to
multiply by 2
            }
        }
        return BigInteger.valueOf(computeIterativelyFaster(n));
    }

    private static long computeIterativelyFaster(int n)
    {
        // see Listing 1–5 for implementation
    }
}

A call to computeRecursivelyFasterUsingBigIntegerAndPrimitive(50000) returns in about 73 milliseconds and results in about 11,000 objects being allocated: a small modification in the algorithm yields results about 20 times faster and about 20 times fewer objects being allocated. Quite impressive! It is possible to improve the performance even further by reducing the number of allocations, as shown in Listing 1–9. Precomputed results can be quickly generated when the Fibonacci class is first loaded, and these results can later be used directly.

Listing 1–9. Faster Recursive Implementation of Fibonacci Series Using BigInteger and Precomputed Results

public class Fibonacci {
    static final int PRECOMPUTED_SIZE= 512;
    static BigInteger PRECOMPUTED[] = new BigInteger[PRECOMPUTED_SIZE];

    static {
        PRECOMPUTED[0] = BigInteger.ZERO;
        PRECOMPUTED[1] = BigInteger.ONE;
        for (int i = 2; i < PRECOMPUTED_SIZE; i++) {
            PRECOMPUTED[i] = PRECOMPUTED[i-1].add(PRECOMPUTED[i-2]);
        }
    }

    public static BigInteger computeRecursivelyFasterUsingBigIntegerAndTable(int n)
    {
        if (n > PRECOMPUTED_SIZE - 1) {
            int m = (n / 2) + (n & 1);
            BigInteger fM = computeRecursivelyFasterUsingBigIntegerAndTable(m);
            BigInteger fM_1 = computeRecursivelyFasterUsingBigIntegerAndTable(m - 1);
            if ((n & 1) == 1) {
                return fM.pow(2).add(fM_1.pow(2));
            } else {
                return fM_1.shiftLeft(1).add(fM).multiply(fM);
            }
        }
        return PRECOMPUTED[n];
    }
}

The performance of this implementation depends on PRECOMPUTED_SIZE: the bigger, the faster. However, memory usage may become an issue since many BigInteger objects will be created and remain in memory for as long as the Fibonacci class is loaded. It is possible to merge the implementations shown in Listing 1–8 and Listing 1–9, and use a combination of precomputed results and computations with primitive types. For example, terms 0 to 92 could be computed using computeIterativelyFaster, terms 93 to 127 using precomputed results and any other term using recursion. As a developer, you are responsible for choosing the best implementation, which may not always be the fastest. Your choice will be based on various factors, including:

  • What devices and Android versions your application target
  • Your resources (people and time)

As you may have already noticed, optimizations tend to make the source code harder to read, understand, and maintain, sometimes to such an extent that you would not recognize your own code weeks or months later. For this reason, it is important to carefully think about what optimizations you really need and how they will affect your application development, both in the short term and in the long term. It is always recommended you first implement a working solution before you think of optimizing it (and make sure you save a copy of the working solution). After all, you may realize optimizations are not needed at all, which could save you a lot of time. Also, make sure you include comments in your code for everything that is not obvious to a person with ordinary skill in the art. Your coworkers will thank you, and you may give yourself a pat on the back as well when you stumble on some of your old code. My poor comment in Listing 1–7 is proof.

NOTE: All implementations disregard the fact that n could be negative. This was done intentionally to make a point, but your code, at least in all public APIs, should throw an IllegalArgumentException whenever appropriate.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.185.87