Chapter 4

Writing Core Algorithms

Many of the algorithms you are asked to define or demonstrate in interviews are operations on lists, usually sorting or searching. This chapter examines several ways to assemble a list into an ordered list, and discusses the advantages of each approach. It also looks at a common approach to searching lists for a given value.

The algorithms described here are often covered in a core computer science course, so if you have ever studied computer science you will probably be able to treat this chapter as a refresher.

Looking at Big O Notation

A formal approach to describing the performance or complexity of an algorithm is called Big O Notation. This describes how an algorithm performs as its input changes.

For instance, an algorithm described as having a worst-case performance of O(n2) means that as the size of the input doubles, the algorithm takes four times longer to run. An O(n2) algorithm is often not the most efficient implementation, although this is entirely dependent on the exact goal of the algorithm.

Big O tends to be applicable for larger values of n. When n is small, the running time or space consumption of an algorithm is often negligible. Regardless, for any algorithm you do write, it is always worth considering Big O notation, even for small values of n, as, over time, you may find you are running your algorithm with larger and larger input.

An algorithm often has three values of complexity: best-case, worst-case, and average-case. As you would expect, a best-case performance is how the algorithm performs when the input given means the algorithm does as little work as possible.

The performance descriptions here are often called the time complexity of the algorithm. Algorithms have a space complexity, too; that is, how much extra space the algorithm needs to do its work.

There is often a trade-off. You may find that the best performing algorithm uses much more space than you can afford, or that the algorithm that uses only one extra piece of memory is much less efficient than alternatives that use more space.

With any algorithm you write, try to think in terms of both time and space complexity. Although computer memory is cheap nowadays and often large enough that it is not a concern, it is a good exercise to understand what exactly your algorithm is doing.

Sorting Lists

A common operation on collections—lists in particular—is to rearrange the list into some kind of sorted order. Lists are often sorted using natural ordering, such as sorting numbers from lowest to highest or sorting characters into alphabetical order. However, you may have different requirements for your sorting. Java provides two interfaces for helping with sorting: Comparable and Comparator.


What is the difference between the Comparable and Comparator interfaces?


Because these interfaces are public, they can be put to any use. By convention, the Comparable interface is used for natural ordering, and Comparator is used to give exact control over the ordering.

When sorting an array, you generally use a built-in library, such as the implementation included in the Arrays and Collections classes. However, for an interview you may be asked to write your own implementation.

The Arrays and Collections classes have several overloaded sort methods, which can broadly be split into two: one that takes an array as a parameter, and one that takes an array and a Comparator object. The method is overloaded for each of the primitive types and one for the reference type (Object).

For the implementation without a Comparator object, the type is sorted by its natural ordering. Listing 4-1 shows this for ints, sorting the array from low to high.

Listing 4-1: Naturally sorting an array of ints

@Test
public void sortInts() {
    final int[] numbers = {-3, -5, 1, 7, 4, -2};
    final int[] expected = {-5, -3, -2, 1, 4, 7};

    Arrays.sort(numbers);
    assertArrayEquals(expected, numbers);
}

For an array of Objects, the type being sorted must implement the Comparable interface, as shown in Listing 4-2.

Listing 4-2: Sorting objects naturally

@Test
public void sortObjects() {
    final String[] strings = {"z", "x", "y", "abc", "zzz", "zazzy"};
    final String[] expected = {"abc", "x", "y", "z", "zazzy", "zzz"};

    Arrays.sort(strings);
    assertArrayEquals(expected, strings);
}

The String class implements the Comparable interface, so the sorting works as you would expect. If the type being sorted does not implement Comparable, this will throw a ClassCastException.

For your own class definitions, you will need to implement Comparable, such as the implementation described in Listing 4-3.

Listing 4-3: Sorting without a Comparable interface

private static class NotComparable {
    private int i;
    private NotComparable(final int i) {
        this.i = i;
    }
}

@Test
public void sortNotComparable() {
    final List<NotComparable> objects = new ArrayList<>();
    for (int i = 0; i < 10; i++) {
        objects.add(new NotComparable(i));
    }

    try {
        Arrays.sort(objects.toArray());
    } catch (Exception e) {
        // correct behavior – cannot sort
        return;
    }

    fail();
}

It is not possible to use the Collections.sort method because the compiler expects the type of the parameter to be an implementation of Comparable. The method signature is:

public static <T extends Comparable<? super T>> void sort(List<T> list)

If you want to provide your own ordering, you provide an implementation of the Comparator interface to the sort method. This interface has two methods: int compare(T o1, T o2) for the implementing type T, and boolean equals(Object o). The compare method returns an int in one of three states: negative if the first argument is sorted before the second, zero if the two are equal, or positive if the second argument is sorted first.

If you were to provide a reverse numerical order Comparator, the implementation may look like Listing 4-4.

Listing 4-4: A reverse numerical order Comparator

public class ReverseNumericalOrder implements Comparator<Integer> {
    @Override
    public int compare(Integer o1, Integer o2) {
        return o2 - o1;
    }
    // equals omitted
}

Listing 4-5 uses this Comparator.

Listing 4-5: Using a custom ordering

@Test
public void customSorting() {
    final List<Integer> numbers = Arrays.asList(4, 7, 1, 6, 3, 5, 4);
    final List<Integer> expected = Arrays.asList(7, 6, 5, 4, 4, 3, 1);

    Collections.sort(numbers, new ReverseNumericalOrder());
    assertEquals(expected, numbers);
}

For the examples in this chapter, a simple natural ordering is assumed.


How would you implement a bubble sort algorithm?


The bubble sort algorithm is extremely simple to describe and implement. Here is an example in pseudocode, assuming a zero-indexed array:

for i between 0 and (array length – 2):
  if (array(i + 1) < array(i)):
    switch array(i) and array(i + 1)

repeat until a complete iteration where no elements are switched.

Following is a simple example with a small list:

6, 4, 9, 5 -> 4, 6, 9, 5: When i = 0, the numbers 6 and 4 are switched
4, 6, 9, 5 -> 4, 6, 5, 9: When i = 2, the numbers 9 and 5 are switched
4, 6, 5, 9: The first iteration switched numbers, so iterate again

4, 6, 5, 9 -> 4, 5, 6, 9: When i = 1, the numbers 6 and 5 are switched
4, 5, 6, 9: The second iteration swiched numbers, so iterate again

4, 5, 6, 9: No more numbers to switch, so this is the sorted list.

Listing 4-6 shows an implementation of the bubble sort in Java.

Listing 4-6: A bubble sort implementation

public void bubbleSort(int[] numbers) {
    boolean numbersSwitched;
    do {
        numbersSwitched = false;
        for (int i = 0; i < numbers.length - 1; i++) {
            if (numbers[i + 1] < numbers[i]) {
                int tmp = numbers[i + 1];
                numbers[i + 1] = numbers[i];
                numbers[i] = tmp;
                numbersSwitched = true;
            }
        }
    } while (numbersSwitched);
}

Although this implementation is simple, it is extremely inefficient. The worst case, when you want to sort a list that is already sorted in reverse order, is a performance of O(n2): For each iteration, you are only switching one element. The best case is when a list is already sorted: You make one pass through the list, and because you have not switched any elements, you can stop. This has a performance of O(n).


How would you implement the insert sort algorithm?


The insert sort algorithm is another simple algorithm to describe:

Given a list l, and a new list nl
for each element originallistelem in list l:
  for each element newlistelem in list nl:
    if (originallistelem < newlistelem):
      insert originallistelem in nl before newlistelem
    else move to the next element
  if originallistelem has not been inserted:
    insert at end of nl

Listing 4-7 shows an implementation.

Listing 4-7: An insert sort implementation

public static List<Integer> insertSort(final List<Integer> numbers) {
    final List<Integer> sortedList = new LinkedList<>();

    originalList: for (Integer number : numbers) {
        for (int i = 0; i < sortedList.size(); i++) {
            if (number < sortedList.get(i)) {
                sortedList.add(i, number);
                continue originalList;
            }
        }
        sortedList.add(sortedList.size(), number);
    }

    return sortedList;
}

There are several points to note about this implementation. Notice that the method returns a new List unlike the bubble sort, which sorted the elements in place. This is mainly the choice of the implementation. Because this algorithm created a new List, it made sense to return that.

Also, the list implementation returned is a LinkedList instance. A linked list is very efficient in adding elements in the middle of the list, simply by rearranging the pointers of the nodes in the list. If an ArrayList had been used, adding elements to the middle would be expensive. An ArrayList is backed by an array, so inserting at the front or middle of the list means that all subsequent elements must be shifted along by one to a new slot in the array. This can be very expensive if you have a list with several million rows, especially if you are inserting early in the list. If the difference between list data structures, such as array lists and linked lists, are new to you, look at Chapter 5 on data structures and their implementations.

Finally, the outer loop in the implementation is labeled originalList. When an appropriate place to insert the element has been found, you can move on to the next element in the original array. Calling continue inside a loop will move on to the next iteration of the enclosing loop. If you want to continue in an outer loop, you need to label the loop and call continue with its identifier.

As hinted at with the outer loop labeling and the continue statement, once the element has successfully been placed, the algorithm moves on to the next element. This is an advantage over bubble sort: Once the list to return has been constructed, it can be returned immediately; there are no redundant cycles to check that the list is in order.

The worst-case performance for this algorithm is still O(n2), though: If you attempt to sort an already-sorted list, you need to iterate to the end of the new list with each element to insert. Conversely, if you sort a reverse-order list, you will be putting each element into the new list at the head of the list, which is O(n).

The algorithm described here uses twice as much space to sort the list because a new list is returned. The bubble sort algorithm only uses one extra slot in memory, to hold the value temporarily while being swapped.


How would you implement the quicksort algorithm?


The description for the quicksort algorithm is:

method quicksort(list l):
  if l.size < 2:
    return l

  let pivot = l(0)
  let lower = new list
  let higher = new list
  for each element e in between l(0) and the end of the list:
    if e < pivot:
      add e to lower
    else add e to higher

  let sortedlower = quicksort(lower)
  let sortedhigher = quicksort(higher)
  return sortedlower + pivot + sortedhigher

This algorithm is recursive. The base case is when a list is provided with zero or one element. You can simply return that list because it is already sorted.

The second part of the algorithm is to pick an arbitrary element from the list, called the pivot. In this case the first element in the list was used, but any element could have been picked. The remaining elements are separated into two: those lower than the pivot, and those equal to or larger than the pivot.

The method is then called on each of the two smaller lists, which will return in each segment being sorted. The final list is the smaller elements, now sorted, followed by the pivot, and then the higher sorted elements.

Listing 4-8 is an implementation in Java.

Listing 4-8: Quicksort

public static List<Integer> quicksort(List<Integer> numbers) {
    if (numbers.size() < 2) {
        return numbers;
    }

    final Integer pivot = numbers.get(0);
    final List<Integer> lower = new ArrayList<>();
    final List<Integer> higher = new ArrayList<>();

    for (int i = 1; i < numbers.size(); i++) {
        if (numbers.get(i) < pivot) {
            lower.add(numbers.get(i));
        } else {
            higher.add(numbers.get(i));
        }
    }

    final List<Integer> sorted = quicksort(lower);

    sorted.add(pivot);
    sorted.addAll(quicksort(higher));

    return sorted;
}

If you ever write a recursive algorithm, you must make sure that it terminates. The algorithm in Listing 4-8 is guaranteed to terminate, because each recursive call is made with a smaller list, and the base case is on a zero- or one-element list.

The performance of Listing 4-8 is much more efficient than the bubble sort and insertion sort algorithms. The separation of the elements into two separate lists is O(n), and each recursive call happens on half of each list, resulting in O(n log n). This is an average performance. The worst case is still O(n2). The choice of the pivot can make a difference: For the implementation given here, if you always pick the first element as the pivot, and the list is ordered in reverse, then the recursive steps only reduce by one each time.

It is worth noting that each division of the list and the subsequent recursive call is independent of any other sorting necessary, and could be performed in parallel.


How would you implement the merge sort algorithm?


The final sorting algorithm to examine in this chapter is the merge sort algorithm. The following pseudocode describes this recursive algorithm:

method mergesort(list l):
  if list.size < 2:
    return l

  let middleIndex = l.size / 2
  let leftList = elements between l(0) and l(middleIndex - 1)
  let rightList = elements between l(middleIndex) and l(size – 1)

  let sortedLeft = mergesort(leftList)
  let sortedRight = mergesort(rightList)

  return merge(sortedLeft, sortedRight)

method merge(list l, list r):
  let leftPtr = 0
  let rightPtr = 0
  let toReturn = new list

  while (leftPtr < l.size and rightPtr < r.size):
    if(l(leftPtr) < r(rightPtr)):
      toReturn.add(l(leftPtr))
      leftPtr++
    else:
      toReturn.add(r(rightPtr))
      rightPtr++

  while(leftPtr < l.size):
      toReturn.add(l(leftPtr))
      leftPtr++

  while(rightPtr < r.size):
      toReturn.add(r(rightPtr))
      rightPtr++

  return toReturn

This is another divide-and-conquer algorithm: Split the list into two, sort each sublist, and then merge the two lists together.

The main code is in merging the two lists efficiently. The pseudocode holds a pointer to each sublist, adds the lowest value pointed to, and then increments the appropriate pointer. Once one pointer reaches the end of its list, you can add all of the remaining elements from the other list. From the second and third while statements in the preceding merge method, one while statement will immediately return false, because all the elements will have been consumed from that sublist in the original while statement.

Listing 4-9 shows an implementation.

Listing 4-9: A merge sort implementation

public static List<Integer> mergesort(final List<Integer> values) {
    if (values.size() < 2) {
        return values;
    }

    final List<Integer> leftHalf =
            values.subList(0, values.size() / 2);
    final List<Integer> rightHalf =
            values.subList(values.size() / 2, values.size());

    return merge(mergesort(leftHalf), mergesort(rightHalf));
}

private static List<Integer> merge(final List<Integer> left,
                                   final List<Integer> right) {
    int leftPtr = 0;
    int rightPtr = 0;

    final List<Integer> merged = 
            new ArrayList<>(left.size() + right.size());

    while (leftPtr < left.size() && rightPtr < right.size()) {
        if (left.get(leftPtr) < right.get(rightPtr)) {
            merged.add(left.get(leftPtr));
            leftPtr++;
        } else {
            merged.add(right.get(rightPtr));
            rightPtr++;
        }
    }

    while (leftPtr < left.size()) {
        merged.add(left.get(leftPtr));
        leftPtr++;
    }

    while (rightPtr < right.size()) {
        merged.add(right.get(rightPtr));
        rightPtr++;
    }

    return merged;
}

There should be no surprise here; it is very similar to the pseudocode. Note that the subList method on List takes two parameters, from and to: from is inclusive, and to is exclusive.

A common question in interviews is often to ask you to merge two sorted lists, as was shown in the merge method in the preceding listing.

Again, merge sort has a performance of O(n log n). Each merge operation is O(n), and each recursive call works on only half of the given list.

Searching Lists


How would you implement a binary search?


When searching through a list, unless the list is sorted in some fashion, the only sure way to find a given value is to look at every value in the list.

But if you are given a sorted list, or if you sort the list before searching, a binary search is a very efficient method to see if a given value is in the list:

method binarySearch(list l, element e):
  if l is empty:
    return false

  let value = l(l.size / 2)
  if (value == e):
    return true

  if (e < value):
    return binarySearch(elements between l(0) and l(l.size / 2 - 1)
  else:
    return binarySearch(elements between l(l.size / 2 + 1) and l(l.size)

The beauty of this algorithm is that you use the property of the sorted list to your advantage. You can throw away many elements without even examining them, because you know they definitely cannot be equal to the given element. If you have a list with one million elements, you can find a given element in as little as twenty comparisons. This algorithm has the performance of O(n).

Listing 4-10 shows an implementation of a binary search.

Listing 4-10: Binary search

public static boolean binarySearch(final List<Integer> numbers,
                                   final Integer value) {
    if (numbers == null ||_numbers.isEmpty()) {
        return false;
    }

    final Integer comparison = numbers.get(numbers.size() / 2);
    if (value.equals(comparison)) {
        return true;
    }

    if (value < comparison) {
        return binarySearch(
                numbers.subList(0, numbers.size() / 2), 
                value);
    } else {
        return binarySearch(
                numbers.subList(numbers.size() / 2 + 1, numbers.size()), 
                value);
    }
}

Summary

For many interviews, the core of the assessment is often around implementing algorithms and understanding the performance. Make sure you understand how the core sorting and searching algorithms work and how they perform, because this will prepare you for any complex algorithms you are asked to perform during an interview.

Some of the algorithms shown here are recursive. Make sure you understand the implications of a recursive algorithm: They can look elegant, but can have surprising side effects due to the practicalities of calling a new method. The call stack has new values added, and if your call stack is too deep, you run the risk of a StackOverflowException.

Outside of an interview situation, whenever you need to sort a collection, use a library implementation. A tried and tested algorithm will almost always be more efficient than your own ad-hoc implementation. Any performance and memory concerns will have been examined. In fact, some of the sorting algorithms inside the Java standard libraries are implemented differently depending on the size of the list: Small lists are sorted in place using insert sort, but after a defined threshold, a merge sort is used.

The next chapter complements this chapter by providing an examination of some basic data structures, including lists, maps, and sets.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.164.24