Merge Sort

We have presented, so far, two sorting algorithms: bubble sort and insertion sort. The performance of both of them will be better if the data is partially sorted. The third algorithm presented in this chapter is the merge sort algorithm, which was developed in 1940 by John von Neumann. The defining feature of this algorithm is that its performance is not dependent on whether the input data is sorted. Like MapReduce and other big data algorithms, it is based on a divide and conquer strategy. In the first phase, called splitting, the algorithm keeps on dividing the data into two parts recursively, until the size of the data is less than the defined threshold. In the second phase, called merging, the algorithm keeps on merging and processing until we get the final result. The logic of this algorithm is explained in the following diagram:

Let's first look into the pseudocode of the merge sort algorithm:

mergeSort(list, start, end) 
if(start < end)
midPoint = (end - start) / 2 + start
mergeSort(list, start, midPoint)
mergeSort(list, midPoint + 1, start)
merge(list, start, midPoint, end)

As we can see the algorithm has the following three steps:

  1. It divides the input list into two equal parts
  2. It uses recursion to split until the length of each list is 1
  3. Then, it merges the sorted parts into a sorted list and returns it

The code for implementing MergeSort is shown here:

When the preceding Python code is run, it generates an output, as follows:

Note that the code results are in a sorted list. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.71.106