Sunday, 8 January 2012

What is the quickest sorting method to use?

The answer depends on what you mean by quickest. For most sorting problems, it just doesn’t matter how quick the sort is because it is done infrequently or other operations take significantly more time anyway. Even in cases in which sorting speed is of the essence, there is no one answer. It depends on not only the size and nature of the data, but also the likely order. No algorithm is best in all cases.

There are three sorting methods in this author’s “toolbox” that are all very fast and that are useful in different situations. Those methods are quick sort, merge sort, and radix sort.

The Quick Sort


The quick sort algorithm is of the “divide and conquer” type. That means it works by reducing a sorting problem into several easier sorting problems and solving each of them. A “dividing” value is chosen from the input data, and the data is partitioned into three sets: elements that belong before the dividing value, the value itself, and elements that come after the dividing value. The partitioning is performed by exchanging elements that are in the first set but belong in the third with elements that are in the third set but belong in the first. Elements that are equal to the dividing element can be put in any of the three sets—the algorithm will still work properly.

After the three sets are formed, the middle set (the dividing element itself ) is already sorted, so quick sort is applied to the first and third sets, recursively. At some point, the set being sorting becomes too small for quick sort. Obviously, a set of two or fewer elements cannot be divided into three sets. At this point, some other sorting method is used. The cutoff point at which a different method of sorting is applied is up to the person implementing the sort. This cutoff point can dramatically affect the efficiency of the sort, because there are methods that are faster than quick sort for relatively small sets of data.

The string sorting example we discussed in the last post will be rewritten using a quick sort. Excuse the preprocessor trickery, but the goal is to make the code readable and fast. Below example shows myQsort(), an implementation of the quick sort algorithm from scratch.

The function myQsort() sorts an array of strings into ascending order. First it checks for the simplest cases. On line 17 it checks for the case of zero or one element in the array, in which case it can return—the array is already sorted. Line 19 checks for the case of an array of two elements, because this is too small an array to be handled by the rest of the function. If there are two elements, either the array is sorted or the two elements are exchanged to make the array sorted.

Line 28 selects the middle element of the array as the one to use to partition the data. It moves that element to the beginning of the array and begins partitioning the data into two sets. Lines 37–39 find the first element in the array that belongs in the second set, and lines 45–47 find the last element in the array that belongs in the first set.

Line 49 checks whether the first element that belongs in the second set is after the last element that belongs in the first set. If this is the case, all the elements in the first set come before the elements in the second set, so the data are partitioned. Otherwise, the algorithm swaps the two elements so that they will be in the proper set, and then continues.

After the array has been properly partitioned into two sets, line 55 puts the middle element back into its proper place between the two sets, which turns out to be its correct position in the sorted array. Lines 57 and 58 sort each of the two sets by calling myQsort() recursively. When each set is sorted, the entire array is sorted.

An implementation of quick sort that doesn’t use the qsort() function.


No comments:

Post a Comment