Sorting

Goals

Discover how sorting algorithms work at an intuitive level.
Learn the general categories of sorting algorithms.
Be able to make intelligent decisions when choosing sorting algorithms.
Understand and use the Comparable<T> interface for natural order comparisons in Java.

Concepts

bubble sort
bucket sort
comparison sort
counting sort
heap
heap sort
in place
insertion sort
merge sort
natural ordering
pivot
recursion
quicksort
searching
selection sort
sorting
space complexity
stability
stable
time complexity

Library

Lesson

In your first encounter with algorithms in this course, you learned that finding things and ordering things are two common uses for algorithms. You've already implemented an algorithm for searching that combines with a special data structure called a hash table. You also looked at another technique called “binary search” which required that the items first be placed in order. Now you will learn more about sorting techniques for placing items in some defined order, so that the data may be used with other algorithms such as binary search or simply because the user prefers results to be shown in some order.

Computer scientists have spent decades researching efficient ways of sorting items with little memory. This was particularly important in the early days of computing, because processors had limited power and computer had little memory. With more and more data being produced, efficiently searching and sorting data remains important. Although most programming languages already contain libraries for sorting simple lists, studying the different categories of sorting algorithms helps understand the development and evaluation of algorithms in general. Moreover some sorting techniques are better applied to some types of data than others, and each comes with its own benefits and limitations. As with searching, you'll discover that sometimes certain sorting algorithms require or at least prefer the information to be stored in certain data structures to be most effective.

Evaluating Algorithms

If there has been so much research into sorting algorithms, why can't we just pick the best one and always use it? There is no “perfect” sorting algorithm because each technique has ways in which it excels, each one usually has a downside. One algorithm might be extremely fast, however, but might only work with certain types of data. Other algorithms might use large amounts of memory. And sometimes an algorithm may perform better in most cases but in some cases may perform worse than others if the input data is arranged in a certain way. (Nevertheless some sorting algorithms have become very popular in general-purpose libraries, explained later in the Quicksort section.)

There are several different aspects that allow us to compare algorithms.

time complexity: How much longer does an algorithm take when presented with more data? This is the order of grown you studied in the initial lesson on algorithms. You learned that often what concerns us most is the worst case running time of algorithms, but with sorting algorithms the analysis become more nuanced: there are some sorting algorithms with horrible worst case performance, but that in the average case performs pretty well.
space complexity: How much more space does the algorithm use? Some sorting algorithms make their changes in place, modifying the original data directly. Other algorithms require copying the data, resulting doubling or more the memory used.
stability: Does the algorithm maintain the original relative order of the items? It might seem odd to ask if a sorting algorithm maintains an original order, because sorting by definition is meant to change the order. But you if the key on which you are searching has duplicates, the items grouped together may or not be in their original order. Only a stable sorting algorithm will maintain the same relative order for grouped items. Suppose you had a list of flights in order of landing at some airport. You might sort the flights by the airline name, the landing times of each airline might be shuffled if the algorithm were not stable.

O(`n`²) Sorting Algorithms

Bubble Sort

One of the most famous sorting algorithms is also one of the easiest to understand. The algorithm is conceptually two steps:

Pass through the items two at a time, for each pair switching them if they are out of order.
Repeat step #1 until nothing gets switched during a pass.

Bubble Sort

/**
 * Sorts an array in place using the Bubble Sort algorithm.
 * @param items The items to sort.
 */
public static void bubbleSort(final int[] items) {
  final int length = items.length;
  int n = length;
  boolean somethingChanged;
  do { //each pass
    somethingChanged = false;
    for(int i = 0; i < n - 1, i++) {
      if(items[i] > items[i + 1]) {
        final int temp = items[i];  //swap
        items[i] = items[i + 1];
        items[i + 1] = temp;
        somethingChanged = true;
      }
    }
    n--;  //optimization
  } while(somethingChanged);
}

On each pass we loop until the next-to-last item n - 1, because we compare each item with the one after it.

If an item is out of order, each pass will slowly move it to the right—it will “bubble up” to the top, which is how the algorithm gets its name: the bubble sort. In fact each pass results in the largest item moving all the way to the right. In the next pass, the next-largest item is moved to the second-to-top position, and so on. Thus the bubble sort performs two different activities: a comparison and a swap.

You might realize that if on each pass the largest item is sorted to last, there will be one fewer swap that could be made. In fact you could add an optimization n-- on each pass so that the last items never even need to be compared. This cuts the number of comparisons down to n²/2 which is the same as 1.5n². As we ignore the constants in big O notation, this is still O(n²) complexity.

In a worst case scenario, in which the original items are in reverse order, the first pass will move the first (largest) item all the way to the right. Put another way, the first pass moves the item basically n positions. Then the second pass moves the next item almost n positions. In rough terms, we have to move around n positions for each of the n items, or n × n = n² movements altogether.

In a worse case therefore the bubble sort has O(n²) time complexity. You will remember from the algorithms discussion on big O notation, O(n²) time is not very fast, to put it mildly. Fortunately there are more efficient algorithms than bubble sort.

Selection Sort

The selection sort is another approach that is simple to understand.

Look through the list and find the smallest item.
Swap it with the first item.
Repeat steps #1 and #2 with each next-to-smallest item until you reach the end of the array.

This doesn't make the algorithm any faster than bubble sort. In fact it makes things worse: because it repeatedly searches for each next-to-smallest item, it will always look through the list n² times even when the list is already sorted! In other words, selection sort has O(n²) running time even in the best case scenario, when nothing needs to be changed.

Insertion Sort

The insertion sort at first appears similar to the selection sort, but it is more efficient. Rather than searching for the next smallest element, it simply takes the next element and moves it down to the correct position.

Start with the first item in the array.
Find the first previous item that is smaller than the item and insert the current item after that item. The first time performing these steps there will be no previous items at all.
Go to the next item and repeat steps #1 and #2.

This algorithm moves through the array, taking each item and moving it down (effectively inserting it) to its correct position. At the end of each step, all items to the left of the current position will be sorted.

The benefit of the insertion sort is that it has less work to do if the list is already partially sorted. In fact if the list is already sorted, insertion sort performs no changes, going through the list a single time, resulting in O(n) performance! Nevertheless in general, for random unsorted data, insertion sort has an order of growth of O(n²) like selection sort and bubble sort.

O(`n` log `n`) Sorting Algorithms

Merge Sort

A lot of the overhead of the O(n²) algorithms involve comparing an item over and over with different items, and moving an item back and forth, one step at a time, over long distances. We need some way to cut down on the comparisons.

What if you had two arrays that were already sorted. Would you be able to put them together into a single, sorted array? That wouldn't be too difficult: you'd simply look at each array and take the smallest available from each array until both arrays were finished.

Create an array big enough to hold all the values.
Keep an index into both input arrays and into the output array.
See which value is smaller in one of the input arrays, and copy it to the output array.
Continue repeating step #3 until both arrays have been copied.

Merging two sorted lists.

/**
 * Merges two sorted arrays into one.
 * @param array1 The first array, already sorted.
 * @param array2 The second array, already sorted.
 * @return The two arrays merged in the correct order.
 */
static int[] merge(final int[] array1, final int[] array2) {
  final int outputLength = array1.length + array2.length;
  final int[] output = new int[outputLength];
  int index1 = 0;
  int index2 = 0;
  for(int outputIndex = 0; outputIndex < outputLength; outputIndex++) {
    final int value;
    if(index1 >= array1.length) { //array 1 finished
      value = index2[index2++]; //copy from array 2
    } else if(index2 >= array2.length) { //array 2 finished
      value = index1[index1++]; //copy from array 1
    } else { //haven't reached end of either array
      if(array1[index1] <= array2[index2]) {
        value = array1[index1++];
      } else {
        assert array1[index1] > array2[index2];
        value = array2[index2++];
      }
    }
    output[outputIndex] = value;
  }
  return output;
}

You should easily see that the time complexity of this merge is only O(n), which is pretty fast. The algorithm needs to iterate over all the arrays a single time, at the same time.

Merge sort. — Steps effectively taken to perform a merge sort (Wikimedia Commons).

Even though it is easy and fast to merge two sorted arrays, you may wonder how to get the two sorted input arrays to begin with. After all, sorting arrays is the problem we started out with! But first think of a couple of possibilities. If you were given the following array, with a single element (value 38 at index 0)? Would it be easy to sort?

0
`38`

A single value is already sorted; there is nothing to do. You are done before you start. What about two values: 38 and 27?

0	1
`38`	`27`

That array is easy to sort as well. Simply compare the two values, and if they are out of order, switch the two values as you did in the bubble sort. Then you are finished. What about four values: 38, 27, 43, 3?

0	1	2	3
`38`	`27`	`43`	`3`

Before you start deciding which sort algorithm to use, remember that this array of four values can be separated into two arrays of two values each:

0	1
`38`	`27`

0	1
`43`	`3`

Graph of O(1), O(log n), O(n), O(n log n), and O(n^2) running times. — Graph of O(1), O(log `n`), O(`n`), O(`n` log `n`), and O(`n`²) running times.

By now you already know how to sort an array of two items, so that is simple enough. Then you will have two sorted arrays. But how do you get them back together? You probably already remember at the beginning of this section that we already created a method for merging two sorted arrays.

In the lesson on algorithms you learned about a general technique called divide and conquer. By dividing up the problem into pieces, it can be more efficient than working on the whole. The merge sort uses this approach to continually divide up the array into pieces, sort them, and then merge the pieces together. You should recognize that this sort of approach lends itself well to the technique of recursion, in which a method continually calls itself with each smaller divided piece.

Divide the array in half.
For each piece, divide that piece in half again using step #1.
When you reach a piece that is one or two elements long, stop dividing and switch those values if needed to make them sorted.
Merge the sorted piece with the other half, which has now been sorted as well using the same approach.
Keep merging the sorted pieces using step #4 until you arrive back at the full size of the original array, which is now sorted!

Dividing up the problem into smaller pieces drastically cuts down on the number of comparisons need to be made. It also lowers the number of times needed to move an item into its correct place relative to the others. The binary search algorithm, which also uses divide and conquer, was able to reduce the time to O(log n) because each division effectively discarded the rest of the data. Unfortunately the merge sort can't discard data (it is sorting, not searching), and each merge step takes a separate amount of time. Nevertheless, the merge sort is able to significantly reduce its running time in comparison with the bubble sort, resulting in an order of grown called O(n log n). This means O(n × log n), which is slower than O(log n) but still much faster than O(n²)!

Merge sort is stable; for any keys that are equal, their original order is maintained. Merge sort's biggest drawback is that it does not sort in place. Rather it requires values to be copied to some other location, sorted, and then merged back into the original array. A well written merge sort implementation can make this work with a single array the same size of the original array, but still this requires double the memory than would be required by an in-place sorting algorithm. For small data sets this will not be a problem, but if the amount of data to be sorted already takes up most of the memory, it might not be feasible allocate a complete second copy just to use merge sort.

Quicksort

The divide-and-conquer approach works well for making searching more efficient than O(n²). But the merge sort uses a lot of memory and results in a lot of copying. The quicksort algorithm manages to recursively sort an array in place, without making a copy of the array.

Choose some item in the array and designate it as the pivot. The algorithm will work using any value as the pivot, although some pivots perform better than others.
Swap elements as needed so that all items below the pivot are less than the pivot, and all items above the pivot are greater than the pivot. The technique described here varies slightly from the technique shown in the diagram, but the result is the same.
1. Find the first item greater than the pivot.
2. Find the last item greater than the pivot.
3. Swap the two items.
4. Keep searching towards the center of list until there are no more items to swap.
5. Swap the pivot with the center item; the pivot is now in its sorted order, but the items on each side still need to be sorted.
Recursively repeat steps #1 and #2 with the subarray on each side of the pivot.

A crucial factor in the quicksort algorithm is the choice of the pivot. Ideally, to make the most of “divide and conquer”, one would want to “divide” as much as possible on each recursive pass. That is, you would want to choose a pivot that winds up close to the middle of the array. Knowing ahead of time which item would be in the middle of the array is impossible, because that would require knowing how the values will be sorted! If nothing is known about the data, some quicksort algorithms choose a value at random to be the pivot. You can improve your chances at getting a pivot near the middle if you look at the first, last, and middle value of each segment and choose the median of the three.

The quicksort algorithm is clever and is usually pretty fast. Its average order of growth is O(n log n). The problem with comes with its worse case, which degrades to O(n²). Making matters worse, if a pivot is chosen on the end, quicksort exhibits worst-case performance for already-sorted data! Compare this to the insertion sort algorithm, which most of the time performs with O(n²) complexity, but for already sorted data provides a speedy O(n) performance.

Thus you can see that three algorithms we have discussed have trade-offs:

Insertion sort will give slow performance of O(n²) most of the time, but if the data is almost sorted already it can give an amazing O(n).
Merge sort consistently gives O(n log n) performance, but it takes a lot of memory.
Quicksort will perform faster than the other two most of the time, but if the data is mostly sorted already it can deteriorate into a horrible O(n²) performance.

Quicksort has become very popular and is the default sorting algorithm uses on many systems and in many programming languages. The Unix C library function qsort() is often implemented using quicksort, although some versions of Linux use heap sort instead. Notably recent versions of the Java standard library methods for sorting arrays of numbers such as Arrays.sort(int[] a) and Arrays.sort(long[] a) use a variation of quicksort.

Heap Sort

One other O(n log n) sorting algorithm is heap sort. You saw in the lesson on trees that placing items into a binary search tree and then using a depth-first, in-order traversal would iterate over the items in sorted order. A heap sort uses a special type of tree (usually a binary tree) called a heap.

The heap can be efficiently represented as a sequence of array elements instead of actual node objects with references. Sorting is performed in place in the array by manipulating the order of the items based upon the tree structure the array represents. Heap sort is usually not quite as fast as quicksort, but it doesn't degrade to O(n²) performance as quicksort does.

Sorting Algorithms Faster than O(`n` log `n`)

The sorting algorithms you have seen so far are examples of comparison sorts, algorithms that work by comparing two numbers at a time. Because numbers logically have to be compared with each other a certain number of times, it turns out that O(n log n) is the fastest theoretical time possible for a comparison sort. Although by luck some input data might sort faster, such as almost-sorted data with insertion sort, but there is no way to improve the order of growth on average with arbitrary data. Some comparison sorts will still be faster than others depending on the type of data, but will still exhibit O(n log n) time.

Counting Sort

It is however possible to sort certain types of data very quickly if you do not need to compare the actual values with each other. Suppose that you know in advance that all the values to be sorted fall in the range 0 – 999. Remembering that array lookup is a constant time operation, rather than comparing the numbers to each other, you simply use the number as an index into some separate array. The numbers inherently have an order, even without comparing them to each other. And array indexes in an array are also inherently in order.

The counting sort uses a separate “counting” array to record the presence of each number as it is encountered. Because it is assumed numbers can appear more than once, rather than recording a true mark in the counting array, a total count of each number is kept. After going through the original array once and marking the presence of each number, the counting array is used to rewrite the original array with the numbers in order.

Create a separate “counting” array with enough positions to hold the highest largest as an index.
Look at each value in the original array, and use it as an index into the counting array to increase the total count for that number.
Step through the counting array and rewrite the values into the original array, now in order.

Counting Sort

/**
 * Sorts an array in place using the Counting Sort algorithm.
 * @impSpec This implementation only supports values up to 999.
 * @param items The items to sort.
 * @throws IllegalArgumentException if one of the items is greater
 *     than 999.
 */
public static bubbleSort(final int[] items) throws IllegalArgumentException {
  final int[] counts = new int[999]; //these default to 0
  for(int item = 0; i < items.length; i++) {
    final int item = items[i];
    if(item >= counts.length) {
      throw new IllegalArgumentException("Item size not supported.");
    }
    counts[item]++;  //use the item as an index into the counting array
  }
  int itemIndex = 0;  //write the values back
  for(int item = 0; item < counts.length; item++) {
    final int count = counts[item]; //each index is the item
    for(int i = 0; i < count; i++) { //could have duplicate items
      items[itemIndex++] = item
    }
  }
}

The maximum value 999 is arbitrary. Supporting larger values would require more memory for the counting array, because it would need more indexes.

You can see that the counting sort goes through two loops separately. The first loop depends on the number of items which you can refer to as n. The second loop depends on the highest value supported, in this example 999, which you can refer to as k. Thus the running time is O(n + k). If the highest value supported is close to the number of values to sort, the result is close to O(n) time, which is extremely fast for sorting!

Bucket Sort

The memory required for counting sort depends on the highest value supported. Supporting very high values would require creating a counting array with the same number of indexes. Supporting values up to 1,000,000 for example would require a counting array of a million indexes, each holding an int requiring four bytes, for a total of four megabytes!

You might have noticed that the counting sort technique of uses values as indexes in an array resembles the approach used by hash tables. You recall that hash tables would have the same memory problem with large hash values; hash tables mitigate this problem by placing the grouping hash codes into “buckets”. Because this would result in hash collisions, each bucket contains a list of values which will need to be used to distinguish values with the same hash code.

The bucket sort uses the same approach. Each value is mapped to some bucket, using the modulus operator used in hash tables for example. A list is used to keep separate counts for the separate values in the buckets. Each bucket list itself will have to be sorted one of the other sorting techniques such as insertion sort.

Overall the efficiency of the bucket sort is similar to the O(n + k) of the counting sort, except that less memory is used and higher values can be sorted. With a well chosen number of buckets the running time can even approach O(n + k), which depending on the highest value supported can be close to O(n) time.

`Comparable<T>`

Comparable.compareTo(T object)

Return Value	Meaning
`< 0`	`this < object`
`0`	`this.equals(object)` See “consistency with equals” below.
`> 0`	`this > object`

The first two categories of sorting algorithms, those taking O(n²) and O(n log n), work by comparison of two values being sorted. But more needs to be said about that it means to “compare” two values. The examples so far used numbers, the primitive int type, which has a natural ordering—the order you would expect when counting with numbers for example. The natural order for primitive numbers is built into the JVM, which is why the less than < and greater than > operators work for them.

But how does one indicate the natural ordering of objects such as Vehicle or Animal? Java comes with an interface java.lang.Comparable<T> to be used with non-primitive types, that is classes, that should have a natural ordering. Its method Comparable.compareTo(T o) compares any instance of the object with another instance of the object. It returns an int indicating which object is greater than the other. Interestingly the actual magnitude of the returned value is not important; what is important is the sign. If a negative value is returned, the first object is less than the second. If a positive value is returned, the first object is greater than the second. If zero is returned, the objects are considered equal for purposes of their natural ordering.

java.util.Comparable<T>

package java.util;

public interface Comparable<T> {

  /**
    * Compares this object with the specified object for order.
    * @param object the object to be compared.
    * @return A negative integer, zero, or a positive integer depending on
    *     whether this object is less than, equal to, or greater than the
    *     specified object.
    * @throws NullPointerException if the specified object is <code>null</code>.
    * @throws ClassCastException if the specified object's type prevents it
    *     from being compared to this object.
    */
  int compareTo(T object);

}

Imagine that a class PlayerScore encapsulates the score of someone who has just played a video game. You might want to sort the scores to show the top-scoring players on the screen. To have a natural ordering, the class would need to implement Comparable<PlayerScore>, and in the implementation of compareTo(final PlayerScore playerScore) compare the object's score with the score of the given player.

Using Comparable<T> for natural order.

package com.example;

/**
 * Encapsulates the name and score of a video game player.
 * @apiNote This class has a natural ordering that is
 *     inconsistent with equals.
 */
public class PlayerScore implements Comparable<PlayerScore> {

  final String name;
  final int score;
  
  public String getName() {return name;}
  public int getScore() {return score;}

  /**
   * Constructor.
   * @param name The name of the player.
   * @param int The player's score.
   */
  PlayerScore(@Nonnull final String name, final int score) {
    …
  }

  /**
   * {@inheritDoc}
   * @implSpec This implementation compares the scores of the objects.
   * @implNote This implementation is inconsistent with {@link #equals()}.
   * @see #getScore()
   */
  @Override
  public int compareTo(final PlayerScore playerScore) {
    if(getScore() < playerScore.getScore()) {
      return -1; //any negative number will do
    } else if(getScore() > playerScore.getScore()) {
      return 1; //any positive number will do
    } else {
      assert getScore() == playerScore.getScore();
      return 0;
    }
  }

}

Performing so many comparisons when implementing Comparable.compareTo(T o) is normally not necessary. You've already see the primitive “wrapper” classes java.lang.Integer and java.lang.Long appear in previous lessons. Besides being able to represent a primitive value as an object, each of these classes contain utility methods for working with their respective primitive types. In particular the Integer.compare(int x, int y) and Long.compare(long x, long y) methods will compare any two primitive values and return a result that is compatible with the contract of Comparable.compareTo(T o). In essence these methods perform the same logic as the example above. Therefore the example could be shortened, simplified, and made less prone to errors simply by delegating to one of these comparison methods.

@Override
public int compareTo(final PlayerScore playerScore) {
  return Integer.compare(getScore(), player.getScore());
}

You might hear of a "trick" of implementing Comparable.compareTo(T o) for integers by simply subtracting the two values and returning the result. Unfortunately this trick is broken! If one integer is very large and the other very small, the difference may be larger than the largest allowed positive integer, resulting in an overflow (and a negative number). See Java Integer compareTo() - why use comparison vs. subtraction? (Stack Overflow)

The Comparable<T> API strongly recommends that the implementation be “consistent with equals”. This means that if the two classes return a Comparable.compareTo(T o) value of 0, indicating that they one is neither greater than nor less than the other, then Object.equals(Object obj) should return true. This is not an absolute requirement, but if the two don't return the same answer you might encounter inconsistent results in some of Java's collection implementations when sorting or checking for containment. Nevertheless, in the above example just because two players have the same score doesn't mean that they are the same player, so Object.equals(Object obj) was not implemented to return true for the same score. If you do not ensure that your Object.equals(Object obj) method returns true for objects that compare the same, you should note this in the contract of your Comparable implementation, such as using the @apiNote tag. Java recommends the following language:

"Note: this class has a natural ordering that is inconsistent with equals."

See Consequence when compareTo() is inconsistent with equals() for more discussion.

Now a general-purpose sorting method can sort a sequence of any objects with natural ordering, not just primitive numbers such as those stored in int. The Java standard library, for example, has a java.util.Arrays containing general utilities for working with arrays. The Arrays.sort(Object[] a) method will sort any array of objects that have a natural ordering. By now you should realize that this means each object should implement Comparable<T>, and should all be comparable to each other. Rather than using the less than < and greater than > operators, the Arrays.sort(Object[] a) method calls Comparable.compareTo(T o). For example, instead of comparing object1 < object2, the method compares object1.compareTo(object2) < 0.

At first you might wonder why this method signature is not Arrays.sort(Comparable<T>[] array) to ensure that all the objects implement Comparable<T>. This would be a good idea, except for one problem: Java arrays do not support generics, as you might recall from the previous chapter.

Review

Summary

Algorithm	Time Complexity	Stable	In Place
Bubble Sort	O(`n`²)	yes	yes
Selection Sort	O(`n`²)	no	yes
Insertion Sort	O(`n`²) Best case O(`n`) for partially sorted data.	yes	yes
Merge Sort	O(`n` log `n`)	yes	no
Quicksort	O(`n` log `n`) Very fast in practice. Worst case O(`n`²).	no	yes
Heap Sort	O(`n` log `n`)	no	yes
Counting Sort	O(`n` + `k`)	yes	no
Bucket Sort	O(`n` + `k`)	yes	no

Gotchas

Not every type has a natural ordering that makes sense. If you want to sort some items that have no inherent ordering, you may not want to implement Comparable<T> and instead use a sorting strategy, explained in a future lesson.
Don't return the difference of two integers inside Comparable.compareTo(…), as this can result in an integer overflow.
What does it mean for a Comparable<T> implementation to be “inconsistent with equals”? What problems can this cause?

In the Real World

Most of the time the built-in sorting algorithm for your programming language library has been tuned and will work fine for general sorting. Normally this algorithm winds up being a variant of quicksort or merge sort. In certain circumstances you may need to verify whether the algorithm is stable, if that is important to your use case. Otherwise, it is rare to implement sorting algorithms from scratch nowadays. However understanding the considerations provide important background when approaching larger data sets, especially those that may not all fit in memory and/or require several levels of sorting.

Think About It

Think back to the supplier addresses in an early lesson regarding furniture and toy suppliers. One Address class instance might represent “123 Some Street”, while another Address instance might represent “456 Some Street”. You might say that the Address class then has a natural ordering, because obviously “123 Some Street” comes before “456 Some Street”.

But what about other streets? Is the address “123 Some Street” considered the same as “123 Other Street”? You may want to have two levels of comparison: first to sort the street names, and then if the street names are equal (i.e. you would otherwise have returned 0 for the comparison), compare the actual address number on the street.

Self Evaluation

If most programming languages already contain libraries for sorting, why is it useful to study sorting algorithms?
What is one of the biggest drawback of the merge sort? What is one of its benefits, besides its quick running time?
The performance of merge sort depends on what factor, and how does this relate to the design of hash tables?
What is the crucial choice that needs to be made that will determine how quicksort will perform?
Which general type of sorting algorithm does Java use for sorting numbers in its core libraries? What about objects? Why?

Task

When you created a binary search tree in a previous lesson, you had no way to make it work generally for objects. You defined its comparisons in terms of publication date, naming it PublicationBinarySearchTree and putting it in the booker project. But now you have a way to make comparisons for objects in general, as long as they implement Comparable<T>.

Improve your publication hierarchy to have a natural order based upon publication date—the same date that you are currently using for comparison in the PublicationBinarySearchTree.
- Your publication interface will extend Comparable<T> interface.
- Specify in the the publication API documentation that implementations must make comparisons based upon the date. This is part of the publication interface contract.
- Because you are not overriding equals(), note that the comparison implementation is inconsistent with equals.
Refactor your PublicationBinarySearchTree to make comparisons using any instance of Comparable<T> instead of just Publication.
- Remove “Publication” from the name of the interface and its implementation.
- Move the tree interface and implementation to the datastruct project to be with your other data structures such as linked lists and hash tables. Your unit tests that test the tree with Publication will of course need to stay in the booker project.
- Add unit tests that show the binary search tree works with Integer and Long instances, which also implement Comparable<T>.
Print out your publications in order of publication date using your official user publication display methods. Currently you print out the publications from the application's linked list. Now you will sort the publications before printing them. You are already printing the publications using toString() by traversing the binary search tree; you can leave this for now. This will result in publications being printed twice: once in user display format, and the second time using the toString() format.
1. Copy your publications from a linked list to an array.
2. Sort the publications using Arrays.sort(Object[] a). This works because your publications now implement Comparable<T>.
3. Copy the publication, now sorted, from the array into a new linked list and print them as normal.

References

A Visual Overview of Sorting Algorithms (Adam A. Smith)

Acknowledgments

A Common-Sense Guide to Data Structures and Algorithms (Jay Wengrow - The Pragmatic Bookshelf, 2017)
Algorithms, Fourth Edition (Robert Sedgewick, Kevin Wayne - Addison-Wesley, 2011)
Algorithms in a Nutshell, Second Edition (George T. Heineman, Gary Pollice, Stanley Selkow - O'Reilly, 2016)
Data Structures and Algorithms in Java, Second Edition (Robert LaFore - Sams, 2003)
Essential Algorithms (Rod Stephens - Wiley, 2013)
Grokking Algorithms (Aditya Y. Bhargava - Manning, 2016)
Introduction to Algorithms, Third Edition (Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein - The MIT Press, 2009)
Merge sort algorithm diagram by VineetKumar at English Wikipedia, via Wikimedia Commons.
Quicksort diagram by Znupi [Public domain] at English Wikipedia, via Wikimedia Commons.
Big O notation graph modified from Comparison computational complexity by Cmglee (Own work) [CC BY-SA 4.0 or GFDL], via Wikimedia Commons.

Sorting

Goals

Concepts

Library

Lesson

Evaluating Algorithms

O(n2) Sorting Algorithms

Bubble Sort

Selection Sort

Insertion Sort

O(n log n) Sorting Algorithms

Merge Sort

Quicksort

Heap Sort

Sorting Algorithms Faster than O(n log n)

Counting Sort

Bucket Sort

Comparable<T>

Review

Summary

Gotchas

In the Real World

Think About It

Self Evaluation

Task

See Also

References

Acknowledgments

O(`n`²) Sorting Algorithms

O(`n` log `n`) Sorting Algorithms

Sorting Algorithms Faster than O(`n` log `n`)

`Comparable<T>`