COMP 202: Principles of Object-Oriented Programming II

Heaps and Heap Sort

Heap Sort

• Heap Sort, like Selection Sort, is a hard-split, easy-join method.
• Think of Heap Sort as an improved (faster) version of Selection Sort.
• Specifically, split(), which finds the largest (or smallest) element in the subarray, is made to run in O(log n) steps instead of O(n) steps, where n is the subarray length.
• Since split() is performed n times, where n is the (overall) array length, Heap Sort takes O(n log n) steps.

How is split() sped up?

• The elements in the unsorted portion of the array are organized into a heap.  A heap is a data structure that is optimized for repeatedly finding and removing the largest (or smallest) element.

What is a Heap?

• A "complete" tree is a minimum height tree with all the nodes on the lowest level in their left-most positions. That is, the tree is completely filled from the top with any extra elements strictly on one side (left) of the lowest level. Note that in a complete tree, there is at most a variation of 1 in path lengths from the root to the leaves.

• A heap is (conceptually) a binary tree that
1. is complete and
2. also exhibits the heap property:
• the root, if non-null, is the largest (or smallest) key in the tree, and
• its left and right subtrees are themselves heaps.

Initializing the Heap: HeapSorter()

In order to sort an array, using the ordering capabilities of a heap, you must first transform the randomly placed data in the array into a heap.  This is called "heapifying" the array.  This can be accomplished by sifting down the elements.  Luckily, even though this operation takes place in O(n log(n)) time, it only occurs once, so in the end, it has no impact on the overall complexity of the algorithm.

Note that we only really have to sift down half the array, i.e. half the "tree".  This is because a single-element array (tree) is already a heap, so we can bypass all the leaves and immediately start working on the layer right above the leaves.

Inserting into an existing Heap: siftUp()

To insert a data element into an existing heap, we are forced to initially insert the element at the bottom of the tree, which is at the end of the array.  Since this may break the heap property, we need "sift up" the data through the tree to find a spot for it where the overall heap property will be restored.  When sifting up, we are essentially taking a data element, (starting with the one being inserted) comparing it to its parent and then taking the largest (or smallest) of the pair, leaving the other in the parent's position.   The process with the left-over data element and the next higher parent until the top of the heap is reached.

Sample Code

 Best-case Cost Worst-case Cost Selection O(n2) O(n2) Insertion O(n) O(n2) Heap O(n log n) O(n log n) Merge O(n log n) O(n log n) Quick O(n log n) O(n2)
• Selection sort performs the least swaps, O(n), in the worst case.
• Insertion sort is best if the array is nearly or already sorted.
• Heap sort performs a constant factor more comparisons than Merge sort
• Merge sort requires extra storage proportional in size to the input.
• Quick sort typically (expected case) outperforms Heap and Merge sort because of its simplicity.

Heaps and Heap Sort

URL: http://www.cs.rice.edu/teaching/202/08-fall/lectures/sort3/index.shtml