top
|
COMP 202: Principles of Object-Oriented Programming II
←
Heaps and Heap Sort
→
|
Heap Sort
-
Heap Sort, like Selection Sort, is a hard-split, easy-join method.
-
Think of Heap Sort as an improved (faster) version of Selection Sort.
-
Specifically, split(), which finds the largest (or
smallest) element in the subarray,
is made to run in O(log n) steps instead of O(n) steps, where n is the subarray
length.
-
Since split() is performed n times, where n is the (overall) array length, Heap
Sort takes O(n log n) steps.
How is split() sped up?
-
The elements in the unsorted portion of the array are organized into a
heap. A heap is a data structure that is optimized for repeatedly
finding and removing the largest (or smallest) element.
What is a Heap?
- A "complete" tree is a minimum height tree with all the nodes on
the lowest level in their left-most positions. That is, the tree is
completely filled from the top with any extra elements strictly on one side
(left) of the lowest level. Note that in a complete tree, there is at most a
variation of 1 in path lengths from the root to the leaves.
- A heap is
(conceptually) a binary tree that
- is complete and
- also exhibits the heap property:
- the root, if non-null, is the largest (or smallest) key in the
tree, and
- its left and right subtrees are themselves heaps.
Implementing a Heap?
Heap Sort Basics
Heap Sort: split()
Example of siftDown()
siftDown(): The Implementation
Initializing the Heap: HeapSorter()
In order to sort an array, using the ordering capabilities of
a heap, you must first transform the randomly placed data in the array into a
heap. This is called "heapifying" the array. This can be
accomplished by sifting down the elements. Luckily, even though this
operation takes place in O(n log(n)) time, it only occurs once, so in the end,
it has no impact on the overall complexity of the algorithm.
Note that we only really have to sift down half the array, i.e.
half the "tree". This is because a single-element array (tree) is already
a heap, so we can bypass all the leaves and immediately start working on the
layer right above the leaves.
Inserting into an existing Heap:
siftUp()
To insert a data element into an existing heap, we are forced
to initially insert the element at the bottom of the tree, which is at the end
of the array. Since this may break the heap property, we need "sift up"
the data through the tree to find a spot for it where the overall heap property
will be restored. When sifting up, we are essentially taking a data
element, (starting with the one being inserted) comparing it to its parent and
then taking the largest (or smallest) of the pair, leaving the other in the
parent's position. The process with the left-over data element and
the next higher parent until the top of the heap is reached.
Sample Code
|
Best-case Cost |
Worst-case Cost |
Selection |
O(n2) |
O(n2) |
Insertion |
O(n) |
O(n2) |
Heap |
O(n log n) |
O(n log n) |
Merge |
O(n log n) |
O(n log n) |
Quick |
O(n log n) |
O(n2) |
-
Selection sort performs the least swaps, O(n), in the worst case.
-
Insertion sort is best if the array is nearly or already sorted.
-
Heap sort performs a constant factor more comparisons than Merge sort
-
Merge sort requires extra storage proportional in size to the input.
-
Quick sort typically (expected case) outperforms Heap and Merge sort because of
its simplicity.
←
Heaps and Heap Sort
→
URL: http://www.cs.rice.edu/teaching/202/08-fall/lectures/sort3/index.shtml
Copyright © 2008-2010 Mathias Ricken and Stephen Wong