Introsort
Encyclopedia
Introsort or introspective sort is a sorting algorithm
Sorting algorithm
In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order...

 designed by David Musser
David Musser
David Musser is a professor of computer science at the Rensselaer Polytechnic Institute in Troy, New York, U.S.He is known for his work in generic programming, particularly as applied to C++. His research with Alexander Stepanov led to the creation of the C++ Standard Template Library...

 in 1997. It begins with quicksort and switches to heapsort
Heapsort
Heapsort is a comparison-based sorting algorithm to create a sorted array , and is part of the selection sort family. Although somewhat slower in practice on most machines than a well implemented quicksort, it has the advantage of a more favorable worst-case O runtime...

 when the recursion depth exceeds a level based on (the logarithm
Logarithm
The logarithm of a number is the exponent by which another fixed value, the base, has to be raised to produce that number. For example, the logarithm of 1000 to base 10 is 3, because 1000 is 10 to the power 3: More generally, if x = by, then y is the logarithm of x to base b, and is written...

 of) the number of elements being sorted. It is the best of both worlds, with a worst-case O(n log n) runtime and practical performance comparable to quicksort on typical data sets. Since both algorithms it uses are comparison sort
Comparison sort
A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list...

s, it too is a comparison sort.

In quicksort, one of the critical operations is choosing the pivot: the element around which the list is partitioned. The simplest pivot selection algorithm is to take the first or the last element of the list as the pivot, causing poor behavior for the case of sorted or nearly-sorted input. Niklaus Wirth
Niklaus Wirth
Niklaus Emil Wirth is a Swiss computer scientist, best known for designing several programming languages, including Pascal, and for pioneering several classic topics in software engineering. In 1984 he won the Turing Award for developing a sequence of innovative computer languages.-Biography:Wirth...

's variant uses the middle element to prevent these occurrences, degenerating to O(n²) for contrived sequences. The median-of-3 pivot selection algorithm takes the median of the first, middle, and last elements of the list; however, even though this performs well on many real-world inputs, it is still possible to contrive a median-of-3 killer list that will cause dramatic slowdown of a quicksort based on this pivot selection technique. Such inputs could potentially be exploited by an aggressor, for example by sending such a list to an Internet server for sorting as a denial of service attack.

Musser reported that on a median-of-3 killer sequence of 100,000 elements, introsort's running time was 1/200 that of median-of-3 quicksort. Musser also considered the effect on caches
CPU cache
A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

 of Sedgewick
Robert Sedgewick (computer scientist)
Robert Sedgewick is a computer science professor at Princeton University and a member of the board of directors of Adobe Systems....

's delayed small sorting, where small ranges are sorted at the end in a single pass of insertion sort
Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort...

. He reported that it could double the number of cache misses, but that its performance with double-ended queues was significantly better and should be retained for template libraries, in part because the gain in other cases from doing the sorts immediately was not great.

Similarly, Musser also introduced a worst-case linear selection algorithm
Selection algorithm
In computer science, a selection algorithm is an algorithm for finding the kth smallest number in a list . This includes the cases of finding the minimum, maximum, and median elements. There are O, worst-case linear time, selection algorithms...

 with time comparable to that of Hoare's algorithm, a simple adaptation of quicksort that is the most efficient selection algorithm used in practice. This is called introspection selection or Introselect.

The June 2000 SGI
Silicon Graphics
Silicon Graphics, Inc. was a manufacturer of high-performance computing solutions, including computer hardware and software, founded in 1981 by Jim Clark...

 C++ Standard Template Library
Standard Template Library
The Standard Template Library is a C++ software library which later evolved into the C++ Standard Library. It provides four components called algorithms, containers, functors, and iterators. More specifically, the C++ Standard Library is based on the STL published by SGI. Both include some...

 stl_algo.h implementation of unstable sort uses the Musser introsort approach with the recursion depth to switch to heapsort passed as a parameter, median-of-3 pivot selection and the Sedgewick final insertion sort pass. The element threshold for switching to the simple insertion sort was 16.

External links

  • "A guide to Introsort" Paper created over the course of a student research project by Ralph Unden. Contains a complete implementation in Java.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK