Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sorting algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Efficient sorts === Practical general sorting algorithms are almost always based on an algorithm with average time complexity (and generally worst-case complexity) O(''n'' log ''n''), of which the most common are heapsort, merge sort, and quicksort. Each has advantages and drawbacks, with the most significant being that simple implementation of merge sort uses O(''n'') additional space, and simple implementation of quicksort has O(''n''<sup>2</sup>) worst-case complexity. These problems can be solved or ameliorated at the cost of a more complex algorithm. While these algorithms are asymptotically efficient on random data, for practical efficiency on real-world data various modifications are used. First, the overhead of these algorithms becomes significant on smaller data, so often a hybrid algorithm is used, commonly switching to insertion sort once the data is small enough. Second, the algorithms often perform poorly on already sorted data or almost sorted data – these are common in real-world data and can be sorted in O(''n'') time by appropriate algorithms. Finally, they may also be [[unstable sort|unstable]], and stability is often a desirable property in a sort. Thus more sophisticated algorithms are often employed, such as [[Timsort]] (based on merge sort) or [[introsort]] (based on quicksort, falling back to heapsort). ==== Merge sort ==== {{Main|Merge sort}} ''Merge sort'' takes advantage of the ease of merging already sorted lists into a new sorted list. It starts by comparing every two elements (i.e., 1 with 2, then 3 with 4...) and swapping them if the first should come after the second. It then merges each of the resulting lists of two into lists of four, then merges those lists of four, and so on; until at last two lists are merged into the final sorted list.<ref>{{harvnb|Wirth|1986|pp=101–102}}</ref> Of the algorithms described here, this is the first that scales well to very large lists, because its worst-case running time is O(''n'' log ''n''). It is also easily applied to lists, not only arrays, as it only requires sequential access, not random access. However, it has additional O(''n'') space complexity and involves a large number of copies in simple implementations. Merge sort has seen a relatively recent surge in popularity for practical implementations, due to its use in the sophisticated algorithm [[Timsort]], which is used for the standard sort routine in the programming languages [[Python (programming language)|Python]]<ref>{{cite web|url=http://svn.python.org/projects/python/trunk/Objects/listsort.txt|title=Tim Peters's original description of timsort|website=python.org|access-date=14 April 2018|archive-date=22 January 2018|archive-url=https://web.archive.org/web/20180122024335/http://svn.python.org/projects/python/trunk/Objects/listsort.txt|url-status=live}}</ref> and [[Java (programming language)|Java]] (as of [[JDK7]]<ref>{{cite web|url=http://cr.openjdk.java.net/~martin/webrevs/openjdk7/timsort/raw_files/new/src/share/classes/java/util/TimSort.java|title=OpenJDK's TimSort.java|website=java.net|access-date=14 April 2018|archive-date=14 August 2011|archive-url=https://web.archive.org/web/20110814013719/http://cr.openjdk.java.net/~martin/webrevs/openjdk7/timsort/raw_files/new/src/share/classes/java/util/TimSort.java|url-status=dead}}</ref>). Merge sort itself is the standard routine in [[Perl]],<ref>{{cite web|url=http://perldoc.perl.org/functions/sort.html|title=sort – perldoc.perl.org|website=perldoc.perl.org|access-date=14 April 2018|archive-date=14 April 2018|archive-url=https://web.archive.org/web/20180414233802/http://perldoc.perl.org/functions/sort.html|url-status=live}}</ref> among others, and has been used in Java at least since 2000 in [[Java version history#J2SE 1.3|JDK1.3]].<ref name="mergesort_in_jdk13">[http://java.sun.com/j2se/1.3/docs/api/java/util/Arrays.html#sort(java.lang.Object%5B%5D) Merge sort in Java 1.3], Sun. {{Webarchive|url=https://web.archive.org/web/20090304021927/http://java.sun.com/j2se/1.3/docs/api/java/util/Arrays.html#sort(java.lang.Object%5B%5D) |date=2009-03-04 }}</ref> ==== Heapsort ==== {{Main|Heapsort}} ''Heapsort'' is a much more efficient version of [[selection sort]]. It also works by determining the largest (or smallest) element of the list, placing that at the end (or beginning) of the list, then continuing with the rest of the list, but accomplishes this task efficiently by using a data structure called a [[heap (data structure)|heap]], a special type of [[binary tree]].<ref>{{harvnb|Wirth|1986|pp=87–89}}</ref> Once the data list has been made into a heap, the root node is guaranteed to be the largest (or smallest) element. When it is removed and placed at the end of the list, the heap is rearranged so the largest element remaining moves to the root. Using the heap, finding the next largest element takes O(log ''n'') time, instead of O(''n'') for a linear scan as in simple selection sort. This allows Heapsort to run in O(''n'' log ''n'') time, and this is also the worst-case complexity. ==== Recombinant sort ==== Recombinant sort is a non-comparison-based sorting algorithm developed by Peeyush Kumar et al in 2020. The algorithm combines bucket sort, counting sort, radix sort, hashing, and dynamic programming techniques. It employs an n-dimensional Cartesian space mapping approach consisting of two primary phases: a Hashing cycle that maps elements to a multidimensional array using a special hash function, and an Extraction cycle that retrieves elements in sorted order. Recombinant Sort achieves O(n) time complexity for best, average, and worst cases, and can process both numerical and string data types, including mixed decimal and non-decimal numbers.<ref>{{citation |url=https://www.iieta.org/journals/isi/paper/10.18280/isi.250513 |title=Recombinant Sort: N-Dimensional Cartesian Spaced Algorithm Designed from Synergetic Combination of Hashing, Bucket, Counting and Radix Sort|date=2020 |doi=10.18280/isi.250513 |last1=Kumar |first1=Peeyush |last2=Gangal |first2=Ayushe |last3=Kumari |first3=Sunita |journal=Ingénierie des Systèmes D Information |volume=25 |issue=5 |pages=655–668 |arxiv=2107.01391 }}</ref> ==== Quicksort ==== {{Main|Quicksort}} ''Quicksort'' is a [[divide-and-conquer algorithm]] which relies on a ''partition'' operation: to partition an array, an element called a ''pivot'' is selected.<ref>{{harvnb|Wirth|1986|p=93}}</ref><ref>{{citation |last1=Cormen |first1=Thomas H. |author1-link=Thomas H. Cormen |last2=Leiserson |first2=Charles E. |author2-link=Charles E. Leiserson |last3=Rivest |first3=Ronald L. |author3-link=Ron Rivest |last4=Stein |first4=Clifford |author4-link=Clifford Stein |title=Introduction to Algorithms |edition=3rd |place=Cambridge, MA |publisher=The MIT Press |year=2009 |isbn=978-0262033848 |pages=171–172}}</ref> All elements smaller than the pivot are moved before it and all greater elements are moved after it. This can be done efficiently in linear time and [[in-place]]. The lesser and greater sublists are then recursively sorted. This yields an average time complexity of O(''n'' log ''n''), with low overhead, and thus this is a popular algorithm. Efficient implementations of quicksort (with in-place partitioning) are typically unstable sorts and somewhat complex but are among the fastest sorting algorithms in practice. Together with its modest O(log ''n'') space usage, quicksort is one of the most popular sorting algorithms and is available in many standard programming libraries. The important caveat about quicksort is that its worst-case performance is O(''n''<sup>2</sup>); while this is rare, in naive implementations (choosing the first or last element as pivot) this occurs for sorted data, which is a common case. The most complex issue in quicksort is thus choosing a good pivot element, as consistently poor choices of pivots can result in drastically slower O(''n''<sup>2</sup>) performance, but good choice of pivots yields O(''n'' log ''n'') performance, which is asymptotically optimal. For example, if at each step the [[median]] is chosen as the pivot then the algorithm works in O(''n'' log ''n''). Finding the median, such as by the [[median of medians]] [[selection algorithm]] is however an O(''n'') operation on unsorted lists and therefore exacts significant overhead with sorting. In practice choosing a random pivot almost certainly yields O(''n'' log ''n'') performance. If a guarantee of O(''n'' log ''n'') performance is important, there is a simple modification to achieve that. The idea, due to Musser, is to set a limit on the maximum depth of recursion.<ref>{{citation |last1=Musser |first1=David R. |title=Introspective Sorting and Selection Algorithms |journal=Software: Practice and Experience |year=1997 |volume=27 |issue=8 |pages=983–993|doi=10.1002/(SICI)1097-024X(199708)27:8<983::AID-SPE117>3.0.CO;2-# }}</ref> If that limit is exceeded, then sorting is continued using the heapsort algorithm. Musser proposed that the limit should be <math> 1 + 2 \lfloor \log_2(n) \rfloor</math>, which is approximately twice the maximum recursion depth one would expect on average with a randomly [[ordered array]]. ==== Shellsort ==== [[File:Shell_sorting_algorithm_color_bars.svg|right|thumb|A Shellsort, different from bubble sort in that it moves elements to numerous [[Swap (computer science)|swapping positions]].]] {{Main|Shellsort}} ''Shellsort'' was invented by [[Donald Shell]] in 1959.<ref name="Shell">{{Cite journal |url=http://penguin.ewu.edu/cscd300/Topic/AdvSorting/p30-shell.pdf |last=Shell |first=D. L. |title=A High-Speed Sorting Procedure |journal=Communications of the ACM |volume=2 |issue=7 |year=1959 |pages=30–32 |doi=10.1145/368370.368387 |s2cid=28572656 |access-date=2020-03-23 |archive-date=2017-08-30 |archive-url=https://web.archive.org/web/20170830020037/http://penguin.ewu.edu/cscd300/Topic/AdvSorting/p30-shell.pdf |url-status=dead }}</ref> It improves upon insertion sort by moving out of order elements more than one position at a time. The concept behind Shellsort is that insertion sort performs in {{tmath|O(kn)}} time, where k is the greatest distance between two out-of-place elements. This means that generally, they perform in ''O''(''n''<sup>2</sup>), but for data that is mostly sorted, with only a few elements out of place, they perform faster. So, by first sorting elements far away, and progressively shrinking the gap between the elements to sort, the final sort computes much faster. One implementation can be described as arranging the data sequence in a two-dimensional array and then sorting the columns of the array using insertion sort. The worst-case time complexity of Shellsort is an [[open problem]] and depends on the gap sequence used, with known complexities ranging from ''O''(''n''<sup>2</sup>) to ''O''(''n''<sup>4/3</sup>) and Θ(''n'' log<sup>2</sup> ''n''). This, combined with the fact that Shellsort is [[in-place]], only needs a relatively small amount of code, and does not require use of the [[call stack]], makes it is useful in situations where memory is at a premium, such as in [[embedded system]]s and [[operating system kernel]]s.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)