Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Counting sort
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Sorting algorithm}} {{Infobox algorithm|class=[[Sorting Algorithm]]|data=[[Array data structure|Array]]|time=<math>O(n+k)</math>, where k is the range of the non-negative key values.|space=<math>O(n+k)</math>}} In [[computer science]], '''counting sort''' is an [[algorithm]] for [[sorting algorithm|sorting]] a collection of objects according to keys that are small positive [[integer]]s; that is, it is an [[integer sorting]] algorithm. It operates by counting the number of objects that possess distinct key values, and applying prefix sum on those counts to determine the positions of each key value in the output sequence. Its running time is linear in the number of items and the difference between the maximum key value and the minimum key value, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. It is often used as a subroutine in [[radix sort]], another sorting algorithm, which can handle larger keys more efficiently.<ref name="clrs">{{citation | last1 = Cormen | first1 = Thomas H. | author1-link = Thomas H. Cormen | last2 = Leiserson | first2 = Charles E. | author2-link = Charles E. Leiserson | last3 = Rivest | first3 = Ronald L. | author3-link = Ron Rivest | last4 = Stein | first4 = Clifford | author4-link = Clifford Stein | contribution = 8.2 Counting Sort | edition = 2nd | isbn = 0-262-03293-7 | pages = 168β170 | publisher = [[MIT Press]] and [[McGraw-Hill]] | title = [[Introduction to Algorithms]] | year = 2001}}. See also the historical notes on page 181.</ref><ref name="edmonds">{{citation|first=Jeff|last=Edmonds|contribution=5.2 Counting Sort (a Stable Sort)|pages=72β75|title=How to Think about Algorithms|publisher=Cambridge University Press|year=2008|isbn=978-0-521-84931-9}}.</ref><ref name="sedgewick">{{citation|first=Robert|last=Sedgewick|author-link=Robert Sedgewick (computer scientist)|contribution=6.10 Key-Indexed Counting|title=Algorithms in Java, Parts 1-4: Fundamentals, Data Structures, Sorting, and Searching|edition=3rd|publisher=Addison-Wesley|year=2003|pages=312β314}}.</ref> Counting sort is not a [[comparison sort]]; it uses key values as indexes into an array and the {{math|[[Big O notation#Family of BachmannβLandau notations|Ξ©]](''n'' log ''n'')}} [[lower bound]] for comparison sorting will not apply.<ref name="clrs"/> [[Bucket sort]] may be used in lieu of counting sort, and entails a similar time analysis. However, compared to counting sort, bucket sort requires [[linked list]]s, [[dynamic array]]s, or a large amount of pre-allocated memory to hold the sets of items within each bucket, whereas counting sort stores a single number (the count of items) per bucket.<ref name="knuth"/> ==Input and output assumptions== In the most general case, the input to counting sort consists of a [[collection (abstract data type)|collection]] of {{mvar|n}} items, each of which has a non-negative integer key whose maximum value is at most {{mvar|k}}.<ref name="sedgewick"/> In some descriptions of counting sort, the input to be sorted is assumed to be more simply a sequence of integers itself,<ref name="clrs"/> but this simplification does not accommodate many applications of counting sort. For instance, when used as a subroutine in [[radix sort]], the keys for each call to counting sort are individual digits of larger item keys; it would not suffice to return only a sorted list of the key digits, separated from the items. In applications such as in radix sort, a bound on the maximum key value {{mvar|k}} will be known in advance, and can be assumed to be part of the input to the algorithm. However, if the value of {{mvar|k}} is not already known then it may be computed, as a first step, by an additional loop over the data to determine the maximum key value. The output is an [[Array data structure|array]] of the elements ordered by their keys. Because of its application to radix sorting, counting sort must be a [[stable sort]]; that is, if two elements share the same key, their relative order in the output array and their relative order in the input array should match.<ref name="clrs"/><ref name="edmonds"/> ==Pseudocode== In pseudocode, the algorithm may be expressed as: '''function''' CountingSort(input, ''k'') count β array of ''k'' + 1 zeros output β array of same length as input '''for''' ''i'' = 0 '''to''' length(input) - 1 '''do''' ''j'' = key(input[''i'']) count[''j''] = count[''j''] + 1 '''for''' ''i'' = 1 '''to''' ''k'' '''do''' count[''i''] = count[''i''] + count[''i'' - 1] '''for''' ''i'' = length(input) - 1 '''down to''' 0 '''do''' ''j'' = key(input[''i'']) count[''j''] = count[''j''] - 1 output[count[''j'']] = input[''i''] '''return''' output Here <code>input</code> is the input array to be sorted, <code>key</code> returns the numeric key of each item in the input array, <code>count</code> is an auxiliary array used first to store the numbers of items with each key, and then (after the second loop) to store the positions where items with each key should be placed, <code>k</code> is the maximum value of the non-negative key values and <code>output</code> is the sorted output array. In summary, the algorithm loops over the items in the first loop, computing a [[histogram]] of the number of times each key occurs within the <code>input</code> collection. After that in the second loop, it performs a [[prefix sum]] computation on <code>count</code> in order to determine, for each key, the position range where the items having that key should be placed; i.e. items of key <math>i</math> should be placed starting in position <code>count[<math>i</math>]</code>. Finally, in the third loop, it loops over the items of <code>input</code> again, but in reverse order, moving each item into its sorted position in the <code>output</code> array.<ref name="clrs"/><ref name="edmonds"/><ref name="sedgewick"/> The relative order of items with equal keys is preserved here; i.e., this is a [[:Category:Stable sorts|stable sort]]. ==Complexity analysis== Because the algorithm uses only simple <code>for</code> loops, without recursion or subroutine calls, it is straightforward to analyze. The initialization of the count array, and the second for loop which performs a prefix sum on the count array, each iterate at most {{math|''k'' + 1}} times and therefore take {{math|''O''(''k'')}} time. The other two for loops, and the initialization of the output array, each take {{math|''O''(''n'')}} time. Therefore, the time for the whole algorithm is the sum of the times for these steps, {{math|''O''(''n'' + ''k'')}}.<ref name="clrs"/><ref name="edmonds"/> Because it uses arrays of length {{math|''k'' + 1}} and {{mvar|n}}, the total space usage of the algorithm is also {{math|''O''(''n'' + ''k'')}}.<ref name="clrs"/> For problem instances in which the maximum key value is significantly smaller than the number of items, counting sort can be highly space-efficient, as the only storage it uses other than its input and output arrays is the Count array which uses space {{math|''O''(''k'')}}.<ref>{{citation | last1 = Burris | first1 = David S. | last2 = Schember | first2 = Kurt | contribution = Sorting sequential files with limited auxiliary storage | doi = 10.1145/503838.503855 | location = New York, NY, USA | pages = 23β31 | publisher = ACM | title = Proceedings of the 18th annual Southeast Regional Conference | year = 1980| isbn = 0897910141 | s2cid = 5670614 }}.</ref> ==Variant algorithms== If each item to be sorted is itself an integer, and used as key as well, then the second and third loops of counting sort can be combined; in the second loop, instead of computing the position where items with key <code>i</code> should be placed in the output, simply append <code>Count[i]</code> copies of the number <code>i</code> to the output. This algorithm may also be used to eliminate duplicate keys, by replacing the <code>Count</code> array with a [[bit vector]] that stores a <code>one</code> for a key that is present in the input and a <code>zero</code> for a key that is not present. If additionally the items are the integer keys themselves, both second and third loops can be omitted entirely and the bit vector will itself serve as output, representing the values as offsets of the non-<code>zero</code> entries, added to the range's lowest value. Thus the keys are sorted and the duplicates are eliminated in this variant just by being placed into the bit array. For data in which the maximum key size is significantly smaller than the number of data items, counting sort may be [[parallel algorithm|parallelized]] by splitting the input into subarrays of approximately equal size, processing each subarray in parallel to generate a separate count array for each subarray, and then merging the count arrays. When used as part of a parallel radix sort algorithm, the key size (base of the radix representation) should be chosen to match the size of the split subarrays.<ref>{{citation | last1 = Zagha | first1 = Marco | last2 = Blelloch | first2 = Guy E. | author2-link = Guy Blelloch | contribution = Radix sort for vector multiprocessors | doi = 10.1145/125826.126164 | pages = 712β721 | publisher = IEEE Computer Society / ACM | title = Proceedings of Supercomputing '91, November 18-22, 1991, Albuquerque, NM, USA | year = 1991| isbn = 0897914597 | url = https://www.cs.cmu.edu/~scandal/papers/cray-sort-supercomputing91.ps.gz| doi-access = free }}.</ref> The simplicity of the counting sort algorithm and its use of the easily parallelizable prefix sum primitive also make it usable in more fine-grained parallel algorithms.<ref>{{citation | last = Reif | first = John H. | author-link = John Reif | contribution = An optimal parallel algorithm for integer sorting | doi = 10.1109/SFCS.1985.9 | pages = 496β504 | title = [[Symposium on Foundations of Computer Science|Proc. 26th Annual Symposium on Foundations of Computer Science (FOCS 1985)]] | year = 1985| isbn = 0-8186-0644-4 | s2cid = 5694693 }}.</ref> As described, counting sort is not an [[in-place algorithm]]; even disregarding the count array, it needs separate input and output arrays. It is possible to modify the algorithm so that it places the items into sorted order within the same array that was given to it as the input, using only the count array as auxiliary storage; however, the modified in-place version of counting sort is not stable.<ref name="sedgewick"/> ==History== Although radix sorting itself dates back far longer, counting sort, and its application to radix sorting, were both invented by [[Harold H. Seward]] in 1954.<ref name="clrs"/><ref name="knuth">{{citation|first=D. E.|last=Knuth|author-link=Donald Knuth|title=[[The Art of Computer Programming]], Volume 3: Sorting and Searching|edition=2nd|publisher=Addison-Wesley|year=1998|isbn=0-201-89685-0}}. Section 5.2, Sorting by counting, pp. 75β80, and historical notes, p. 170.</ref><ref>{{citation|first=H. H.|last=Seward|title=Information sorting in the application of electronic digital computers to business operations|series=Master's thesis, Report R-232|year=1954|publisher=[[Massachusetts Institute of Technology]], Digital Computer Laboratory|url=http://bitsavers.org/pdf/mit/whirlwind/R-series/R-232_Information_Sorting_in_the_Application_of_Electronic_Digital_Computers_to_Business_Operations_May54.pdf|contribution=2.4.6 Internal Sorting by Floating Digital Sort|pages=25β28}}.</ref> ==References== {{reflist}} ==External links== {{Wikibooks|Algorithm implementation|Sorting/Counting sort|Counting sort}} * [http://www.cs.usfca.edu/~galles/visualization/CountingSort.html Counting Sort html5 visualization] * [http://users.cs.cf.ac.uk/C.L.Mumford/tristan/CountingSort.html Demonstration applet from Cardiff University] {{Webarchive|url=https://web.archive.org/web/20130602221151/http://users.cs.cf.ac.uk/C.L.Mumford/tristan/CountingSort.html |date=2013-06-02 }} *{{citation|first=Art S.|last=Kagel|contribution=counting sort|title=Dictionary of Algorithms and Data Structures|editor-first=Paul E.|editor-last=Black|publisher=U.S. National Institute of Standards and Technology|date=2 June 2006|url=https://xlinux.nist.gov/dads/HTML/countingsort.html|access-date=2011-04-21}}. {{sorting}} {{DEFAULTSORT:Counting Sort}} [[Category:Sorting algorithms]] [[Category:Stable sorts]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation
(
edit
)
Template:Infobox algorithm
(
edit
)
Template:Math
(
edit
)
Template:Mvar
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Sister project
(
edit
)
Template:Sorting
(
edit
)
Template:Webarchive
(
edit
)
Template:Wikibooks
(
edit
)