Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sorting algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Memory usage patterns and index sorting == When the size of the array to be sorted approaches or exceeds the available primary memory, so that (much slower) disk or swap space must be employed, the memory usage pattern of a sorting algorithm becomes important, and an algorithm that might have been fairly efficient when the array fit easily in RAM may become impractical. In this scenario, the total number of comparisons becomes (relatively) less important, and the number of times sections of memory must be copied or swapped to and from the disk can dominate the performance characteristics of an algorithm. Thus, the number of passes and the localization of comparisons can be more important than the raw number of comparisons, since comparisons of nearby elements to one another happen at [[computer bus|system bus]] speed (or, with caching, even at [[CPU]] speed), which, compared to disk speed, is virtually instantaneous. For example, the popular recursive [[quicksort]] algorithm provides quite reasonable performance with adequate RAM, but due to the recursive way that it copies portions of the array it becomes much less practical when the array does not fit in RAM, because it may cause a number of slow copy or move operations to and from disk. In that scenario, another algorithm may be preferable even if it requires more total comparisons. One way to work around this problem, which works well when complex records (such as in a [[relational database]]) are being sorted by a relatively small key field, is to create an index into the array and then sort the index, rather than the entire array. (A sorted version of the entire array can then be produced with one pass, reading from the index, but often even that is unnecessary, as having the sorted index is adequate.) Because the index is much smaller than the entire array, it may fit easily in memory where the entire array would not, effectively eliminating the disk-swapping problem. This procedure is sometimes called "tag sort".<ref>{{cite web|url=https://www.pcmag.com/encyclopedia_term/0%2C2542%2Ct%3Dtag+sort%26i%3D52532%2C00.asp|title=tag sort Definition from PC Magazine Encyclopedia|website=Pcmag.com|access-date=14 April 2018|archive-date=6 October 2012|archive-url=https://web.archive.org/web/20121006015634/http://www.pcmag.com/encyclopedia_term/0%2C2542%2Ct%3Dtag+sort%26i%3D52532%2C00.asp|url-status=live}}</ref> Another technique for overcoming the memory-size problem is using [[external sorting]], for example, one of the ways is to combine two algorithms in a way that takes advantage of the strength of each to improve overall performance. For instance, the array might be subdivided into chunks of a size that will fit in RAM, the contents of each chunk sorted using an efficient algorithm (such as [[quicksort]]), and the results merged using a ''k''-way merge similar to that used in [[merge sort]]. This is faster than performing either merge sort or quicksort over the entire list.<ref>[[Donald Knuth]], ''[[The Art of Computer Programming]]'', Volume 3: ''Sorting and Searching'', Second Edition. Addison-Wesley, 1998, {{ISBN|0-201-89685-0}}, Section 5.4: External Sorting, pp. 248β379.</ref><ref>[[Ellis Horowitz]] and [[Sartaj Sahni]], ''Fundamentals of Data Structures'', H. Freeman & Co., {{ISBN|0-7167-8042-9}}.</ref> Techniques can also be combined. For sorting very large sets of data that vastly exceed system memory, even the index may need to be sorted using an algorithm or combination of algorithms designed to perform reasonably with [[virtual memory]], i.e., to reduce the amount of swapping required.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)