Editing Selection algorithm (section)

== Lower bounds ==
The <math>O(n)</math> running time of the selection algorithms described above is necessary, because a selection algorithm that can handle inputs in an arbitrary order must take that much time to look at all of its inputs. If any one of its input values is not compared, that one value could be the one that should have been selected, and the algorithm can be made to produce an incorrect answer.{{r|kkzz}} Beyond this simple argument, there has been a significant amount of research on the exact number of comparisons needed for selection, both in the randomized and deterministic cases.

Selecting the minimum of <math>n</math> values requires <math>n-1</math> comparisons, because the <math>n-1</math> values that are not selected must each have been determined to be  non-minimal, by being the largest in some comparison, and no two of these values can be largest in the same comparison. The same argument applies symmetrically to selecting the {{nowrap|maximum.{{r|knuth}}}}

The next simplest case is selecting the second-smallest. After several incorrect attempts, the first tight lower bound on this case was published in 1964 by Soviet mathematician [[Sergey Kislitsyn]]. It can be shown by observing that selecting the second-smallest also requires distinguishing the smallest value from the rest, and by considering the number <math>p</math> of comparisons involving the smallest value that an algorithm for this problem makes. Each of the <math>p</math> items that were compared to the smallest value is a candidate for second-smallest, and <math>p-1</math> of these values must be found larger than another value in a second comparison in order to rule them out as second-smallest.
With <math>n-1</math> values being the larger in at least one comparison, and <math>p-1</math> values being the larger in at least two comparisons, there are a total of at least <math>n+p-2</math> comparisons. An [[Adversary model|adversary argument]], in which the outcome of each comparison is chosen in order to maximize <math>p</math> (subject to consistency with at least one possible ordering) rather than by the numerical values of the given items, shows that it is possible to force <math>p</math> to be {{nowrap|at least <math>\log_2 n</math>.}} Therefore, the worst-case number of comparisons needed to select the second smallest {{nowrap|is <math>n+\lceil\log_2 n\rceil-2</math>,}} the same number that would be obtained by holding a single-elimination tournament with a run-off tournament among the values that lost to the smallest value. However, the expected number of comparisons of a randomized selection algorithm can be better than this bound; for instance, selecting the second-smallest of six elements requires seven comparisons in the worst case, but may be done by a randomized algorithm with an expected number of {{nowrap|6.5 comparisons.{{r|knuth}}}}

More generally, selecting the {{nowrap|<math>k</math>th}} element out of <math>n</math> requires at least <math>n+\min(k,n-k)-O(1)</math> comparisons, in the average case, matching the number of comparisons of the Floyd–Rivest algorithm up to its <math>o(n)</math> term. The argument is made directly for deterministic algorithms, with a number of comparisons that is averaged over all possible [[permutation]]s of the input {{nowrap|values.{{r|cunmun}}}} By [[Yao's principle]], it also applies to the expected number of comparisons for a randomized algorithm on its worst-case {{nowrap|input.{{r|chan}}}}

For deterministic algorithms, it has been shown that selecting the {{nowrap|<math>k</math>th}} element requires <math>\bigl(1+H(k/n)\bigr)n+\Omega(\sqrt n)</math> comparisons, where <math display=block>H(x)=x\log_2\frac1x + (1-x)\log_2\frac1{1-x}</math> is the {{nowrap|[[binary entropy function]].{{r|benjoh}}}} The special case of median-finding has a slightly larger lower bound on the number of comparisons, {{nowrap|at least <math>(2+\varepsilon)n</math>,}} for {{nowrap|<math>\varepsilon\approx 2^{-80}</math>.{{r|dz01}}}}