Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Bucket sort
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Analysis == === Worst-case analysis === When the input contains several keys that are close to each other (clustering), those elements are likely to be placed in the same bucket, which results in some buckets containing more elements than average. The worst-case scenario occurs when all the elements are placed in a single bucket. The overall performance would then be dominated by the algorithm used to sort each bucket, for example <math>O(n^2)</math> [[insertion sort]] or <math>O(n \log(n))</math> [[comparison sort]] algorithms, such as [[merge sort]]. === Average-case analysis === Consider the case that the input is uniformly distributed. The first step, which is '''initialize''' the buckets and '''find the maximum key value''' in the array, can be done in <math>O(n)</math> time. If division and multiplication can be done in constant time, then '''scattering''' each element to its bucket also costs <math>O(n)</math>. Assume insertion sort is used to sort each bucket, then the third step costs <math>O(\textstyle \sum_{i=1}^k \displaystyle n_i^2)</math>, where <math>n_i</math> is the length of the bucket indexed <math>i</math>. Since we are concerning the average time, the expectation <math>E(n_i^2)</math> has to be evaluated instead. Let <math>X_{ij}</math> be the random variable that is <math>1</math> if element <math>j</math> is placed in bucket <math>i</math>, and <math>0</math> otherwise. We have <math>n_i = \sum_{j=1}^n X_{ij}</math>. Therefore, :<math>\begin{align} E(n_i^2) & = E\left(\sum_{j=1}^n X_{ij} \sum_{l=1}^n X_{il}\right) \\ & = E\left(\sum_{j=1}^n \sum_{l=1}^n X_{ij}X_{il}\right) \\ & = E\left(\sum_{j=1}^n X_{ij}^2\right) + E\left(\sum_{1\leq j,l\leq n}\sum_{j\neq l}X_{ij}X_{il}\right) \end{align} </math> The last line separates the summation into the case <math>j=l</math> and the case <math>j\neq l</math>. Since the chance of an object distributed to bucket <math>i</math> is <math>1/k</math>, <math>X_{ij} </math> is 1 with probability <math>1/k</math> and 0 otherwise. :<math>E(X_{ij}^2) = 1^2\cdot \left(\frac{1}{k}\right) + 0^2\cdot \left(1-\frac{1}{k}\right) = \frac{1}{k}</math> :<math>E(X_{ij}X_{ik}) = 1\cdot \left(\frac{1}{k}\right)\left(\frac{1}{k}\right) = \frac{1}{k^2} </math> With the summation, it would be :<math>E\left(\sum_{j=1}^n X_{ij}^2\right) + E\left(\sum_{1\leq j,k\leq n}\sum_{j\neq k}X_{ij}X_{ik}\right) = n\cdot\frac{1}{k} + n(n-1)\cdot\frac{1}{k^2} = \frac{n^2+nk-n}{k^2}</math> Finally, the complexity would be <math>O\left(\sum_{i=1}^kE(n_i^2)\right) = O\left(\sum_{i=1}^k \frac{n^2+nk-n}{k^2}\right) = O\left(\frac{n^2}{k}+n\right) </math>. The last step of bucket sort, which is '''concatenating''' all the sorted objects in each bucket, requires <math>O(k)</math> time. Therefore, the total complexity is <math>O\left(n+\frac{n^2}{k}+k\right)</math>. Note that if k is chosen to be <math>k = \Theta(n)</math>, then bucket sort runs in <math>O(n)</math> average time, given a uniformly distributed input.<ref name="lfcs" />
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)