Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Histogram
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==== Variable bin widths ==== Rather than choosing evenly spaced bins, for some applications it is preferable to vary the bin width. This avoids bins with low counts. A common case is to choose ''equiprobable bins'', where the number of samples in each bin is expected to be approximately equal. The bins may be chosen according to some known distribution or may be chosen based on the data so that each bin has <math>\approx n/k</math> samples. When plotting the histogram, the ''frequency density'' is used for the dependent axis. While all bins have approximately equal area, the heights of the histogram approximate the density distribution. For equiprobable bins, the following rule for the number of bins is suggested:<ref>{{cite web |author1=Jack Prins |author2=Don McCormack |author3=Di Michelson |author4=Karen Horrell |title=Chi-square goodness-of-fit test |url=https://itl.nist.gov/div898/handbook/prc/section2/prc211.htm |website=NIST/SEMATECH e-Handbook of Statistical Methods |publisher=NIST/SEMATECH |access-date=29 March 2019 |page=7.2.1.1}}</ref> :<math>k = 2 n^{2/5}</math> This choice of bins is motivated by maximizing the power of a [[Pearson chi-squared test]] testing whether the bins do contain equal numbers of samples. More specifically, for a given confidence interval <math>\alpha</math> it is recommended to choose between 1/2 and 1 times the following equation:<ref>{{cite book |last1=Moore |first1=David |editor1-last=D'Agostino |editor1-first=Ralph |editor2-last=Stephens |editor2-first=Michael |title=Goodness-of-Fit Techniques |date=1986 |publisher=Marcel Dekker Inc. |location=New York, NY, US |isbn=0-8247-7487-6 |page=70 |chapter=3}}</ref> :<math>k = 4 \left( \frac{2 n^2}{\Phi^{-1}(\alpha)} \right)^\frac{1}{5}</math> Where <math>\Phi^{-1}</math> is the [[probit]] function. Following this rule for <math>\alpha = 0.05</math> would give between <math>1.88n^{2/5}</math> and <math>3.77n^{2/5}</math>; the coefficient of 2 is chosen as an easy-to-remember value from this broad optimum.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)