Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Multimodal distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Statistical tests== A number of tests are available to determine if a data set is distributed in a bimodal (or multimodal) fashion. ===Graphical methods=== In the study of sediments, particle size is frequently bimodal. Empirically, it has been found useful to plot the frequency against the log( size ) of the particles.<ref name=Folk1957>{{cite journal | last1 = Folk | first1 = RL | last2 = Ward | first2 = WC | year = 1957 | title = Brazos River bar: a study in the significance of grain size parameters | url = https://doi.pangaea.de/10.1594/PANGAEA.896129| journal = Journal of Sedimentary Research | volume = 27 | issue = 1| pages = 3β26 | doi=10.1306/74d70646-2b21-11d7-8648000102c1865d|bibcode = 1957JSedR..27....3F }}</ref><ref name=Dyer1970>{{cite journal | last1 = Dyer | first1 = KR | year = 1970 | title = Grain-size parameters for sandy gravels | journal = Journal of Sedimentary Research | volume = 40 | issue = 2| pages = 616β620 |doi=10.1306/74D71FE6-2B21-11D7-8648000102C1865D}}</ref> This usually gives a clear separation of the particles into a bimodal distribution. In geological applications the [[logarithm]] is normally taken to the base 2. The log transformed values are referred to as phi (Ξ¦) units. This system is known as the [[Grain size|Krumbein]] (or phi) scale. An alternative method is to plot the log of the particle size against the cumulative frequency. This graph will usually consist two reasonably straight lines with a connecting line corresponding to the antimode. ;Statistics Approximate values for several statistics can be derived from the graphic plots.<ref name=Folk1957/> <math display="block">\begin{align} \text{mean} &= \frac{ \phi_{16} + \phi_{50} + \phi_{84} }{ 3 } \\[1ex] \text{std. dev.} &= \frac{ \phi_{84} - \phi_{16} }{ 4 } + \frac{ \phi_{95} - \phi_5 }{ 6.6 } \\[1ex] \text{skewness} &= \frac{ \phi_{84} + \phi_{16} - 2 \phi_{50} }{ 2 ( \phi_{84} - \phi_{16} ) } + \frac{ \phi_{95} + \phi_{ 5 } - 2 \phi_{50} }{ 2( \phi_{95} - \phi_5 ) } \\[1ex] \text{kurtosis} &= \frac{ \phi_{95} - \phi_5 }{ 2.44 ( \phi_{75} - \phi_{25} ) } \end{align}</math> where ''Ο''<sub>x</sub> is the value of the variate ''Ο'' at the ''x''<sup>th</sup> percentage of the distribution. ===Unimodal vs. bimodal distribution=== Pearson in 1894 was the first to devise a procedure to test whether a distribution could be resolved into two normal distributions.<ref name=Pearson1894>{{cite journal | last1 = Pearson | first1 = K | year = 1894 | title = Contributions to the mathematical theory of evolution: On the dissection of asymmetrical frequency-curves | journal = Philosophical Transactions of the Royal Society A | volume = 185 | pages = 71β90 | doi=10.1098/rsta.1894.0003| bibcode = 1894RSPTA.185...71P| doi-access = free }}</ref> This method required the solution of a ninth order [[polynomial]]. In a subsequent paper Pearson reported that for any distribution skewness<sup>2</sup> + 1 < kurtosis.<ref name=Pearson1916/> Later Pearson showed that<ref name=Pearson1929>{{cite journal | last1 = Pearson | first1 = K | year = 1929 | title = Editorial note | journal = Biometrika | volume = 21 | pages = 370β375 }}</ref> <math display="block"> b_2 - b_1 \ge 1 </math> where ''b''<sub>2</sub> is the kurtosis and ''b''<sub>1</sub> is the square of the skewness. Equality holds only for the two point [[Bernoulli distribution]] or the sum of two different [[Dirac delta function]]s. These are the most extreme cases of bimodality possible. The kurtosis in both these cases is 1. Since they are both symmetrical their skewness is 0 and the difference is 1. Baker proposed a transformation to convert a bimodal to a unimodal distribution.<ref name=Baker1930>{{cite journal | last1 = Baker | first1 = GA | year = 1930 | title = Transformations of bimodal distributions | journal = Annals of Mathematical Statistics | volume = 1 | issue = 4| pages = 334β344 | doi=10.1214/aoms/1177733063| doi-access = free }}</ref> Several tests of unimodality versus bimodality have been proposed: Haldane suggested one based on second central differences.<ref name=Haldane1951>{{cite journal | last1 = Haldane | first1 = JBS | year = 1951 | title = Simple tests for bimodality and bitangentiality | journal = Annals of Eugenics | volume = 16 | issue = 1| pages = 359β364 | doi = 10.1111/j.1469-1809.1951.tb02488.x | pmid = 14953132 }}</ref> Larkin later introduced a test based on the F test;<ref name=Larkin1979>{{cite journal | last1 = Larkin | first1 = RP | year = 1979 | title = An algorithm for assessing bimodality vs. unimodality in a univariate distribution | journal = Behavior Research Methods & Instrumentation | volume = 11 | issue = 4| pages = 467β468 | doi = 10.3758/BF03205709 | doi-access = free }}</ref> Benett created one based on [[G-test|Fisher's G test]].<ref name=Bennett1992>{{cite journal | last1 = Bennett | first1 = SC | year = 1992 | title = Sexual dimorphism of ''Pteranodon'' and other pterosaurs, with comments on cranial crests | journal = Journal of Vertebrate Paleontology | volume = 12 | issue = 4| pages = 422β434 | doi=10.1080/02724634.1992.10011472}}</ref> Tokeshi has proposed a fourth test.<ref name=Tokeshi1992>{{cite journal | last1 = Tokeshi | first1 = M | year = 1992 | title = Dynamics and distribution in animal communities; theory and analysis | journal = Researches on Population Ecology | volume = 34 | issue = 2| pages = 249β273 | doi=10.1007/bf02514796| s2cid = 22912914 }}</ref><ref name=Barreto2003>{{cite journal | last1 = Barreto | first1 = S | last2 = Borges | first2 = PAV | last3 = Guo | first3 = Q | year = 2003 | title = A typing error in Tokeshi's test of bimodality | journal = Global Ecology and Biogeography | volume = 12 | issue = 2| pages = 173β174 | doi=10.1046/j.1466-822x.2003.00018.x| hdl = 10400.3/1408 | hdl-access = free }}</ref> A test based on a likelihood ratio has been proposed by Holzmann and Vollmer.<ref name=Holzmann2008/> A method based on the score and Wald tests has been proposed.<ref name=Carolan2001>{{cite journal | last1 = Carolan | first1 = AM | last2 = Rayner | first2 = JCW | year = 2001 | title = One sample tests for the location of modes of nonnormal data | journal = Journal of Applied Mathematics and Decision Sciences| volume = 5 | issue = 1| pages = 1β19 | doi=10.1155/s1173912601000013| citeseerx = 10.1.1.504.4999 | doi-access = free }}</ref> This method can distinguish between unimodal and bimodal distributions when the underlying distributions are known. ===Antimode tests=== Statistical tests for the antimode are known.<ref name=Hartigan2000>{{cite book |last=Hartigan |first=J. A. |date=2000 |chapter=Testing for Antimodes |editor1=Gaul W |editor2=Opitz O |editor3=Schader M |title=Data Analysis |series=Studies in Classification, Data Analysis, and Knowledge Organization |publisher=Springer |pages=169β181 |isbn=3-540-67731-3 |chapter-url=https://books.google.com/books?id=WVDmCAAAQBAJ&pg=PA169 }}</ref> ;Otsu's method [[Otsu's method]] is commonly employed in computer graphics to determine the optimal separation between two distributions. ===General tests=== To test if a distribution is other than unimodal, several additional tests have been devised: the [[bandwidth test (multimodal)|bandwidth test]],<ref name=Silverman1981/> the [[dip test]],<ref name=Hartigan1985>{{cite journal | last1 = Hartigan | first1 = JA | last2 = Hartigan | first2 = PM | year = 1985 | title = The dip test of unimodality | journal = Annals of Statistics | volume = 13 | issue = 1| pages = 70β84 | doi=10.1214/aos/1176346577| doi-access = free }}</ref> the [[excess mass test]],<ref name=Mueller1991>{{cite journal | last1 = Mueller | first1 = DW | last2 = Sawitzki | first2 = G | year = 1991 | title = Excess mass estimates and tests for multimodality | journal = Journal of the American Statistical Association | volume = 86 | issue = 415| pages = 738β746 |jstor=2290406 | doi=10.1080/01621459.1991.10475103}}</ref> the MAP test,<ref name="RozΓ‘l1994">{{cite journal | last1 = RozΓ‘l | first1 = GPM Hartigan JA | year = 1994 | title = The MAP test for multimodality | journal = Journal of Classification | volume = 11 | issue = 1| pages = 5β36 | doi = 10.1007/BF01201021 | s2cid = 118500771 }}</ref> the [[mode existence test]],<ref name=Minnotte1997>{{cite journal | last1 = Minnotte | first1 = MC | year = 1997 | title = Nonparametric testing of the existence of modes | journal = Annals of Statistics | volume = 25 | issue = 4| pages = 1646β1660 | doi=10.1214/aos/1031594735| doi-access = free }}</ref> the [[runt test]],<ref name=Hartigan1992>{{cite journal | last1 = Hartigan | first1 = JA | last2 = Mohanty | first2 = S | year = 1992 | title = The RUNT test for multimodality | journal = Journal of Classification | volume = 9 | pages = 63β70 | doi=10.1007/bf02618468| s2cid = 121960832 }}</ref><ref name=Andrushkiw2008>{{cite journal |author1=Andrushkiw RI |author2=Klyushin DD |author3=Petunin YI |date=2008 |title=A new test for unimodality |journal=Theory of Stochastic Processes |volume=14 |issue=1 |pages=1β6}}</ref> the [[span test]],<ref name=Hartigan1988>{{cite book |last=Hartigan |first=J. A. |year=1988 |chapter=The Span Test of Multimodality |title=Classification and Related Methods of Data Analysis |editor-first=H. H. |editor-last=Bock |publisher=North-Holland |location=Amsterdam |pages=229β236 |isbn=0-444-70404-3 }}</ref> and the [[saddle test]]. An implementation of the dip test is available for the [[R (programming language)|R programming language]].<ref>{{cite web|url=https://cran.r-project.org/web/packages/diptest/index.html|title=diptest: Hartigan's Dip Test Statistic for Unimodality - Corrected|first1=Martin Maechler (originally from Fortran and S.-plus by Dario|last1=Ringach|last2=NYU.edu)|date=5 December 2016|via=R-Packages}}</ref> The p-values for the dip statistic values range between 0 and 1. P-values less than 0.05 indicate significant multimodality and p-values greater than 0.05 but less than 0.10 suggest multimodality with marginal significance.<ref name=FreemanDale2012>{{cite journal | last1 = Freeman | last2 = Dale | year = 2012 | title = Assessing bimodality to detect the presence of a dual cognitive process | journal = Behavior Research Methods | volume = 45 | issue = 1 | pages = 83β97 | doi = 10.3758/s13428-012-0225-x | pmid = 22806703 | s2cid = 14500508 | url = http://psych.nyu.edu/freemanlab/pubs/2012_BRM.pdf| doi-access = free }}</ref> ===Silverman's test=== Silverman introduced a bootstrap method for the number of modes.<ref name=Silverman1981>{{cite journal | last1 = Silverman | first1 = B. W. | year = 1981 | title = Using kernel density estimates to investigate multimodality | journal = Journal of the Royal Statistical Society, Series B | volume = 43 | issue = 1| pages = 97β99 |jstor=2985156| bibcode = 1981JRSSB..43...97S | doi=10.1111/j.2517-6161.1981.tb01155.x}}</ref> The test uses a fixed bandwidth which reduces the power of the test and its interpretability. Under smoothed densities may have an excessive number of modes whose count during bootstrapping is unstable. ===Bajgier-Aggarwal test=== Bajgier and Aggarwal have proposed a test based on the kurtosis of the distribution.<ref name=Bajgier1991>{{cite journal |author1=Bajgier SM |author2=Aggarwal LK |date=1991 |title=Powers of goodness-of-fit tests in detecting balanced mixed normal distributions |journal=Educational and Psychological Measurement |volume=51 |issue=2 |pages=253β269 |doi=10.1177/0013164491512001|s2cid=121113601 }}</ref> ===Special cases=== Additional tests are available for a number of special cases: ;Mixture of two normal distributions A study of a mixture density of two normal distributions data found that separation into the two normal distributions was difficult unless the means were separated by 4β6 standard deviations.<ref name=Jackson1898>{{cite journal | last1 = Jackson | first1 = PR | last2 = Tucker | first2 = GT | last3 = Woods | first3 = HF | year = 1989 | title = Testing for bimodality in frequency distributions of data suggesting polymorphisms of drug metabolism--hypothesis testing | journal = British Journal of Clinical Pharmacology | volume = 28 | issue = 6| pages = 655β662 | doi=10.1111/j.1365-2125.1989.tb03558.x| pmid = 2611088 | pmc = 1380036 }}</ref> In [[astronomy]] the Kernel Mean Matching algorithm is used to decide if a data set belongs to a single normal distribution or to a mixture of two normal distributions. ;Beta-normal distribution This distribution is bimodal for certain values of is parameters. A test for these values has been described.<ref>{{cite conference|url=http://www.amstat.org/sections/srms/Proceedings/y2002/Files/JSM2002-000150.pdf|archive-url=https://web.archive.org/web/20160304051936/http://www.amstat.org/sections/srms/Proceedings/y2002/Files/JSM2002-000150.pdf|archive-date=2016-03-04|contribution=Beta-normal distribution: Bimodality properties and application|first1=Felix|last1=Famoye|first2=Carl|last2=Lee|first3=Nicholas|last3=Eugene|title=Joint Statistical Meetings - Section on Physical & Engineering Sciences (SPES)|publisher=American Statistical Society|pages=951-956}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)