Editing Curse of dimensionality (section)

{{Short description|Difficulties arising when analyzing data with many aspects ("dimensions")}}
{{Machine learning}}
The '''curse of dimensionality''' refers to various phenomena that arise when analyzing and organizing data in [[high-dimensional space]]s that do not occur in low-dimensional settings such as the [[three-dimensional space|three-dimensional]] [[physical space]] of everyday experience.  The expression was coined by [[Richard E. Bellman]] when considering problems in [[dynamic programming]].<ref>{{Cite book|first=Richard Ernest |last=Bellman|author2=Rand Corporation|title=Dynamic programming|url=https://books.google.com/books?id=wdtoPwAACAAJ|year=1957|publisher=Princeton University Press|isbn=978-0-691-07951-6|page=ix}},<br />Republished: {{Cite book|first=Richard Ernest |last=Bellman|title=Dynamic Programming|url=https://books.google.com/books?id=fyVtp3EMxasC|year=2003|publisher=Courier Dover Publications|isbn=978-0-486-42809-3}}</ref><ref>{{Cite book|first=Richard Ernest |last=Bellman|title=Adaptive control processes: a guided tour|url=https://books.google.com/books?id=POAmAAAAMAAJ|year=1961|publisher=Princeton University Press|isbn=9780691079011}}</ref> The curse generally refers to issues that arise when the number of datapoints is small (in a suitably defined sense) relative to the intrinsic dimension of the data.

Dimensionally cursed phenomena occur in domains such as [[numerical analysis]], [[Sampling (statistics)|sampling]], [[combinatorics]], [[machine learning]], [[data mining]] and [[database]]s. The common theme of these problems is that when the dimensionality increases, the [[volume]] of the space increases so fast that the available data become sparse. In order to obtain a reliable result, the amount of data needed often grows exponentially with the dimensionality. Also, organizing and searching data often relies on detecting areas where objects form groups with similar properties; in high dimensional data, however, all objects appear to be sparse and dissimilar in many ways, which prevents common data organization strategies from being efficient.

{{toclimit|3}}