Editing Principal component analysis (section)

== Overview ==
When performing PCA, the first principal component of a set of <math>p</math> variables is the derived variable formed as a linear combination of the original variables that explains the most variance. The second principal component explains the most variance in what is left once the effect of the first component is removed, and we may proceed through <math>p</math> iterations until all the variance is explained. PCA is most commonly used when many of the variables are highly correlated with each other and it is desirable to reduce their number to an [[linear independence|independent set]].
The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The <math>i</math>-th principal component can be taken as a direction orthogonal to the first <math>i-1</math> principal components that maximizes the variance of the projected data.

For either objective, it can be shown that the principal components are [[eigenvectors]] of the data's [[covariance matrix]]. Thus, the principal components are often computed by [[Eigendecomposition of a matrix|eigendecomposition]] of the data covariance matrix or [[singular value decomposition]] of the data matrix. PCA is the simplest of the true eigenvector-based multivariate analyses and is closely related to [[factor analysis]]. Factor analysis typically incorporates more domain-specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to [[Canonical correlation|canonical correlation analysis (CCA)]]. CCA defines coordinate systems that optimally describe the [[cross-covariance]] between two datasets while PCA defines a new [[orthogonal coordinate system]] that optimally describes variance in a single dataset.<ref>{{Cite journal|author1=Barnett, T. P. |author2=R. Preisendorfer. |name-list-style=amp |title=Origins and levels of monthly and seasonal forecast skill for United States surface air temperatures determined by canonical correlation analysis |journal=Monthly Weather Review |volume=115 |issue=9 |pages=1825 |year=1987 |doi=10.1175/1520-0493(1987)115<1825:oaloma>2.0.co;2|bibcode=1987MWRv..115.1825B|doi-access=free }}</ref><ref>{{Cite book |last1=Hsu|first1=Daniel |first2=Sham M.|last2=Kakade |first3=Tong|last3=Zhang |title=A spectral algorithm for learning hidden markov models |arxiv=0811.4413 |year=2008 |bibcode=2008arXiv0811.4413H}}</ref><ref name="mark2017">{{cite journal|last1=Markopoulos|first1=Panos P.|last2=Kundu|first2=Sandipan|last3=Chamadia|first3=Shubham |last4=Pados|first4=Dimitris A.|title=Efficient L1-Norm Principal-Component Analysis via Bit Flipping|journal=IEEE Transactions on Signal Processing|date=15 August 2017|volume=65|issue=16|pages=4252–4264|doi=10.1109/TSP.2017.2708023|arxiv=1610.01959|bibcode=2017ITSP...65.4252M|s2cid=7931130}}</ref><ref name="l1tucker">{{cite journal|last1=Chachlakis|first1=Dimitris G.|last2=Prater-Bennette|first2=Ashley|last3=Markopoulos|first3=Panos P.|title=L1-norm Tucker Tensor Decomposition|journal=IEEE Access|date=22 November 2019|volume=7|pages=178454–178465|doi=10.1109/ACCESS.2019.2955134|arxiv=1904.06455|doi-access=free|bibcode=2019IEEEA...7q8454C }}</ref> [[Robust principal component analysis|Robust]] and [[Lp space|L1-norm]]-based variants of standard PCA have also been proposed.<ref name="mark2014">{{cite journal|last1=Markopoulos|first1=Panos P.|last2=Karystinos|first2=George N.|last3=Pados|first3=Dimitris A.|title=Optimal Algorithms for L1-subspace Signal Processing|journal=IEEE Transactions on Signal Processing|date=October 2014|volume=62|issue=19|pages=5046–5058|doi=10.1109/TSP.2014.2338077|arxiv=1405.6785|bibcode=2014ITSP...62.5046M|s2cid=1494171}}</ref><ref>{{cite journal |last1=Zhan |first1=J. |last2=Vaswani |first2=N. |date=2015 |title=Robust PCA With Partial Subspace Knowledge |url=https://doi.org/10.1109/tsp.2015.2421485 |journal=IEEE Transactions on Signal Processing |volume=63 |issue=13 |pages=3332–3347 | doi=10.1109/tsp.2015.2421485|arxiv=1403.1591 |bibcode=2015ITSP...63.3332Z |s2cid=1516440 }}</ref><ref>{{cite book|last1=Kanade|first1=T.|last2=Ke|first2=Qifa |title=2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) |chapter=Robust L₁ Norm Factorization in the Presence of Outliers and Missing Data by Alternative Convex Programming |volume=1|pages=739–746|date=June 2005|doi=10.1109/CVPR.2005.309|publisher=IEEE|isbn=978-0-7695-2372-9|citeseerx=10.1.1.63.4605|s2cid=17144854}}</ref><ref name="l1tucker" />