Editing Principal component analysis (section)

== Generalizations ==

=== Sparse PCA ===
{{main|Sparse PCA}}
A particular disadvantage of PCA is that the principal components are usually linear combinations of all input variables. [[Sparse PCA]] overcomes this disadvantage by finding linear combinations that contain just a few input variables. It extends the classic method of principal component analysis (PCA) for the reduction of dimensionality of data by adding sparsity constraint on the input variables.
Several approaches have been proposed, including
* a regression framework,<ref>
{{cite journal
 |author1=Hui Zou
 |author2=Trevor Hastie
 |author3=Robert Tibshirani
 |year=2006
 |title=Sparse principal component analysis
 |journal=[[Journal of Computational and Graphical Statistics]]
 |volume=15 |issue=2 |pages=262–286
 |url=http://www-stat.stanford.edu/~hastie/Papers/spc_jcgs.pdf
 |doi=10.1198/106186006x113430
|citeseerx=10.1.1.62.580
 |s2cid=5730904
 }}</ref>
* a convex relaxation/semidefinite programming framework,<ref name="SDP">
{{cite journal
 |author1=Alexandre d'Aspremont
 |author2=Laurent El Ghaoui
 |author3=Michael I. Jordan
 |author4=Gert R. G. Lanckriet
 |year=2007
 |title=A Direct Formulation for Sparse PCA Using Semidefinite Programming
 |journal=[[SIAM Review]]
 |volume=49 |issue=3 |pages=434–448
 |url=http://www.cmap.polytechnique.fr/~aspremon/PDF/sparsesvd.pdf
 |doi=10.1137/050645506
|arxiv=cs/0406021
 |s2cid=5490061
 }}</ref>
* a generalized power method framework<ref>
{{cite journal
 |author1=Michel Journee
 |author2=Yurii Nesterov
 |author3=Peter Richtarik
 |author4=Rodolphe Sepulchre
 |year=2010
 |title=Generalized Power Method for Sparse Principal Component Analysis
 |journal=[[Journal of Machine Learning Research]]
 |volume=11 |pages=517–553
 |url=http://jmlr.csail.mit.edu/papers/volume11/journee10a/journee10a.pdf
 |arxiv=0811.4724
 |id=CORE Discussion Paper 2008/70
|bibcode=2008arXiv0811.4724J
 }}</ref>
* an alternating maximization framework<ref>
{{cite arXiv
 |author1=Peter Richtarik
 |author2=Martin Takac
 |author3=S. Damla Ahipasaoglu
 |year=2012
 |title=Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes
 |eprint=1212.4137
 |class=stat.ML
}}</ref>
* forward-backward greedy search and exact methods using branch-and-bound techniques,<ref>
{{cite conference
 |author1=Baback Moghaddam
 |author2=Yair Weiss
 |author3=Shai Avidan
 |year=2005
 |chapter=Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms
 |title=Advances in Neural Information Processing Systems
 |volume=18
 |publisher=MIT Press
 |chapter-url=http://books.nips.cc/papers/files/nips18/NIPS2005_0643.pdf
}}</ref>
* Bayesian formulation framework.<ref>
{{cite journal
 |author1=Yue Guan
 |author2=Jennifer Dy|author2-link=Jennifer Dy
 |year=2009
 |title=Sparse Probabilistic Principal Component Analysis
 |journal=[[Journal of Machine Learning Research Workshop and Conference Proceedings]]
 |volume=5 |page=185
 |url=http://jmlr.csail.mit.edu/proceedings/papers/v5/guan09a/guan09a.pdf
}}</ref>
The methodological and theoretical developments of Sparse PCA as well as its applications in scientific studies were recently reviewed in a survey paper.<ref>
{{cite journal
 |author1=Hui Zou
 |author2=Lingzhou Xue
 |year=2018
 |title=A Selective Overview of Sparse Principal Component Analysis
 |journal=[[Proceedings of the IEEE]]
 |volume=106 |issue=8
 |pages=1311–1320
 |doi=10.1109/JPROC.2018.2846588
 |doi-access=free
 }}</ref>

=== Nonlinear PCA ===
[[File:Elmap breastcancer wiki.png|thumb|300px| Linear PCA versus nonlinear Principal Manifolds<ref>[[Alexander Nikolaevich Gorban|A. N. Gorban]], A. Y. Zinovyev, [https://arxiv.org/abs/0809.0490 "Principal Graphs and Manifolds"], In: ''Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods and Techniques'', Olivas E.S. et al Eds. Information Science Reference, IGI Global: Hershey, PA, USA, 2009. 28–59.</ref> for [[Scientific visualization|visualization]] of [[breast cancer]] [[microarray]] data: a) Configuration of nodes and 2D Principal Surface in the 3D PCA linear manifold. The dataset is curved and cannot be mapped adequately on a 2D principal plane; b) The distribution in the internal 2D non-linear principal surface coordinates (ELMap2D) together with an estimation of the density of points; c) The same as b), but for the linear 2D PCA manifold (PCA2D). The "basal" breast cancer subtype is visualized more adequately with ELMap2D and some features of the distribution become better resolved in comparison to PCA2D. Principal manifolds are produced by the [[elastic map]]s algorithm. Data are available for public competition.<ref>{{cite journal |last1=Wang |first1=Y. |last2=Klijn |first2=J. G. |last3=Zhang |first3=Y. |last4=Sieuwerts |first4=A. M. |last5=Look |first5=M. P. |last6=Yang |first6=F. |last7=Talantov |first7=D. |last8=Timmermans |first8=M. |last9=Meijer-van Gelder |first9=M. E. |last10=Yu |first10=J. |title=Gene expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer |journal=[[The Lancet]] |volume=365 |issue=9460 |pages=671–679 |year=2005 |doi=10.1016/S0140-6736(05)17947-1 |pmid=15721472 |s2cid=16358549 |display-authors=etal}} [https://www.ihes.fr/~zinovyev/princmanif2006/ Data online]</ref> Software is available for free non-commercial use.<ref>{{cite web |first=A. |last=Zinovyev |url=http://bioinfo-out.curie.fr/projects/vidaexpert/ |title=ViDaExpert – Multidimensional Data Visualization Tool |work=[[Curie Institute (Paris)|Institut Curie]] |location=Paris }} (free for non-commercial use)</ref>]]

Most of the modern methods for [[nonlinear dimensionality reduction]] find their theoretical and algorithmic roots in PCA or K-means. Pearson's original idea was to take a straight line (or plane) which will be "the best fit" to a set of data points. [[Trevor Hastie]] expanded on this concept by proposing '''Principal [[curve]]s'''<ref>{{cite journal|author1-last=Hastie|author1-first=T. |author1-link=Trevor Hastie|author2-last=Stuetzle|author2-first=W. |title=Principal Curves|journal=[[Journal of the American Statistical Association]]|date=June 1989|volume=84|issue=406|pages=502–506|doi=10.1080/01621459.1989.10478797 |url=https://web.stanford.edu/~hastie/Papers/Principal_Curves.pdf}}</ref> as the natural extension for the geometric interpretation of PCA, which explicitly constructs a manifold for data [[approximation]] followed by [[Projection (mathematics)|projecting]] the points onto it. See also the [[elastic map]] algorithm and [[principal geodesic analysis]].<ref>A.N. Gorban, B. Kegl, D.C. Wunsch, A. Zinovyev (Eds.), [https://www.researchgate.net/publication/271642170_Principal_Manifolds_for_Data_Visualisation_and_Dimension_Reduction_LNCSE_58 Principal Manifolds for Data Visualisation and Dimension Reduction],
LNCSE 58, Springer, Berlin – Heidelberg – New York, 2007. {{isbn|978-3-540-73749-0}}</ref> Another popular generalization is [[kernel PCA]], which corresponds to PCA performed in a reproducing kernel Hilbert space associated with a positive definite kernel.

In [[multilinear subspace learning]],<ref name="Vasilescu2003">{{cite conference
 |first1=M.A.O. |last1=Vasilescu
 |first2=D. |last2=Terzopoulos 
 |url=http://www.cs.toronto.edu/~maov/tensorfaces/cvpr03.pdf 
 |title=Multilinear Subspace Analysis of Image Ensembles
 |conference=Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’03)
 |location=Madison, WI
 |year=2003
 }}</ref><ref name="Vasilescu2002tensorfaces">{{cite book
 |first1=M.A.O. |last1=Vasilescu
 |first2=D. |last2=Terzopoulos 
 |url=http://www.cs.toronto.edu/~maov/tensorfaces/Springer%20ECCV%202002_files/eccv02proceeding_23500447.pdf 
 |title=Multilinear Analysis of Image Ensembles: TensorFaces
 |series=Lecture Notes in Computer Science 2350; (Presented at Proc. 7th European Conference on Computer Vision (ECCV'02), Copenhagen, Denmark)
 |publisher=Springer, Berlin, Heidelberg
 |doi=10.1007/3-540-47969-4_30
 |isbn=978-3-540-43745-1
 |year=2002
 }}</ref><ref name="MPCA-MICA2005">{{cite conference
 |first1=M.A.O. |last1=Vasilescu
 |first2=D. |last2=Terzopoulos 
 |url=http://www.media.mit.edu/~maov/mica/mica05.pdf 
 |title=Multilinear Independent Component Analysis
 |conference=Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’05) 
 |location=San Diego, CA
 |date=June 2005
 |volume=1
 |pages=547–553}}</ref> PCA is generalized to [[multilinear principal component analysis|multilinear PCA]] (MPCA) that extracts features directly from tensor representations. MPCA is solved by performing PCA in each mode of the tensor iteratively. MPCA has been applied to face recognition, gait recognition, etc. MPCA is further extended to uncorrelated MPCA, non-negative MPCA and robust MPCA.

''N''-way principal component analysis may be performed with models such as [[Tucker decomposition]], [[PARAFAC]], multiple factor analysis, co-inertia analysis, STATIS, and DISTATIS.

=== Robust PCA ===
While PCA finds the mathematically optimal method (as in minimizing the squared error), it is still sensitive to [[outlier]]s in the data that produce large errors, something that the method tries to avoid in the first place. It is therefore common practice to remove outliers before computing PCA. However, in some contexts, outliers can be difficult to identify.<ref>{{cite conference
| author = Kirill Simonov, Fedor V. Fomin, Petr A. Golovach, Fahad Panolan
| title = Refined Complexity of PCA with Outliers
| book-title = Proceedings of the 36th International Conference on Machine Learning (ICML 2019)
| date = June 9–15, 2019
| location = Long Beach, California, USA
| publisher = PMLR
| volume = 97
| pages = 5818–5826
| url = http://proceedings.mlr.press/v97/simonov19a.html
| editor = Kamalika Chaudhuri, Ruslan Salakhutdinov
}}</ref>
For example, in [[data mining]] algorithms like [[correlation clustering]], the assignment of points to clusters and outliers is not known beforehand.
A recently proposed generalization of PCA<ref>{{Cite book | doi = 10.1007/978-3-540-69497-7_27 | isbn = 978-3-540-69476-2 | series = Lecture Notes in Computer Science | year = 2008 | last1 = Kriegel | first1 = H. P. | last2 = Kröger | first2 = P. | last3 = Schubert | first3 = E. | last4 = Zimek | first4 = A. | title = Scientific and Statistical Database Management | chapter = A General Framework for Increasing the Robustness of PCA-Based Correlation Clustering Algorithms | volume = 5069 | pages = 418–435 | citeseerx = 10.1.1.144.4864 }}</ref> based on a weighted PCA increases robustness by assigning different weights to data objects based on their estimated relevancy.

Outlier-resistant variants of PCA have also been proposed, based on L1-norm formulations ([[L1-norm principal component analysis|L1-PCA]]).<ref name="mark2014"/><ref name="mark2017" />

[[Robust principal component analysis]] (RPCA) via decomposition in low-rank and sparse matrices is a modification of PCA that works well with respect to grossly corrupted observations.<ref name=RPCA>{{cite journal|last=Emmanuel J. Candes|author2=Xiaodong Li |author3=Yi Ma |author4=John Wright |title=Robust Principal Component Analysis?|journal=Journal of the ACM|volume=58|issue=3|pages=11 |doi=10.1145/1970392.1970395|arxiv=0912.3599|year=2011 |s2cid=7128002 }}</ref><ref name=RPCA-BOUWMANS>{{cite journal|last=T. Bouwmans|author2= E. Zahzah|title=Robust PCA via Principal Component Pursuit: A Review for a Comparative Evaluation in Video Surveillance|journal=Computer Vision and Image Understanding|volume= 122|pages= 22–34|year=2014|doi= 10.1016/j.cviu.2013.11.009}}</ref><ref name=RPCA-BOUWMANS-COSREV>{{Cite journal|last=T. Bouwmans|author2= A. Sobral|author3= S. Javed|author4= S. Jung|author5= E. Zahzah|title=Decomposition into Low-rank plus Additive Matrices for Background/Foreground Separation: A Review for a Comparative Evaluation with a Large-Scale Dataset |journal= Computer Science Review|volume= 23|pages= 1–71|arxiv=1511.01245|year=2015|doi= 10.1016/j.cosrev.2016.11.001|bibcode= 2015arXiv151101245B|s2cid= 10420698}}</ref>