Editing Principal component analysis (section)

=== Robust PCA ===
While PCA finds the mathematically optimal method (as in minimizing the squared error), it is still sensitive to [[outlier]]s in the data that produce large errors, something that the method tries to avoid in the first place. It is therefore common practice to remove outliers before computing PCA. However, in some contexts, outliers can be difficult to identify.<ref>{{cite conference
| author = Kirill Simonov, Fedor V. Fomin, Petr A. Golovach, Fahad Panolan
| title = Refined Complexity of PCA with Outliers
| book-title = Proceedings of the 36th International Conference on Machine Learning (ICML 2019)
| date = June 9–15, 2019
| location = Long Beach, California, USA
| publisher = PMLR
| volume = 97
| pages = 5818–5826
| url = http://proceedings.mlr.press/v97/simonov19a.html
| editor = Kamalika Chaudhuri, Ruslan Salakhutdinov
}}</ref>
For example, in [[data mining]] algorithms like [[correlation clustering]], the assignment of points to clusters and outliers is not known beforehand.
A recently proposed generalization of PCA<ref>{{Cite book | doi = 10.1007/978-3-540-69497-7_27 | isbn = 978-3-540-69476-2 | series = Lecture Notes in Computer Science | year = 2008 | last1 = Kriegel | first1 = H. P. | last2 = Kröger | first2 = P. | last3 = Schubert | first3 = E. | last4 = Zimek | first4 = A. | title = Scientific and Statistical Database Management | chapter = A General Framework for Increasing the Robustness of PCA-Based Correlation Clustering Algorithms | volume = 5069 | pages = 418–435 | citeseerx = 10.1.1.144.4864 }}</ref> based on a weighted PCA increases robustness by assigning different weights to data objects based on their estimated relevancy.

Outlier-resistant variants of PCA have also been proposed, based on L1-norm formulations ([[L1-norm principal component analysis|L1-PCA]]).<ref name="mark2014"/><ref name="mark2017" />

[[Robust principal component analysis]] (RPCA) via decomposition in low-rank and sparse matrices is a modification of PCA that works well with respect to grossly corrupted observations.<ref name=RPCA>{{cite journal|last=Emmanuel J. Candes|author2=Xiaodong Li |author3=Yi Ma |author4=John Wright |title=Robust Principal Component Analysis?|journal=Journal of the ACM|volume=58|issue=3|pages=11 |doi=10.1145/1970392.1970395|arxiv=0912.3599|year=2011 |s2cid=7128002 }}</ref><ref name=RPCA-BOUWMANS>{{cite journal|last=T. Bouwmans|author2= E. Zahzah|title=Robust PCA via Principal Component Pursuit: A Review for a Comparative Evaluation in Video Surveillance|journal=Computer Vision and Image Understanding|volume= 122|pages= 22–34|year=2014|doi= 10.1016/j.cviu.2013.11.009}}</ref><ref name=RPCA-BOUWMANS-COSREV>{{Cite journal|last=T. Bouwmans|author2= A. Sobral|author3= S. Javed|author4= S. Jung|author5= E. Zahzah|title=Decomposition into Low-rank plus Additive Matrices for Background/Foreground Separation: A Review for a Comparative Evaluation with a Large-Scale Dataset |journal= Computer Science Review|volume= 23|pages= 1–71|arxiv=1511.01245|year=2015|doi= 10.1016/j.cosrev.2016.11.001|bibcode= 2015arXiv151101245B|s2cid= 10420698}}</ref>