Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Principal component analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Limitations === As noted above, the results of PCA depend on the scaling of the variables. This can be cured by scaling each feature by its standard deviation, so that one ends up with dimensionless features with unital variance.<ref name=Leznik>Leznik, M; Tofallis, C. 2005 [https://uhra.herts.ac.uk/bitstream/handle/2299/715/S56.pdf Estimating Invariant Principal Components Using Diagonal Regression.]</ref> The applicability of PCA as described above is limited by certain (tacit) assumptions<ref>Jonathon Shlens, [https://arxiv.org/abs/1404.1100 A Tutorial on Principal Component Analysis.]</ref> made in its derivation. In particular, PCA can capture linear correlations between the features but fails when this assumption is violated (see Figure 6a in the reference). In some cases, coordinate transformations can restore the linearity assumption and PCA can then be applied (see [[Kernel principal component analysis|kernel PCA]]). Another limitation is the mean-removal process before constructing the covariance matrix for PCA. In fields such as astronomy, all the signals are non-negative, and the mean-removal process will force the mean of some astrophysical exposures to be zero, which consequently creates unphysical negative fluxes,<ref name="soummer12"/> and forward modeling has to be performed to recover the true magnitude of the signals.<ref name="pueyo16">{{Cite journal|arxiv= 1604.06097 |last1= Pueyo|first1= Laurent |title= Detection and Characterization of Exoplanets using Projections on Karhunen Loeve Eigenimages: Forward Modeling |journal= The Astrophysical Journal |volume= 824|issue= 2|pages= 117|year= 2016|doi= 10.3847/0004-637X/824/2/117|bibcode = 2016ApJ...824..117P|s2cid= 118349503|doi-access= free}}</ref> As an alternative method, [[non-negative matrix factorization]] focusing only on the non-negative elements in the matrices is well-suited for astrophysical observations.<ref name="blantonRoweis07"/><ref name="zhu16"/><ref name="ren18"/> See more at [[#Non-negative matrix factorization|the relation between PCA and non-negative matrix factorization]]. PCA is at a disadvantage if the data has not been standardized before applying the algorithm to it. PCA transforms the original data into data that is relevant to the principal components of that data, which means that the new data variables cannot be interpreted in the same ways that the originals were. They are linear interpretations of the original variables. Also, if PCA is not performed properly, there is a high likelihood of information loss.<ref>{{cite web | title=What are the Pros and cons of the PCA? | website=i2tutorials | date=September 1, 2019 | url=https://www.i2tutorials.com/what-are-the-pros-and-cons-of-the-pca/ | access-date=June 4, 2021}}</ref> PCA relies on a linear model. If a dataset has a pattern hidden inside it that is nonlinear, then PCA can actually steer the analysis in the complete opposite direction of progress.<ref name=abbott>{{cite book | title=Applied Predictive Analytics | last=Abbott | first=Dean | isbn=9781118727966 | date=May 2014 | publisher=Wiley}}</ref>{{Page needed|date=June 2021}} Researchers at Kansas State University discovered that the sampling error in their experiments impacted the bias of PCA results. "If the number of subjects or blocks is smaller than 30, and/or the researcher is interested in PC's beyond the first, it may be better to first correct for the serial correlation, before PCA is conducted".<ref name=jiang /> The researchers at Kansas State also found that PCA could be "seriously biased if the autocorrelation structure of the data is not correctly handled".<ref name=jiang>{{cite journal| title=Bias in Principal Components Analysis Due to Correlated Observations| url=https://newprairiepress.org/agstatconference/2000/proceedings/13/ |last1=Jiang | first1=Hong| last2=Eskridge | first2=Kent M.| year=2000 | journal=Conference on Applied Statistics in Agriculture |issn=2475-7772| doi=10.4148/2475-7772.1247| doi-access=free}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)