Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Principal component analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Singular value decomposition === {{Main|Singular value decomposition}} The principal components transformation can also be associated with another matrix factorization, the [[singular value decomposition]] (SVD) of '''X''', :<math>\mathbf{X} = \mathbf{U}\mathbf{\Sigma}\mathbf{W}^T</math> Here '''Σ''' is an ''n''-by-''p'' [[Diagonal matrix|rectangular diagonal matrix]] of positive numbers ''σ''<sub>(''k'')</sub>, called the singular values of '''X'''; '''U''' is an ''n''-by-''n'' matrix, the columns of which are orthogonal unit vectors of length ''n'' called the left singular vectors of '''X'''; and '''W''' is a ''p''-by-''p'' matrix whose columns are orthogonal unit vectors of length ''p'' and called the right singular vectors of '''X'''. In terms of this factorization, the matrix '''X'''<sup>T</sup>'''X''' can be written :<math>\begin{align} \mathbf{X}^T\mathbf{X} & = \mathbf{W}\mathbf{\Sigma}^\mathsf{T} \mathbf{U}^\mathsf{T} \mathbf{U}\mathbf{\Sigma}\mathbf{W}^\mathsf{T} \\ & = \mathbf{W}\mathbf{\Sigma}^\mathsf{T} \mathbf{\Sigma} \mathbf{W}^\mathsf{T} \\ & = \mathbf{W}\mathbf{\hat{\Sigma}}^2 \mathbf{W}^\mathsf{T} \end{align}</math> where ''' <math> \mathbf{\hat{\Sigma}} </math>''' is the square diagonal matrix with the singular values of '''X '''and the excess zeros chopped off that satisfies''' <math> \mathbf{\hat{\Sigma}^2}=\mathbf{\Sigma}^\mathsf{T} \mathbf{\Sigma} </math>'''. Comparison with the eigenvector factorization of '''X'''<sup>T</sup>'''X''' establishes that the right singular vectors '''W''' of '''X''' are equivalent to the eigenvectors of '''X'''<sup>T</sup>'''X''', while the singular values ''σ''<sub>(''k'')</sub> of ''' <math> \mathbf{{X}}</math>''' are equal to the square-root of the eigenvalues ''λ''<sub>(''k'')</sub> of '''X'''<sup>T</sup>'''X'''. Using the singular value decomposition the score matrix '''T''' can be written :<math>\begin{align} \mathbf{T} & = \mathbf{X} \mathbf{W} \\ & = \mathbf{U}\mathbf{\Sigma}\mathbf{W}^\mathsf{T} \mathbf{W} \\ & = \mathbf{U}\mathbf{\Sigma} \end{align}</math> so each column of '''T''' is given by one of the left singular vectors of '''X''' multiplied by the corresponding singular value. This form is also the [[polar decomposition]] of '''T'''. Efficient algorithms exist to calculate the SVD of '''X''' without having to form the matrix '''X'''<sup>T</sup>'''X''', so computing the SVD is now the standard way to calculate a principal components analysis from a data matrix,<ref>{{Cite book |last1=Boyd |first1=Stephen |url=http://dx.doi.org/10.1017/cbo9780511804441 |title=Convex Optimization |last2=Vandenberghe |first2=Lieven |date=2004-03-08 |publisher=Cambridge University Press |doi=10.1017/cbo9780511804441 |isbn=978-0-521-83378-3}}</ref> unless only a handful of components are required. As with the eigen-decomposition, a truncated {{math|''n'' × ''L''}} score matrix '''T'''<sub>L</sub> can be obtained by considering only the first L largest singular values and their singular vectors: :<math>\mathbf{T}_L = \mathbf{U}_L\mathbf{\Sigma}_L = \mathbf{X} \mathbf{W}_L </math> The truncation of a matrix '''M''' or '''T''' using a truncated singular value decomposition in this way produces a truncated matrix that is the nearest possible matrix of [[Rank (linear algebra)|rank]] ''L'' to the original matrix, in the sense of the difference between the two having the smallest possible [[Frobenius norm]], a result known as the [[Low-rank approximation#Proof of Eckart–Young–Mirsky theorem (for Frobenius norm)|Eckart–Young theorem]] [1936]. <blockquote> '''Theorem (Optimal k‑dimensional fit).''' Let P be an n×m data matrix whose columns have been mean‑centered and scaled, and let <math>P = U \,\Sigma\, V^{T}</math> be its singular value decomposition. Then the best rank‑k approximation to P in the least‑squares (Frobenius‑norm) sense is <math>P_{k} = U_{k}\,\Sigma_{k}\,V_{k}^{T}</math>, where V<sub>k</sub> consists of the first k columns of V. Moreover, the relative residual variance is <math>R(k)=\frac{\sum_{j=k+1}^{m}\sigma_{j}^{2}}{\sum_{j=1}^{m}\sigma_{j}^{2}}</math>. </blockquote><ref name="Holmes2023" />
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)