Editing Nonlinear dimensionality reduction (section)

=== Kernel principal component analysis ===

Perhaps the most widely used algorithm for dimensional reduction is [[kernel principal component analysis|kernel PCA]].<ref>{{cite journal |first1=B. |last1=Schölkopf |first2=A. |last2=Smola |author3-link=Klaus-Robert Müller |first3=K.-R. |last3=Müller |title=Nonlinear Component Analysis as a Kernel Eigenvalue Problem |journal=Neural Computation  |volume=10 |issue=5 |pages=1299–1319 |date=1998 |publisher=[[MIT Press]] |doi=10.1162/089976698300017467|s2cid=6674407 }}</ref> PCA begins by computing the covariance matrix of the <math>m \times n</math> matrix <math>\mathbf{X}</math>

:<math>C = \frac{1}{m}\sum_{i=1}^m{\mathbf{x}_i\mathbf{x}_i^\mathsf{T}}.</math>

It then projects the data onto the first ''k'' eigenvectors of that matrix. By comparison, KPCA begins by computing the covariance matrix of the data after being transformed into a higher-dimensional space,

:<math>C = \frac{1}{m}\sum_{i=1}^m{\Phi(\mathbf{x}_i)\Phi(\mathbf{x}_i)^\mathsf{T}}.</math>

It then projects the transformed data onto the first ''k'' eigenvectors of that matrix, just like PCA. It uses the [[Kernel_method#Mathematics:_the_kernel_trick|kernel trick]] to factor away much of the computation, such that the entire process can be performed without actually computing <math>\Phi(\mathbf{x})</math>. Of course <math>\Phi</math> must be chosen such that it has a known corresponding kernel. Unfortunately, it is not trivial to find a good kernel for a given problem, so KPCA does not yield good results with some problems when using standard kernels. For example, it is known to perform poorly with these kernels on the [[Swiss roll]] manifold. However, one can view certain other methods that perform well in such settings (e.g., Laplacian Eigenmaps, LLE) as special cases of kernel PCA by constructing a data-dependent kernel matrix.<ref>{{cite conference |first1=Jihun |last1=Ham |first2=Daniel D. |last2=Lee |first3=Sebastian |last3=Mika |first4=Bernhard |last4=Schölkopf |title=A kernel view of the dimensionality reduction of manifolds |book-title=Proceedings of the 21st International Conference on Machine Learning, Banff, Canada, 2004 |doi=10.1145/1015330.1015417}}</ref>

KPCA has an internal model, so it can be used to map points onto its embedding that were not available at training time.