Editing Support vector machine (section)

=== Kernel trick ===
{{Main|Kernel method}}
[[Image:Kernel trick idea.svg|thumbnail|right|A training example of SVM with kernel given by φ((''a'', ''b'')) = (''a'', ''b'', ''a''<sup>2</sup> + ''b''<sup>2</sup>)]]

Suppose now that we would like to learn a nonlinear classification rule which corresponds to a linear classification rule for the transformed data points <math> \varphi(\mathbf{x}_i).</math> Moreover, we are given a kernel function <math> k</math> which satisfies <math> k(\mathbf{x}_i, \mathbf{x}_j) = \varphi(\mathbf{x}_i) \cdot \varphi(\mathbf{x}_j)</math>.

We know the classification vector <math>\mathbf{w}</math> in the transformed space satisfies

<math display="block">  \mathbf{w} = \sum_{i=1}^n c_iy_i\varphi(\mathbf{x}_i),</math>

where, the <math>c_i</math> are obtained by solving the optimization problem

<math display="block"> \begin{align}
\text{maximize}\,\, f(c_1 \ldots c_n) &=  \sum_{i=1}^n c_i - \frac 1 2 \sum_{i=1}^n\sum_{j=1}^n y_ic_i(\varphi(\mathbf{x}_i) \cdot \varphi(\mathbf{x}_j))y_jc_j \\
                                      &=  \sum_{i=1}^n c_i - \frac 1 2 \sum_{i=1}^n\sum_{j=1}^n y_ic_ik(\mathbf{x}_i, \mathbf{x}_j)y_jc_j \\
\text{subject to } \sum_{i=1}^n c_i y_i &= 0,\,\text{and } 0 \leq c_i \leq \frac{1}{2n\lambda}\;\text{for all }i.
\end{align}
</math>

The coefficients <math> c_i</math> can be solved for using quadratic programming, as before. Again, we can find some index <math> i</math> such that <math> 0 < c_i <(2n\lambda)^{-1}</math>, so that <math> \varphi(\mathbf{x}_i)</math> lies on the boundary of the margin in the transformed space, and then solve

<math display="block"> \begin{align}
b = \mathbf{w}^\mathsf{T} \varphi(\mathbf{x}_i) - y_i &= \left[\sum_{j=1}^n c_jy_j\varphi(\mathbf{x}_j) \cdot \varphi(\mathbf{x}_i)\right] - y_i \\
  &= \left[\sum_{j=1}^n c_jy_jk(\mathbf{x}_j, \mathbf{x}_i)\right] - y_i.
\end{align}</math>

Finally,

<math display="block"> \mathbf{z} \mapsto \sgn(\mathbf{w}^\mathsf{T} \varphi(\mathbf{z}) - b) = \sgn \left(\left[\sum_{i=1}^n c_iy_ik(\mathbf{x}_i, \mathbf{z})\right] - b\right).</math>