Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Support vector machine
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Hard-margin === If the training data is [[linearly separable]], we can select two parallel hyperplanes that separate the two classes of data, so that the distance between them is as large as possible. The region bounded by these two hyperplanes is called the "margin", and the maximum-margin hyperplane is the hyperplane that lies halfway between them. With a normalized or standardized dataset, these hyperplanes can be described by the equations : <math>\mathbf{w}^\mathsf{T} \mathbf{x} - b = 1</math> (anything on or above this boundary is of one class, with label 1) and : <math>\mathbf{w}^\mathsf{T} \mathbf{x} - b = -1</math> (anything on or below this boundary is of the other class, with label β1). Geometrically, the distance between these two hyperplanes is <math>\tfrac{2}{\|\mathbf{w}\|}</math>,<ref>{{cite web |url=https://math.stackexchange.com/q/1305925/168764 |title=Why is the SVM margin equal to <math>\frac{2}{\|\mathbf{w}\|}</math> |author= |date=30 May 2015 |website=Mathematics Stack Exchange}}</ref> so to maximize the distance between the planes we want to minimize <math>\|\mathbf{w}\|</math>. The distance is computed using the [[distance from a point to a plane]] equation. We also have to prevent data points from falling into the margin, we add the following constraint: for each <math>i</math> either <math display="block">\mathbf{w}^\mathsf{T} \mathbf{x}_i - b \ge 1 \, , \text{ if } y_i = 1,</math> or <math display="block">\mathbf{w}^\mathsf{T} \mathbf{x}_i - b \le -1 \, , \text{ if } y_i = -1.</math> These constraints state that each data point must lie on the correct side of the margin. This can be rewritten as {{NumBlk||<math display="block">y_i(\mathbf{w}^\mathsf{T} \mathbf{x}_i - b) \ge 1, \quad \text{ for all } 1 \le i \le n.</math>|{{EquationRef|1}}}} We can put this together to get the optimization problem: <math>\begin{align} &\underset{\mathbf{w},\;b}{\operatorname{minimize}} && \frac{1}{2}\|\mathbf{w}\|^2\\ &\text{subject to} && y_i(\mathbf{w}^\top \mathbf{x}_i - b) \geq 1 \quad \forall i \in \{1,\dots,n\} \end{align}</math> The <math>\mathbf{w}</math> and <math>b</math> that solve this problem determine the final classifier, <math>\mathbf{x} \mapsto \sgn(\mathbf{w}^\mathsf{T} \mathbf{x} - b)</math>, where <math>\sgn(\cdot)</math> is the [[sign function]]. An important consequence of this geometric description is that the max-margin hyperplane is completely determined by those <math>\mathbf{x}_i</math> that lie nearest to it (explained below). These <math>\mathbf{x}_i</math> are called ''support vectors''.{{anchor|Support vectors}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)