Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Linear classifier
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Discriminative training=== Discriminative training of linear classifiers usually proceeds in a [[supervised learning|supervised]] way, by means of an [[optimization algorithm]] that is given a training set with desired outputs and a [[loss function]] that measures the discrepancy between the classifier's outputs and the desired outputs. Thus, the learning algorithm solves an optimization problem of the form<ref name="ieee">{{cite journal |author1=Guo-Xun Yuan |author2=Chia-Hua Ho |author3=Chih-Jen Lin |title=Recent Advances of Large-Scale Linear Classification |journal=Proc. IEEE |volume=100 |issue=9 |year=2012|url=http://dmkd.cs.vt.edu/TUTORIAL/Bigdata/Papers/IEEE12.pdf |archive-url=https://web.archive.org/web/20170610105707/http://dmkd.cs.vt.edu/TUTORIAL/Bigdata/Papers/IEEE12.pdf |archive-date=2017-06-10 |url-status=live}}</ref> :<math>\underset{\mathbf{w}}{\arg\min} \;R(\mathbf{w}) + C \sum_{i=1}^N L(y_i, \mathbf{w}^\mathsf{T} \mathbf{x}_i)</math> where * {{math|'''w'''}} is a vector of classifier parameters, * {{math|''L''(''y<sub>i</sub>'', '''w'''<sup>T</sup>'''x'''<sub>''i''</sub>)}} is a loss function that measures the discrepancy between the classifier's prediction and the true output {{mvar|y<sub>i</sub>}} for the {{mvar|i}}'th training example, * {{math|''R''('''w''')}} is a [[Regularization (mathematics)|regularization]] function that prevents the parameters from getting too large (causing [[overfitting]]), and * {{mvar|C}} is a scalar constant (set by the user of the learning algorithm) that controls the balance between the regularization and the loss function. Popular loss functions include the [[hinge loss]] (for linear SVMs) and the [[log loss]] (for linear logistic regression). If the regularization function {{mvar|R}} is [[convex function|convex]], then the above is a [[convex optimization|convex problem]].{{r|ieee}} Many algorithms exist for solving such problems; popular ones for linear classification include ([[Stochastic gradient descent|stochastic]]) [[gradient descent]], [[L-BFGS]], [[coordinate descent]] and [[Newton method]]s.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)