Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Linear discriminant analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Multiclass LDA== [[File:4class3ddiscriminant.png|thumb|Visualisation for one-versus-all LDA axes for 4 classes in 3d]] [[File:3dProjections.png|thumb|Projections along linear discriminant axes for 4 classes]] In the case where there are more than two classes, the analysis used in the derivation of the Fisher discriminant can be extended to find a [[Linear subspace|subspace]] which appears to contain all of the class variability.<ref name="garson">Garson, G. D. (2008). Discriminant function analysis. {{cite web |url=http://www2.chass.ncsu.edu/garson/pa765/discrim.htm |title=PA 765: Discriminant Function Analysis |access-date=2008-03-04 |url-status=dead |archive-url=https://web.archive.org/web/20080312065328/http://www2.chass.ncsu.edu/garson/pA765/discrim.htm |archive-date=2008-03-12 }} .</ref> This generalization is due to [[C. R. Rao]].<ref name="Rao:1948">{{cite journal |last=Rao |first=R. C. |author-link=Calyampudi Radhakrishna Rao |title=The utilization of multiple measurements in problems of biological classification |journal=Journal of the Royal Statistical Society, Series B |volume=10 |issue=2 |pages=159β203 |year=1948 |doi=10.1111/j.2517-6161.1948.tb00008.x |jstor=2983775}}</ref> Suppose that each of C classes has a mean <math> \mu_i </math> and the same covariance <math> \Sigma </math>. Then the scatter between class variability may be defined by the sample covariance of the class means :<math> \Sigma_b = \frac{1}{C} \sum_{i=1}^C (\mu_i-\mu) (\mu_i-\mu)^\mathrm{T} </math> where <math> \mu </math> is the mean of the class means. The class separation in a direction <math> \vec w </math> in this case will be given by :<math> S = \frac{{\vec w}^\mathrm{T} \Sigma_b \vec w}{{\vec w}^\mathrm{T} \Sigma \vec w} </math> This means that when <math> \vec w </math> is an [[eigenvector]] of <math> \Sigma^{-1} \Sigma_b </math> the separation will be equal to the corresponding [[eigenvalue]]. If <math> \Sigma^{-1} \Sigma_b </math> is diagonalizable, the variability between features will be contained in the subspace spanned by the eigenvectors corresponding to the ''C'' β 1 largest eigenvalues (since <math> \Sigma_b </math> is of rank ''C'' β 1 at most). These eigenvectors are primarily used in feature reduction, as in PCA. The eigenvectors corresponding to the smaller eigenvalues will tend to be very sensitive to the exact choice of training data, and it is often necessary to use regularisation as described in the next section. If classification is required, instead of [[dimension reduction]], there are a number of alternative techniques available. For instance, the classes may be partitioned, and a standard Fisher discriminant or LDA used to classify each partition. A common example of this is "one against the rest" where the points from one class are put in one group, and everything else in the other, and then LDA applied. This will result in C classifiers, whose results are combined. Another common method is pairwise classification, where a new classifier is created for each pair of classes (giving ''C''(''C'' β 1)/2 classifiers in total), with the individual classifiers combined to produce a final classification.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)