Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Neural network (machine learning)
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Generalization and statistics=== {{No footnotes|date=August 2019|section}} Applications whose goal is to create a system that generalizes well to unseen examples, face the possibility of [[Overfitting|over-training]]. This arises in convoluted or over-specified systems when the network capacity significantly exceeds the needed free parameters. Two approaches address over-training. The first is to use [[cross-validation (statistics)|cross-validation]] and similar techniques to check for the presence of over-training and to select hyperparameters to minimize the generalization error. The second is to use some form of ''[[regularization (mathematics)|regularization]]''. This concept emerges in a probabilistic (Bayesian) framework, where regularization can be performed by selecting a larger prior probability over simpler models; but also in statistical learning theory, where the goal is to minimize over two quantities: the 'empirical risk' and the 'structural risk', which roughly corresponds to the error over the training set and the predicted error in unseen data due to overfitting. [[File:Synapse deployment.jpg|thumb|right|upright=1.15|Confidence analysis of a neural network]] Supervised neural networks that use a [[mean squared error]] (MSE) cost function can use formal statistical methods to determine the confidence of the trained model. The MSE on a validation set can be used as an estimate for variance. This value can then be used to calculate the [[confidence interval]] of network output, assuming a [[normal distribution]]. A confidence analysis made this way is statistically valid as long as the output [[probability distribution]] stays the same and the network is not modified. By assigning a [[softmax activation function]], a generalization of the [[logistic function]], on the output layer of the neural network (or a softmax component in a component-based network) for categorical target variables, the outputs can be interpreted as posterior probabilities. This is useful in classification as it gives a certainty measure on classifications. The softmax activation function is: :<math>y_i=\frac{e^{x_i}}{\sum_{j=1}^c e^{x_j}}</math> <section end="theory" />
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)