Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Inductive bias
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Assumptions for inference in machine learning}} The '''inductive bias''' (also known as '''learning bias''') of a [[Machine learning|learning algorithm]] is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered.<ref name=Mitchell1980> {{Citation | last = Mitchell | first = T. M. | title = The need for biases in learning generalizations | place = New Brunswick, New Jersey, USA | publisher = Rutgers University | series = CBM-TR 5-110 | year = 1980 | citeseerx = 10.1.1.19.5466 }} </ref> Inductive bias is anything which makes the algorithm learn one pattern instead of another pattern (e.g., step-functions in [[decision tree]]s instead of continuous functions in [[linear regression]] models). Learning involves searching a space of solutions for a solution that provides a good explanation of the data. However, in many cases, there may be multiple equally appropriate solutions.<ref>{{Cite book |last=Goodman |first=Nelson |title=Fact, Fiction, and Forecast |publisher=Harvard University Press |year=1955 |isbn=978-0-674-29071-6 |pages=59β83 |chapter=The new riddle of induction}}</ref> An inductive bias allows a learning algorithm to prioritize one solution (or interpretation) over another, independently of the observed data.<ref>{{Cite journal |last=Mitchell |first=Tom M |date=1980 |title=The need for biases in learning generalizations |url=https://axon.cs.byu.edu/~martinez/classes/678/Papers/Mitchell_IB.pdf |journal=Rutgers University Technical Report CBM-TR-117 |pages=184β191}}</ref> In [[machine learning]], the aim is to construct algorithms that are able to learn to predict a certain target output. To achieve this, the learning algorithm is presented some training examples that demonstrate the intended relation of input and output values. Then the learner is supposed to approximate the correct output, even for examples that have not been shown during training. Without any additional assumptions, this problem cannot be solved since unseen situations might have an arbitrary output value. The kind of necessary assumptions about the nature of the target function are subsumed in the phrase ''inductive bias''.<ref name=Mitchell1980 /><ref name=DesJardinsandGordon1995> {{Citation | last1 = DesJardins | first1 = M. | last2 = Gordon | first2 = D. F. | author2-link = | title = Evaluation and selection of biases in machine learning | journal = Machine Learning | volume = 20 | year = 1995 | issue = 1β2 | pages = 5β22 | doi = 10.1007/BF00993472 | url = https://link.springer.com/article/10.1007/BF00993472 }} </ref> A classical example of an inductive bias is [[Occam's razor]], assuming that the simplest consistent hypothesis about the target function is actually the best. Here, ''consistent'' means that the hypothesis of the learner yields correct outputs for all of the examples that have been given to the algorithm. Approaches to a more formal definition of inductive bias are based on [[mathematical logic]]. Here, the inductive bias is a logical formula that, together with the training data, logically entails the hypothesis generated by the learner. However, this strict formalism fails in many practical cases in which the inductive bias can only be given as a rough description (e.g., in the case of [[artificial neural networks]]), or not at all. ==Types== The following is a list of common inductive biases in machine learning algorithms. * '''Maximum [[conditional independence]]''': if the hypothesis can be cast in a [[Bayesian inference|Bayesian]] framework, try to maximize conditional independence. This is the bias used in the [[Naive Bayes classifier]]. * '''Minimum [[Cross-validation (statistics)|cross-validation]] error''': when trying to choose among hypotheses, select the hypothesis with the lowest cross-validation error. Although cross-validation may seem to be free of bias, the [[No free lunch theorem|"no free lunch"]] theorems show that cross-validation must be biased, for example assuming that there is no information encoded in the ordering of the data. * '''Maximum margin''': when drawing a boundary between two classes, attempt to maximize the width of the boundary. This is the bias used in [[support vector machines]]. The assumption is that distinct classes tend to be separated by wide boundaries. * '''[[Minimum description length]]''': when forming a hypothesis, attempt to minimize the length of the description of the hypothesis. * '''Minimum features''': unless there is good evidence that a [[feature space|feature]] is useful, it should be deleted. This is the assumption behind [[feature selection]] algorithms. * '''Nearest neighbors''': assume that most of the cases in a small neighborhood in [[feature space]] belong to the same class. Given a case for which the class is unknown, guess that it belongs to the same class as the majority in its immediate neighborhood. This is the bias used in the [[k-nearest neighbors algorithm]]. The assumption is that cases that are near each other tend to belong to the same class. ==Shift of bias== Although most learning algorithms have a static bias, some algorithms are designed to shift their bias as they acquire more data.<ref name=Utgoff1984> {{Citation | last = Utgoff | first = P. E. | title = Shift of bias for inductive concept learning | place = New Brunswick, New Jersey, USA | publisher = Doctoral dissertation, Department of Computer Science, Rutgers University | year = 1984 | url = https://books.google.com/books?id=f9RylgKpHZsC&dq=%22Shift+of+bias+for+inductive+concept+learning%22&pg=PA107 | isbn = 9780934613002 }} </ref> This does not avoid bias, since the bias shifting process itself must have a bias. ==See also== * [[Algorithmic bias]] * [[Cognitive bias]] * [[No free lunch theorem]] * [[No free lunch in search and optimization]] ==References== {{reflist}} {{Biases}} {{differentiable computing}} [[Category:Bias]] [[Category:Machine learning]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Biases
(
edit
)
Template:Citation
(
edit
)
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Differentiable computing
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)