Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Random forest
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Bagging=== {{main|Bootstrap aggregating}} [[File:Random Forest Bagging Illustration.png|thumb|Illustration of training a Random Forest model. The training dataset (in this case, of 250 rows and 100 columns) is randomly sampled with replacement ''n'' times. Then, a decision tree is trained on each sample. Finally, for prediction, the results of all ''n'' trees are aggregated to produce a final decision.]] The training algorithm for random forests applies the general technique of [[bootstrap aggregating]], or bagging, to tree learners. Given a training set {{mvar|X}} = {{mvar|x<sub>1</sub>}}, ..., {{mvar|x<sub>n</sub>}} with responses {{mvar|Y}} = {{mvar|y<sub>1</sub>}}, ..., {{mvar|y<sub>n</sub>}}, bagging repeatedly (''B'' times) selects a [[Sampling (statistics)#Replacement of selected units|random sample with replacement]] of the training set and fits trees to these samples: {{block indent | em = 1.5 | text = For {{mvar|b}} = 1, ..., {{mvar|B}}: # Sample, with replacement, {{mvar|n}} training examples from {{mvar|X}}, {{mvar|Y}}; call these {{mvar|X<sub>b</sub>}}, {{mvar|Y<sub>b</sub>}}. # Train a classification or regression tree {{mvar|f<sub>b</sub>}} on {{mvar|X<sub>b</sub>}}, {{mvar|Y<sub>b</sub>}}. }} After training, predictions for unseen samples {{mvar|x'}} can be made by averaging the predictions from all the individual regression trees on {{mvar|x'}}: <math display="block">\hat{f} = \frac{1}{B} \sum_{b=1}^Bf_b (x')</math> or by taking the plurality vote in the case of classification trees. This bootstrapping procedure leads to better model performance because it decreases the [[Bias–variance dilemma|variance]] of the model, without increasing the bias. This means that while the predictions of a single tree are highly sensitive to noise in its training set, the average of many trees is not, as long as the trees are not correlated. Simply training many trees on a single training set would give strongly correlated trees (or even the same tree many times, if the training algorithm is deterministic); bootstrap sampling is a way of de-correlating the trees by showing them different training sets. Additionally, an estimate of the uncertainty of the prediction can be made as the standard deviation of the predictions from all the individual regression trees on {{mvar|x′}}: <math display="block">\sigma = \sqrt{\frac{\sum_{b=1}^B (f_b(x') - \hat{f})^2}{B-1} }.</math> The number {{mvar|B}} of samples (equivalently, of trees) is a free parameter. Typically, a few hundred to several thousand trees are used, depending on the size and nature of the training set. {{mvar|B}} can be optimized using [[Cross-validation (statistics)|cross-validation]], or by observing the ''[[out-of-bag error]]'': the mean prediction error on each training sample {{mvar|x<sub>i</sub>}}, using only the trees that did not have {{mvar|x<sub>i</sub>}} in their bootstrap sample.<ref name="islr">{{cite book |author1=Gareth James |author2=Daniela Witten |author3=Trevor Hastie |author4=Robert Tibshirani |title=An Introduction to Statistical Learning |publisher=Springer |year=2013 |url=http://www-bcf.usc.edu/~gareth/ISL/ |pages=316–321}}</ref> The training and test error tend to level off after some number of trees have been fit.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)