Editing Bootstrap aggregating (section)

===Creation of Decision Trees ===
The next step of the algorithm involves the generation of [[decision tree]]s from the bootstrapped dataset. To achieve this, the process examines each gene/feature and determines for how many samples the feature's presence or absence yields a positive or negative result. This information is then used to compute a [[confusion matrix]], which lists the true positives, false positives, true negatives, and false negatives of the feature when used as a classifier. These features are then ranked according to various [[Decision tree learning|classification metrics]] based on their confusion matrices. Some common metrics include estimate of positive correctness (calculated by subtracting false positives from true positives), measure of "goodness", and [[Information gain in decision trees|information gain]]. These features are then used to partition the samples into two sets: those that possess the top feature, and those that do not.

The diagram below shows a decision tree of depth two being used to classify data. For example, a data point that exhibits Feature 1, but not Feature 2, will be given a "No". Another point that does not exhibit Feature 1, but does exhibit Feature 3, will be given a "Yes".

[[File:Decision_Tree_Depth_2.png|Decision Tree Depth 2]]

This process is repeated recursively for successive levels of the tree until the desired depth is reached. At the very bottom of the tree, samples that test positive for the final feature are generally classified as positive, while those that lack the feature are classified as negative. These trees are then used as predictors to classify new data.