Editing Decision tree learning (section)

===Advantages===
Amongst other data mining methods, decision trees have various advantages:
* '''Simple to understand and interpret.''' People are able to understand decision tree models after a brief explanation. Trees can also be displayed graphically in a way that is easy for non-experts to interpret.<ref name=":0">{{Cite book|title=An Introduction to Statistical Learning|url=https://archive.org/details/introductiontost00jame|url-access=limited|last1=Gareth|first1=James|last2=Witten|first2=Daniela|last3=Hastie|first3=Trevor|last4=Tibshirani|first4=Robert|publisher=Springer|year=2015|isbn=978-1-4614-7137-0|location=New York|pages=[https://archive.org/details/introductiontost00jame/page/n323 315]}}</ref>
* '''Able to handle both numerical and [[Categorical variable|categorical]] data.'''<ref name=":0" /> Other techniques are usually specialized in analyzing datasets that have only one type of variable. (For example, relation rules can be used only with nominal variables while neural networks can be used only with numerical variables or categoricals converted to 0-1 values.) Early decision trees were only capable of handling categorical variables, but more recent versions, such as C4.5, do not have this limitation.<ref name="tdidt" />
* '''Requires little data preparation.''' Other techniques often require data normalization. Since trees can handle qualitative predictors, there is no need to create [[dummy variable (statistics)|dummy variables]].<ref name=":0" />
* '''Uses a [[white box (software engineering)|white box]] or open-box<ref name="tdidt" /> model.''' If a given situation is observable in a model the explanation for the condition is easily explained by [[Boolean logic]]. By contrast, in a [[black box]] model, the explanation for the results is typically difficult to understand, for example with an [[artificial neural network]].
* '''Possible to validate a model using statistical tests.''' That makes it possible to account for the reliability of the model.
* Non-parametric approach that makes no assumptions of the training data or prediction residuals; e.g., no distributional, independence, or constant variance assumptions
* '''Performs well with large datasets.''' Large amounts of data can be analyzed using standard computing resources in reasonable time.
* '''Accuracy with flexible modeling'''. These methods may be applied to healthcare research with increased accuracy.<ref>{{Cite journal |last1=Hu |first1=Liangyuan |last2=Li |first2=Lihua |date=2022-12-01 |title=Using Tree-Based Machine Learning for Health Studies: Literature Review and Case Series |journal=International Journal of Environmental Research and Public Health |language=en |volume=19 |issue=23 |pages=16080 |doi=10.3390/ijerph192316080 |issn=1660-4601 |pmc=9736500 |pmid=36498153|doi-access=free }}</ref>
* '''Mirrors human decision making more closely than other approaches.'''<ref name=":0" /> This could be useful when modeling human decisions/behavior.
* '''Robust against co-linearity, particularly boosting.'''
* '''In built''' '''[[feature selection]]'''. Additional irrelevant feature will be less used so that they can be removed on subsequent runs. The hierarchy of attributes in a decision tree reflects the importance of attributes.<ref>{{Cite book|last=Provost, Foster, 1964-|title=Data science for business : [what you need to know about data mining and data-analytic thinking]|date=2013|publisher=O'Reilly|others=Fawcett, Tom.|isbn=978-1-4493-6132-7|edition= 1st|location=Sebastopol, Calif.|oclc=844460899}}</ref> It means that the features on top are the most informative.<ref>{{Cite journal|last1=Piryonesi S. Madeh|last2=El-Diraby Tamer E.|date=2020-06-01|title=Role of Data Analytics in Infrastructure Asset Management: Overcoming Data Size and Quality Problems|journal=Journal of Transportation Engineering, Part B: Pavements|volume=146|issue=2|pages=04020022|doi=10.1061/JPEODX.0000175| s2cid=216485629 }}</ref>
* '''Decision trees can approximate any [[Boolean function]] e.g. [[Exclusive or|XOR]].<ref>{{cite journal |first1=Dinesh |last1=Mehtaa |first2=Vijay |last2=Raghavan |title=Decision tree approximations of Boolean functions |journal=Theoretical Computer Science |volume=270 |issue=1–2 |year=2002 |pages=609–623 |doi=10.1016/S0304-3975(01)00011-1 |doi-access=free }}</ref>'''