Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Complexity
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Classification Problems=== There has also been interest in measuring the complexity of classification problems in [[Supervised learning|supervised machine learning]]. This can be useful in [[meta-learning (computer science)|meta-learning]] to determine for which data sets filtering (or removing suspected noisy instances from the training set) is the most beneficial<ref>{{cite journal|title= Predicting Noise Filtering Efficacy with Data Complexity Measures for Nearest Neighbor Classification|journal= Pattern Recognition|volume= 46|pages= 355–364|doi= 10.1016/j.patcog.2012.07.009|year= 2013|last1= Sáez|first1= José A.|last2= Luengo|first2= Julián|last3= Herrera|first3= Francisco|issue= 1|bibcode= 2013PatRe..46..355S}}</ref> and could be expanded to other areas. For [[binary classification]], such measures can consider the overlaps in feature values from differing classes, the separability of the classes, and measures of geometry, topology, and density of [[manifold]]s.<ref>Ho, T.K.; Basu, M. (2002). "[https://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=990132&tag=1 Complexity Measures of Supervised Classification Problems]". IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (3), pp 289–300.</ref> For non-binary classification problems, instance hardness<ref>Smith, M.R.; Martinez, T.; Giraud-Carrier, C. (2014). "[https://link.springer.com/article/10.1007%2Fs10994-013-5422-z An Instance Level Analysis of Data Complexity]". Machine Learning, 95(2): 225–256.</ref> is a bottom-up approach that first seeks to identify instances that are likely to be misclassified (assumed to be the most complex). The characteristics of such instances are then measured using [[supervised learning|supervised]] measures such as the number of disagreeing neighbors or the likelihood of the assigned class label given the input features.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)