Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Full-text search
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==The precision vs. recall tradeoff== [[Image:Full-text-search-results.png|150px|thumb|right|Diagram of a low-precision, low-recall search]] Recall measures the quantity of relevant results returned by a search, while precision is the measure of the quality of the results returned. Recall is the ratio of relevant results returned to all relevant results. Precision is the ratio of the number of relevant results returned to the total number of results returned. The diagram at right represents a low-precision, low-recall search. In the diagram the red and green dots represent the total population of potential search results for a given search. Red dots represent irrelevant results, and green dots represent relevant results. Relevancy is indicated by the proximity of search results to the center of the inner circle. Of all possible results shown, those that were actually returned by the search are shown on a light-blue background. In the example only 1 relevant result of 3 possible relevant results was returned, so the recall is a very low ratio of 1/3, or 33%. The precision for the example is a very low 1/4, or 25%, since only 1 of the 4 results returned was relevant.<ref name="isbn1430215941">{{cite book|last=Coles|first=Michael|year=2008|title=Pro Full-Text Search in SQL Server 2008|edition=Version 1|publisher=[[Apress|Apress Publishing Company]]|isbn=978-1-4302-1594-3}}</ref> Due to the ambiguities of [[natural language]], full-text-search systems typically includes options like [[stop word|filtering]] to increase precision and [[stemming]] to increase recall. [[Controlled vocabulary|Controlled-vocabulary]] searching also helps alleviate low-precision issues by [[tag (metadata)|tagging]] documents in such a way that ambiguities are eliminated. The trade-off between precision and recall is simple: an increase in precision can lower overall recall, while an increase in recall lowers precision.<ref name="YuwonoLee">{{Cite conference | first = Yuwono | last = B. |author2=Lee, D. L. | title = Search and ranking algorithms for locating resources on the World Wide Web | pages = 164 | publisher = 12th International Conference on Data Engineering (ICDE'96) | year = 1996}}</ref> {{See also|Precision and recall}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)