Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Information extraction
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Present significance== The present significance of IE pertains to the growing amount of information available in unstructured form. [[Tim Berners-Lee]], inventor of the [[World Wide Web]], refers to the existing [[Internet]] as the web of ''documents'' <ref>{{cite web|url=http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf|title=Linked Data - The Story So Far}}</ref> and advocates that more of the content be made available as a [[semantic web|web of ''data'']].<ref>{{cite web|url=http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html|title=Tim Berners-Lee on the next Web|access-date=2010-03-27|archive-date=2011-04-10|archive-url=https://web.archive.org/web/20110410204952/http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html|url-status=dead}}</ref> Until this transpires, the web largely consists of unstructured documents lacking semantic [[metadata]]. Knowledge contained within these documents can be made more accessible for machine processing by means of transformation into [[relational database|relational form]], or by marking-up with [[XML]] tags. An intelligent agent monitoring a news data feed requires IE to transform unstructured data into something that can be reasoned with. A typical application of IE is to scan a set of documents written in a [[natural language]] and populate a database with the information extracted.<ref>[[Rohini Kesavan Srihari|R. K. Srihari]], W. Li, C. Niu and T. Cornell,"InfoXtract: A Customizable Intermediate Level Information Extraction Engine",[https://web.archive.org/web/20080507153920/http://journals.cambridge.org/action/displayIssue?iid=359643 Journal of Natural Language Engineering],{{dead link|date=September 2020}} Cambridge U. Press, 14(1), 2008, pp.33-69.</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)