Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Annotation
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==== Semantic labelling common tasks ==== Here are some of the common semantic labelling tasks presented in the literature: ===== Entity linking and disambiguation ===== This is the most common task in semantic labelling. Given a text of a cell and a data source, the approach predicts the entity and link it to the one identified in the given data source. For example, if the input to the approach were the text "Richard Feynman" and a URL to the SPARQL endpoint of DBpedia, the approach would return "[https://dbpedia.org/resource/Richard_Feynman http://dbpedia.org/resource/Richard_Feynman]", which is the entity from DBpedia. Some approaches use exact match.<ref name=":02" /> while others use similarity metrics such as [[Cosine similarity]]<ref name=":7" /> ===== Subject column identification ===== The subject column of a table is the column that contain the main subjects/entities in the table.<ref name="auto12"/><ref name=":5" /><ref name=":8" /><ref>{{Citation |last1=Ermilov |first1=Ivan |date=2016 |url=http://dx.doi.org/10.1007/978-3-319-49004-5_11 |pages=163–179 |place=Cham |publisher=Springer International Publishing |doi=10.1007/978-3-319-49004-5_11 |isbn=978-3-319-49003-8 |access-date=2022-09-22 |last2=Ngomo |first2=Axel-Cyrille Ngonga|title=Knowledge Engineering and Knowledge Management |chapter=TAIPAN: Automatic Property Mapping for Tabular Data |series=Lecture Notes in Computer Science |volume=10024 |s2cid=37730677 |url-access=subscription }}</ref><ref name=":9">{{Cite journal |last=Zhang |first=Ziqi |date=2017-08-07 |editor-last=Hitzler |editor-first=Pascal |editor2-last=Cruz |editor2-first=Isabel |title=Effective and efficient Semantic Table Interpretation using TableMiner+ |url=https://www.medra.org/servlet/aliasResolver?alias=iospress&doi=10.3233/SW-160242 |journal=Semantic Web |volume=8 |issue=6 |pages=921–957 |doi=10.3233/SW-160242}}</ref> Some approaches expects the subject column as an input<ref name=":02" /> while others predict the subject column such as TableMiner+.<ref name=":9" /> ===== Column data-type detection ===== Columns types are divided differently by different approaches.<ref name=":5" /> Some divide them into strings/text and numbers<ref name="auto2" /><ref name=":6" /><ref>{{Cite book |last1=Ramnandan |first1=S.K. |last2=Mittal |first2=Amol |last3=Knoblock |first3=Craig A. |last4=Szekely |first4=Pedro |title=The Semantic Web. Latest Advances and New Domains |chapter=Assigning Semantic Labels to Data Sources |date=2015 |editor-last=Gandon |editor-first=Fabien |editor2-last=Sabou |editor2-first=Marta |editor3-last=Sack |editor3-first=Harald |editor4-last=d’Amato |editor4-first=Claudia |editor5-last=Cudré-Mauroux |editor5-first=Philippe |editor6-last=Zimmermann |editor6-first=Antoine |series=Lecture Notes in Computer Science |language=en |location=Cham |publisher=Springer International Publishing |volume=9088 |pages=403–417 |doi=10.1007/978-3-319-18818-8_25 |isbn=978-3-319-18818-8|s2cid=7040223 |doi-access=free }}</ref><ref name=":102"/> while others divide them further<ref name=":5" /> (e.g., Number Typology,<ref name="auto12" /> Date,<ref name=":4" /><ref name=":8" /> coordinates<ref>{{Cite book |last1=Quercini |first1=Gianluca |last2=Reynaud |first2=Chantal |title=Proceedings of the 16th International Conference on Extending Database Technology |chapter=Entity discovery and annotation in tables |date=2013 |chapter-url=http://dx.doi.org/10.1145/2452376.2452457 |location=New York, New York, USA |publisher=ACM Press |page=693 |doi=10.1145/2452376.2452457 |isbn=9781450315975 |s2cid=8252126|url=https://hal.inria.fr/hal-00832639/file/edbt2013.pdf }}</ref>). ===== Relation prediction ===== The relation between [[Madrid]] and [[Spain]] is "capitalOf".<ref>{{Cite web |title=About: capital of |url=https://dbpedia.org/property/capitalOf |access-date=2022-09-22 |website=dbpedia.org}}</ref> Such relations can easily be found in ontologies, such as [[DBpedia]]. Venetis et al.<ref name=":8" /> use TextRunner<ref>{{Cite journal |last1=Etzioni |first1=Oren |last2=Banko |first2=Michele |last3=Soderland |first3=Stephen |last4=Weld |first4=Daniel S. |date=2008-12-01 |title=Open information extraction from the web |url=https://doi.org/10.1145/1409360.1409378 |journal=Communications of the ACM |volume=51 |issue=12 |pages=68–74 |doi=10.1145/1409360.1409378 |issn=0001-0782 |s2cid=207169186}}</ref> to extract the relation between two columns. Syed et al.<ref name=":4" /> use the relation between the entities of the two columns and the most frequent relation is selected.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)