Editing Annotation (section)

==== Semantic labelling common tasks ====
Here are some of the common semantic labelling tasks presented in the literature:

===== Entity linking and disambiguation =====
This is the most common task in semantic labelling. Given a text of a cell and a data source, the approach predicts the entity and link it to the one identified in the given data source. For example, if the input to the approach were the text  "Richard Feynman" and a URL to the SPARQL endpoint of DBpedia, the approach would return "[https://dbpedia.org/resource/Richard_Feynman http://dbpedia.org/resource/Richard_Feynman]", which is the entity from DBpedia. Some approaches use exact match.<ref name=":02" /> while others use similarity metrics such as [[Cosine similarity]]<ref name=":7" />

===== Subject column identification =====
The subject column of a table is the column that contain the main subjects/entities in the table.<ref name="auto12"/><ref name=":5" /><ref name=":8" /><ref>{{Citation |last1=Ermilov |first1=Ivan |date=2016 |url=http://dx.doi.org/10.1007/978-3-319-49004-5_11 |pages=163–179 |place=Cham |publisher=Springer International Publishing |doi=10.1007/978-3-319-49004-5_11 |isbn=978-3-319-49003-8 |access-date=2022-09-22 |last2=Ngomo |first2=Axel-Cyrille Ngonga|title=Knowledge Engineering and Knowledge Management |chapter=TAIPAN: Automatic Property Mapping for Tabular Data |series=Lecture Notes in Computer Science |volume=10024 |s2cid=37730677 |url-access=subscription }}</ref><ref name=":9">{{Cite journal |last=Zhang |first=Ziqi |date=2017-08-07 |editor-last=Hitzler |editor-first=Pascal |editor2-last=Cruz |editor2-first=Isabel |title=Effective and efficient Semantic Table Interpretation using TableMiner+ |url=https://www.medra.org/servlet/aliasResolver?alias=iospress&doi=10.3233/SW-160242 |journal=Semantic Web |volume=8 |issue=6 |pages=921–957 |doi=10.3233/SW-160242}}</ref> Some approaches expects the subject column as an input<ref name=":02" /> while others predict the subject column such as TableMiner+.<ref name=":9" />

===== Column data-type detection =====
Columns types are divided differently by different approaches.<ref name=":5" /> Some divide them into strings/text and numbers<ref name="auto2" /><ref name=":6" /><ref>{{Cite book |last1=Ramnandan |first1=S.K. |last2=Mittal |first2=Amol |last3=Knoblock |first3=Craig A. |last4=Szekely |first4=Pedro |title=The Semantic Web. Latest Advances and New Domains |chapter=Assigning Semantic Labels to Data Sources |date=2015 |editor-last=Gandon |editor-first=Fabien |editor2-last=Sabou |editor2-first=Marta |editor3-last=Sack |editor3-first=Harald |editor4-last=d’Amato |editor4-first=Claudia |editor5-last=Cudré-Mauroux |editor5-first=Philippe |editor6-last=Zimmermann |editor6-first=Antoine |series=Lecture Notes in Computer Science |language=en |location=Cham |publisher=Springer International Publishing |volume=9088 |pages=403–417 |doi=10.1007/978-3-319-18818-8_25 |isbn=978-3-319-18818-8|s2cid=7040223 |doi-access=free }}</ref><ref name=":102"/> while others divide them further<ref name=":5" /> (e.g., Number Typology,<ref name="auto12" /> Date,<ref name=":4" /><ref name=":8" /> coordinates<ref>{{Cite book |last1=Quercini |first1=Gianluca |last2=Reynaud |first2=Chantal |title=Proceedings of the 16th International Conference on Extending Database Technology |chapter=Entity discovery and annotation in tables |date=2013 |chapter-url=http://dx.doi.org/10.1145/2452376.2452457 |location=New York, New York, USA |publisher=ACM Press |page=693 |doi=10.1145/2452376.2452457 |isbn=9781450315975 |s2cid=8252126|url=https://hal.inria.fr/hal-00832639/file/edbt2013.pdf }}</ref>).

===== Relation prediction =====
The relation between [[Madrid]] and [[Spain]] is "capitalOf".<ref>{{Cite web |title=About: capital of |url=https://dbpedia.org/property/capitalOf |access-date=2022-09-22 |website=dbpedia.org}}</ref> Such relations can easily be found in ontologies, such as [[DBpedia]]. Venetis et al.<ref name=":8" /> use TextRunner<ref>{{Cite journal |last1=Etzioni |first1=Oren |last2=Banko |first2=Michele |last3=Soderland |first3=Stephen |last4=Weld |first4=Daniel S. |date=2008-12-01 |title=Open information extraction from the web |url=https://doi.org/10.1145/1409360.1409378 |journal=Communications of the ACM |volume=51 |issue=12 |pages=68–74 |doi=10.1145/1409360.1409378 |issn=0001-0782 |s2cid=207169186}}</ref> to extract the relation between two columns. Syed et al.<ref name=":4" /> use the relation between the entities of the two columns and the most frequent relation is selected.