Semantic Enrichment of a Multilingual Archive with Linked Open Data

Max De Wilde, Simon Hengchen

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

Sammanfattning

This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80% precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.
Originalspråkengelska
TidskriftDigital Humanities Quarterly
Volym11
Utgåva4
StatusPublicerad - 2017
MoE-publikationstypA1 Tidskriftsartikel-refererad

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 6160 Övriga humanistiska vetenskaper
  • 113 Data- och informationsvetenskap

Citera det här

@article{32f2a53f5fe045d1b0107de6378164a1,
title = "Semantic Enrichment of a Multilingual Archive with Linked Open Data",
abstract = "This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80{\%} precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.",
keywords = "6121 Languages, 6160 Other humanities, 113 Computer and information sciences",
author = "{De Wilde}, Max and Simon Hengchen",
year = "2017",
language = "English",
volume = "11",
journal = "Digital Humanities Quarterly",
issn = "1938-4122",
publisher = "Alliance of Digital Humanities Organizations",
number = "4",

}

Semantic Enrichment of a Multilingual Archive with Linked Open Data. / De Wilde, Max; Hengchen, Simon.

I: Digital Humanities Quarterly, Vol. 11, Nr. 4, 2017.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

TY - JOUR

T1 - Semantic Enrichment of a Multilingual Archive with Linked Open Data

AU - De Wilde, Max

AU - Hengchen, Simon

PY - 2017

Y1 - 2017

N2 - This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80% precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.

AB - This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80% precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.

KW - 6121 Languages

KW - 6160 Other humanities

KW - 113 Computer and information sciences

M3 - Article

VL - 11

JO - Digital Humanities Quarterly

JF - Digital Humanities Quarterly

SN - 1938-4122

IS - 4

ER -