Semantic Enrichment of a Multilingual Archive with Linked Open Data

Max De Wilde, Simon Hengchen

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

Kuvaus

This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80% precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.
Alkuperäiskielienglanti
LehtiDigital Humanities Quarterly
Vuosikerta11
Numero4
TilaJulkaistu - 2017
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä, vertaisarvioitu

Tieteenalat

  • 6121 Kielitieteet
  • 6160 Muut humanistiset tieteet
  • 113 Tietojenkäsittely- ja informaatiotieteet

Lainaa tätä

@article{32f2a53f5fe045d1b0107de6378164a1,
title = "Semantic Enrichment of a Multilingual Archive with Linked Open Data",
abstract = "This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80{\%} precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.",
keywords = "6121 Languages, 6160 Other humanities, 113 Computer and information sciences",
author = "{De Wilde}, Max and Simon Hengchen",
year = "2017",
language = "English",
volume = "11",
journal = "Digital Humanities Quarterly",
issn = "1938-4122",
publisher = "Alliance of Digital Humanities Organizations",
number = "4",

}

Semantic Enrichment of a Multilingual Archive with Linked Open Data. / De Wilde, Max; Hengchen, Simon.

julkaisussa: Digital Humanities Quarterly, Vuosikerta 11, Nro 4, 2017.

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

TY - JOUR

T1 - Semantic Enrichment of a Multilingual Archive with Linked Open Data

AU - De Wilde, Max

AU - Hengchen, Simon

PY - 2017

Y1 - 2017

N2 - This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80% precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.

AB - This paper introduces MERCKX, a Multilingual Entity/Resource Combiner & Knowledge eXtractor. A case study involving the semantic enrichment of a multilingual archive is presented with the aim of assessing the relevance of natural language processing techniques such as named-entity recognition and entity linking for cultural heritage material. In order to improve the indexing of historical collections, we map entities to the Linked Open Data cloud using a language-independent method. Our evaluation shows that MERCKX outperforms similar tools on the task of place disambiguation and linking, achieving over 80% precision despite lower recall scores. These results are encouraging for small and medium-size cultural institutions since they demonstrate that semantic enrichment can be achieved with limited resources.

KW - 6121 Languages

KW - 6160 Other humanities

KW - 113 Computer and information sciences

M3 - Article

VL - 11

JO - Digital Humanities Quarterly

JF - Digital Humanities Quarterly

SN - 1938-4122

IS - 4

ER -