Benchmarks for Unsupervised Discourse Change Detection

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

The main motivation for this work lies in the need to track discourse dynamics in historical corpora. However, in many real use cases ground truth is not available and annotating discourses on a corpus-level is hardly possible. We propose a novel procedure to generate synthetic datasets for this task, a novel evaluation framework and a set of benchmarking models. Finally, we run large-scale experiments using these synthetic datasets and demonstrate that a model trained on such a dataset can obtain meaningful results when applied to a real dataset, without any adjustments of the model.
Originalspråkengelska
Titel på värdpublikationProceedings of the 6th International Workshop on Computational History (HistoInformatics 2021)
RedaktörerYasunobu Sumikawa , Ryohei Ikejiri , Antoine Doucet, Eva Pfanzelter, Mohammed Hasanuzzaman, Gaël Dias, Ian Milligan, Adam Jatowt
Antal sidor12
UtgivningsortAachen
FörlagCEUR-WS.org
Utgivningsdatumsep. 2021
StatusPublicerad - sep. 2021
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational Workshop on Computational History -
Varaktighet: 30 sep. 20211 okt. 2021
Konferensnummer: 6
https://sites.google.com/view/histoinformatics2021workshop/home

Publikationsserier

NamnCEUR workshop proceedings
FörlagCEUR-WS.org
Volym2981
ISSN (elektroniskt)1613-0073

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här