A New Latin Treebank for Universal Dependencies: Charters between Ancient Latin and Romance Languages

Flavio Massimiliano Cecchini, Timo Korkiakangas, Marco Passarotti

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

The present work introduces a new Latin treebank that follows the Universal Dependencies (UD) annotation standard. The treebank is obtained from the automated conversion of the Late Latin Charter Treebank 2 (LLCT2), originally in the Prague Dependency Treebank (PDT) style. As this treebank consists of Early Medieval legal documents, its language variety differs considerably from both the Classical and Medieval learned varieties prevalent in the other currently available UD Latin treebanks. Consequently, besides significant
phenomena from the perspective of diachronic linguistics, this treebank also poses several challenging technical issues for the current and future syntactic annotation of Latin in the UD framework. Some of the most relevant cases are discussed in depth, with comparisons between the original PDT and the resulting UD annotations. Additionally, an overview of the UD-style structure of the treebank is given, and some diachronic aspects of the transition from Latin to Romance languages are highlighted.
Original languageEnglish
Title of host publicationTwelfth International Conference on Language Resources and Evaluation (LREC 2020) : May 11-16, 2020 PALAIS DU PHARO Marseille, France : Conference Proceedings
EditorsNicoletta Calzolari ... [et al.]
Number of pages10
Place of PublicationParis
PublisherEuropean Language Resources Association (ELRA)
Publication date2020
Pages933-942
ISBN (Print)979-10-95546-34-4
Publication statusPublished - 2020
MoE publication typeA4 Article in conference proceedings
EventInternational Conference on Language Resources and Evaluation - Marseille, France
Duration: 11 May 202016 May 2020
Conference number: 12

Fields of Science

  • 6121 Languages

Cite this