On the questions in developing computational infrastructure for Komi-Permyak

Jack Rueter, Niko Partanen, Larisa Ponomareva

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review


There are two main written Komi varieties, Permyak and Zyrian. These are mutually intelligible but derive from different parts of the same Komi dialect continuum,representing the varieties prominent in the vicinity and in the cities of Syktyvkar and Kudymkar,respectively. Hence,they share a vast number of features, as well as the majority of their lexicon, yet the overlap in their dialects is very complex. This paper evaluates the degree of difference in these written varieties based on changes required for computational resources in the description of these languages when adapted fromthe Komi-Zyrian original. Primarily these changes include the FST architecture, but we are also looking at its application to the Universal Dependencies annotation scheme in the morphologies of the two languages.
Titel på värdpublikationProceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages
RedaktörerTommi A. Pirinen, Francis M. Tyers, Michael Rießler
Antal sidor11
FörlagThe Association for Computational Linguistics
ISBN (elektroniskt)978-1-952148-00-2
StatusPublicerad - 2020
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational Workshop on Computational Linguistics of Uralic Languages - Universität Wien, Vienna, Österrike
Varaktighet: 10 jan. 202011 jan. 2020
Konferensnummer: 6


  • 6121 Språkvetenskaper

Citera det här