On the questions in developing computational infrastructure for Komi-Permyak

Jack Rueter, Niko Partanen, Larisa Ponomareva

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


There are two main written Komi varieties, Permyak and Zyrian. These are mutually intelligible but derive from different parts of the same Komi dialect continuum,representing the varieties prominent in the vicinity and in the cities of Syktyvkar and Kudymkar,respectively. Hence,they share a vast number of features, as well as the majority of their lexicon, yet the overlap in their dialects is very complex. This paper evaluates the degree of difference in these written varieties based on changes required for computational resources in the description of these languages when adapted fromthe Komi-Zyrian original. Primarily these changes include the FST architecture, but we are also looking at its application to the Universal Dependencies annotation scheme in the morphologies of the two languages.
Original languageEnglish
Title of host publicationProceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages
EditorsTommi A. Pirinen, Francis M. Tyers, Michael Rießler
Number of pages11
Place of PublicationStroudsburg
PublisherThe Association for Computational Linguistics
Publication date2020
ISBN (Electronic)978-1-952148-00-2
Publication statusPublished - 2020
MoE publication typeA4 Article in conference proceedings
EventInternational Workshop on Computational Linguistics of Uralic Languages - Universität Wien, Vienna, Austria
Duration: 10 Jan 202011 Jan 2020
Conference number: 6

Fields of Science

  • 6121 Languages
  • Komi-Zyrian Language
  • Komi-Permyak language
  • Permic languages
  • Dialect continuum
  • Computational infrastructure
  • open-source
  • finite-state morphology

Cite this