A Free/Open-Source Morphological Analyser and Generator for Sakha

Sardana Ivanova, Jonathan Washington, Francis M. Tyers

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

We present, to our knowledge, the first ever published morphological analyser and generator for Sakha, a marginalised language of Siberia. The transducer, developed using HFST, has coverage of solidly above 90%, and high precision. In the development of the analyser, we have expanded linguistic knowledge about Sakha, and developed strategies for complex grammatical patterns. The transducer is already being used in downstream tasks, including computer assisted language learning applications for linguistic maintenance and computational linguistic shared tasks.
Originalspråkengelska
Titel på värdpublikationLREC 2022, THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION : LREC 2022 Conference Proceedings
Antal sidor6
FörlagEuropean Languages Resources Association (ELRA)
Utgivningsdatumjuni 2022
Sidor5137-5142
ISBN (elektroniskt)979-10-95546-72-6
StatusPublicerad - juni 2022
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangLanguage Resources and Evaluation Conference - Marseille, Frankrike
Varaktighet: 21 juni 202223 juni 2022
Konferensnummer: 13
https://lrec2022.lrec-conf.org/en/

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här