Finno-Ugrian Language Text Corpora for Testing of Grammar Descriptions

    Project: Research project

    Project Details

    Description (abstract)

    This entails the acquisition of Erzya-language text corpora with authorized release where necessary, (copy right). The formats include electronic text documents, pdfs and paper prints requiring scanning and OCR technology.
    StatusActive
    Effective start/end date01/04/1995 → …

    Fields of Science

    • 612 Languages and Literature
    • Erzya
    • Komi
    • Moksha
    • Literature
    • Original-language
    • novel
    • short story
    • poetry
    • journal
    • continuous text
    • 213 Electronic, automation and communications engineering, electronics
    • Unicode
    • xml
    • finite-state development
    • Enhancing Endangered Language FSTs with Pokémon Names

      Rueter, J. & Hämäläinen, M., Nov 2024, Lightning Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages. Hämäläinen, M. & Pirinen, F. (eds.). Helsinki: Fly for Points, p. 1–5 5 p.

      Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

      Open Access
      File
    • UD_Erzya-JR 2.14

      Rueter, J., Nivre, J., Zeman, D., Erina, O., Klementeva, J., Riabov, I. & Tyers, F. M., 15 May 2024

      Research output: Non-textual formSoftwareScientificpeer-review

      Open Access
    • UD_Erzya-JR 2.15

      Rueter, J., Zeman, D., Nivre, J., Erina, O., Klementeva, J., Riabov, I. & Tyers, F. M., 15 Nov 2024

      Research output: Non-textual formSoftwareScientificpeer-review

      Open Access