Finno-Ugrian Language Text Corpora for Testing of Grammar Descriptions

Project Details

Description

This entails the acquisition of Erzya-language text corpora with authorized release where necessary, (copy right). The formats include electronic text documents, pdfs and paper prints requiring scanning and OCR technology.
StatusActive
Effective start/end date01/04/1995 → …

Fields of Science

  • 612 Languages and Literature
  • Erzya
  • Komi
  • Moksha
  • Literature
  • Original-language
  • novel
  • short story
  • poetry
  • journal
  • continuous text
  • 213 Electronic, automation and communications engineering, electronics
  • Unicode
  • xml
  • finite-state development
  • UD_Erzya-JR 2.8

    Rueter, J., Nivre, J., Zeman, D., Erina, O., Klementeva, J., Riabov, I. & Tyers, F. M., 15 May 2021

    Research output: Non-textual formSoftwareScientificpeer-review

    Open Access
  • UD_Moksha-JR 2.8

    Nivre, J., Zeman, D., Rueter, J., Kabaeva, N. & Levina, M., 15 May 2021

    Research output: Non-textual formSoftwareScientificpeer-review

    Open Access
  • UD_Skolt_Sami-Giellagas 2.8

    Nivre, J., Zeman, D., Rueter, J., Juutinen, M. & Hämäläinen, M., 15 May 2021

    Research output: Non-textual formSoftwareScientificpeer-review

    Open Access