Finno-Ugrian Language Text Corpora for Testing of Grammar Descriptions

Project Details

Description

This entails the acquisition of Erzya-language text corpora with authorized release where necessary, (copy right). The formats include electronic text documents, pdfs and paper prints requiring scanning and OCR technology.
StatusActive
Effective start/end date01/04/1995 → …

Fields of Science

  • 612 Languages and Literature
  • Erzya
  • Komi
  • Moksha
  • Literature
  • Original-language
  • novel
  • short story
  • poetry
  • journal
  • continuous text
  • 213 Electronic, automation and communications engineering, electronics
  • Unicode
  • xml
  • finite-state development