Projekteja vuodessa
Abstrakti
The paper describes work-in-progress by the Izhva Komi language documentation project, which records new spoken language data, digitizes available recordings and annotate these multimedia data in order to provide a comprehensive language corpus as a databases for future research on and for this endangered – and under-described – Uralic speech community. While working with a spoken variety and in the framework of documentary linguistics, we apply language technology methods and tools, which have been applied so far only to normalized written languages. Specifically, we describe a script providing interactivity between ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora, and different morphosyntactic analysis modules implemented as Finite-State Transducers and Constraint Grammar for rule-based morphosyntactic tagging and disambiguation. Our aim is to challenge current manual approaches in the annotation of language documentation corpora.
Alkuperäiskieli | englanti |
---|---|
Otsikko | Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages |
Sivumäärä | 10 |
Kustantaja | The Association for Computational Linguistics |
Julkaisupäivä | maalisk. 2017 |
Sivut | 57-66 |
DOI - pysyväislinkit | |
Tila | Julkaistu - maalisk. 2017 |
Julkaistu ulkoisesti | Kyllä |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisuussa |
Tapahtuma | Workshop on Computational Methods for Endangered Languages - Honolulu, Yhdysvallat (USA) Kesto: 26 helmik. 2019 → 27 helmik. 2019 Konferenssinumero: 3 |
Julkaisusarja
Nimi | ACL Anthology |
---|
Tieteenalat
- 113 Tietojenkäsittely- ja informaatiotieteet
- 6121 Kielitieteet
Projektit
- 1 Aktiivinen
-
IKDP-2: Language Documentation meets Language Technology: The Next Step in the Description of Komi
Blokland, R. (Projektinjohtaja), Rießler, M. (Projektinjohtaja) & Partanen, N. (Projektinjohtaja)
01/03/2017 → …
Projekti: Tutkimusprojekti