Projekteja vuodessa
Abstrakti
The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future research on and for endangered – and under-described – Uralic speech communities. Applying language technology in language documentation helps us to create more systematically annotated corpora, rather than eclectic data collections. Specifically, we describe a script providing interactivity between different morphosyntactic analysis modules implemented as Finite State Transducers and ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora. Ultimately, the spoken corpora created in our projects will be useful for scientifically significant quantitative investigations on these languages in the future.
Alkuperäiskieli | englanti |
---|---|
Lehti | Northern European Journal of Language Technology |
Vuosikerta | 4 |
Sivut | 29-47 |
Sivumäärä | 19 |
ISSN | 2000-1533 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 2016 |
Julkaistu ulkoisesti | Kyllä |
OKM-julkaisutyyppi | A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä, vertaisarvioitu |
Tieteenalat
- 6121 Kielitieteet
Projektit
- 1 Päättynyt
-
IKDP: Building an annotated digital corpus for future research on Komi speech communities in northernmost Russia
Blokland, R., Fedina, M., Rießler, M. & Partanen, N.
01/01/2014 → 31/12/2016
Projekti: Tutkimusprojekti