Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages

Juho Leinonen, Sami Virpioja, Mikko Kurimo

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Forced alignment is an effective process to speed up linguistic research. However, most forced aligners are language-dependent, and under-resourced languages rarely have enough resources to train an acoustic model for an aligner. We present a new Finnish grapheme-based forced aligner and demonstrate its performance by aligning multiple Uralic languages and English as an unrelated language. We show that even a simple non-expert created grapheme-to-phoneme mapping can result in useful word alignments.
Original languageEnglish
Title of host publicationProceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
EditorsSimon Dobnik, Lilja Øvrelid
Number of pages6
Place of PublicationLinköping
PublisherLinköping University Electronic Press
Publication date1 May 2021
Pages345-350
ISBN (Electronic)978-91-7929-614-8
Publication statusPublished - 1 May 2021
MoE publication typeA4 Article in conference proceedings
EventNordic Conference on Computational Linguistics - [Online event], Reykjavik, Iceland
Duration: 31 May 20212 Jun 2021
Conference number: 23
https://nodalida2021.github.io/index.html

Publication series

NameLinköping Electronic Conference Proceedings
PublisherLinköping University Electronic Press
Number78
ISSN (Print)1650-3740
ISSN (Electronic)1650-3686
NameNEALT Proceedings Series
PublisherUniversity of Tartu
Number45
ISSN (Print)1736-8197
ISSN (Electronic)1736-6305

Fields of Science

  • 113 Computer and information sciences
  • 6121 Languages

Cite this