Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages

Juho Leinonen, Sami Virpioja, Mikko Kurimo

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

Forced alignment is an effective process to speed up linguistic research. However, most forced aligners are language-dependent, and under-resourced languages rarely have enough resources to train an acoustic model for an aligner. We present a new Finnish grapheme-based forced aligner and demonstrate its performance by aligning multiple Uralic languages and English as an unrelated language. We show that even a simple non-expert created grapheme-to-phoneme mapping can result in useful word alignments.
Originalspråkengelska
Titel på gästpublikationProceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
RedaktörerSimon Dobnik, Lilja Øvrelid
Antal sidor6
UtgivningsortLinköping
FörlagLinköping University Electronic Press
Utgivningsdatum1 maj 2021
Sidor345-350
ISBN (elektroniskt)978-91-7929-614-8
StatusPublicerad - 1 maj 2021
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangNordic Conference on Computational Linguistics - [Online event], Reykjavik, Island
Varaktighet: 31 maj 20212 jun 2021
Konferensnummer: 23
https://nodalida2021.github.io/index.html

Publikationsserier

NamnLinköping Electronic Conference Proceedings
FörlagLinköping University Electronic Press
Nummer78
ISSN (tryckt)1650-3740
ISSN (elektroniskt)1650-3686
NamnNEALT Proceedings Series
FörlagUniversity of Tartu
Nummer45
ISSN (tryckt)1736-8197
ISSN (elektroniskt)1736-6305

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap
  • 6121 Språkvetenskaper

Citera det här