Finnish ASR with Deep Transformer Models

Abhilash Jain, Aku Rouhe, Stig-Arne Grönroos, Mikko Kurimo

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

Recently, BERT and Transformer-XL based architectures have achieved strong results in a range of NLP applications. In this paper, we explore Transformer architectures—BERT and
Transformer-XL—as a language model for a Finnish ASR task with different rescoring schemes. We achieve strong results in both an intrinsic and an extrinsic task with Transformer-XL. Achieving 29% better perplexity and 3% better WER than our previous best LSTM-based approach. We also introduce a novel three-pass decoding scheme
which improves the ASR performance by 8%. To the best of our knowledge, this is also the first work (i) to formulate an alpha smoothing framework to use the non-autoregressive BERT language model for an ASR task, and (ii) to explore sub-word units with Transformer-XL for an agglutinative language like Finnish.
Originalspråkengelska
Titel på värdpublikationProceedings of Interspeech 2020
Antal sidor5
UtgivningsortBaixas
FörlagISCA - International Speech Communication Association
Utgivningsdatum2020
Sidor3630-3634
DOI
StatusPublicerad - 2020
Externt publiceradJa
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInterspeech 2020 - [Virtual conference]
Varaktighet: 25 okt. 202029 okt. 2020
http://www.interspeech2020.org

Publikationsserier

NamnInterspeech
FörlagISCA
ISSN (elektroniskt)2308-457X

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 113 Data- och informationsvetenskap

Citera det här