Sammanfattning
Recently, BERT and Transformer-XL based architectures have achieved strong results in a range of NLP applications. In this paper, we explore Transformer architectures—BERT and
Transformer-XL—as a language model for a Finnish ASR task with different rescoring schemes. We achieve strong results in both an intrinsic and an extrinsic task with Transformer-XL. Achieving 29% better perplexity and 3% better WER than our previous best LSTM-based approach. We also introduce a novel three-pass decoding scheme
which improves the ASR performance by 8%. To the best of our knowledge, this is also the first work (i) to formulate an alpha smoothing framework to use the non-autoregressive BERT language model for an ASR task, and (ii) to explore sub-word units with Transformer-XL for an agglutinative language like Finnish.
Transformer-XL—as a language model for a Finnish ASR task with different rescoring schemes. We achieve strong results in both an intrinsic and an extrinsic task with Transformer-XL. Achieving 29% better perplexity and 3% better WER than our previous best LSTM-based approach. We also introduce a novel three-pass decoding scheme
which improves the ASR performance by 8%. To the best of our knowledge, this is also the first work (i) to formulate an alpha smoothing framework to use the non-autoregressive BERT language model for an ASR task, and (ii) to explore sub-word units with Transformer-XL for an agglutinative language like Finnish.
Originalspråk | engelska |
---|---|
Titel på värdpublikation | Proceedings of Interspeech 2020 |
Antal sidor | 5 |
Utgivningsort | Baixas |
Förlag | ISCA - International Speech Communication Association |
Utgivningsdatum | 2020 |
Sidor | 3630-3634 |
DOI | |
Status | Publicerad - 2020 |
Externt publicerad | Ja |
MoE-publikationstyp | A4 Artikel i en konferenspublikation |
Evenemang | Interspeech 2020 - [Virtual conference] Varaktighet: 25 okt. 2020 → 29 okt. 2020 http://www.interspeech2020.org |
Publikationsserier
Namn | Interspeech |
---|---|
Förlag | ISCA |
ISSN (elektroniskt) | 2308-457X |
Vetenskapsgrenar
- 6121 Språkvetenskaper
- 113 Data- och informationsvetenskap