Kuvaus
Current machine translation architectures are based on deep neural networks and provide impressive translation quality for the major language pairs. I will start by giving a high-level overview of neural machine translation architectures and then focus on two challenging application scenarios.The first scenario concerns low-resource languages. I will present our participation in the AmericasNLP shared task, which focuses on machine translation from Spanish to eleven indigenous languages of the Americas. I describe how a combination of techniques, ranging from data collection to knowledge distillation and post-processing, helps improve the translation quality.
In the second scenario, I investigate the suitability of neural machine translation techniques for the automatic normalization of phonetic transcriptions in multi-dialectal corpora. In this case, our focus does not lie in optimal normalization performance, but rather what the model learns about the different dialects and their relation with each other during the training process. In our case study with large Finnish and Norwegian dialect corpora, the model successfully identified the major dialect areas known from prior dialectological research.
Aikajakso | 25 toukok. 2023 |
---|---|
Pidetty | Paris-Lodron University of Salzburg, Itävalta |
Tunnustuksen arvo | Paikallinen |
Asiakirjat ja linkit
Tähän liittyvä sisältö
-
Projektit
-
Found in Translation - Natural Language Understanding with Cross-Lingual Grounding
Projekti: EU Horizon 2020: European Research Council: Consolidator Grant (H2020-ERC-COG)