Multilingual NMT with a language-independent attention bridge

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.
Originalspråkengelska
Titel på gästpublikationThe 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) : Proceedings of the Workshop
RedaktörerIsabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Johannes Welbl, Alexis Conneau, Xiang Ren, Marek Rei
Antal sidor7
UtgivningsortStroudsburg
FörlagAssociation for Computational Linguistics
Utgivningsdatum2019
Sidor33-39
ISBN (elektroniskt)978-1-950737-35-2
StatusPublicerad - 2019
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangWorkshop on Representation Learning for NLP - Florence, Italien
Varaktighet: 2 aug 20192 aug 2019
Konferensnummer: 4

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här

Vazquez Carrillo, J. R., Raganato, A., Tiedemann, J., & Creutz, M. (2019). Multilingual NMT with a language-independent attention bridge. I I. Augenstein, S. Gella, S. Ruder, K. Kann, B. Can, J. Welbl, A. Conneau, X. Ren, ... M. Rei (Red.), The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop (s. 33-39). Stroudsburg: Association for Computational Linguistics.
Vazquez Carrillo, Juan Raul ; Raganato, Alessandro ; Tiedemann, Jörg ; Creutz, Mathias. / Multilingual NMT with a language-independent attention bridge. The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. redaktör / Isabelle Augenstein ; Spandana Gella ; Sebastian Ruder ; Katharina Kann ; Burcu Can ; Johannes Welbl ; Alexis Conneau ; Xiang Ren ; Marek Rei. Stroudsburg : Association for Computational Linguistics, 2019. s. 33-39
@inproceedings{bc9b500d4be549dd988ba08a44401517,
title = "Multilingual NMT with a language-independent attention bridge",
abstract = "In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.",
keywords = "113 Computer and information sciences, Natural language processing, Multilingual machine translation",
author = "{Vazquez Carrillo}, {Juan Raul} and Alessandro Raganato and J{\"o}rg Tiedemann and Mathias Creutz",
year = "2019",
language = "English",
pages = "33--39",
editor = "Augenstein, {Isabelle } and Gella, {Spandana } and Ruder, {Sebastian } and Kann, {Katharina } and Can, {Burcu } and Welbl, {Johannes } and Conneau, {Alexis } and Ren, {Xiang } and Rei, {Marek }",
booktitle = "The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)",
publisher = "Association for Computational Linguistics",
address = "International",

}

Vazquez Carrillo, JR, Raganato, A, Tiedemann, J & Creutz, M 2019, Multilingual NMT with a language-independent attention bridge. i I Augenstein, S Gella, S Ruder, K Kann, B Can, J Welbl, A Conneau, X Ren & M Rei (red), The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Association for Computational Linguistics, Stroudsburg, s. 33-39, Workshop on Representation Learning for NLP, Florence, Italien, 02/08/2019.

Multilingual NMT with a language-independent attention bridge. / Vazquez Carrillo, Juan Raul; Raganato, Alessandro; Tiedemann, Jörg; Creutz, Mathias.

The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. red. / Isabelle Augenstein; Spandana Gella; Sebastian Ruder; Katharina Kann; Burcu Can; Johannes Welbl; Alexis Conneau; Xiang Ren; Marek Rei. Stroudsburg : Association for Computational Linguistics, 2019. s. 33-39.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

TY - GEN

T1 - Multilingual NMT with a language-independent attention bridge

AU - Vazquez Carrillo, Juan Raul

AU - Raganato, Alessandro

AU - Tiedemann, Jörg

AU - Creutz, Mathias

PY - 2019

Y1 - 2019

N2 - In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.

AB - In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.

KW - 113 Computer and information sciences

KW - Natural language processing

KW - Multilingual machine translation

M3 - Conference contribution

SP - 33

EP - 39

BT - The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

A2 - Augenstein, Isabelle

A2 - Gella, Spandana

A2 - Ruder, Sebastian

A2 - Kann, Katharina

A2 - Can, Burcu

A2 - Welbl, Johannes

A2 - Conneau, Alexis

A2 - Ren, Xiang

A2 - Rei, Marek

PB - Association for Computational Linguistics

CY - Stroudsburg

ER -

Vazquez Carrillo JR, Raganato A, Tiedemann J, Creutz M. Multilingual NMT with a language-independent attention bridge. I Augenstein I, Gella S, Ruder S, Kann K, Can B, Welbl J, Conneau A, Ren X, Rei M, redaktörer, The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Stroudsburg: Association for Computational Linguistics. 2019. s. 33-39