Multilingual NMT with a language-independent attention bridge

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.
Original languageEnglish
Title of host publicationThe 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) : Proceedings of the Workshop
EditorsIsabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Johannes Welbl, Alexis Conneau, Xiang Ren, Marek Rei
Number of pages7
Place of PublicationStroudsburg
PublisherAssociation for Computational Linguistics
Publication date2019
Pages33-39
ISBN (Electronic)978-1-950737-35-2
Publication statusPublished - 2019
MoE publication typeA4 Article in conference proceedings
EventWorkshop on Representation Learning for NLP - Florence, Italy
Duration: 2 Aug 20192 Aug 2019
Conference number: 4

Fields of Science

  • 113 Computer and information sciences
  • Natural language processing
  • Multilingual machine translation

Cite this

Vazquez Carrillo, J. R., Raganato, A., Tiedemann, J., & Creutz, M. (2019). Multilingual NMT with a language-independent attention bridge. In I. Augenstein, S. Gella, S. Ruder, K. Kann, B. Can, J. Welbl, A. Conneau, X. Ren, ... M. Rei (Eds.), The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop (pp. 33-39). Stroudsburg: Association for Computational Linguistics.
Vazquez Carrillo, Juan Raul ; Raganato, Alessandro ; Tiedemann, Jörg ; Creutz, Mathias. / Multilingual NMT with a language-independent attention bridge. The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. editor / Isabelle Augenstein ; Spandana Gella ; Sebastian Ruder ; Katharina Kann ; Burcu Can ; Johannes Welbl ; Alexis Conneau ; Xiang Ren ; Marek Rei. Stroudsburg : Association for Computational Linguistics, 2019. pp. 33-39
@inproceedings{bc9b500d4be549dd988ba08a44401517,
title = "Multilingual NMT with a language-independent attention bridge",
abstract = "In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.",
keywords = "113 Computer and information sciences, Natural language processing, Multilingual machine translation",
author = "{Vazquez Carrillo}, {Juan Raul} and Alessandro Raganato and J{\"o}rg Tiedemann and Mathias Creutz",
year = "2019",
language = "English",
pages = "33--39",
editor = "Augenstein, {Isabelle } and Gella, {Spandana } and Ruder, {Sebastian } and Kann, {Katharina } and Can, {Burcu } and Welbl, {Johannes } and Conneau, {Alexis } and Ren, {Xiang } and Rei, {Marek }",
booktitle = "The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)",
publisher = "Association for Computational Linguistics",
address = "International",

}

Vazquez Carrillo, JR, Raganato, A, Tiedemann, J & Creutz, M 2019, Multilingual NMT with a language-independent attention bridge. in I Augenstein, S Gella, S Ruder, K Kann, B Can, J Welbl, A Conneau, X Ren & M Rei (eds), The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Association for Computational Linguistics, Stroudsburg, pp. 33-39, Workshop on Representation Learning for NLP, Florence, Italy, 02/08/2019.

Multilingual NMT with a language-independent attention bridge. / Vazquez Carrillo, Juan Raul; Raganato, Alessandro; Tiedemann, Jörg; Creutz, Mathias.

The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. ed. / Isabelle Augenstein; Spandana Gella; Sebastian Ruder; Katharina Kann; Burcu Can; Johannes Welbl; Alexis Conneau; Xiang Ren; Marek Rei. Stroudsburg : Association for Computational Linguistics, 2019. p. 33-39.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

TY - GEN

T1 - Multilingual NMT with a language-independent attention bridge

AU - Vazquez Carrillo, Juan Raul

AU - Raganato, Alessandro

AU - Tiedemann, Jörg

AU - Creutz, Mathias

PY - 2019

Y1 - 2019

N2 - In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.

AB - In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.

KW - 113 Computer and information sciences

KW - Natural language processing

KW - Multilingual machine translation

M3 - Conference contribution

SP - 33

EP - 39

BT - The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

A2 - Augenstein, Isabelle

A2 - Gella, Spandana

A2 - Ruder, Sebastian

A2 - Kann, Katharina

A2 - Can, Burcu

A2 - Welbl, Johannes

A2 - Conneau, Alexis

A2 - Ren, Xiang

A2 - Rei, Marek

PB - Association for Computational Linguistics

CY - Stroudsburg

ER -

Vazquez Carrillo JR, Raganato A, Tiedemann J, Creutz M. Multilingual NMT with a language-independent attention bridge. In Augenstein I, Gella S, Ruder S, Kann K, Can B, Welbl J, Conneau A, Ren X, Rei M, editors, The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Stroudsburg: Association for Computational Linguistics. 2019. p. 33-39