An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

Kuvaus

In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.
Alkuperäiskielienglanti
OtsikkoThe 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) : Proceedings of the Workshop
ToimittajatIsabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Johannes Welbl, Alexis Conneau, Xiang Ren, Marek Rei
Sivumäärä6
JulkaisupaikkaStroudsburg
KustantajaAssociation for Computational Linguistics
Julkaisupäivä1 elokuuta 2019
Sivut27-32
ISBN (elektroninen)978-1-950737-35-2
TilaJulkaistu - 1 elokuuta 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaWorkshop on Representation Learning for NLP - Florence, Italia
Kesto: 2 elokuuta 20192 elokuuta 2019
Konferenssinumero: 4

Tieteenalat

  • 113 Tietojenkäsittely- ja informaatiotieteet
  • 6121 Kielitieteet

Lainaa tätä

Raganato, A., Vázquez, R., Creutz, M., & Tiedemann, J. (2019). An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. teoksessa I. Augenstein, S. Gella, S. Ruder, K. Kann, B. Can, J. Welbl, A. Conneau, X. Ren, ... M. Rei (Toimittajat), The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop (Sivut 27-32). Stroudsburg: Association for Computational Linguistics.
Raganato, Alessandro ; Vázquez, Raúl ; Creutz, Mathias ; Tiedemann, Jörg. / An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Toimittaja / Isabelle Augenstein ; Spandana Gella ; Sebastian Ruder ; Katharina Kann ; Burcu Can ; Johannes Welbl ; Alexis Conneau ; Xiang Ren ; Marek Rei. Stroudsburg : Association for Computational Linguistics, 2019. Sivut 27-32
@inproceedings{c14916d412e2492db8ea5ad6fe5c735c,
title = "An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation",
abstract = "In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.",
keywords = "113 Computer and information sciences, 6121 Languages",
author = "Alessandro Raganato and Ra{\'u}l V{\'a}zquez and Mathias Creutz and J{\"o}rg Tiedemann",
year = "2019",
month = "8",
day = "1",
language = "English",
pages = "27--32",
editor = "Isabelle Augenstein and Gella, {Spandana } and Ruder, {Sebastian } and Kann, {Katharina } and Can, {Burcu } and Welbl, {Johannes } and Alexis Conneau and Ren, {Xiang } and Rei, {Marek }",
booktitle = "The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)",
publisher = "Association for Computational Linguistics",
address = "United States",

}

Raganato, A, Vázquez, R, Creutz, M & Tiedemann, J 2019, An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. julkaisussa I Augenstein, S Gella, S Ruder, K Kann, B Can, J Welbl, A Conneau, X Ren & M Rei (toim), The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Association for Computational Linguistics, Stroudsburg, Sivut 27-32, Workshop on Representation Learning for NLP, Florence, Italia, 02/08/2019.

An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. / Raganato, Alessandro; Vázquez, Raúl; Creutz, Mathias; Tiedemann, Jörg.

The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. toim. / Isabelle Augenstein; Spandana Gella; Sebastian Ruder; Katharina Kann; Burcu Can; Johannes Welbl; Alexis Conneau; Xiang Ren; Marek Rei. Stroudsburg : Association for Computational Linguistics, 2019. s. 27-32.

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

TY - GEN

T1 - An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

AU - Raganato, Alessandro

AU - Vázquez, Raúl

AU - Creutz, Mathias

AU - Tiedemann, Jörg

PY - 2019/8/1

Y1 - 2019/8/1

N2 - In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.

AB - In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.

KW - 113 Computer and information sciences

KW - 6121 Languages

M3 - Conference contribution

SP - 27

EP - 32

BT - The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

A2 - Augenstein, Isabelle

A2 - Gella, Spandana

A2 - Ruder, Sebastian

A2 - Kann, Katharina

A2 - Can, Burcu

A2 - Welbl, Johannes

A2 - Conneau, Alexis

A2 - Ren, Xiang

A2 - Rei, Marek

PB - Association for Computational Linguistics

CY - Stroudsburg

ER -

Raganato A, Vázquez R, Creutz M, Tiedemann J. An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. julkaisussa Augenstein I, Gella S, Ruder S, Kann K, Can B, Welbl J, Conneau A, Ren X, Rei M, toimittajat, The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Stroudsburg: Association for Computational Linguistics. 2019. s. 27-32