I Have an Attention Bridge to Sell You: Generalization Capabilities of Modular Translation Architectures

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Modularity is a paradigm of machine translation with the potential of bringing forth models that are large at training time and small during inference. Within this field of study, modular approaches, and in particular attention bridges, have been argued to improve the generalization capabilities of models by fostering language-independent representations. In the present paper, we study whether modularity affects translation quality; as well as how well modular architectures generalize across different evaluation scenarios. For a given computational budget, we find non-modular architectures to be always comparable or preferable to all modular designs we study.
Original languageEnglish
Title of host publicationProceedings of the Fifth Workshop on Insights from Negative Results in NLP
EditorsShabnam Tafreshi, Arjun Akula, João Sedoc, Aleksandr Drozd, Anna Rogers, Anna Rumshisky
Number of pages7
Place of PublicationKerrville
PublisherThe Association for Computational Linguistics
Publication date1 Jun 2024
Pages34-40
ISBN (Electronic)979-8-89176-102-5
Publication statusPublished - 1 Jun 2024
MoE publication typeA4 Article in conference proceedings
EventWorkshop on Insights from Negative Results in NLP - Mexico City, Mexico
Duration: 20 Jun 202420 Jun 2024
Conference number: 5

Fields of Science

  • 6121 Languages
  • 113 Computer and information sciences

Cite this