The MeMAD Submission to the WMT18 Multimodal Translation Task

Stig-Arne Grönroos, Benoit Huet, Mikko Kurimo, Jorma Laaksonen, Bernard Merialdo, Phu Pham, Mats Sjöberg, Umut Sulubacak, Jörg Tiedemann, Raphaël Troncy, Juan Raúl Vázquez Carrillo

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


This paper describes the MeMAD project entry to the WMT Multimodal Machine
Translation Shared Task.

We propose adapting the Transformer neural machine translation (NMT) architecture to a multi-modal setting. In this paper, we also describe the preliminary experiments with text-only translation systems leading us up to this choice.

We have the top scoring system for both English-to-German and English-to-French, according to the automatic metrics for flickr18.

Our experiments show that the effect of the visual features in our system is small. Our largest gains come from the quality of the underlying text-only NMT system. We find that appropriate use of additional data is effective.
Original languageEnglish
Title of host publicationProceedings of the Third Conference on Machine Translation (WMT) : Shared Task Papers
EditorsOndřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Number of pages9
Place of PublicationStroudsburg
PublisherThe Association for Computational Linguistics
Publication date1 Nov 2018
ISBN (Electronic)978-1-948087-81-0
Publication statusPublished - 1 Nov 2018
MoE publication typeA4 Article in conference proceedings
EventConference on Machine Translation - 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), Brussels, Belgium
Duration: 31 Oct 20181 Nov 2018
Conference number: 3

Fields of Science

  • 113 Computer and information sciences
  • 6121 Languages

Cite this