The University of Helsinki submissions to the WMT19 news translation task

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

In this paper, we present the University of Helsinki submissions to the WMT 2019 shared task on news translation in three language pairs: English-German, English-Finnish and Finnish-English. This year, we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German, we trained both sentence-level transformer models and compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches, and we also included a rule-based system for English-Finnish.
Originalspråkengelska
Titel på gästpublikationFourth Conference of Conference on Machine Translation : Proceedings of the Conference
Antal sidor12
UtgivningsortStroudsburg
FörlagAssociation for Computational Linguistics
Utgivningsdatum1 aug 2019
Sidor611-622
ISBN (elektroniskt)978-1-950737-27-7
StatusPublicerad - 1 aug 2019
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangFourth Conference on Machine Translation: WMT19 - Firenze, Italien
Varaktighet: 1 aug 20192 aug 2019
Konferensnummer: 4

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap
  • 6121 Språkvetenskaper

Citera det här

Talman, A., Sulubacak, U., Vazquez , R., Scherrer, Y., Virpioja, S., Raganato, A., ... Tiedemann, J. (2019). The University of Helsinki submissions to the WMT19 news translation task. I Fourth Conference of Conference on Machine Translation: Proceedings of the Conference (s. 611-622). Stroudsburg: Association for Computational Linguistics.
Talman, Aarne ; Sulubacak, Umut ; Vazquez , Raul ; Scherrer, Yves ; Virpioja, Sami ; Raganato, Alessandro ; Hurskainen, Arvi ; Tiedemann, Jörg. / The University of Helsinki submissions to the WMT19 news translation task. Fourth Conference of Conference on Machine Translation: Proceedings of the Conference. Stroudsburg : Association for Computational Linguistics, 2019. s. 611-622
@inproceedings{8326a39257134d458dae56c457b37630,
title = "The University of Helsinki submissions to the WMT19 news translation task",
abstract = "In this paper, we present the University of Helsinki submissions to the WMT 2019 shared task on news translation in three language pairs: English-German, English-Finnish and Finnish-English. This year, we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German, we trained both sentence-level transformer models and compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches, and we also included a rule-based system for English-Finnish.",
keywords = "113 Computer and information sciences, 6121 Languages",
author = "Aarne Talman and Umut Sulubacak and Raul Vazquez and Yves Scherrer and Sami Virpioja and Alessandro Raganato and Arvi Hurskainen and J{\"o}rg Tiedemann",
year = "2019",
month = "8",
day = "1",
language = "English",
pages = "611--622",
booktitle = "Fourth Conference of Conference on Machine Translation",
publisher = "Association for Computational Linguistics",
address = "International",

}

Talman, A, Sulubacak, U, Vazquez , R, Scherrer, Y, Virpioja, S, Raganato, A, Hurskainen, A & Tiedemann, J 2019, The University of Helsinki submissions to the WMT19 news translation task. i Fourth Conference of Conference on Machine Translation: Proceedings of the Conference. Association for Computational Linguistics, Stroudsburg, s. 611-622, Firenze, Italien, 01/08/2019.

The University of Helsinki submissions to the WMT19 news translation task. / Talman, Aarne; Sulubacak, Umut; Vazquez , Raul; Scherrer, Yves; Virpioja, Sami; Raganato, Alessandro; Hurskainen, Arvi; Tiedemann, Jörg.

Fourth Conference of Conference on Machine Translation: Proceedings of the Conference. Stroudsburg : Association for Computational Linguistics, 2019. s. 611-622.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

TY - GEN

T1 - The University of Helsinki submissions to the WMT19 news translation task

AU - Talman, Aarne

AU - Sulubacak, Umut

AU - Vazquez , Raul

AU - Scherrer, Yves

AU - Virpioja, Sami

AU - Raganato, Alessandro

AU - Hurskainen, Arvi

AU - Tiedemann, Jörg

PY - 2019/8/1

Y1 - 2019/8/1

N2 - In this paper, we present the University of Helsinki submissions to the WMT 2019 shared task on news translation in three language pairs: English-German, English-Finnish and Finnish-English. This year, we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German, we trained both sentence-level transformer models and compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches, and we also included a rule-based system for English-Finnish.

AB - In this paper, we present the University of Helsinki submissions to the WMT 2019 shared task on news translation in three language pairs: English-German, English-Finnish and Finnish-English. This year, we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German, we trained both sentence-level transformer models and compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches, and we also included a rule-based system for English-Finnish.

KW - 113 Computer and information sciences

KW - 6121 Languages

M3 - Conference contribution

SP - 611

EP - 622

BT - Fourth Conference of Conference on Machine Translation

PB - Association for Computational Linguistics

CY - Stroudsburg

ER -

Talman A, Sulubacak U, Vazquez R, Scherrer Y, Virpioja S, Raganato A et al. The University of Helsinki submissions to the WMT19 news translation task. I Fourth Conference of Conference on Machine Translation: Proceedings of the Conference. Stroudsburg: Association for Computational Linguistics. 2019. s. 611-622