Morphological Segmentation and OPUS for Finnish-English Machine Translation

Jörg Tiedemann, Filip Ginter, Jenna Kanerva

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

This paper describes baseline systems for
Finnish-English and English-Finnish machine translation using standard phrasebased and factored models including morphological features. We experiment with compound splitting and morphological segmentation and study the effect of adding noisy out-of-domain data to the parallel and the monolingual training data. Our results stress the importance of training data
and demonstrate the effectiveness of morphological pre-processing of Finnish.
Original languageEnglish
Title of host publicationProceedings of the Tenth Workshop on Statistical Machine Translation
Number of pages7
Place of PublicationNew York
PublisherThe Association for Computational Linguistics
Publication date1 Sep 2015
Pages177-183
Publication statusPublished - 1 Sep 2015
Externally publishedYes
MoE publication typeA4 Article in conference proceedings
EventWorkshop on Statistical Machine Translation - Lisboa, Portugal
Duration: 17 Sep 201518 Sep 2015
Conference number: 10

Fields of Science

  • 6121 Languages

Cite this