Character-based PSMT for Closely Related Languages

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

Abstrakti

Translating unknown words between related languages using a character-based statistical machine translation model can be beneficial. In this paper, we describe a simple method to combine character-based models with standard word-based models to increase the coverage of a phrase-based SMT system. Using this approach, we can show a modest improvement when translating between Norwegian and Swedish. The potentials of applying character-based models to closely related languages is also illustrated by applying the character model on its own. The performance of such an approach is similar to the word-level baseline and closer to the reference in terms of string similarity.
Alkuperäiskielimuu / ei tiedossa
OtsikkoUnknown host publication
ToimittajatLluís Marqués, Harold Somers
Julkaisupäivä1 toukokuuta 2009
Sivut12 - 19
TilaJulkaistu - 1 toukokuuta 2009
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa

Siteeraa tätä

Tiedemann, J. (2009). Character-based PSMT for Closely Related Languages. teoksessa L. Marqués, & H. Somers (Toimittajat), Unknown host publication (Sivut 12 - 19)