Bitexts as Semantic Mirrors

Jörg Tiedemann, Lonneke van der Plas, Begoña Villada Moirón

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinen

Abstrakti

The importance of parallel corpora in machine translation research is widely recognized and undisputed. The amount of research on data-driven techniques in MT has grown tremendously since the introduction of the first automatic alignment techniques in the early 90’s that finally allowed to create large and reasonably clean bitexts from scratch without human interventions. Nowadays, it is impossible to think of an era before statistical machine translation and Google Translate is at everybody’s hands. However, there is more to it than just translation. Enthusiastic researchers in the early days came up with ever new ideas of using parallel data in problems of natural language understanding and processing. Some of these ideas almost disappear in the flood of SMT papers coming out every year. This paper tries to remind us of some other applications that illustrate the amazing utility of parallel corpora.
Alkuperäiskielienglanti
OtsikkoUnknown host publication
Julkaisupäivä1 lokakuuta 2013
TilaJulkaistu - 1 lokakuuta 2013
Julkaistu ulkoisestiKyllä
OKM-julkaisutyyppiB3 Vertaisarvioimaton artikkeli konferenssijulkaisussa
TapahtumaEMNLP 2013: Conference on Empirical Methods in Natural Language Processing - Seattle, Yhdysvallat (USA)
Kesto: 18 lokakuuta 201321 lokakuuta 2013

Tieteenalat

  • 6121 Kielitieteet
  • 113 Tietojenkäsittely- ja informaatiotieteet

Siteeraa tätä