Optimizing the description of multi-word expressions in English

Research output: Working paperScientific

Abstract

The description of multi-word expressions (MWE) is a necessary phase in rule-based machine translation. Because the concept MWE contains several types of word clusters, it is not self-evident how they should be described. One approach is that the isolation of multi-words is carried out after the morphological analysis, but before disambiguation. If the POS ambiguity of the language is minimal, this method is suitable, and perhaps also optimal. In case the POS ambiguity of the language is extensive, this method is hardly optimal. English belongs to this type of languages. The more optimal solution is that the isolation of MWEs is carried out in two phases. This method will be discussed and demonstrated in this report.
Original languageEnglish
Place of PublicationHelsinki
PublisherUniversity of Helsinki, Institute for Asian and African Studies
Number of pages18
Publication statusPublished - 2020
MoE publication typeD4 Published development or research report or study

Fields of Science

  • 6121 Languages

Cite this