Automatic Collocation Extraction and Classification of Automatically Obtained Bigrams

Daria Kormacheva, Lidia Pivovarova, Mihail Kopotev

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

This paper focuses on automatic determination of the distributional preferences of words in Russian. We present the comparison of six different measures for collocation extraction, part of which are widely known, while others are less prominent or new. For these metrics we evaluate the semantic stability of automatically obtained bigrams beginning with single-token prepositions. Manual annotation of the first 100 bigrams and comparison with the dictionary of multi-word expressions are used as evaluation measures. Finally, in order to present error analysis, two prepositions are investigated in some details.
Original languageEnglish
Title of host publicationProceedings : Workshop on Computational, Cognitive, and Linguistic Approaches to the Analysis of Complex Words and Collocations (CCLCC 2014)
EditorsVerena Henrich, Erhard Hinrichs
Number of pages7
Place of PublicationTübingen
PublisherUniversity of Tübingen
Publication date2014
Pages27-33
Publication statusPublished - 2014
MoE publication typeA4 Article in conference proceedings
EventWorkshop on Computational, Cognitive, and Linguistic Approaches to the Analysis of Complex Words and Collocations - Tübingen, Germany
Duration: 11 Aug 201415 Aug 2014
Conference number: CCLCC 2014

Fields of Science

  • 6121 Languages

Projects

COLLOCATIONS, COLLIGATIONS AND CORPORA (CoCoCo)

Kopotev, M., Yangarber, R., Kormacheva, D., Pivovarova, L. & Pierce, M.

01/09/2012 → …

Project: Research project

Cite this

Kormacheva, D., Pivovarova, L., & Kopotev, M. (2014). Automatic Collocation Extraction and Classification of Automatically Obtained Bigrams. In V. Henrich, & E. Hinrichs (Eds.), Proceedings: Workshop on Computational, Cognitive, and Linguistic Approaches to the Analysis of Complex Words and Collocations (CCLCC 2014) (pp. 27-33). University of Tübingen. http://www.sfs.uni-tuebingen.de/~vhenrich/cclcc_2014/