ParaPhraser: Russian paraphrase corpus and shared task

Lidia Pivovarova, Ekaterina Pronoza, Elena Yagunova, Anton Pronoza

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

The paper describes the results of the First Russian Paraphrase Detection Shared Task held in St.-Petersburg, Russia, in October 2016. Research in the area of paraphrase extraction, detection and generation has been successfully developing for a long time while there has been only a recent surge of interest towards the problem in the Russian community of computational linguistics. We try to overcome this gap by introducing the project ParaPhraser.ru dedicated to the collection of Russian paraphrase corpus and organizing a Paraphrase Detection Shared Task, which uses the corpus as the training data. The participants of the task applied a wide variety of techniques to the problem of paraphrase detection, from rule-based approaches to deep learning, and results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.
Originalspråkengelska
Titel på värdpublikationArtificial Intelligence and Natural Language : 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected Papers
RedaktörerAndrey Filchenkov, Lidia Pivovarova, Jan Žižka
UtgivningsortCham
FörlagSpringer
Utgivningsdatum2017
Sidor211-225
ISBN (tryckt)978-3-319-71745-6
ISBN (elektroniskt)978-3-319-71746-3
DOI
StatusPublicerad - 2017
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangConference on Artificial Intelligence and Natural Language - St.Petersburg, Ryssland
Varaktighet: 20 sep. 201723 sep. 2017
Konferensnummer: 6
http://ainlconf.ru

Publikationsserier

NamnCommunications in Computer and Information Science
FörlagSpringer
Volym789
ISSN (tryckt)1865-0929

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 113 Data- och informationsvetenskap

Citera det här