ParaPhraser: Russian paraphrase corpus and shared task

Lidia Pivovarova, Ekaterina Pronoza, Elena Yagunova, Anton Pronoza

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

Kuvaus

The paper describes the results of the First Russian Paraphrase Detection Shared Task held in St.-Petersburg, Russia, in October 2016. Research in the area of paraphrase extraction, detection and generation has been successfully developing for a long time while there has been only a recent surge of interest towards the problem in the Russian community of computational linguistics. We try to overcome this gap by introducing the project ParaPhraser.ru dedicated to the collection of Russian paraphrase corpus and organizing a Paraphrase Detection Shared Task, which uses the corpus as the training data. The participants of the task applied a wide variety of techniques to the problem of paraphrase detection, from rule-based approaches to deep learning, and results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.
Alkuperäiskielienglanti
OtsikkoArtificial Intelligence and Natural Language : 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected Papers
ToimittajatAndrey Filchenkov, Lidia Pivovarova, Jan Žižka
JulkaisupaikkaCham
KustantajaSpringer
Julkaisupäivä2017
Sivut211-225
ISBN (painettu)978-3-319-71745-6
ISBN (elektroninen)978-3-319-71746-3
DOI - pysyväislinkit
TilaJulkaistu - 2017
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaConference on Artificial Intelligence and Natural Language - St.Petersburg, Venäjä
Kesto: 20 syyskuuta 201723 syyskuuta 2017
Konferenssinumero: 6
http://ainlconf.ru

Julkaisusarja

NimiCommunications in Computer and Information Science
KustantajaSpringer
Vuosikerta789
ISSN (painettu)1865-0929

Tieteenalat

  • 6121 Kielitieteet
  • 113 Tietojenkäsittely- ja informaatiotieteet

Lainaa tätä

Pivovarova, L., Pronoza, E., Yagunova, E., & Pronoza, A. (2017). ParaPhraser: Russian paraphrase corpus and shared task. teoksessa A. Filchenkov, L. Pivovarova, & J. Žižka (Toimittajat), Artificial Intelligence and Natural Language: 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected Papers (Sivut 211-225). (Communications in Computer and Information Science; Vuosikerta 789). Cham: Springer. https://doi.org/10.1007/978-3-319-71746-3_18
Pivovarova, Lidia ; Pronoza, Ekaterina ; Yagunova, Elena ; Pronoza, Anton. / ParaPhraser: Russian paraphrase corpus and shared task. Artificial Intelligence and Natural Language: 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected Papers. Toimittaja / Andrey Filchenkov ; Lidia Pivovarova ; Jan Žižka. Cham : Springer, 2017. Sivut 211-225 (Communications in Computer and Information Science).
@inproceedings{710aa7b24f9741b3b777b6ffbf0606cb,
title = "ParaPhraser: Russian paraphrase corpus and shared task",
abstract = "The paper describes the results of the First Russian Paraphrase Detection Shared Task held in St.-Petersburg, Russia, in October 2016. Research in the area of paraphrase extraction, detection and generation has been successfully developing for a long time while there has been only a recent surge of interest towards the problem in the Russian community of computational linguistics. We try to overcome this gap by introducing the project ParaPhraser.ru dedicated to the collection of Russian paraphrase corpus and organizing a Paraphrase Detection Shared Task, which uses the corpus as the training data. The participants of the task applied a wide variety of techniques to the problem of paraphrase detection, from rule-based approaches to deep learning, and results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.",
keywords = "6121 Languages, 113 Computer and information sciences",
author = "Lidia Pivovarova and Ekaterina Pronoza and Elena Yagunova and Anton Pronoza",
year = "2017",
doi = "10.1007/978-3-319-71746-3_18",
language = "English",
isbn = "978-3-319-71745-6",
series = "Communications in Computer and Information Science",
publisher = "Springer",
pages = "211--225",
editor = "{ Filchenkov}, Andrey and Lidia Pivovarova and Jan Žižka",
booktitle = "Artificial Intelligence and Natural Language",
address = "Switzerland",

}

Pivovarova, L, Pronoza, E, Yagunova, E & Pronoza, A 2017, ParaPhraser: Russian paraphrase corpus and shared task. julkaisussa A Filchenkov, L Pivovarova & J Žižka (toim), Artificial Intelligence and Natural Language: 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected Papers. Communications in Computer and Information Science, Vuosikerta 789, Springer, Cham, Sivut 211-225, Conference on Artificial Intelligence and Natural Language, St.Petersburg, Venäjä, 20/09/2017. https://doi.org/10.1007/978-3-319-71746-3_18

ParaPhraser: Russian paraphrase corpus and shared task. / Pivovarova, Lidia; Pronoza, Ekaterina ; Yagunova, Elena; Pronoza, Anton.

Artificial Intelligence and Natural Language: 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected Papers. toim. / Andrey Filchenkov; Lidia Pivovarova; Jan Žižka. Cham : Springer, 2017. s. 211-225 (Communications in Computer and Information Science; Vuosikerta 789).

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

TY - GEN

T1 - ParaPhraser: Russian paraphrase corpus and shared task

AU - Pivovarova, Lidia

AU - Pronoza, Ekaterina

AU - Yagunova, Elena

AU - Pronoza, Anton

PY - 2017

Y1 - 2017

N2 - The paper describes the results of the First Russian Paraphrase Detection Shared Task held in St.-Petersburg, Russia, in October 2016. Research in the area of paraphrase extraction, detection and generation has been successfully developing for a long time while there has been only a recent surge of interest towards the problem in the Russian community of computational linguistics. We try to overcome this gap by introducing the project ParaPhraser.ru dedicated to the collection of Russian paraphrase corpus and organizing a Paraphrase Detection Shared Task, which uses the corpus as the training data. The participants of the task applied a wide variety of techniques to the problem of paraphrase detection, from rule-based approaches to deep learning, and results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.

AB - The paper describes the results of the First Russian Paraphrase Detection Shared Task held in St.-Petersburg, Russia, in October 2016. Research in the area of paraphrase extraction, detection and generation has been successfully developing for a long time while there has been only a recent surge of interest towards the problem in the Russian community of computational linguistics. We try to overcome this gap by introducing the project ParaPhraser.ru dedicated to the collection of Russian paraphrase corpus and organizing a Paraphrase Detection Shared Task, which uses the corpus as the training data. The participants of the task applied a wide variety of techniques to the problem of paraphrase detection, from rule-based approaches to deep learning, and results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.

KW - 6121 Languages

KW - 113 Computer and information sciences

U2 - 10.1007/978-3-319-71746-3_18

DO - 10.1007/978-3-319-71746-3_18

M3 - Conference contribution

SN - 978-3-319-71745-6

T3 - Communications in Computer and Information Science

SP - 211

EP - 225

BT - Artificial Intelligence and Natural Language

A2 - Filchenkov, Andrey

A2 - Pivovarova, Lidia

A2 - Žižka, Jan

PB - Springer

CY - Cham

ER -

Pivovarova L, Pronoza E, Yagunova E, Pronoza A. ParaPhraser: Russian paraphrase corpus and shared task. julkaisussa Filchenkov A, Pivovarova L, Žižka J, toimittajat, Artificial Intelligence and Natural Language: 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected Papers. Cham: Springer. 2017. s. 211-225. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-319-71746-3_18