Multiple Admissibility in Language Learning: Judging Grammaticality using Unlabeled Data

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

We present our work on the problem of detection Multiple Admissibility (MA) in language learning. Multiple Admissibility occurs when more than one grammatical form of a word fits syntactically and semantically in a given context. In second-language education—in particular, in intelligent tutoring systems/computer-aided language learning (ITS/CALL), systems generate exercises automatically. MA implies that multiple alternative answers are possible. We treat the problem as a grammaticality judgement task. We train a neural network with an objective to label sentences as grammatical or ungrammatical, using a "simulated learner corpus": a dataset with correct text and with artificial errors, generated automatically. While MA occurs commonly in many languages, this paper focuses on learning Russian. We present a detailed classification of the types of constructions in Russian, in which MA is possible, and evaluate the model using a test set built from answers provided by users of the Revita language learning system.
Originalspråkengelska
Titel på gästpublikationThe 7th Workshop on Balto-Slavic Natural Language Processing : Proceedings of the Workshop
RedaktörerTomaž Erjavec, Michał Marcińczuk, Preslav Nakov, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
Antal sidor11
UtgivningsortStroudsburg
FörlagThe Association for Computational Linguistics
Utgivningsdatumaug 2019
Sidor12-22
ISBN (elektroniskt)978-1-950737-41-3
DOI
StatusPublicerad - aug 2019
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangWorkshop on
Balto-Slavic Natural Language Processing
- Florence, Italien
Varaktighet: 2 aug 20192 aug 2019
Konferensnummer: 7
http://bsnlp.cs.helsinki.fi

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här