Multiple Admissibility in Language Learning: Judging Grammaticality using Unlabeled Data

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review


We present our work on the problem of detection Multiple Admissibility (MA) in language learning. Multiple Admissibility occurs when more than one grammatical form of a word fits syntactically and semantically in a given context. In second-language education—in particular, in intelligent tutoring systems/computer-aided language learning (ITS/CALL), systems generate exercises automatically. MA implies that multiple alternative answers are possible. We treat the problem as a grammaticality judgement task. We train a neural network with an objective to label sentences as grammatical or ungrammatical, using a "simulated learner corpus": a dataset with correct text and with artificial errors, generated automatically. While MA occurs commonly in many languages, this paper focuses on learning Russian. We present a detailed classification of the types of constructions in Russian, in which MA is possible, and evaluate the model using a test set built from answers provided by users of the Revita language learning system.
Titel på värdpublikationThe 7th Workshop on Balto-Slavic Natural Language Processing : Proceedings of the Workshop
RedaktörerTomaž Erjavec, Michał Marcińczuk, Preslav Nakov, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
Antal sidor11
FörlagThe Association for Computational Linguistics
Utgivningsdatumaug. 2019
ISBN (elektroniskt)978-1-950737-41-3
StatusPublicerad - aug. 2019
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangWorkshop on
Balto-Slavic Natural Language Processing
- Florence, Italien
Varaktighet: 2 aug. 20192 aug. 2019
Konferensnummer: 7


  • 113 Data- och informationsvetenskap

Citera det här