Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign

Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Ahmed Ali, Suwon Shon, James Glass, Yves Scherrer, Tanja Samardžić, Nikola Ljubešić, Jörg Tiedemann, Chris van der Lee, Stefan Grondelaers, Nelleke Oostdijk, Dirk Speelman, Antal van den Bosch, Ritesh Kumar, Bornini Lahiri, Mayank Jain

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskaplig

Sammanfattning

We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING’2018. This year, the campaign included five shared tasks, including two task re-runs – Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) –, and three new tasks – Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report.
Originalspråkengelska
Titel på gästpublikationProceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects
RedaktörerMarcos Zampieri, Preslav Nakov, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi , Ahmed Ali
Antal sidor17
UtgivningsortSanta Fe
FörlagAssociation for Computational Linguistics
Utgivningsdatum2018
Sidor1-17
ISBN (elektroniskt)978-1-948087-55-1
StatusPublicerad - 2018
MoE-publikationstypB3 Ej refererad artikel i konferenshandlingar
EvenemangWorkshop on NLP for Similar Languages, Varieties and Dialects - Santa Fe, Förenta Staterna (USA)
Varaktighet: 20 aug 201820 aug 2018
Konferensnummer: 5

Vetenskapsgrenar

  • 6121 Språkvetenskaper

Citera det här

Zampieri, M., Malmasi, S., Nakov, P., Ali, A., Shon, S., Glass, J., ... Jain, M. (2018). Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. I M. Zampieri, P. Nakov, N. Ljubešić, J. Tiedemann, S. Malmasi , & A. Ali (Red.), Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (s. 1-17). Santa Fe: Association for Computational Linguistics.
Zampieri, Marcos ; Malmasi, Shervin ; Nakov, Preslav ; Ali, Ahmed ; Shon, Suwon ; Glass, James ; Scherrer, Yves ; Samardžić, Tanja ; Ljubešić, Nikola ; Tiedemann, Jörg ; van der Lee, Chris ; Grondelaers, Stefan ; Oostdijk, Nelleke ; Speelman, Dirk ; van den Bosch, Antal ; Kumar, Ritesh ; Lahiri, Bornini ; Jain, Mayank. / Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects. redaktör / Marcos Zampieri ; Preslav Nakov ; Nikola Ljubešić ; Jörg Tiedemann ; Shervin Malmasi ; Ahmed Ali. Santa Fe : Association for Computational Linguistics, 2018. s. 1-17
@inproceedings{a48fbf25ffb444f2a499b6fa74d901f9,
title = "Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign",
abstract = "We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING’2018. This year, the campaign included five shared tasks, including two task re-runs – Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) –, and three new tasks – Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report.",
keywords = "6121 Languages",
author = "Marcos Zampieri and Shervin Malmasi and Preslav Nakov and Ahmed Ali and Suwon Shon and James Glass and Yves Scherrer and Tanja Samardžić and Nikola Ljubešić and J{\"o}rg Tiedemann and {van der Lee}, Chris and Stefan Grondelaers and Nelleke Oostdijk and Dirk Speelman and {van den Bosch}, Antal and Ritesh Kumar and Bornini Lahiri and Mayank Jain",
year = "2018",
language = "English",
pages = "1--17",
editor = "Marcos Zampieri and Preslav Nakov and Ljubešić, {Nikola } and Tiedemann, {J{\"o}rg } and {Malmasi }, {Shervin } and Ahmed Ali",
booktitle = "Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects",
publisher = "Association for Computational Linguistics",
address = "International",

}

Zampieri, M, Malmasi, S, Nakov, P, Ali, A, Shon, S, Glass, J, Scherrer, Y, Samardžić, T, Ljubešić, N, Tiedemann, J, van der Lee, C, Grondelaers, S, Oostdijk, N, Speelman, D, van den Bosch, A, Kumar, R, Lahiri, B & Jain, M 2018, Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. i M Zampieri, P Nakov, N Ljubešić, J Tiedemann, S Malmasi & A Ali (red), Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects. Association for Computational Linguistics, Santa Fe, s. 1-17, Workshop on NLP for Similar Languages, Varieties and Dialects, Santa Fe, Förenta Staterna (USA), 20/08/2018.

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. / Zampieri, Marcos; Malmasi, Shervin; Nakov, Preslav; Ali, Ahmed; Shon, Suwon; Glass, James; Scherrer, Yves; Samardžić, Tanja; Ljubešić, Nikola; Tiedemann, Jörg; van der Lee, Chris; Grondelaers, Stefan; Oostdijk, Nelleke; Speelman, Dirk; van den Bosch, Antal; Kumar, Ritesh; Lahiri, Bornini; Jain, Mayank.

Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects. red. / Marcos Zampieri; Preslav Nakov; Nikola Ljubešić; Jörg Tiedemann; Shervin Malmasi ; Ahmed Ali. Santa Fe : Association for Computational Linguistics, 2018. s. 1-17.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskaplig

TY - GEN

T1 - Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign

AU - Zampieri, Marcos

AU - Malmasi, Shervin

AU - Nakov, Preslav

AU - Ali, Ahmed

AU - Shon, Suwon

AU - Glass, James

AU - Scherrer, Yves

AU - Samardžić, Tanja

AU - Ljubešić, Nikola

AU - Tiedemann, Jörg

AU - van der Lee, Chris

AU - Grondelaers, Stefan

AU - Oostdijk, Nelleke

AU - Speelman, Dirk

AU - van den Bosch, Antal

AU - Kumar, Ritesh

AU - Lahiri, Bornini

AU - Jain, Mayank

PY - 2018

Y1 - 2018

N2 - We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING’2018. This year, the campaign included five shared tasks, including two task re-runs – Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) –, and three new tasks – Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report.

AB - We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING’2018. This year, the campaign included five shared tasks, including two task re-runs – Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) –, and three new tasks – Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report.

KW - 6121 Languages

M3 - Conference contribution

SP - 1

EP - 17

BT - Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects

A2 - Zampieri, Marcos

A2 - Nakov, Preslav

A2 - Ljubešić, Nikola

A2 - Tiedemann, Jörg

A2 - Malmasi , Shervin

A2 - Ali, Ahmed

PB - Association for Computational Linguistics

CY - Santa Fe

ER -

Zampieri M, Malmasi S, Nakov P, Ali A, Shon S, Glass J et al. Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. I Zampieri M, Nakov P, Ljubešić N, Tiedemann J, Malmasi S, Ali A, redaktörer, Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects. Santa Fe: Association for Computational Linguistics. 2018. s. 1-17