LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?

Marc Pàmies, Emily Öhman, Kaisla Kajava, Jörg Tiedemann

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

This paper presents the different models submitted by the LT@Helsinki team for the SemEval2020 Shared Task 12. Our team participated in sub-tasks A and C; titled offensive language identification and offense target identification, respectively. In both cases we used the so called Bidirectional Encoder Representation from Transformer (BERT), a model pre-trained by Google and fine-tuned by us on the OLID dataset. The results show that offensive tweet classification is one of several language-based tasks where BERT can achieve state-of-the-art results.
Originalspråkengelska
Titel på värdpublikationProceedings of the Fourteenth Workshop on Semantic Evaluation
RedaktörerAurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Antal sidor7
UtgivningsortBarcelona
FörlagInternational Committee for Computational Linguistics
Utgivningsdatum2020
Sidor1569-1575
ISBN (elektroniskt)978-1-952148-31-6
StatusPublicerad - 2020
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational Workshop on Semantic Evaluation - [Online event], Barcelona, Spanien
Varaktighet: 12 dec. 202013 dec. 2020
Konferensnummer: 14
http://alt.qcri.org/semeval2020/

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap
  • 6121 Språkvetenskaper

Citera det här