Återgå till huvudnavigering Återgå till sök Gå direkt till huvudinnehållet

Video Games as a Corpus: Sentiment Analysis using Fallout New Vegas Dialog

  • Mika Hämäläinen
  • , Khalid Alnajjar
  • , Thierry Poibeau

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

We present a method for extracting a multilingual sentiment annotated dialog data set from Fallout New Vegas. The game developers have preannotated every line of dialog in the game in one of the 8 different sentiments: anger, disgust, fear, happy, neutral, pained, sad and surprised. The game has been translated into English, Spanish, German, French and Italian. We conduct experiments on multilingual, multilabel sentiment analysis on the extracted data set using multilingual BERT, XLMRoBERTa and language specific BERT models. In our experiments, multilingual BERT outperformed XLMRoBERTa for most of the languages, also language specific models were slightly better than multilingual BERT for most of the languages. The best overall accuracy was 54% and it was achieved by using multilingual BERT on Spanish data. The extracted data set presents a challenging task for sentiment analysis. We have released the data, including the testing and training splits, openly on Zenodo. The data set has been shuffled for copyright reasons.

Originalspråkengelska
Titel på värdpublikationProceedings of the 17th International Conference on the Foundations of Digital Games
RedaktörerKostas Karpouzis, Stefano Gualeni, Johanna Pirker, Allan Fowler
Antal sidor4
UtgivningsortNew York
FörlagAssociation for Computing Machinery
Utgivningsdatum4 nov. 2022
Artikelnummer56
ISBN (elektroniskt)978-1-4503-9795-7
DOI
StatusPublicerad - 4 nov. 2022
MoE-publikationstypA4 Artikel i en konferenspublikation
Evenemang International Conference on the Foundations of Digital Games - Athens, Grekland
Varaktighet: 5 sep. 20228 sep. 2022
Konferensnummer: 17

Publikationsserier

NamnACM Proceedings
FörlagAssociation for Computing Machinery
ISSN (elektroniskt)2168-4081

Bibliografisk information

Publisher Copyright:
© 2022 Owner/Author.

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 113 Data- och informationsvetenskap

Citera det här