SuicidEmoji: Derived Emoji Dataset and Tasks for Suicide-Related Social Content

Tianlin Zhang, Kailai Yang, Shaoxiong Ji, Boyang Liu, Qianqian Xie, Sophia Ananiadou

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

Early suicidal ideation detection using social media is crucial for mental health surveillance. Simultaneously, emojis from the posts can help us better understand users' emotions and predict mental health conditions. However, research in emoji-based suicide analysis remains underexplored, with few resources available, which can restrict the development of studying emoji usage patterns among users with suicidal ideation. In this work, we build a derived suicide-related emoji dataset named SuicidEmoji, which contains 25k emoji posts (2,329 suicide-related posts and 22,722 posts for the control group users) filtered from about 1.3 million crawled Reddit data. To the best of our knowledge, SuicidEmoji is the first suicide-related emoji dataset. Based on SuicidEmoji, we propose two novel tasks: emoji-aware suicidal ideation detection and emoji prediction, for which we build two benchmark subdatasets from SuicidEmoji to evaluate the performance of advanced methods including pre-trained language models (PLMs) and large language models (LLMs). We analyze the experimental results of two PLMs and the highly capable LLMs, which reveal the significance and challenges of emoji-based suicide-related NLP tasks. The dataset is avaliable at https://github.com/TianlinZhang668/SuicidEmoji.

Originalspråkengelska
Titel på värdpublikationSIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
RedaktörerGrace Hui Yang, Hongning Wang
Antal sidor6
UtgivningsortNew York
FörlagAssociation for Computing Machinery
Utgivningsdatum11 juli 2024
Sidor1136-1141
ISBN (elektroniskt)979-8-4007-0431-4
DOI
StatusPublicerad - 11 juli 2024
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational ACM SIGIR Conference on Research and Development in Information Retrieval - Washington DC
Varaktighet: 14 juli 202418 juli 2024
Konferensnummer: 47
https://sigir-2024.github.io/

Bibliografisk information

Publisher Copyright:
© 2024 Owner/Author.

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap
  • 6121 Språkvetenskaper

Citera det här