SuicidEmoji: Derived Emoji Dataset and Tasks for Suicide-Related Social Content

Tianlin Zhang, Kailai Yang, Shaoxiong Ji, Boyang Liu, Qianqian Xie, Sophia Ananiadou

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Early suicidal ideation detection using social media is crucial for mental health surveillance. Simultaneously, emojis from the posts can help us better understand users' emotions and predict mental health conditions. However, research in emoji-based suicide analysis remains underexplored, with few resources available, which can restrict the development of studying emoji usage patterns among users with suicidal ideation. In this work, we build a derived suicide-related emoji dataset named SuicidEmoji, which contains 25k emoji posts (2,329 suicide-related posts and 22,722 posts for the control group users) filtered from about 1.3 million crawled Reddit data. To the best of our knowledge, SuicidEmoji is the first suicide-related emoji dataset. Based on SuicidEmoji, we propose two novel tasks: emoji-aware suicidal ideation detection and emoji prediction, for which we build two benchmark subdatasets from SuicidEmoji to evaluate the performance of advanced methods including pre-trained language models (PLMs) and large language models (LLMs). We analyze the experimental results of two PLMs and the highly capable LLMs, which reveal the significance and challenges of emoji-based suicide-related NLP tasks. The dataset is avaliable at https://github.com/TianlinZhang668/SuicidEmoji.

Original languageEnglish
Title of host publicationSIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
EditorsGrace Hui Yang, Hongning Wang
Number of pages6
Place of PublicationNew York
PublisherAssociation for Computing Machinery
Publication date11 Jul 2024
Pages1136-1141
ISBN (Electronic)979-8-4007-0431-4
DOIs
Publication statusPublished - 11 Jul 2024
MoE publication typeA4 Article in conference proceedings
EventInternational ACM SIGIR Conference on Research and Development in Information Retrieval - Washington DC
Duration: 14 Jul 202418 Jul 2024
Conference number: 47
https://sigir-2024.github.io/

Fields of Science

  • 113 Computer and information sciences
  • 6121 Languages
  • emojis
  • mental health
  • social media
  • suicidal ideation detection

Cite this