Abstract
Early suicidal ideation detection using social media is crucial for mental health surveillance. Simultaneously, emojis from the posts can help us better understand users' emotions and predict mental health conditions. However, research in emoji-based suicide analysis remains underexplored, with few resources available, which can restrict the development of studying emoji usage patterns among users with suicidal ideation. In this work, we build a derived suicide-related emoji dataset named SuicidEmoji, which contains 25k emoji posts (2,329 suicide-related posts and 22,722 posts for the control group users) filtered from about 1.3 million crawled Reddit data. To the best of our knowledge, SuicidEmoji is the first suicide-related emoji dataset. Based on SuicidEmoji, we propose two novel tasks: emoji-aware suicidal ideation detection and emoji prediction, for which we build two benchmark subdatasets from SuicidEmoji to evaluate the performance of advanced methods including pre-trained language models (PLMs) and large language models (LLMs). We analyze the experimental results of two PLMs and the highly capable LLMs, which reveal the significance and challenges of emoji-based suicide-related NLP tasks. The dataset is avaliable at https://github.com/TianlinZhang668/SuicidEmoji.
Original language | English |
---|---|
Title of host publication | SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Editors | Grace Hui Yang, Hongning Wang |
Number of pages | 6 |
Place of Publication | New York |
Publisher | Association for Computing Machinery |
Publication date | 11 Jul 2024 |
Pages | 1136-1141 |
ISBN (Electronic) | 979-8-4007-0431-4 |
DOIs | |
Publication status | Published - 11 Jul 2024 |
MoE publication type | A4 Article in conference proceedings |
Event | International ACM SIGIR Conference on Research and Development in Information Retrieval - Washington DC Duration: 14 Jul 2024 → 18 Jul 2024 Conference number: 47 https://sigir-2024.github.io/ |
Fields of Science
- 113 Computer and information sciences
- 6121 Languages
- emojis
- mental health
- social media
- suicidal ideation detection