Sammanfattning
Early suicidal ideation detection using social media is crucial for mental health surveillance. Simultaneously, emojis from the posts can help us better understand users' emotions and predict mental health conditions. However, research in emoji-based suicide analysis remains underexplored, with few resources available, which can restrict the development of studying emoji usage patterns among users with suicidal ideation. In this work, we build a derived suicide-related emoji dataset named SuicidEmoji, which contains 25k emoji posts (2,329 suicide-related posts and 22,722 posts for the control group users) filtered from about 1.3 million crawled Reddit data. To the best of our knowledge, SuicidEmoji is the first suicide-related emoji dataset. Based on SuicidEmoji, we propose two novel tasks: emoji-aware suicidal ideation detection and emoji prediction, for which we build two benchmark subdatasets from SuicidEmoji to evaluate the performance of advanced methods including pre-trained language models (PLMs) and large language models (LLMs). We analyze the experimental results of two PLMs and the highly capable LLMs, which reveal the significance and challenges of emoji-based suicide-related NLP tasks. The dataset is avaliable at https://github.com/TianlinZhang668/SuicidEmoji.
Originalspråk | engelska |
---|---|
Titel på värdpublikation | SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Redaktörer | Grace Hui Yang, Hongning Wang |
Antal sidor | 6 |
Utgivningsort | New York |
Förlag | Association for Computing Machinery |
Utgivningsdatum | 11 juli 2024 |
Sidor | 1136-1141 |
ISBN (elektroniskt) | 979-8-4007-0431-4 |
DOI | |
Status | Publicerad - 11 juli 2024 |
MoE-publikationstyp | A4 Artikel i en konferenspublikation |
Evenemang | International ACM SIGIR Conference on Research and Development in Information Retrieval - Washington DC Varaktighet: 14 juli 2024 → 18 juli 2024 Konferensnummer: 47 https://sigir-2024.github.io/ |
Bibliografisk information
Publisher Copyright:© 2024 Owner/Author.
Vetenskapsgrenar
- 113 Data- och informationsvetenskap
- 6121 Språkvetenskaper