skip to main content
10.1145/3626772.3657852acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Open access

SuicidEmoji: Derived Emoji Dataset and Tasks for Suicide-Related Social Content

Published: 11 July 2024 Publication History

Abstract

Early suicidal ideation detection using social media is crucial for mental health surveillance. Simultaneously, emojis from the posts can help us better understand users' emotions and predict mental health conditions. However, research in emoji-based suicide analysis remains underexplored, with few resources available, which can restrict the development of studying emoji usage patterns among users with suicidal ideation. In this work, we build a derived suicide-related emoji dataset named SuicidEmoji, which contains 25k emoji posts (2,329 suicide-related posts and 22,722 posts for the control group users) filtered from about 1.3 million crawled Reddit data. To the best of our knowledge, SuicidEmoji is the first suicide-related emoji dataset. Based on SuicidEmoji, we propose two novel tasks: emoji-aware suicidal ideation detection and emoji prediction, for which we build two benchmark subdatasets from SuicidEmoji to evaluate the performance of advanced methods including pre-trained language models (PLMs) and large language models (LLMs). We analyze the experimental results of two PLMs and the highly capable LLMs, which reveal the significance and challenges of emoji-based suicide-related NLP tasks. The dataset is avaliable at https://github.com/TianlinZhang668/SuicidEmoji.

References

[1]
Theyazn HH Aldhyani, Saleh Nagi Alsubari, Ali Saleh Alshebami, Hasan Alkahtani, and Zeyad AT Ahmed. 2022. Detecting and analyzing suicidal ideation on social media using deep learning and machine learning models. International journal of environmental research and public health, Vol. 19, 19 (2022), 12635.
[2]
Florian Arendt, Benedikt Till, Martin Voracek, Stefanie Kirchner, Gernot Sonneck, Brigitte Naderer, Paul P�rcher, and Thomas Niederkrotenthaler. 2023. ChatGPT, Artificial Intelligence, and suicide prevention: A call for a targeted and concerted research effort. Crisis: The Journal of Crisis Intervention and Suicide Prevention, Vol. 44, 5 (2023).
[3]
Qiyu Bai, Qi Dan, Zhe Mu, and Maokun Yang. 2019. A systematic review of emoji: Current research and future perspectives. Frontiers in psychology, Vol. 10 (2019), 2221.
[4]
Francesco Barbieri, Jose Camacho-Collados, Francesco Ronzano, Luis Espinosa Anke, Miguel Ballesteros, Valerio Basile, Viviana Patti, and Horacio Saggion. 2018. Semeval 2018 task 2: Multilingual emoji prediction. In Proceedings of the 12th international workshop on semantic evaluation. 24--33.
[5]
Jason Baumgartner, Savvas Zannettou, Brian Keegan, Megan Squire, and Jeremy Blackburn. 2020. The pushshift reddit dataset. In Proceedings of the international AAAI conference on web and social media, Vol. 14. 830--839.
[6]
Amy Bruckman. 2002. Studying the amateur artist: A perspective on disguising data collected in human subjects research on the Internet. Ethics and Information Technology, Vol. 4 (2002), 217--231.
[7]
Jose Camacho-Collados, Kiamehr Rezaee, Talayeh Riahi, Asahi Ushio, Daniel Loureiro, Dimosthenis Antypas, Joanne Boisson, Luis Espinosa Anke, Fangyu Liu, and Eugenio Mart�nez-C�mara. 2022. TweetNLP: Cutting-Edge Natural Language Processing for Social Media. In Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 38--49.
[8]
Moumita Chatterjee, Piyush Kumar, Poulomi Samanta, and Dhrubasish Sarkar. 2022. Suicide ideation detection from online social media: A multi-modal feature based technique. International Journal of Information Management Data Insights, Vol. 2, 2 (2022), 100103.
[9]
Daniel DiPietro, Vivek Hazari, and Soroush Vosoughi. 2022. Robin: A Novel Online Suicidal Text Corpus of Substantial Breadth and Scale. arXiv preprint arXiv:2209.05707 (2022).
[10]
Hamideh Ghanadian, Isar Nejadgholi, and Hussein Al Osman. 2023. ChatGPT for Suicide Risk Assessment on Social Media: Quantitative Evaluation of Model Performance, Potentials and Limitations. arXiv preprint arXiv:2306.09390 (2023).
[11]
Fabrizio Gilardi, Meysam Alizadeh, and Ma�l Kubli. 2023. Chatgpt outperforms crowd-workers for text-annotation tasks. arXiv preprint arXiv:2303.15056 (2023).
[12]
Sara R Jaeger, Christina M Roigard, David Jin, Leticia Vidal, and Gast�n Ares. 2019. Valence, arousal and sentiment meanings of 33 facial emoji: Insights for the use of emoji in consumer research. Food research international, Vol. 119 (2019), 895--907.
[13]
Shaoxiong Ji, Shirui Pan, Xue Li, Erik Cambria, Guodong Long, and Zi Huang. 2020. Suicidal ideation detection: A review of machine learning methods and applications. IEEE Transactions on Computational Social Systems, Vol. 8, 1 (2020), 214--226.
[14]
Shaoxiong Ji, Tianlin Zhang, Luna Ansari, Jie Fu, Prayag Tiwari, and Erik Cambria. 2022. MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. 7184--7190.
[15]
Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171--4186.
[16]
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large language models are zero-shot reasoners. Advances in neural information processing systems, Vol. 35 (2022), 22199--22213.
[17]
Petra Kralj Novak, Jasmina Smailović, Borut Sluban, and Igor Mozetivc. 2015. Sentiment of emojis. PloS one, Vol. 10, 12 (2015), e0144296.
[18]
Weicheng Ma, Ruibo Liu, Lili Wang, and Soroush Vosoughi. 2020. Emoji prediction: Extensions and benchmarking. arXiv preprint arXiv:2007.07389 (2020).
[19]
Davide Marengo, Michele Settanni, and Fabrizia Giannotta. 2019. Development and preliminary validation of an image-based instrument to assess depressive symptoms. Psychiatry research, Vol. 279 (2019), 180--185.
[20]
Hannah Miller, Jacob Thebault-Spieker, Shuo Chang, Isaac Johnson, Loren Terveen, and Brent Hecht. 2016. "Blissfully happy" or "ready tofight": Varying interpretations of emoji. In Proceedings of the international AAAI conference on web and social media, Vol. 10. 259--268.
[21]
Usman Naseem, Jinman Kim, Matloob Khushi, and Adam Dunn. 2023. Graph-Based Hierarchical Attention Network for Suicide Risk Detection on Social Media. In Companion Proceedings of the ACM Web Conference 2023. 995--1003.
[22]
Jennifer Nicholas, Sandersan Onie, and Mark E Larsen. 2020. Ethics and privacy in social media research for mental health. Current psychiatry reports, Vol. 22 (2020), 1--7.
[23]
K Nikhileswar, D Vishal, L Sphoorthi, and S Fathimabi. 2021. Suicide ideation detection in social media forums. In 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC). IEEE, 1741--1747.
[24]
Guilherme Penedo, Quentin Malartic, Daniel Hesslow, Ruxandra Cojocaru, Alessandro Cappelli, Hamza Alobeidli, Baptiste Pannier, Ebtesam Almazrouei, and Julien Launay. 2023. The Refined Web dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116 (2023).
[25]
Matthew Shardlow, Luciano Gerber, and Raheel Nawaz. 2022. One emoji, many meanings: A corpus for the prediction and disambiguation of emoji sense. Expert Systems with Applications, Vol. 198 (2022), 116862.
[26]
Abhishek Singh, Eduardo Blanco, and Wei Jin. 2019. Incorporating emoji descriptions improves tweet classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2096--2101.
[27]
Samantha Tang, Natalie M Reily, Andrew F Arena, Philip J Batterham, Alison L Calear, Gregory L Carter, Andrew J Mackinnon, and Helen Christensen. 2022. People who die by suicide without receiving mental health services: a systematic review. Frontiers in public health, Vol. 9 (2022), 736948.
[28]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth�e Lacroix, Baptiste Rozi�re, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
[29]
Kailai Yang, Shaoxiong Ji, Tianlin Zhang, Qianqian Xie, and Sophia Ananiadou. 2023. On the evaluations of chatgpt and emotion-enhanced prompting for mental health analysis. arXiv preprint arXiv:2304.03347 (2023).
[30]
Tianlin Zhang, Annika M Schoene, and Sophia Ananiadou. 2021. Automatic identification of suicide notes with a transformer-based deep learning model. Internet interventions, Vol. 25 (2021), 100422.
[31]
Tianlin Zhang, Annika M Schoene, Shaoxiong Ji, and Sophia Ananiadou. 2022. Natural language processing applied to mental illness detection: a narrative review. NPJ digital medicine, Vol. 5, 1 (2022), 46.

Index Terms

  1. SuicidEmoji: Derived Emoji Dataset and Tasks for Suicide-Related Social Content

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2024
    3164 pages
    ISBN:9798400704314
    DOI:10.1145/3626772
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2024

    Check for updates

    Author Tags

    1. emojis
    2. mental health
    3. social media
    4. suicidal ideation detection

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 311
      Total Downloads
    • Downloads (Last 12 months)311
    • Downloads (Last 6 weeks)95
    Reflects downloads up to 19 Oct 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media