135 88
Full Length Article
Fusion: Practice and Applications
Volume 15 , Issue 1, PP: 196-204 , 2024 | Cite this article as | XML | Html |PDF

Title

Fusion of Topic Modeling and RoBERTa for Detecting Signs of Depression from Social Media

  Madhu Sudhan H. V. 1 * ,   S. Saravana Kumar 2

1  CMR University (CMRU), Bangalore, India
    (madhusudhan.19cphd@cmr.edu.in)

2  CMR University (CMRU), Bangalore, India
    (sarvana.k@cmr.edu.in)


Doi   :   https://doi.org/10.54216/FPA.150115

Received: August 16, 2023 Revised: December 28, 2023 Accepted: March 11, 2024

Abstract :

Depression, or Major Depressive Disorder, is a serious and common medical condition that affects people worldwide. It negatively affects the person's feelings, thoughts, and actions. Depression causes a loss of interest in activities he enjoyed in the past.  It can lead to physical and emotional problems that hamper the daily activities at work and home. In recent years, much research has been done to identify Depression through various modalities of image, speech, and text through artificial intelligence. Social media is an important medium where many discussions and mentions happen about Depression. The current study proposes a novel approach to understand how the depressed and non-depressed communicate differently with the help of Topic Modeling with latent-Dirichlet allocation (LDA) and also detect depression with the help of Robustly Optimized BERT Pretraining Approach (RoBERTa). The current study achieved an accuracy of 66.4% for the depression detection model, which outperformed the previous approaches with similar methodology. The current study is helpful for self-diagnosis of signs of Depression at very early stages.

Keywords :

Clinical Depression; Artificial Intelligence; Machine Learning; Natural Language Processing; Mathematical Fusion; Bidirectional Encoder Representations from Transformers; Social media; Twitter; Reddit; Latent Dirichlet allocation; Fusion Based; Topic Modeling

References :

[1]    psychiatry.org, “From: https://www.psychiatry.org/patients-families/depression/what-is-depression,” 2024.

[2]    Tulika Saha, Apoorva Upadhyaya, Sriparna Saha, and Pushpak Bhattacharyya. 2021. Towards sentiment and emotion aided multi-modal speech act classification in twitter. In NAACL-HLT, pages 5727–5737. Association for Computational Linguistics.

[3]    Ermal Toto, M. L. Tlachac, and Elke A. Rundensteiner. 2021. Audibert: A deep transfer learning multimodal classification framework for depression screening. In CIKM, pages 4145–4154. ACM.

[4]    Jahandad Pirayesh, Haiquan Chen, Xiao Qin,Wei-Shinn Ku, and Da Yan. 2021. Mentalspot: Effective early screening for depression based on social contagion. In CIKM, pages 1437–1446. ACM.

[5]    Liangjie Hong and Brian D Davison. 2010. Empirical study of topic modeling in twitter. In Proceedings of the first workshop on social media analytics. ACM, 80–88.

[6]    Chong Wang and David M Blei. 2011. Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACMSIGKDD international conference on Knowledge discovery and data mining. ACM, 448–456.

[7]    Philip Resnik, Anderson Garron, and Rebecca Resnik. 2013. Using topic modeling to improve prediction of neuroticism and depression. In Proceedings of the 2013 Conference on EmpiricalMethods in Natural. Association for Computational Linguistics, 1348–1353.

[8]    Losada, David E. and Crestani, Fabio and Parapar, Javier, “erisk 2020: Self-harm and depression challenges,” in European Conference on Information Retrieval. Springer, 2020, pp. 557–563.

[9]    “erisk 2017: Clef lab on early risk prediction on the internet: experimental foundations,” in International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, 2017, pp. 346–360.

[10] G. Coppersmith, M. Dredze, C. Harman, K. Hollingshead, and M. Mitchell, “Clpsych 2015 shared task: Depression and ptsd on twitter,” in Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2015, pp. 31–39.

[11] M. Trotzek, S. Koitka, and C. M. Friedrich, “Utilizing neural networks and linguistic metadata for early detection of depression indications in text sequences,” IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 3, pp. 588–601, 2020.

[12] M. Trotzek, S. Koitka, and C. M. Friedrich, “Word embeddings and linguistic metadata at the CLEF 2018 tasks for early detection of depression and anorexia,” in CEUR Workshop Proceedings, vol. 2125, 2018. [Online]. Available: http://www.reddit.com/r/depression,

[13] JT Wolohan, Misato Hiraga, Atreyee Mukherjee, Zeeshan Ali Sayyed, and Matthew Millard. 2018. Detecting linguistic traces of depression in topic-restricted text: Attending to self-stigmatized depression with NLP. In Proceedings of the First International Workshop on Language Cognition and Computational Models, pages 11–21, Santa Fe, New Mexico, USA. Association for Computational Linguistics.

[14] James W. Pennebaker, Ryan Boyd, Kayla Jordan, and Kate Blackburn. 2015. The development and psychometric properties of LIWC2015. University of Texas at Austin.

[15] Inna Pirina and Ça˘grı Çöltekin. 2018. Identifying depression on Reddit: The effect of training data. In Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task, pages 9–12, Brussels, Belgium. Association for Computational Linguistics.

[16] Michael M. Tadesse, Hongfei Lin, Bo Xu, and Liang Yang. 2019. Detection of depression-related posts in reddit social media forum. IEEE Access, 7:44883– 44893.

[17] Priyanka Arora and Parul Arora. 2019. Mining twitter data for depression detection. In 2019 International Conference on Signal Processing and Communication (ICSC), pages 186–189.

[18] Chenhao Lin, Pengwei Hu, Hui Su, Shaochun Li, Jing Mei, Jie Zhou, and Henry Leung. 2020. Sense- Mood: Depression Detection on Social Media, page 407–411. Association for Computing Machinery, New York, NY, USA.

[19] Hamad Zogan, Imran Razzak, Shoaib Jameel, and Guandong Xu. 2021. Depressionnet: A novel summarization boosted deep framework for depression detection on social media. ArXiv, abs/2105.10878.

[20] P. Resnik, W. Armstrong, L. Claudino, T. Nguyen, V.-A. Nguyen, and J. Boyd-Graber. “Beyond LDA: Exploring Supervised Topic Modeling for Depression-Related Language in Twitter”. In: Workshop on Computational Linguistics and Clinical Psychology. 2015.

[21] M. I. J. David M. Blei Andrew Y. Ng. “Latent Dirichlet Allocation”. In: Journal of Machine Learning Research 3. 2003, pp. 993–1022.

[22] G. Coppersmith, M. Dredze, C. Harman, K. Hollingshead, and M. Mitchell. “CLPsych 2015 Shared Task: Depression and PTSD on Twitter”. In: Workshop on Computational Linguistics and Clinical Psychology. 2015.

[23] W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. “Comparing Twitter and Traditional Media using Topic Models”. In: Advances in Information Retrieval. 2011.

[24] H. Jelodar, Y. Wang, C. Yuan, X. Feng, X. Jiang, Y. Li, and L. Zhao. “Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, a Survey”. In: Multimedia Tools and Applications (2019).

[25] P. Resnik, W. Armstrong, L. Claudino, T. Nguyen, V.-A. Nguyen, and J. Boyd-Graber. “Beyond LDA: Exploring Supervised Topic Modeling for Depression-Related Language in Twitter”. In: Workshop on Computational Linguistics and Clinical Psychology. 2015.

[26] D. Maupomé and M.-J. Meurs. “Using Topic Extraction on Social Media Content for the Early Detection of Depression.” In: CLEF (Working Notes) 2125 (2018).

[27] Md Nasir, Arindam Jati, Prashanth Gurunath Shivakumar, Sandeep Nallan Chakravarthula, and Panayiotis Georgiou. Multimodal and multiresolution depression detection from speech and facial landmark features. In Proceedings of the 6th international workshop on audio/visual emotion challenge, pages 43–50, 2016.

[28] JanaMHavigerová, Jiˇrí Haviger, Dalibor Kuˇcera, and Petra Hoffmannová. Text-based detection of the risk of depression. Frontiers in psychology, 10:513, 2019.

[29] Michelle Renee Morales and Rivka Levitan. Speech vs. text: A comparative analysis of features for depression detection systems. In 2016 IEEE spoken language technology workshop (SLT), pages 136–143. IEEE, 2016.

[30] David E Losada, Fabio Crestani, and Javier Parapar. erisk 2017: Clef lab on early risk prediction on the internet: experimental foundations. In International Conference of the Cross-Language Evaluation Forum for European Languages, pages 346–360. Springer, 2017.

[31] Tuka Al Hanai, Mohammad M Ghassemi, and James R Glass. Detecting depression with audio/text sequence modeling of interviews. In Interspeech, pages 1716–1720, 2018.

[32] Michelle Renee Morales and Rivka Levitan. Speech vs. text: A comparative analysis of features for depression detection systems. In 2016 IEEE spoken language technology workshop (SLT), pages 136–143. IEEE, 2016.

[33] Kayalvizhi and D Thenmozhi. 2022. Data set creation and empirical analysis for detecting signs of depression from social media postings. arXiv preprint arXiv:2202.03047.

[34] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. ArXiv, abs/1907.11692.


Cite this Article as :
Style #
MLA Madhu Sudhan H. V., S. Saravana Kumar. "Fusion of Topic Modeling and RoBERTa for Detecting Signs of Depression from Social Media." Fusion: Practice and Applications, Vol. 15, No. 1, 2024 ,PP. 196-204 (Doi   :  https://doi.org/10.54216/FPA.150115)
APA Madhu Sudhan H. V., S. Saravana Kumar. (2024). Fusion of Topic Modeling and RoBERTa for Detecting Signs of Depression from Social Media. Journal of Fusion: Practice and Applications, 15 ( 1 ), 196-204 (Doi   :  https://doi.org/10.54216/FPA.150115)
Chicago Madhu Sudhan H. V., S. Saravana Kumar. "Fusion of Topic Modeling and RoBERTa for Detecting Signs of Depression from Social Media." Journal of Fusion: Practice and Applications, 15 no. 1 (2024): 196-204 (Doi   :  https://doi.org/10.54216/FPA.150115)
Harvard Madhu Sudhan H. V., S. Saravana Kumar. (2024). Fusion of Topic Modeling and RoBERTa for Detecting Signs of Depression from Social Media. Journal of Fusion: Practice and Applications, 15 ( 1 ), 196-204 (Doi   :  https://doi.org/10.54216/FPA.150115)
Vancouver Madhu Sudhan H. V., S. Saravana Kumar. Fusion of Topic Modeling and RoBERTa for Detecting Signs of Depression from Social Media. Journal of Fusion: Practice and Applications, (2024); 15 ( 1 ): 196-204 (Doi   :  https://doi.org/10.54216/FPA.150115)
IEEE Madhu Sudhan H. V., S. Saravana Kumar, Fusion of Topic Modeling and RoBERTa for Detecting Signs of Depression from Social Media, Journal of Fusion: Practice and Applications, Vol. 15 , No. 1 , (2024) : 196-204 (Doi   :  https://doi.org/10.54216/FPA.150115)