Depression Detection on Mandarin Text through Bert Model

Yung, Teck Kiong and Cheah, Wai Shiang and Mahir, Perdana and Hamizan, Sharbini and Iwan Tri, Riyadi Yanto (2024) Depression Detection on Mandarin Text through Bert Model. Journal of Advanced Research in Applied Sciences and Engineering Technology, 60 (2). pp. 295-311. ISSN 2462-1943

[img] PDF
ARASETV60_N2_PP295311.pdf

Download (770kB)
Official URL: https://semarakilmu.com.my/journals/index.php/appl...

Abstract

Depression is currently one of the most prevalent mental disorders and its incidence has been rising significantly in Malaysia amid the Covid-19 pandemic. While previous studies have demonstrated the potential of artificial intelligence technology in analysing social media texts to detect signs of depression, most of these studies have focused on English textual content. Considering that Mandarin is the second most widely spoken language worldwide, it is worthwhile to explore depression detection techniques specifically tailored for Mandarin textual content. This research aims to examine the effectiveness of the BERT model in text classification, particularly for detecting depression in Mandarin. The study proposes the utilization of the BERT model to analyse social media posts related to depression. The model is trained using the WU3D dataset, which comprises a collection of over 2 million text data sourced from Sina Weibo, a prominent Chinese social media platform. Given the dataset's inherent imbalance, text augmentation techniques were employed to assess whether they contribute to improved model performance. The findings suggest that the BERT model trained on the original dataset outperformed the model trained on the augmented dataset. This implies that the BERT model is well-equipped to handle imbalanced datasets effectively. Furthermore, it is speculated that the augmented dataset did not introduce novel information or knowledge during the model training process. Notably, the highest-performing model achieved an impressive accuracy rate of 88% on the testing dataset.

Item Type: Article
Uncontrolled Keywords: NLP; Depression; Machine learning; Transformer; BERT.
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Shiang
Date Deposited: 03 Jan 2025 06:40
Last Modified: 03 Jan 2025 06:40
URI: http://ir.unimas.my/id/eprint/47245

Actions (For repository members only: login required)

View Item View Item