Publications
(* = (co-)corresponding author, ** = equal contributions)
2024
- Multilingual Mild Cognitive Impairment Detection with Multimodal ApproachBenjamin Barrera-Altuna , Daeun Lee , Zaima Zarnaz , Jinyoung Han , and Seungbae Kim**In INTERSPEECH 2024 , Sep 2024
Mild cognitive impairment (MCI) and dementia significantly impact millions worldwide and rank as a major cause of mortality. Since traditional diagnostic methods are often costly and result in delayed diagnoses, many efforts have been made to propose automatic detection approaches. However, most methods focus on monolingual cases, limiting the scalability of their models to individuals speaking different languages. To understand the common characteristics of people with MCI speaking different languages, we propose a multilingual MCI detection model using multimodal approaches that analyze both acoustic and linguistic features. It outperforms existing machine learning models by identifying universal MCI indicators across languages. Particularly, we find that speech duration and pauses are crucial in detecting MCI in multilingual settings. Our findings can potentially facilitate early intervention in cognitive decline across diverse linguistic backgrounds.
- Detecting Bipolar Disorder from Misdiagnosed Major Depressive Disorder with Mood-Aware Multi-Task LearningDaeun Lee** , Hyolim Jeon** , Sejung Son , Chaewon Park , Jihyun An , and 2 more authorsIn NAACL 2024 , Jun 2024
Bipolar Disorder (BD) is a mental disorder characterized by intense mood swings, from depression to manic states. Individuals with BD are at a higher risk of suicide, but BD is often misdiagnosed as Major Depressive Disorder (MDD) due to shared symptoms, resulting in delays in appropriate treatment and increased suicide risk. While early intervention based on social media data has been explored to uncover latent BD risk, little attention has been paid to detecting BD from those misdiagnosed as MDD. Therefore, this study presents a novel approach for identifying BD risk in individuals initially misdiagnosed with MDD. A unique dataset, BD-Risk, is introduced, incorporating mental disorder types and BD mood levels verified by two clinical experts. The proposed multi-task learning for predicting BD risk and BD mood level outperforms the state-of-the-art baselines. Also, the proposed dynamic mood-aware attention can provide insights into the impact of BD mood on future risk, potentially aiding interventions for at-risk individuals. We provide the codes and a new dataset for reproducibility purposes.
- Fighting against Fake News on Newly-Emerging Crisis: A Case Study of COVID-19Migyeong Yang** , Chaewon Park** , Jiwon Kang , Daeun Lee , Daejin Choi* , and 1 more authorIn The Web (Short paper) , May 2024
As social media users can easily access, generate, and spread information regardless of its authenticity, the proliferation of fake news related to public health has become a serious problem. Since these rumors have caused severe social issues, detecting them in the early stage is imminent. Therefore, in this paper, we propose a deep learning model that can debunk fake news on COVID-19, as a case study, at the initial stage of emergence. The evaluation with a newly-collected dataset consisting of both the COVID-19 and Non-COVID-19 fake news claims demonstrates that the proposed model achieves high performance, indicating that the model can identify fake news on COVID-19 in the early stage with a small amount of data. We believe that our methodology and findings can be applied to detect fake news on newly-emerging and critical topics, which should be performed with insufficient resources.
- A Dual-Prompting for Interpretable Mental Health Language ModelsHyolim Jeon** , Dongje Yoo** , Daeun Lee , Sejung Son , Seungbae Kim , and 1 more authorIn CLPsych (EACL workshop) , Mar 2024
Despite the increasing demand for AI-based mental health monitoring tools, their practical utility for clinicians is limited by the lack of interpretability. The CLPsych 2024 Shared Task aims to enhance the interpretability of Large Language Models (LLMs), particularly in mental health analysis, by providing evidence of suicidality through linguistic content. We propose a dual-prompting approach: (i) Knowledge-aware evidence extraction by leveraging the expert identity and a suicide dictionary with a mental health-specific LLM; and (ii) Evidence summarization by employing an LLM-based consistency evaluator. Comprehensive experiments demonstrate the effectiveness of combining domain-specific information, revealing performance improvements and the approach’s potential to aid clinicians in assessing mental state progression.
2023
- Learning Co-Speech Gesture for Multimodal Aphasia Type DetectionDaeun Lee** , Sejung Son** , Hyolim Jeon , Seungbae Kim , and Jinyoung Han*In EMNLP , Dec 2023
Aphasia, a language disorder resulting from brain damage, requires accurate identification of specific aphasia types, such as Broca’s and Wernicke’s aphasia, for effective treatment. However, little attention has been paid to developing methods to detect different types of aphasia. Recognizing the importance of analyzing co-speech gestures for distinguish aphasia types, we propose a multimodal graph neural network for aphasia type detection using speech and corresponding gesture patterns. By learning the correlation between the speech and gesture modalities for each aphasia type, our model can generate textual representations sensitive to gesture information, leading to accurate aphasia type detection. Extensive experiments demonstrate the superiority of our approach over existing methods, achieving state-of-the-art results (F1 84.2%). We also show that gesture features outperform acoustic features, highlighting the significance of gesture expression in detecting aphasia types. We provide the codes for reproducibility purposes.
- Detecting depression on video logs using audiovisual featuresKyungeun Min , Jeewoo Yoon , Migyeong Kang , Daeun Lee , Eunil Park , and 1 more authorHumanities and Social Sciences Communications, Nov 2023
Detecting depression on social media has received significant attention. Developing a depression detection model helps screen depressed individuals who may need proper treatment. While prior work mainly focused on developing depression detection models with social media posts, including text and image, little attention has been paid to how videos on social media can be used to detect depression. To this end, we propose a depression detection model that utilizes both audio and video features extracted from the vlogs (video logs) on YouTube. We first collected vlogs from YouTube and annotated them into depression and non-depression. We then analyze the statistical differences between depression and non-depression vlogs. Based on the lessons learned, we build a depression detection model that learns both audio and visual features, achieving high accuracy. We believe our model helps detect depressed individuals on social media at an early stage so that individuals who may need appropriate treatment can get help.
- Towards Suicide Prevention from Bipolar Disorder with Temporal Symptom-Aware Multitask LearningDaeun Lee , Sejung Son , Hyolim Jeon , Seungbae Kim , and Jinyoung Han*In ACM SIGKDD , Aug 2023
Bipolar disorder (BD) is closely associated with an increased risk of suicide. However, while the prior work has revealed valuable insight into understanding the behavior of BD patients on social media, little attention has been paid to developing a model that can predict the future suicidality of a BD patient. Therefore, this study proposes a multi-task learning model for predicting the future suicidality of BD patients by jointly learning current symptoms. We build a novel BD dataset clinically validated by psychiatrists, including 14 years of posts on bipolar-related subreddits written by 818 BD patients, along with the annotations of future suicidality and BD symptoms. We also suggest a temporal symptom-aware attention mechanism to determine which symptoms are the most influential for predicting future suicidality over time through a sequence of BD posts. Our experiments demonstrate that the proposed model outperforms the state-of-the-art models in both BD symptom identification and future suicidality prediction tasks. In addition, the proposed temporal symptom-aware attention provides interpretable attention weights, helping clinicians to apprehend BD patients more comprehensively and to provide timely intervention by tracking mental state progression.
- Toward Natural and Intelligible Speech Synthesis: An Empirical Study on Transfer LearningChaewon Kang , Jeewoo Yoon , Daeun Lee , Migyeong Kang , Seohyun Lim , and 3 more authorsIn The Korean Institute of Broadcast and Media Engineers , Jun 2023
To synthesize natural and intelligible speech with a small amount of data, transfer learning with well-maintained and pre-trained data has been known to be useful. However, little attention has been paid to answer the following research questions with empirically-grounded evidence, "How much pre-trained (source) speech data (e.g., 10 K utterances or 10 hours) used in transfer learning is enough for generating natural and intelligible speech?" and "For generating natural and intelligible speech, how much (target) speech data should at least be provided?", which are essential for the quality of speech synthesis. To answer these questions, this paper conducts extensive experiments on speech synthesis with multiple source and target data with different lengths, speakers, and languages. We show that intelligible and natural speech can be synthesized with only 500 utterances of target data using transfer learning. Our work also reveals that at least 5000 utterances of source pre-trained data are required to synthesize decent speech.
2022
- Detecting suicidality with a contextual graph neural networkDaeun Lee , Migyeong Kang , Minji Kim , and Jinyoung Han**In CLPsych (NAACL workshop) , Jul 2022
Discovering individuals’ suicidality on social media has become increasingly important. Many researchers have studied to detect suicidality by using a suicide dictionary. However, while prior work focused on matching a word in a post with a suicide dictionary without considering contexts, little attention has been paid to how the word can be associated with the suicide-related context. To address this problem, we propose a suicidality detection model based on a graph neural network to grasp the dynamic semantic information of the suicide vocabulary by learning the relations between a given post and words. The extensive evaluation demonstrates that the proposed model achieves higher performance than the state-of-the-art methods. We believe the proposed model has great utility in identifying the suicidality of individuals and hence preventing individuals from potential suicide risks at an early stage.
2021
- COVID-19 Korean fake news detection using named entity and user reproliferation informationChaewon Park , Lee Daeun , Jiwon Kang , Munyoung Lee , and Jinyoung Han*In Human and Cognitive Language Technology , Jun 2021
- Machine learning for mental health in social media: Bibliometric studyJina Kim , Daeun Lee , and Eunil Park**Journal of Medical Internet Research, Mar 2021
Background: Social media platforms provide an easily accessible and time-saving communication approach for individuals with mental disorders compared to face-to-face meetings with medical providers. Recently, machine learning (ML)-based mental health exploration using large-scale social media data has attracted significant attention. Objective: We aimed to provide a bibliometric analysis and discussion on research trends of ML for mental health in social media. Methods: Publications addressing social media and ML in the field of mental health were retrieved from the Scopus and Web of Science databases. We analyzed the publication distribution to measure productivity on sources, countries, institutions, authors, and research subjects, and visualized the trends in this field using a keyword co-occurrence network. The research methodologies of previous studies with high citations are also thoroughly described. Results: We obtained a total of 565 relevant papers published from 2015 to 2020. In the last 5 years, the number of publications has demonstrated continuous growth with Lecture Notes in Computer Science and Journal of Medical Internet Research as the two most productive sources based on Scopus and Web of Science records. In addition, notable methodological approaches with data resources presented in high-ranking publications were investigated. Conclusions: The results of this study highlight continuous growth in this research area. Moreover, we retrieved three main discussion points from a comprehensive overview of highly cited publications that provide new in-depth directions for both researchers and practitioners.
2020
- Cross-lingual suicidal-oriented word embedding toward suicide preventionDaeun Lee , Soyoung Park , Jiwon Kang , Daejin Choi , and Jinyoung Han*In EMNLP Findings , Nov 2020
Early intervention for suicide risks with social media data has increasingly received great attention. Using a suicide dictionary created by mental health experts is one of the effective ways to detect suicidal ideation. However, little attention has been paid to validate whether and how the existing dictionaries for other languages (i.e., English and Chinese) can be used for predicting suicidal ideation for a low-resource language (i.e., Korean) where a knowledge-based suicide dictionary has not yet been developed. To this end, we propose a cross-lingual suicidal ideation detection model that can identify whether a given social media post includes suicidal ideation or not. To utilize the existing suicide dictionaries developed for other languages (i.e., English and Chinese) in word embedding, our model translates a post written in the target language (i.e., Korean) into English and Chinese, and then uses the separate suicidal-oriented word embeddings developed for English and Chinese, respectively. By applying an ensemble approach for different languages, the model achieves high accuracy, over 87%. We believe our model is useful in accessing suicidal ideation using social media data for preventing potential suicide risk in an early stage.