Co-speech Gestures in Language Disorder

Aphasia, MCI, ASR

Research Background

Multimodal language analysis helps to understand human communication by integrating information from multiple modalities. However, previous studies simply concatenated features to identify language impairment in speech data, which lacks comprehension of the complex connections between modalities.

Individuals with language disorder often rely on non-verbal communication techniques, especially gestures, as an additional communication tool due to difficulties in word retrieval and language errors. Therefore, the same word can be interpreted differently depending on the accompanying gestures for different language disorder symptoms. Hence, utilizing both speech (i.e., linguistic and acoustic) and gesture (i.e., visual) information is crucial in understanding language disorder’s characteristics.

Research Goal

We aims to establish healthcare systems based on understanding language disorder’s characteristics by utilizing both speech (i.e., linguistic and acoustic) and gesture (i.e., visual) information.

Approach

  • Understanding Co-Speech Gestures for Aphasia Type Detection (Lee** et al., 2023):

    Recognizing the importance of analyzing co-speech gestures for distinguish aphasia types, we propose a multimodal graph neural network for aphasia type detection using speech and corresponding gesture patterns. We show that gesture features outperform acoustic features, highlighting the significance of gesture expression in detecting aphasia types.

  • Using Audiovisual Features for Depression Detection (Min et al., 2023):

    We collected vlogs from YouTube and annotated them into depression and non-depression. Based on analysis of the statistical differences between depression and non-depression vlogs, we build a depression detection model that learns both audio and visual features, achieving high accuracy.

  • Analysis for Multi-modality for MCI Detection (Barrera-Altuna et al., 2024):

    I am actively involved in this project related to mild cognitive impairment detection with domain experts, such as pathologists and healthcare practitioners at the University of South Florida (USF). To understand the common characteristics of people with MCI speaking different languages, we propose a multilingual MCI detection model using multimodal approaches that analyze both acoustic and linguistic features. It outperforms existing machine learning models by identifying universal MCI indicators across languages.

References

2024

  1. interspeech24.png
    Multilingual Mild Cognitive Impairment Detection with Multimodal Approach
    Benjamin Barrera-Altuna ,  Daeun Lee ,  Zaima Zarnaz ,  Jinyoung Han ,  and  Seungbae Kim**
    In INTERSPEECH 2024 , Sep 2024

2023

  1. emnlp23.png
    Learning Co-Speech Gesture for Multimodal Aphasia Type Detection
    Daeun Lee** ,  Sejung Son** ,  Hyolim Jeon ,  Seungbae Kim ,  and  Jinyoung Han*
    In EMNLP , Dec 2023
  2. human23.png
    Detecting depression on video logs using audiovisual features
    Kyungeun Min ,  Jeewoo Yoon ,  Migyeong Kang ,  Daeun Lee ,  Eunil Park , and 1 more author
    Humanities and Social Sciences Communications, Nov 2023