Millions of people all around the world suffer from depression, which is a common mental health issue. Effective intervention and therapy depend on early and precise identification of depression. Multimodal techniques that incorporate audio and video data have recently demonstrated promising outcomes in the detection of depression. In this article, authors suggest a convolutional neural network (CNN) model for multimodal depression identification based on audio and video. The proposed model makes use of audio and visual elements to gather comprehensive and complementary data about depression. Pitch, intensity, and spectral information, among others, are derived from voice recordings for the audio modality. The video modality uses face landmarks, optical flow, and pose estimation techniques to combine facial expressions, head movements, and body gestures. The parallel branches of the CNN architecture process the audio and visual inputs individually. To learn discriminative audio representations, the audio branch applies three convolutional layers, followed by pooling and dense layers. To extract video-specific information, the video branch uses five convolutional layers with different filter widths and depths, followed by fully connected and pooling layers. Further a late fusion technique has been adopted for multimodal fusion, concatenating learned features from the two modalities and passing them through additional thick layers for depression prediction. Authors also used regularization strategies, including dropout and batch normalization, during training to alleviate the overfitting problem. Authors evaluated the proposed multimodal CNN model on a DAIC-WOZ dataset consisting of audio and video recordings of individuals with and without depression. The proposed model achieved 77% accuracy and hence demonstrate that the proposed multimodal CNN model achieves superior performance compared to unimodal approaches.
Suresh MamidisettiMallikarjuna Reddy A.
Lang HeDongmei JiangHichem Sahli
Hamza JavaidAniqa DilawariUsman Ghani KhanBilal Wajid
Xia XuGuanhong ZhangXueqian MaoQinghua Lu