Emotion recognition is one of the latest challenges in intelligent human/machine communication. Most of previous work on emotion recognition focused on extracting emotions from visual or audio information separately. A novel approach is presented in this paper to recognize the human emotion which uses both visual and audio from video clips. A tripled hidden Markov model is introduced to perform the recognition which allows the state asynchrony of the audio and visual observation sequences while preserving their natural correlation over time. The experimental results show that this approach outperforms only using visual or audio separately.
Jingxuan ZhaoXiao WuDongmei Jiang
Jen‐Chun LinChung‐Hsien WuWen-Li Wei
Jen‐Chun LinChung‐Hsien WuWen-Li Wei
Xiaoxing LiuYibao ZhaoXiaobo PiLuhong LiangAra Nefian
Björn W. SchullerGerhard RigollM. Lang