In recent years, the demand for facial expression recognition applications has increased rapidly, and its research has received extensive attention from researchers. However, the current recognition methods based on deep learning ignore the idea of multiple head attention and semantic consistency, resulting in the model only paying attention to the local area of the feature map. In addition, the model's inconsistent attention to the images before and after the flip, resulting the poor robustness, poor interpretation, and other shortcomings in the model. To address the above problems, we propose an Affinity Separation Loss (ASLoss), which improves the separability of samples through clustering. Moreover, a Separate Multi-head Attention block (SMA), and a Zonal Loss (ZLoss) are also designed to decentralize the model's attention. Experimental results demonstrate that our proposed MACNet method achieves competitive recognition performance on two public datasets RAF-DB and FERPlus.
Jing LiTianyu HuGaoxiang Ouyang
Xiaofeng WangT. T. HanShilu LiuMuhammad Zeeshan AjmalLu ChenYongqin ZhangYonghuai Liu
Zhengyao WenWenzhong LinTao WangGe Xu
Caixia ZhengJiayu LiuWei ZhaoYingying GeWenhe Chen