Guifang ZhangHon‐Cheng WongSio‐Long Lo
In recently years, some useful unsupervised video object segmentation methods that emphasize the common information in videos have been proposed. Despite the effectiveness of these methods, they ignore the information from the shallow layers of the network and thus fail to segment the details of the objects. To address this problem, we propose a multi-attention network for unsupervised video object segmentation (MANet). Recent studies show that the deep layers of networks are sensitive to high-level semantic information but messy details, while it is opposite for shallow layers. From this insight, a multi-attention module is designed by taking into account the information from the shallow layers in addition to that from the deep layers. This module can distinguish the primary object and segment the details of the object effectively by enhancing the common information between video frames while combing the features from the shallow layers and the deep layers. Experimental results on the DAVIS-2016 and SegTrack v2 datasets show that our network outperforms the state-of-the-art methods.
ZhengHao ZhangLiguo SunLingyu SiChangwen Zheng
Ping LiYu ZhangYuan LiHuaxin XiaoBinbin LinXianghua Xu
Weidong ChenDexiang HongYuankai QiZhenjun HanShuhui WangLaiyun QingQingming HuangGuorong Li