Yufei WangYongjiang HuAlan Wee‐Chung LiewJunhu Wang
This paper proposes a novel approach of weakly supervised video object segmentation, which only needs one pixel to guide the segmentation. We use two deep neural networks to get the instance-level semantic segmentation masks and optical flow maps of each frame. An object probability map to the first frame in video is generated by combining the semantic masks, the optical flow maps and the guiding pixel. The object probability map propagates forward and backward and becomes more accurate to each frame. Finally, an energy minimization problem on a function that consists of unary term of object probability and pairwise terms of label smoothness potentials is solved to get the pixel-wise object segmentation mask of each frame. We evaluate our method on a benchmark dataset, and the experimental results show that the proposed approach achieves impressive performance in comparison with state-of-the-art methods.
Jinyu YangMingqi GaoFeng ZhengXiantong ZhenRongrong JiLing ShaoAleš Leonardis
XiaoQing BuYukuan SunJianming WangKunliang LiuJiayu LiangGuanghao JinTae‐Sun Chung
Weikang WangYuting SuJing LiuWei SunGuangtao Zhai
Xiao LiuDacheng TaoMingli SongYing RuanChun ChenJiajun Bu
Fanchao LinHongtao XieYan LiYongdong Zhang