In this paper we present a novel network architecture, called Multi-Scale Cascade Network (MSC-Net), to identify the most visually conspicuous objects in an image. Our network consists of several stages (sub-networks) for handling saliency detection across different scales. All these sub-networks form a cascade structure (in a coarse-to-fine manner) where the same underlying convolutional feature representations are fully shared. Compared with existing CNN-based saliency models, the MSC-Net can naturally enable the learning process in the finer cascade stages to encode more global contextual information while progressively incorporating the saliency prior knowledge obtained from coarser stages and thus lead to better detection accuracy. We also design a novel refinement module to further filter out errors by considering the intermediate feedback information. Our MSC-Net is highly integrated, end-to-end trainable, and very powerful. The proposed method achieves state-of-the-art performance on five widely-used salient object detection benchmarks, outperforming existing methods and also maintaining high efficiency. Code and pre-trained models are available at https://github.com/lixin666/MSC-NET.
Youwei PangXiaoqi ZhaoLihe ZhangHuchuan Lu
Dengdi SunHang WuZhuanlian DingSheng LiBin Luo
Chiheng ZhouZhengkai WangYongxia ZhouChen Pan
Fen XiaoWenzheng DengLiangchan PengChunhong CaoKai HuXieping Gao