WANG Anzhi, REN Chunhong, HE Linyan, YANG Yuanying, QU Weihua
Most existing deep learning based saliency detection algorithms focus on 2D RGB images. However, they fail to take advantage of 3D visual information of scenes.Most light field saliency detection methods are based on hand-crafted features, whose feature representation capacity is insufficient.These issues lead to poor performance in many challenging scene images.To remedy these problems, this paper proposes a multi-modal multi-level feature aggregation network based on convolutional neural network for light field salient object detection.To fully exploit 3D visual information, two stream sub-network are designed in parallel to handle all-focus images and depth maps separately.Moreover, several feature aggregation modules are developed to aggregate multi-level features to detect the salient objects in scene.Moreover, several cross-modal feature fusion modules are designed to fuse multi-modal features from all-focus images, focal stack, and depth maps, which can highlight a salient object by utilizing 3D visual information.Comprehensive experimental comparisons were performed on the DUTLF-FS and HFUT-Lytro light field benchmark datasets, and the results reveal that the algorithm outperforms the mainstream salient target detection algorithms, such as MOLF, AFNet, and DMRA on five authoritative evaluation metrics.
Hu HuangPing LiuYanzhao WangTongchi ZhouBoyang QuAimin TaoHao Zhang
Xinghe YanZhenxue ChenQ. M. Jonathan WuMengxu LuLuna Sun
Anzhi WangWeihua OuChunhong RenYun Liu