Most salient object detection algorithms have problems such as single-feature detection defects and insufficient fusion of multiple features, resulting in unclear edges of saliency images and poor background suppression effects. To address these problems, a salient object detection method with multi-scale visual perception and fusion is proposed, which includes a Multiscale Visual Perception Module(MVPM) and a Multi-scale Feature Fusion Module(MFFM), for processing global information of saliency objects and fusing multi-scale features. Based on the U-shaped network structure, void convolution is used to simulate the receptive field in the visual cortex to construct MVPM, fully leveraging the role of void convolution in Convolutional Neural Network(CNN). Global spatial information of salient objects in the backbone network is extracted step by step, thus enhancing foreground saliency regions and suppressing background noise regions. The MFFM is designed by utilizing feature pyramids and spatial attention mechanisms to fuse advanced semantic information with detailed information, thereby restoring spatial structure information of saliency objects while suppressing noise transmission. Experiments conducted on five image datasets with complex background information, including the ECSSD, DUTS, and SOD, showed that the average F-Measure value of this method reached 88.4%, which is 14.2 percentage points higher than the benchmark network U-Net, and the Mean Absolute Error(MAE) value reached 3.5%, which is 5.4 percentage points lower than the benchmark network.
WU Xiaoqin, ZHOU Wenjun, ZUO Chenglin, WANG Yifan, PENG Bo
Wenjun ZhouTianfei WangX. WuChenglin ZuoYifan WangQuan ZhangBo Peng
Miao ZhangTingwei LiuYongri PiaoShunyu YaoHuchuan Lu
Guangyu RenYanchun XieTianhong DaiTania Stathaki