Zhihong ZengHaijun LiuFenglei ChenXiaoheng Tan
Salient object detection (SOD) aims to identify the most prominent regions in images. However, the large model sizes, high computational costs, and slow inference speeds of existing RGB-D SOD models have hindered their deployment on real-world embedded devices. To address this issue, we propose a novel method named AirSOD, which is committed to lightweight RGB-D SOD. Specifically, we first design a hybrid feature extraction network, which includes the first three stages of MobileNetV2 and our Parallel Attention-Shift convolution (PAS) module. Using the novel PAS module enables capturing both long-range dependencies and local information to enhance the representation learning while significantly reducing the number of parameters and computational complexity. Secondly, we propose a Multi-level and Multi-modal feature Fusion (MMF) module to facilitate feature fusion, and a Multi-path enhancement for Feature Refinement (MFR) decoder for feature integration. The proposed method significantly reduces the model size by 63%, decreases the computational complexity by 43%, and improves the inference speed by 43% compared with the cutting-edge model (MobileSal). We test our AirSOD on six widely-used RGB-D SOD datasets. Extensive experimental results demonstrate that our method obtains satisfactory performance. The source codes will be made available.
Liuyi LingYiwen WangChengjun WangShanyong XuYourui Huang
M ZhongJing SunFasheng WangFuming Sun
Changhe ZhangFen ChenLian HuangZongju PengXin Hu
Nianchang HuangYang YangQiang ZhangJungong HanJin H. Huang