Wenda LiYuichiro HayashiMasahiro OdaTakayuki KitasakaKazunari MisawaKensaku Mori
Abstract Purpose Depth estimation is a powerful tool for navigation in laparoscopic surgery. Previous methods utilize predicted depth maps and the relative poses of the camera to accomplish self-supervised depth estimation. However, the smooth surfaces of organs with textureless regions and the laparoscope’s complex rotations make depth and pose estimation difficult in laparoscopic scenes. Therefore, we propose a novel and effective self-supervised monocular depth estimation method with self-attention-guided pose estimation and a joint depth-pose loss function for laparoscopic images. Methods We extract feature maps and calculate the minimum re-projection error as a feature-metric loss to establish constraints based on feature maps with more meaningful representations. Moreover, we introduce the self-attention block in the pose estimation network to predict rotations and translations of the relative poses. In addition, we minimize the difference between predicted relative poses as the pose loss. We combine all of the losses as a joint depth-pose loss. Results The proposed method is extensively evaluated using SCARED and Hamlyn datasets. Quantitative results show that the proposed method achieves improvements of about 18.07 $$\%$$ % and 14.00 $$\%$$ % in the absolute relative error when combining all of the proposed components for depth estimation on SCARED and Hamlyn datasets. The qualitative results show that the proposed method produces smooth depth maps with low error in various laparoscopic scenes. The proposed method also exhibits a trade-off between computational efficiency and performance. Conclusion This study considers the characteristics of laparoscopic datasets and presents a simple yet effective self-supervised monocular depth estimation. We propose a joint depth-pose loss function based on the extracted feature for depth estimation on laparoscopic images guided by a self-attention block. The experimental results prove that all of the proposed components contribute to the proposed method. Furthermore, the proposed method strikes an efficient balance between computational efficiency and performance.
Peng GuoShuguo PanWang GaoKourosh Khoshelham
yuhong chenHongfei YuLaide GuoYang Cao
Selena HeHaonan ZhuChuanzhuang ZhaoMinrui Zhao
Chao FanZhenyu YinFulong XuAnying ChaiFeiqing Zhang
Tianyu ZhangDongchen ZhuGuanghui ZhangWenjun ShiYanqing LiuXiaolin ZhangJiamao Li