Accurate camera tracking and dense scene reconstruction are essential foundations for achieving automated surgical procedures. Many existing vision-based approaches struggle to address the prevalent specular reflections present in endoscopic videos. Moreover, the real-time reconstruction results from current methods often fail to provide reliable and precise geometric information for downstream tasks. This paper introduces a SLAM system tailored for endoscopic videos, based on neural implicit representation. Leveraging existing depth estimation models, we acquire geometric priors of the scene, facilitating a continuous and dense reconstruction of the surgical field. To mitigate the impacts of challenges like specular reflections on the system's performance, we employ neural radiance fields to model the intricate lighting conditions in the scene, thereby enhancing the precision of localization. The efficacy of our system is validated on both simulated datasets and real medical datasets.
Jiwei ShanYirui LiTing XieHesheng Wang
Xiong LiZhenyu WenZhanshuo DongHaoran DuanTianrun ChenZhen Hong
Yuchen ZhouTeng LiYu DaiJianxun Zhang
Jingbo LiEksan FirkatJingyu ZhuBin ZhuJihong ZhuAskar Hamdulla