Monocular depth estimation is an important topic in minimally invasive surgery, providing valuable information for downstream application, like navigation systems. Deep learning for this task requires high amount of training data for an accurate and robust model. Especially in the medical field acquiring ground truth depth information is rarely possible due to patient security and technical limitations. This problem is being tackled by many approaches including the use of syn- thetic data. This leads to the question, how well does the syn- thetic data allow the prediction of depth information on clini- cal data. To evaluate this, the synthetic data is used to train and optimize a U-Net, including hyperparameter tuning and aug- mentation. The trained model is then used to predict the depth on clinical image and analyzed in quality, consistency over the same scene, time and color. The results demonstrate that syn- thetic data sets can be used for training, with an accuracy of over 77% and a RMSE below 10 mm on the synthetic data set, do well on resembling clinical data, but also have limitations due to the complexity of clinical environments. Synthetic data sets are a promising approach allowing monocular depth esti- mation in fields with otherwise lacking data.
Heiko WalknerLorena KramesWerner Nahm
Chi XuBaoru HuangDaniel S. Elson
Cecilia Diana-AlbeldaJuan Ignacio Bravo Pérez-VillarJavier MontalvoÁlvaro García‐MartínJesús Bescós
Wenda LiYuichiro HayashiMasahiro OdaTakayuki KitasakaKazunari MisawaKensaku Mori