We, humans, are pretty good at analyzing and inferring data from the 3D world we live in. We do it by combining the information from multiple sense organs with the prior knowledge of the object's geometry. Thus, even if an object is occluded, the guess will be almost right! Humans normally use a combination of stereo and monocular cues to identify the presence of an object and localize it but this is different for robots and self-driving vehicles. Understanding and capturing the third dimension information from a world coordinate system is challenging. Active sensors like LiDAR gives solutions to the above-mentioned problems. The sparse data and cost of such sensors hinders the development of such applications. Understanding depth from 2D images is a potential area of research which indeed can lead to 3D reconstruction and 3D object detection. Unsupervised learning is gaining interest since it doesn't require ground truth for training. In this paper, we propose DNN for depth estimation using unsupervised learning, then the proposed methods are evaluated using KITTI standard metrics which shows the promising way for self-driving cars. Our proposed methods outperforms the state-of the-art methods in unsupervised learning for depth estimation with approximately 75% less training data and with less input resolution.
Chih-Shuan HuangWan-Nung TsungWei‐Jong YangChin‐Hsing Chen
Chih-Shuan HuangWan-Nung TsungWei‐Jong YangChin-Hsing Chen
赵栓峰 Zhao ShuanfengTao Huang许倩 Xu Qian耿龙龙 Geng Longlong
Tomoyasu ShimadaHiroki NishikawaXiangbo KongHiroyuki Tomiyama