JOURNAL ARTICLE

Audio-Visual Depth and Material Estimation for Robot Navigation

Justin WilsonNicholas RewkowskiMing C. Lin

Year: 2022 Journal:   2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Pages: 9239-9246

Abstract

Reflective and textureless surfaces such as windows, mirrors, and walls can be a challenge for scene reconstruction, due to depth discontinuities and holes. We propose an audio-visual method that uses the reflections of sound to aid in depth estimation and material classification for 3D scene reconstruction in robot navigation and AR/VR applications. The mobile phone prototype emits pulsed audio, while recording video for audio-visual classification for 3D scene reconstruction. Reflected sound and images from the video are input into our audio (EchoCNN-A) and audio-visual (EchoCNN-AV) convolutional neural networks for surface and sound source detection, depth estimation, and material classification. The inferences from these classifications enhance 3D scene reconstructions containing open spaces and reflective surfaces by depth filtering, inpainting, and placement of unmixed sound sources in the scene. Our prototype, demos, and experimental results from real-world with challenging surfaces and sound, also validated with virtual scenes, indicate high success rates on classification of material, depth estimation, and closed/open surfaces, leading to considerable improvement in 3D scene reconstruction for robot navigation.

Keywords:
Computer vision Computer science Artificial intelligence Inpainting Depth map Convolutional neural network Image (mathematics)

Metrics

3
Cited By
0.42
FWCI (Field Weighted Citation Impact)
82
Refs
0.54
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

BOOK-CHAPTER

Audio Visual Language Maps for Robot Navigation

Chenguang HuangOier MeesAndy ZengWolfram Burgard

Springer proceedings in advanced robotics Year: 2024 Pages: 105-117
JOURNAL ARTICLE

Monocular Depth Estimation using CNN for Robot Navigation

Kento YAMAGATAJun Miura

Journal:   The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) Year: 2020 Vol: 2020 (0)Pages: 1P2-I02
© 2026 ScienceGate Book Chapters — All rights reserved.