Bi-Directional Bird’s-Eye View Features Fusion for 3D Multimodal Object Detection and Tracking

Zhekai Hu; Sin-Ye Jhong; Hao-Wei Hwang; Shih-Hsuan Lin; Kai‐Lung Hua; Yung-Yao Chen

doi:10.1109/cacs60074.2023.10326208

ScienceGate Book Chapters

JOURNAL ARTICLE

Bi-Directional Bird’s-Eye View Features Fusion for 3D Multimodal Object Detection and Tracking

Zhekai Hu Sin-Ye Jhong Hao-Wei Hwang Shih-Hsuan Lin Kai‐Lung Hua Yung-Yao Chen

Year: 2023 Pages: 1-6

DOI: 10.1109/cacs60074.2023.10326208

Get Full-Text PDF Get Analytical Report

Abstract

With the rapid progress in autonomous driving technology, the integration of multiple sensors into autonomous driving systems has become crucial. Existing methods often use point-level fusion, where LiDAR point clouds are projected onto a plane and fused with RGB features. However, point-level fusion approach leads to a loss of semantic density from the RGB features during the transformation process. To overcome this limitation, recent methods have transformed RGB pixels into 3D space using depth prediction techniques, generating virtual point clouds. While this preserves the semantic density of camera features, it introduces challenges such as computational load and depth completion inaccuracies. In this paper, we propose a novel fusion method that unifies the representation of multimodal features in the Bird’s Eye View (BEV) space, preserving geometric and semantic information. We introduce the BEV feature fuse module to effectively integrate rich semantic features from RGB data into voxel features. Furthermore, we utilize the Focal Sparse Convolution module to enhance feature learning stability through position-weighted predictions, thereby improving the capability of point cloud feature extraction. Our fusion approach retains semantic features and enhances point cloud feature extraction. Experimental results on the nuScenes public dataset demonstrate superior performance in 3D object detection and tracking, highlighting the valuable application potential of this approach in autonomous driving systems.

Keywords:

Computer science Artificial intelligence Point cloud Computer vision RGB color model Feature extraction Feature (linguistics) Object detection Pattern recognition (psychology)

Metrics

Cited By

1.04

FWCI (Field Weighted Citation Impact)

Refs

0.83

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Robotics and Sensor-Based Localization

Physical Sciences → Engineering → Aerospace Engineering

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Optical Sensing Technologies

Physical Sciences → Physics and Astronomy → Instrumentation

Bi-Directional Bird’s-Eye View Features Fusion for 3D Multimodal Object Detection and Tracking

Abstract

Metrics

Citation History

Topics

Related Documents

A Lightweight Multimodal Fusion Method for Object Detection Based on Bird’s Eye View

BEVCorner: Enhancing Bird’s-Eye View Object Detection with Monocular Features via Depth Fusion

FB-Net: Multi-sensor fusion for object detection using front view and bird’s-eye-view features

Radar–Camera Fusion in Perspective View and Bird’s Eye View for 3D Object Detection

MSCFusion: Multi-Sensor Cooperative Fusion 3D Object Detection Based on Bird’s-Eye View