JOURNAL ARTICLE

Bi-Directional Bird’s-Eye View Features Fusion for 3D Multimodal Object Detection and Tracking

Abstract

With the rapid progress in autonomous driving technology, the integration of multiple sensors into autonomous driving systems has become crucial. Existing methods often use point-level fusion, where LiDAR point clouds are projected onto a plane and fused with RGB features. However, point-level fusion approach leads to a loss of semantic density from the RGB features during the transformation process. To overcome this limitation, recent methods have transformed RGB pixels into 3D space using depth prediction techniques, generating virtual point clouds. While this preserves the semantic density of camera features, it introduces challenges such as computational load and depth completion inaccuracies. In this paper, we propose a novel fusion method that unifies the representation of multimodal features in the Bird’s Eye View (BEV) space, preserving geometric and semantic information. We introduce the BEV feature fuse module to effectively integrate rich semantic features from RGB data into voxel features. Furthermore, we utilize the Focal Sparse Convolution module to enhance feature learning stability through position-weighted predictions, thereby improving the capability of point cloud feature extraction. Our fusion approach retains semantic features and enhances point cloud feature extraction. Experimental results on the nuScenes public dataset demonstrate superior performance in 3D object detection and tracking, highlighting the valuable application potential of this approach in autonomous driving systems.

Keywords:
Computer science Artificial intelligence Point cloud Computer vision RGB color model Feature extraction Feature (linguistics) Object detection Pattern recognition (psychology)

Metrics

2
Cited By
1.04
FWCI (Field Weighted Citation Impact)
20
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Optical Sensing Technologies
Physical Sciences →  Physics and Astronomy →  Instrumentation
© 2026 ScienceGate Book Chapters — All rights reserved.