CL-fusionBEV: 3D object detection method with camera-LiDAR fusion in Bird’s Eye View

Peicheng Shi; Zhiqiang Liu; Xinlong Dong; Aixi Yang

doi:10.1007/s40747-024-01567-0

ScienceGate Book Chapters

JOURNAL ARTICLE

CL-fusionBEV: 3D object detection method with camera-LiDAR fusion in Bird’s Eye View

Peicheng Shi Zhiqiang Liu Xinlong Dong Aixi Yang

Year: 2024 Journal: Complex & Intelligent Systems Vol: 10 (6)Pages: 7681-7696 Publisher: Springer Science+Business Media

DOI: 10.1007/s40747-024-01567-0

Get Full-Text PDF Get Analytical Report

Abstract

Abstract In the wave of research on autonomous driving, 3D object detection from the Bird’s Eye View (BEV) perspective has emerged as a pivotal area of focus. The essence of this challenge is the effective fusion of camera and LiDAR data into the BEV. Current approaches predominantly train and predict within the front view and Cartesian coordinate system, often overlooking the inherent structural and operational differences between cameras and LiDAR sensors. This paper introduces CL-FusionBEV, an innovative 3D object detection methodology tailored for sensor data fusion in the BEV perspective. Our approach initiates with a view transformation, facilitated by an implicit learning module that transitions the camera’s perspective to the BEV space, thereby aligning the prediction module. Subsequently, to achieve modal fusion within the BEV framework, we employ voxelization to convert the LiDAR point cloud into BEV space, thereby generating LiDAR BEV spatial features. Moreover, to integrate the BEV spatial features from both camera and LiDAR, we have developed a multi-modal cross-attention mechanism and an implicit multi-modal fusion network, designed to enhance the synergy and application of dual-modal data. To counteract potential deficiencies in global reasoning and feature interaction arising from multi-modal cross-attention, we propose a BEV self-attention mechanism that facilitates comprehensive global feature operations. Our methodology has undergone rigorous evaluation on a substantial dataset within the autonomous driving domain, the nuScenes dataset. The outcomes demonstrate that our method achieves a mean Average Precision (mAP) of 73.3% and a nuScenes Detection Score (NDS) of 75.5%, particularly excelling in the detection of cars and pedestrians with high accuracies of 89% and 90.7%, respectively. Additionally, CL-FusionBEV exhibits superior performance in identifying occluded and distant objects, surpassing existing comparative methods.

Keywords:

Lidar Computer vision Artificial intelligence Computational intelligence Object (grammar) Computer science Object detection Fusion Remote sensing Geography Pattern recognition (psychology)

Metrics

Cited By

5.30

FWCI (Field Weighted Citation Impact)

Refs

0.92

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Technologies in Various Fields

Physical Sciences → Computer Science → Artificial Intelligence

Robotics and Sensor-Based Localization

Physical Sciences → Engineering → Aerospace Engineering

CL-fusionBEV: 3D object detection method with camera-LiDAR fusion in Bird’s Eye View

Abstract

Metrics

Citation History

Topics

Related Documents

CL-FusionBEV: A Cross-Attention Based Fusion Model for Camera and LiDAR in Bird’s Eye View Perception

BEVTransFusion: LiDAR-Camera Fusion Under Bird’s-Eye-View for 3D Object Detection with Transformers

CPMFusion: LiDAR-camera fusion framework for 3D object detection in bird’s eye view space

Free Space Detection Using Camera-LiDAR Fusion in a Bird’s Eye View Plane

Radar–Camera Fusion in Perspective View and Bird’s Eye View for 3D Object Detection