Abstract

With the prevalence of multimodal learning, camera-LiDAR fusion has gained popularity in 3D object detection. Many fusion approaches have been proposed, falling into two main categories: sparse-only or dense-only, differentiated by their feature representation within the fusion module. We analyze these approaches within a shared taxonomy, identifying two key challenges: (1) Sparse-only methodologies maintain 3D geometric prior but fail to capture the semantic richness from camera data, and (2) Dense-only strategies preserve semantic continuity at the expense of precise geometric information derived from LiDAR. Upon analysis, we deduce that due to their respective architectural designs, some degree of information loss is inevitable. To counteract this loss, we introduce Sparse Dense Fusion (SD-Fusion), an innovative framework combining both sparse and dense fusion modules via the Transformer architecture. The simple yet effective fusion strategy enhances semantic texture and simultaneously leverages spatial structure data. Employing our SD-Fusion strategy, we assemble two popular methods with moderate performance, achieving a 4.3% increase in mAP and a 2.5% rise in NDS, thus ranking first in the nuScenes benchmark. Comprehensive ablation studies validate the effectiveness of our approach and empirically support our findings.

Keywords:
Computer science Lidar Fusion Artificial intelligence Sensor fusion Benchmark (surveying) Object detection Key (lock) Pattern recognition (psychology) Computer vision Data mining Remote sensing

Metrics

8
Cited By
1.46
FWCI (Field Weighted Citation Impact)
43
Refs
0.80
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering

Related Documents

JOURNAL ARTICLE

MMSDF: multimodal sparse dense fusion for 3D object detection

Yunfei ZhangFeipeng DaShaoyan Gai

Journal:   Applied Optics Year: 2025 Vol: 64 (30)Pages: F13-F13
JOURNAL ARTICLE

Dense Voxel Fusion for 3D Object Detection

Anas MahmoudJordan S. K. HuSteven L. Waslander

Journal:   2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Year: 2023 Pages: 663-672
JOURNAL ARTICLE

Dense projection fusion for 3D object detection

Chen ZhaoBin‐Jie HuChengxi LuoGuohao ChenHaohui Zhu

Journal:   Scientific Reports Year: 2024 Vol: 14 (1)Pages: 23492-23492
JOURNAL ARTICLE

Fully Sparse Fusion for 3D Object Detection

Yingyan LiLue FanYang LiuZehao HuangYuntao ChenNaiyan WangZhaoxiang Zhang

Journal:   IEEE Transactions on Pattern Analysis and Machine Intelligence Year: 2024 Vol: 46 (11)Pages: 7217-7231
© 2026 ScienceGate Book Chapters — All rights reserved.