JOURNAL ARTICLE

MMSDF: multimodal sparse dense fusion for 3D object detection

Yunfei ZhangFeipeng DaShaoyan Gai

Year: 2025 Journal:   Applied Optics Vol: 64 (30)Pages: F13-F13   Publisher: Optica Publishing Group

Abstract

A high-precision 3D object detection in autonomous driving requires effective LiDAR-camera fusion. However, the heterogeneous nature of these modalities makes it challenging to fully integrate geometric and semantic information. Existing methods adopt either sparse or dense fusion: sparse fusion retains geometric accuracy but lacks semantic richness, while dense fusion offers better semantics but suffers from inefficiency and noise sensitivity. To address this, we propose the multimodal sparse dense fusion (MMSDF), a complementary framework that combines both fusion strategies. It includes (1) a sparse fusion attention (SFA) module that projects non-empty LiDAR voxels onto the image plane to extract local semantic features; (2) a dense bird’s eye view (BEV) feature alignment (BFA) module using optical flow and frequency-domain convolutions to align LiDAR and image BEV features; and (3) a roI point-voxel fusion attention (RPVFA) module that enhances roI representations via cross-attention between point and multiscale voxel features. Experiments on KITTI show that MMSDF achieves 88.21% and 84.26% accuracy on validation and test sets, respectively, with ablation studies confirming the effectiveness of each module.

Keywords:
Optics Fusion Computer science Artificial intelligence Computer vision Materials science Physics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
48
Refs
0.22
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing and 3D Reconstruction
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

BOOK-CHAPTER

Multimodal Sparse Features for Object Detection

Martin HakerThomas MartinetzErhardt Barth

Lecture notes in computer science Year: 2009 Pages: 923-932
JOURNAL ARTICLE

Dense Voxel Fusion for 3D Object Detection

Anas MahmoudJordan S. K. HuSteven L. Waslander

Journal:   2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Year: 2023 Pages: 663-672
JOURNAL ARTICLE

Dense projection fusion for 3D object detection

Chen ZhaoBin‐Jie HuChengxi LuoGuohao ChenHaohui Zhu

Journal:   Scientific Reports Year: 2024 Vol: 14 (1)Pages: 23492-23492
© 2026 ScienceGate Book Chapters — All rights reserved.