BEVCorner: Enhancing Bird’s-Eye View Object Detection with Monocular Features via Depth Fusion

Jesslyn Nathania; Qiyuan Liu; Zhiheng Li; Liming Liu; Yipeng Gao

doi:10.3390/app15073896

ScienceGate Book Chapters

JOURNAL ARTICLE

BEVCorner: Enhancing Bird’s-Eye View Object Detection with Monocular Features via Depth Fusion

Jesslyn Nathania Qiyuan Liu Zhiheng Li Liming Liu Yipeng Gao

Year: 2025 Journal: Applied Sciences Vol: 15 (7)Pages: 3896-3896 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/app15073896

Get Full-Text PDF Get Analytical Report

Abstract

This research paper presents BEVCorner, a novel framework that synergistically integrates monocular and multi-view pipelines for enhanced 3D object detection in autonomous driving. By fusing depth maps from Bird’s-Eye View (BEV) with object-centric depth estimates from monocular detection, BEVCorner enhances both global context and local precision, addressing the limitations of existing methods in depth precision, occlusion robustness, and computational efficiency. The paper explores four fusion techniques—direct replacement, weighted fusion, region-of-interest refinement, and hard combine—to balance the strengths of monocular and BEV depth estimation. Initial experiments on the NuScenes dataset yield a 38.72% NDS, which is lower than the baseline BEVDepth’s 43.59% NDS, highlighting the challenges in monocular pipeline alignment. Nevertheless, the upper-bound performance of BEVCorner is assessed under ground-truth depth supervision, and the results show a significant improvement, achieving a 53.21% NDS, despite a 21.96% increase in parameters (from 76.4 M to 97.9 M). The upper-bound analysis highlights the promise of camera-only fusion for resource-constrained scenarios.

Keywords:

Computer vision Artificial intelligence Computer science Monocular Object (grammar) Optometry Medicine

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.08

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Robotics and Sensor-Based Localization

Physical Sciences → Engineering → Aerospace Engineering

BEVCorner: Enhancing Bird’s-Eye View Object Detection with Monocular Features via Depth Fusion

Abstract

Metrics

Topics

Related Documents

Bi-Directional Bird’s-Eye View Features Fusion for 3D Multimodal Object Detection and Tracking

FB-Net: Multi-sensor fusion for object detection using front view and bird’s-eye-view features

Radar–Camera Fusion in Perspective View and Bird’s Eye View for 3D Object Detection

BEVTransFusion: LiDAR-Camera Fusion Under Bird’s-Eye-View for 3D Object Detection with Transformers

CL-fusionBEV: 3D object detection method with camera-LiDAR fusion in Bird’s Eye View