JOURNAL ARTICLE

Adaptive Multiscale Attention Feature Aggregation for Multi‐Modal 3D Occluded Object Detection

Y. J. HanMing YuJing Liu

Year: 2025 Journal:   IET Computer Vision Vol: 19 (1)   Publisher: Institution of Engineering and Technology

Abstract

ABSTRACT Accurate perception and understanding of the three‐dimensional environment is crucial for autonomous vehicles to navigate efficiently and make wise decisions. However, in complex real‐world scenarios, the information obtained by a single‐modal sensor is often incomplete, severely affecting the detection accuracy of occluded targets. To address this issue, this paper proposes a novel adaptive multi‐scale attention aggregation strategy, efficiently fusing multi‐scale feature representations of heterogeneous data to accurately capture the shape details and spatial relationships of targets in three‐dimensional space. This strategy utilises learnable sparse keypoints to dynamically align heterogeneous features in a data‐driven manner, adaptively modelling the cross‐modal mapping relationships between keypoints and their corresponding multi‐scale image features. Given the importance of accurately obtaining the three‐dimensional shape information of targets for understanding the size and rotation pose of occluded targets, this paper adopts a shape prior knowledge‐based constraint method and data augmentation strategy to guide the model to more accurately perceive the complete three‐dimensional shape and rotation pose of occluded targets. Experimental results show that our proposed model achieves 2.15%, 3.24% and 2.75% improvement in 3D R40 mAP score under the easy, moderate and hard difficulty levels compared to MVXNet, significantly enhancing the detection accuracy and robustness of occluded targets in complex scenarios.

Keywords:
Computer science Modal Artificial intelligence Feature (linguistics) Object detection Pattern recognition (psychology) Computer vision Object (grammar) Feature extraction Chemistry

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
32
Refs
0.14
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Adaptive multiscale feature for object detection

Xiao-Yong YuSiyuan WuXiaoqiang LuGuilong Gao

Journal:   Neurocomputing Year: 2021 Vol: 449 Pages: 146-158
JOURNAL ARTICLE

AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

Zehui ChenZhenyu LiShiquan ZhangLiangji FangQinhong JiangFeng ZhaoBolei ZhouHang Zhao

Journal:   Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence Year: 2022 Pages: 827-833
JOURNAL ARTICLE

Attention guided multi-level feature aggregation network for camouflaged object detection

Anzhi WangChunhong RenShuang ZhaoShibiao Mu

Journal:   Image and Vision Computing Year: 2024 Vol: 144 Pages: 104953-104953
© 2026 ScienceGate Book Chapters — All rights reserved.