JOURNAL ARTICLE

Visual–Motion–Interaction-Guided Pedestrian Intention Prediction Framework

Neha SharmaChhavi DhimanS. Indu

Year: 2023 Journal:   IEEE Sensors Journal Vol: 23 (22)Pages: 27540-27548   Publisher: IEEE Sensors Council

Abstract

The capability to comprehend the intention of pedestrians on the road is one of the most crucial skills that the current autonomous vehicles (AVs) are striving for, to become fully autonomous. In recent years, multimodal methods have gained traction employing trajectory, appearance, and context for predicting pedestrian crossing intention. However, most existing research works still lag rich feature representational ability in a multimodal scenario, restricting their performance. Moreover, less emphasis is put on pedestrian interactions with the surroundings for predicting short-term pedestrian intention in a challenging ego-centric vision. To address these challenges, an efficient visual–motion–interaction-guided (VMI) intention prediction framework has been proposed. This framework comprises visual encoder (VE), motion encoder (ME), and interaction encoder (IE) to capture rich multimodal features of the pedestrian and its interactions with the surroundings, followed by temporal attention and adaptive fusion (AF) module (AFM) to integrate these multimodal features efficiently. The proposed framework outperforms several SOTA on benchmark datasets: Pedestrian Intention Estimation (PIE)/Joint Attention in Autonomous Driving (JAAD) with accuracy, AUC, ${F}1$ -score, precision, and recall as 0.92/0.89, 0.91/0.90, 0.87/0.81, 0.86/0.79, and 0.88/0.83, respectively. Furthermore, extensive experiments are carried out to investigate different fusion architectures and design parameters of all encoders. The proposed VMI framework predicts pedestrian crossing intention 2.5 s ahead of the crossing event. Code is available at: https://github.com/neha013/VMI.git .

Keywords:
Pedestrian Encoder Computer science Benchmark (surveying) Artificial intelligence Context (archaeology) Trajectory Machine learning Feature (linguistics) Human–computer interaction Computer vision Engineering

Metrics

19
Cited By
3.11
FWCI (Field Weighted Citation Impact)
39
Refs
0.88
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Autonomous Vehicle Technology and Safety
Physical Sciences →  Engineering →  Automotive Engineering
Traffic and Road Safety
Physical Sciences →  Engineering →  Safety, Risk, Reliability and Quality
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Visual Exposes You: Pedestrian Trajectory Prediction Meets Visual Intention

Xian ZhongYan XuZhengwei YangWenxin HuangKui JiangRyan Wen LiuZheng Wang

Journal:   IEEE Transactions on Intelligent Transportation Systems Year: 2023 Vol: 24 (9)Pages: 9390-9400
JOURNAL ARTICLE

Condition-Guided Diffusion for Multi-Modal Pedestrian Trajectory Prediction Incorporating Intention and Interaction Priors

Yanghong LiuXingping DongYutian LinMang YeKaihao ZhangBo Du

Journal:   IEEE Transactions on Pattern Analysis and Machine Intelligence Year: 2025 Vol: PP Pages: 1-14
JOURNAL ARTICLE

TCP: Text-Guided Cascade Network for Pedestrian Crossing Intention Prediction

Yuhao XiaoWenxuan LiuWenxin HuangJie MaRyan Wen LiuXian Zhong

Journal:   IEEE Transactions on Intelligent Transportation Systems Year: 2025 Vol: 27 (1)Pages: 831-841
© 2026 ScienceGate Book Chapters — All rights reserved.