Visual–Motion–Interaction-Guided Pedestrian Intention Prediction Framework

Neha Sharma; Chhavi Dhiman; S. Indu

doi:10.1109/jsen.2023.3317426

ScienceGate Book Chapters

JOURNAL ARTICLE

Visual–Motion–Interaction-Guided Pedestrian Intention Prediction Framework

Neha Sharma Chhavi Dhiman S. Indu

Year: 2023 Journal: IEEE Sensors Journal Vol: 23 (22)Pages: 27540-27548 Publisher: IEEE Sensors Council

DOI: 10.1109/jsen.2023.3317426

Get Full-Text PDF Get Analytical Report

Abstract

The capability to comprehend the intention of pedestrians on the road is one of the most crucial skills that the current autonomous vehicles (AVs) are striving for, to become fully autonomous. In recent years, multimodal methods have gained traction employing trajectory, appearance, and context for predicting pedestrian crossing intention. However, most existing research works still lag rich feature representational ability in a multimodal scenario, restricting their performance. Moreover, less emphasis is put on pedestrian interactions with the surroundings for predicting short-term pedestrian intention in a challenging ego-centric vision. To address these challenges, an efficient visual–motion–interaction-guided (VMI) intention prediction framework has been proposed. This framework comprises visual encoder (VE), motion encoder (ME), and interaction encoder (IE) to capture rich multimodal features of the pedestrian and its interactions with the surroundings, followed by temporal attention and adaptive fusion (AF) module (AFM) to integrate these multimodal features efficiently. The proposed framework outperforms several SOTA on benchmark datasets: Pedestrian Intention Estimation (PIE)/Joint Attention in Autonomous Driving (JAAD) with accuracy, AUC, ${F}1$ -score, precision, and recall as 0.92/0.89, 0.91/0.90, 0.87/0.81, 0.86/0.79, and 0.88/0.83, respectively. Furthermore, extensive experiments are carried out to investigate different fusion architectures and design parameters of all encoders. The proposed VMI framework predicts pedestrian crossing intention 2.5 s ahead of the crossing event. Code is available at: https://github.com/neha013/VMI.git .

Keywords:

Pedestrian Encoder Computer science Benchmark (surveying) Artificial intelligence Context (archaeology) Trajectory Machine learning Feature (linguistics) Human–computer interaction Computer vision Engineering

Metrics

Cited By

3.11

FWCI (Field Weighted Citation Impact)

Refs

0.88

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Autonomous Vehicle Technology and Safety

Physical Sciences → Engineering → Automotive Engineering

Traffic and Road Safety

Physical Sciences → Engineering → Safety, Risk, Reliability and Quality

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Visual–Motion–Interaction-Guided Pedestrian Intention Prediction Framework

Abstract

Metrics

Citation History

Topics

Related Documents

Intelligent Pedestrian Intention Prediction Framework

Visual Exposes You: Pedestrian Trajectory Prediction Meets Visual Intention

Condition-Guided Diffusion for Multi-Modal Pedestrian Trajectory Prediction Incorporating Intention and Interaction Priors

Pedestrian Trajectory Prediction Driven by Bidirectional Intention-Interaction

TCP: Text-Guided Cascade Network for Pedestrian Crossing Intention Prediction