Xiang GuC. LiJie YangJing WangQiwei Huang
The accuracy of pedestrian trajectory prediction is crucial for the safety of autonomous driving systems. However, the task still faces challenges in modeling long-term dependencies, complex spatial interactions, and multi-scale feature fusion. To address these issues, this paper proposes the WAGIN (Windowed Attention Graph Interaction Network) model. First, in the temporal dimension, a window mask mechanism is designed to adjust the attention receptive field at each time step, effectively capturing temporal dependencies. In the spatial dimension, a hierarchical heterogeneous GCN (graph convolutional network) is constructed, combining pedestrian dynamic interaction graphs and scene semantic static graphs. Additionally, an interaction kernel function based on motion consistency is proposed to model the interactions between individual pedestrians. Finally, a multi-scale dilated convolution network is employed for future trajectory generation, capturing multi-scale spatiotemporal features through dilated convolutions to enhance prediction accuracy and robustness. The model is experimentally validated on the public ETH/UCY dataset, and the results demonstrate its effectiveness, achieving improvements of 23% in average displacement error (ADE) and 21% in final displacement error (FDE) over baseline methods. Moreover, qualitative analysis reveals the model’s excellent generalization ability in handling different scenarios.
Yanran LiuHongyan GuoQingyu MengJialin Li
Chao SunBo WangJianghao LengXiangchao ZhangBo Wang
Yonghong LiJiayi CuiZhiqiang ZhaoLaquan Li
Wei KongYun LiuHui LiChuanxu Wang