Learning to understand human behaviors and forecast their motions is a prerequisite for an automated car to navigate in urban traffic safely and efficiently. When pedestrians interact with a vehicle, they follow specific motion patterns based on their intentions. This work presents a conditional generative adversarial network based architecture that explicitly model human intention as a conditional variable to robustly learn the multi-modal nature of pedestrian motion for accurate future trajectory prediction. The generator in our framework uses a LSTM encoder-decoder conditioned on human intention for motion prediction while the discriminator consisting of a LSTM classifier learns to distinguish whether a predicted trajectory is consistent with a given intention. Through experiments on two real-world datasets, it demonstrates that our proposed architecture outperforms state-of-the-art methods in terms of the average displacement error of predicted positions. Additionally, qualitative analysis shows that our model is capable of predicting a multi-modal distribution with respect to human intentions.
Prasen KumarSharma PriyankarJain Arijit Sur
Mohammed RazzokAbdelmajid BadriIlham El MourabitYassine RuichekAïcha Sahel
Binhao HuangZhenwei MaLianggangxu ChenGaoqi He