Gongpu WuChangyuan WangLina GaoJinna Xue
Inferring gaze targeting or gaze following is an effective approach for comprehending human behavior and intentions. This paper employs a non-intrusive appearance-based tracking technique, utilizing a binocular stereo vision camera to capture the face image and head pose to address errors caused by problems such as the disappearance of the eye image and head deflection occlusion in image capture. Each gaze direction is determined based on a single image frame. To improve the classification and detection of the gaze target region by effectively handling head motion and view direction, this paper proposes a hybrid structure for the Swin Transformer gaze target region classification method. The facial image features are extracted using both the ResNet50 model and the Swin Transformer model, followed by fusing head pose features to categorise the gaze target area. The study also compares the classification effects of various structural models. The analysis of the results demonstrates that the hybrid Swin Transformer model outperforms in classifying and detecting the gaze target region, achieving an accuracy rate of 90%. Finally, the research examines the gaze of flight trainees during flight missions by using a heatmap, which lays the groundwork for future analyses of pilot attention and operational intentions during flights.
Zhang ChengYanxia WangXinliang LiuWei LiangFenglin Huang
Jiahui ChenJiaxin MaXiwen WangLongzhao HuangYujie Li
Ruijie ZhaoYuhuan WangSihui LuoSuyao ShouPinyan Tang
Weiwei SunFan ZhangPing SunQishi HuJianhong WangMinghui Zhang