Gaze estimation can be applied to human visual attention understanding. The current methods mainly obtain gaze mapping from facial or eye images, and most of them only focus on gaze point or gaze direction estimation. In this paper, we propose a multitask gaze focus network for gaze point and gaze direction estimation. Focus attention layer is used to guide the generation of facial features. By connecting eye and face features, feature similarity is used to get attention weights, and make it tend to eyes position. We propose four loss functions to constrain the network in 2D and 3D spaces. The combination of eye position constraint and focus attention layer ensures the accuracy of gaze point estimation. Gaze focus is used to obtain gaze depth. Through comprehensive experiments, the advantages of proposed method in gaze tracking are verified. In addition, the application prospect of proposed method in depth-overlapping is proved.
Stylianos AsteriadisKostas KarpouzisStefanos Kollias
Xinmei WuLin LiHaihong ZhuGang ZhouLinfeng LiFei SuShen HeYang‐Gang WangXue Long
Ahmet KarazorAlperen Enes BayarCihan TopalHakan Çev Ikalp