We present a method for estimating a person's head pose with a stereo camera. Our approach focuses on the application of human-robot interaction, where people may be further away from the camera and move freely around in a room. We show that depth information acquired from a stereo camera not only helps improving the accuracy of the pose estimation, but also improves the robustness of the system when the lighting conditions change. The estimation is based on neural networks, which are trained to compute the head pose from grayscale and disparity images of the stereo camera. It can handle pan and tilt rotations from -90/spl deg/ to +90/spl deg/. Our system does not require any manual initialization and does not suffer from drift during an image sequence. Moreover the system is capable of real-time processing.
Seemann, EdgarNickel, KaiStiefelhagen, Rainer
Matthias HöffkenEmin TarayanUlrich KreselKlaus Dietmayer
Qing ShiGuanglu JiaXiaowen GuoZhe ChenChangwen Dong
Junmin LiJinge WangWentian ZhouShiwei JiaLiu Zhen