One of the most frequently performed tasks in human-robot interaction (HRI), intelligent vehicles, and security systems is face related applications such as face recognition, facial expression recognition, driver state monitoring, and gaze estimation. In these applications, accurate head pose estimation is an important issue. However, conventional methods have been lacking in accuracy, robustness or processing speed in practical use. In this paper, we propose a novel method for estimating head pose with a monocular camera. The proposed algorithm is based on a deep neural network for multi-task learning using a small grayscale image. This network jointly detects multi-view faces and estimates head pose in hard environmental conditions such as illumination change and large pose change. The proposed framework quantitatively and qualitatively outperforms the state-of-the-art method with an average head pose mean error of less than 4.5° in real-time.
Finding a head of a person in a scene is very important for taking a well composed picture by a robot photographer because it depends on the position of the head. So in this paper, we propose a robust head tracking algorithm using a hybrid of an omega shape tracker and local binary pattern (LBP) AdaBoost face detector for the robot photographer to take a fine picture automatically. Face detection algorithms have good performance in terms of finding frontal faces, but it is not the same for rotated faces. In addition, when the face is occluded by a hat or hands, it has a hard time finding the face. In order to solve this problem, the omega shape tracker based on active shape model (ASM) is presented. The omega shape tracker is robust to occlusion and illumination change. However, when the environment is dynamic, such as when people move fast and when there is a complex background, its performance is unsatisfactory. Therefore, a method combining the face detection algorithm and the omega shape tracker by probabilistic method using histograms of oriented gradient (HOG) descriptor is proposed in this paper, in order to robustly find human head. A robot photographer was also implemented to abide by the 'rule of thirds' and to take photos when people smile.