This paper describes how a person extracts a unknown object with pointing gesture while interacting with a robot. Using a stereo vision sensor, our proposed method consists of two stages: the detection of the operators' face, the estimation of the pointing direction, and the extraction of the pointed object. The operator's face is recognized by using the Haar-like features. And then we estimate the 3D pointing direction from the shoulder-to-hand line. Finally, we segment an unknown object from 3D point clouds in estimated region of interest. On the basis of this proposed method, we implemented an object registration system with our mobile robot and obtained reliable experimental results.