Robots need to understand as much as possible about their environmental situation and react appropriately to any event that provokes changes in their behavior. In this paper, we pay attention to topological relations between spatial objects and propose a model of robotic cognition that represents and infers temporal relations. Specifically, the proposed model extracts specified features of the co-occurrence matrix represents from disparity images of the stereo vision system. More importantly, a habituation model is used to infer intrinsic spatial relations between objects. A preliminary experimental investigation is carried out to verify the validity of the proposed method under real test condition.