This paper describes a new method for indoor environment mapping and localization with stereo camera. For environmental modeling, we directly use the depth and color information in image pixels as visual features. Furthermore, only the depth and color information at horizontal centerline in image is used, where optical axis passes through. The usefulness of this method is that we can easily build a measure between modeling and sensing data only on the horizontal centerline. That is because vertical working volume between model and sensing data can be changed according to robot motion. Therefore, we can build a map about indoor environment as compact and efficient representation. Also, based on such nodes and sensing data, we suggest a method for estimating mobile robot positioning with random sampling stochastic algorithm. With basic real experiments, we show that the proposed method can be an effective visual navigation algorithm.