This paper presents a new benchmark system for visual odometry (VO) and monocular depth estimation (MDE). As deep learning has become a key technology in computer vision, many researchers are trying to apply deep learning to VO and MDE. Just a couple of years ago, they were independently studied in a supervised way, but now they are coupled and trained together in an unsupervised way. However, before designing fancy models and losses, we have to customize datasets to use them for training and testing. After training, the model has to be compared with the existing models, which is also a huge burden. The benchmark provides input dataset ready-to-use for VO and MDE research in ‘tfrecords’ format and output dataset that includes model checkpoints and inference results of the existing models. It also provides various tools for data formatting, training, and evaluation. In the experiments, the exsiting models were evaluated to verify their performances presented in the corresponding papers and we found that the evaluation result is inferior to the presented performances.
Odometry using wheel encoder is a common relative positioning technique for wheeled mobile robots. The major drawback of odometry is that the kinematic modeling errors are accumulated when the travel distance increases. Therefore, accurate calibration of odometry is required. In several related works, various schemes for odometry calibration are proposed. However, design guidelines of test tracks for odometry calibration were not considered. More accurate odometry calibration results can be achieved by using appropriate test track because the position and orientation errors after the test are affected by the test track. In this paper, we propose the design guidelines of test tracks for odometry calibration schemes using experimental heading errors. Numerical simulations and experiments clearly demonstrate that the proposed design guidelines result in more accurate calibration results.
Visual odometry is a popular approach to estimating robot motion using a monocular or stereo camera. This paper proposes a novel visual odometry scheme using a stereo camera for robust estimation of a 6 DOF motion in the dynamic environment. The false results of feature matching and the uncertainty of depth information provided by the camera can generate the outliers which deteriorate the estimation. The outliers are removed by analyzing the magnitude histogram of the motion vector of the corresponding features and the RANSAC algorithm. The features extracted from a dynamic object such as a human also makes the motion estimation inaccurate. To eliminate the effect of a dynamic object, several candidates of dynamic objects are generated by clustering the 3D position of features and each candidate is checked based on the standard deviation of features on whether it is a real dynamic object or not. The accuracy and practicality of the proposed scheme are verified by several experiments and comparisons with both IMU and wheel-based odometry. It is shown that the proposed scheme works well when wheel slip occurs or dynamic objects exist.
Recently, automatic parking assist systems are commercially available in some cars. In order to improve the reliability and the accuracy of parking control, pose uncertainty of a vehicle and some experimental issues should be solved. In this paper, following three schemes are proposed. (1) Odometry calibration scheme for the Car-Like Mobile Robot.(CLMR) (2) Accurate localization using Extended Kalman Filter(EKF) based redundant odometry fusion. (3) Trajectory tracking controller to compensate the tracking error of the CLMR. The proposed schemes are experimentally verified using a miniature Car-Like Mobile Robot. This paper shows that odometry accuracy and trajectory tracking performance can be dramatically improved by using the proposed schemes.
This paper presents a new sensor system. CALOS, for motion estimation and 3D reconstruction. The 2D laser sensor provides accurate depth information of a plane, not the whole 3D structure. On the contrary, the CCD cameras provide the projected image of whole 3D scene, not the depth of the scene. To overcome the limitations, we combine these two types of sensors, the laser sensor and the CCD cameras. We develop a motion estimation scheme appropriate for this sensor system.In the proposed scheme, the motion between two frames is estimated by using three points among the scan data and their corresponding image points, and refined by non-linear optimization. We validate the accuracy of the proposed method by 3D reconstruction using real images. The results show that the proposed system can be a practical solution for motion estimation as well as for 3D reconstruction.