RelMobNet: End-to-end relative camera pose estimation using a robust two-stage training

التفاصيل البيبلوغرافية
العنوان: RelMobNet: End-to-end relative camera pose estimation using a robust two-stage training
المؤلفون: Rajendran, Praveen Kumar, Mishra, Sumit, Vecchietti, Luiz Felipe, Har, Dongsoo
سنة النشر: 2022
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
الوصف: Relative camera pose estimation, i.e. estimating the translation and rotation vectors using a pair of images taken in different locations, is an important part of systems in augmented reality and robotics. In this paper, we present an end-to-end relative camera pose estimation network using a siamese architecture that is independent of camera parameters. The network is trained using the Cambridge Landmarks data with four individual scene datasets and a dataset combining the four scenes. To improve generalization, we propose a novel two-stage training that alleviates the need of a hyperparameter to balance the translation and rotation loss scale. The proposed method is compared with one-stage training CNN-based methods such as RPNet and RCPNet and demonstrate that the proposed model improves translation vector estimation by 16.11%, 28.88%, and 52.27% on the Kings College, Old Hospital, and St Marys Church scenes, respectively. For proving texture invariance, we investigate the generalization of the proposed method augmenting the datasets to different scene styles, as ablation studies, using generative adversarial networks. Also, we present a qualitative assessment of epipolar lines of our network predictions and ground truth poses.
Comment: 15 pages, 7 figures, 2 tables - RelMobNet revised draft
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2202.12838
رقم الأكسشن: edsarx.2202.12838
قاعدة البيانات: arXiv