论文标题
基于最深关键的摄像头姿势估计几何约束
Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints
论文作者
论文摘要
连续帧的相对相机姿势是视觉探光(VO)和同时定位和映射(SLAM)中的一个基本问题,其中包含手工制作的特征和基于抽样的异常拒绝的经典方法已成为十多年来的主要选择。尽管多个作品建议用基于学习的对应物代替这些模块,但大多数尚未像常规方法那样准确,健壮和可推广。在本文中,我们设计了一个端到端的可训练框架,该框架由可学习的模块组成,用于检测,提取,匹配和异常拒绝,同时直接针对几何姿势目标进行优化。我们在定量和定性上表明,在经典管道中可以实现姿势估计性能。此外,我们能够通过端到端的培训显示,与现有基于学习的方法相比,管道的关键组成部分可以大大改善,从而可以更好地概括地看不见数据集。
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry (VO) and simultaneous localization and mapping (SLAM), where classic methods consisting of hand-crafted features and sampling-based outlier rejection have been a dominant choice for over a decade. Although multiple works propose to replace these modules with learning-based counterparts, most have not yet been as accurate, robust and generalizable as conventional methods. In this paper, we design an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection, while directly optimizing for the geometric pose objective. We show both quantitatively and qualitatively that pose estimation performance may be achieved on par with the classic pipeline. Moreover, we are able to show by end-to-end training, the key components of the pipeline could be significantly improved, which leads to better generalizability to unseen datasets compared to existing learning-based methods.