匹配多种观点以进行有效表示学习

论文标题

匹配多种观点以进行有效表示学习

Matching Multiple Perspectives for Efficient Representation Learning

论文作者

Pantazis, Omiros, Salvaris, Mathew

论文摘要

表示学习方法通常依赖于从单个角度捕获的对象的图像，该对象使用仿射转换进行了转换。此外，自我监督的学习是代表学习的成功范式，它依赖于实例歧视和自我实践，这些歧视和自我实践并不能总是弥合从不同角度观察的相同对象观察之间的差距。从多个角度查看对象有助于对对象的整体理解，这在数据注释有限的情况下尤其重要。在本文中，我们提出了一种方法，该方法将自我监督的学习与多镜头匹配技术结合在一起，并证明了其在通过机器人真空捕获的数据和嵌入式相机捕获的数据上学习更高质量表示的有效性。我们表明，同一对象的多个视图与各种自我监管的预处理算法相结合的可用性可能会导致没有额外标签的对象分类性能提高对象分类性能。

Representation learning approaches typically rely on images of objects captured from a single perspective that are transformed using affine transformations. Additionally, self-supervised learning, a successful paradigm of representation learning, relies on instance discrimination and self-augmentations which cannot always bridge the gap between observations of the same object viewed from a different perspective. Viewing an object from multiple perspectives aids holistic understanding of an object which is particularly important in situations where data annotations are limited. In this paper, we present an approach that combines self-supervised learning with a multi-perspective matching technique and demonstrate its effectiveness on learning higher quality representations on data captured by a robotic vacuum with an embedded camera. We show that the availability of multiple views of the same object combined with a variety of self-supervised pretraining algorithms can lead to improved object classification performance without extra labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题