论文标题
通过构图感知学习的结构化视觉搜索
Structured Visual Search via Composition-aware Learning
论文作者
论文摘要
本文使用结构化查询研究视觉搜索。该结构的形式为2D组成,该组合物编码对象的位置和类别。位置的转换和对象的类别导致视觉组成之间的连续值关系,该关系带有非常有益的信息,尽管并非由以前的技术利用。为此,在这项工作中,我们的目标是通过使用对称性的对称性概念来利用这些连续的关系。我们的模型输出经过相对于输入转换的对称变化,从而导致敏感的特征空间。这样做会导致高效的搜索技术,因为我们的方法使用较小的功能空间从较少的数据中学习。在MS-Coco和Hico-Det的两个大规模基准上进行的实验表明,我们的方法可导致针对竞争技术的性能的可观增长。
This paper studies visual search using structured queries. The structure is in the form of a 2D composition that encodes the position and the category of the objects. The transformation of the position and the category of the objects leads to a continuous-valued relationship between visual compositions, which carries highly beneficial information, although not leveraged by previous techniques. To that end, in this work, our goal is to leverage these continuous relationships by using the notion of symmetry in equivariance. Our model output is trained to change symmetrically with respect to the input transformations, leading to a sensitive feature space. Doing so leads to a highly efficient search technique, as our approach learns from fewer data using a smaller feature space. Experiments on two large-scale benchmarks of MS-COCO and HICO-DET demonstrates that our approach leads to a considerable gain in the performance against competing techniques.