论文标题
NFORMER:与邻居变形金刚重新识别的强大人员重新识别
NFormer: Robust Person Re-identification with Neighbor Transformer
论文作者
论文摘要
人重新识别旨在在不同的摄像机和场景中检索高度不同的环境中的人,在这种情况下,强大和歧视性的表示学习至关重要。大多数研究都考虑了从单个图像中的学习表示,忽略了它们之间的任何潜在相互作用。但是,由于较高的身份变化,忽略这种相互作用通常会导致异常特征。为了解决此问题,我们提出了一个邻居变压器网络,即Nformer,该网络明确对所有输入图像进行了建模,从而抑制了异常值,并导致总体上更强大的表示。由于大量图像之间的建模相互作用是大量干扰器的巨大任务,因此Nformer引入了两个新型模块,地标代理的注意力和相互的邻居SoftMax。具体而言,地标代理的注意力有效地模拟了图像之间的关系图,而在特征空间中具有一些地标在图像之间的关系图。此外,相互的邻居SoftMax仅对相关的邻居(而不是所有)稀疏的关注,这减轻了无关表示的干扰,并进一步减轻了计算负担。在四个大规模数据集的实验中,Nformer实现了新的最先进。该代码以\ url {https://github.com/haochenheheda/nformer}发布。
Person re-identification aims to retrieve persons in highly varying settings across different cameras and scenarios, in which robust and discriminative representation learning is crucial. Most research considers learning representations from single images, ignoring any potential interactions between them. However, due to the high intra-identity variations, ignoring such interactions typically leads to outlier features. To tackle this issue, we propose a Neighbor Transformer Network, or NFormer, which explicitly models interactions across all input images, thus suppressing outlier features and leading to more robust representations overall. As modelling interactions between enormous amount of images is a massive task with lots of distractors, NFormer introduces two novel modules, the Landmark Agent Attention, and the Reciprocal Neighbor Softmax. Specifically, the Landmark Agent Attention efficiently models the relation map between images by a low-rank factorization with a few landmarks in feature space. Moreover, the Reciprocal Neighbor Softmax achieves sparse attention to relevant -- rather than all -- neighbors only, which alleviates interference of irrelevant representations and further relieves the computational burden. In experiments on four large-scale datasets, NFormer achieves a new state-of-the-art. The code is released at \url{https://github.com/haochenheheda/NFormer}.