具有部分标签的多标签识别的异构语义转移

论文标题

具有部分标签的多标签识别的异构语义转移

Heterogeneous Semantic Transfer for Multi-label Recognition with Partial Labels

论文作者

Chen, Tianshui, Pu, Tao, Liu, Lingbo, Shi, Yukai, Yang, Zhijing, Lin, Liang

论文摘要

具有部分标签（MLR-PL）的多标签图像识别，其中某些标签是已知的，而另一些标签则在每个图像中未知，可能会大大降低注释的成本，从而有助于大规模MLR。我们发现，每个图像内部和跨不同图像都存在强大的语义相关性，并且这些相关性可以帮助传递已知标签所拥有的知识以检索未知标签，从而改善MLR-PL任务的性能（见图1）。在这项工作中，我们提出了一个新型的异质语义转移（HST）框架，该框架由两个互补的转移模块组成，该模块探索了图像内部和跨图像的语义相关性，以转移已知标签所拥有的知识以生成未知标签的伪标签。具体而言，图像内图像传输（IST）模块学习每个图像的特定图像标签共发生矩阵，并映射已知的标签以根据这些矩阵进行补充未知标签。此外，跨图像转移（CST）模块学习特定于类别的特征 - 型相似性，然后有助于补充与相应原型具有高度相似性的未知标签。最后，已知和生成的伪标签都用于训练MLR模型。在Microsoft Coco，Visual Genome和Pascal VOC 2007数据集上进行的广泛实验表明，所提出的HST框架的性能比当前最新算法的性能卓越。具体而言，与先前开发的算法最佳的结果相比，这三个数据集的平均平均精度（MAP）提高了1.4％，3.3％和0.4％。

Multi-label image recognition with partial labels (MLR-PL), in which some labels are known while others are unknown for each image, may greatly reduce the cost of annotation and thus facilitate large-scale MLR. We find that strong semantic correlations exist within each image and across different images, and these correlations can help transfer the knowledge possessed by the known labels to retrieve the unknown labels and thus improve the performance of the MLR-PL task (see Figure 1). In this work, we propose a novel heterogeneous semantic transfer (HST) framework that consists of two complementary transfer modules that explore both within-image and cross-image semantic correlations to transfer the knowledge possessed by known labels to generate pseudo labels for the unknown labels. Specifically, an intra-image semantic transfer (IST) module learns an image-specific label co-occurrence matrix for each image and maps the known labels to complement the unknown labels based on these matrices. Additionally, a cross-image transfer (CST) module learns category-specific feature-prototype similarities and then helps complement the unknown labels that have high degrees of similarity with the corresponding prototypes. Finally, both the known and generated pseudo labels are used to train MLR models. Extensive experiments conducted on the Microsoft COCO, Visual Genome, and Pascal VOC 2007 datasets show that the proposed HST framework achieves superior performance to that of current state-of-the-art algorithms. Specifically, it obtains mean average precision (mAP) improvements of 1.4%, 3.3%, and 0.4% on the three datasets over the results of the best-performing previously developed algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题