论文标题
学习选择性相互关注和RGB-D显着性检测的对比度
Learning Selective Mutual Attention and Contrast for RGB-D Saliency Detection
论文作者
论文摘要
如何有效融合跨模式信息是RGB-D显着对象检测的关键问题。早期融合和结果融合方案分别在输入和输出阶段融合了RGB和深度信息,因此会导致分布差距或信息损失的问题。许多模型使用特征融合策略,但受到低阶点对点融合方法的限制。在本文中,我们通过融合不同方式的注意力和环境来提出一种新颖的相互注意模型。我们使用一种模态的非本地注意力来传播另一种方式的远程上下文依赖性,从而利用互补的注意线索来执行高阶和三线性交叉模式相互作用。我们还建议从相互关注中引起对比推断并获得统一模型。考虑到低质量的深度数据可能会损害模型性能,我们进一步提出了选择性的关注,以重新持续增加深度线索。我们将提出的模块嵌入了RGB-D SOD的两流CNN中。实验结果证明了我们提出的模型的有效性。此外,我们还构建了一个具有高质量的新挑战性的大规模RGB-D SOD数据集,因此可以促进对深层模型的训练和评估。
How to effectively fuse cross-modal information is the key problem for RGB-D salient object detection. Early fusion and the result fusion schemes fuse RGB and depth information at the input and output stages, respectively, hence incur the problem of distribution gap or information loss. Many models use the feature fusion strategy but are limited by the low-order point-to-point fusion methods. In this paper, we propose a novel mutual attention model by fusing attention and contexts from different modalities. We use the non-local attention of one modality to propagate long-range contextual dependencies for the other modality, thus leveraging complementary attention cues to perform high-order and trilinear cross-modal interaction. We also propose to induce contrast inference from the mutual attention and obtain a unified model. Considering low-quality depth data may detriment the model performance, we further propose selective attention to reweight the added depth cues. We embed the proposed modules in a two-stream CNN for RGB-D SOD. Experimental results have demonstrated the effectiveness of our proposed model. Moreover, we also construct a new challenging large-scale RGB-D SOD dataset with high-quality, thus can both promote the training and evaluation of deep models.