基于知识图的上下文驱动的视觉对象识别

论文标题

基于知识图的上下文驱动的视觉对象识别

Context-driven Visual Object Recognition based on Knowledge Graphs

论文作者

Monka, Sebastian, Halilaj, Lavdim, Rettinger, Achim

论文摘要

当前用于对象识别的深度学习方法纯粹是数据驱动的，需要大量的培训样本才能获得良好的结果。由于它们对图像数据的唯一依赖性，当面对甚至发生小偏差的新环境时，这些方法往往会失败。然而，事实证明，人类的看法对这种分布变化更加强大。假定他们处理未知场景的能力是基于广泛纳入上下文知识的能力。上下文可以基于场景中的对象共发生，也可以基于经验的记忆。根据人类视觉皮层，使用上下文形成可见图像的不同对象表示形式，我们提出了一种方法，该方法通过使用知识图中编码的外部上下文知识来增强深度学习方法。因此，我们从通用知识图中提取不同的上下文视图，将视图转换为向量空间并将其注入DNN。我们进行了一系列实验，以调查不同上下文观点对同一图像数据集学习对象表示的影响。实验结果提供了证据，表明上下文视图对DNN中的图像表示有所不同，因此导致对同一图像的预测不同。我们还表明，上下文有助于增强分布外图像的对象识别模型的鲁棒性，通常发生在转移学习任务或现实世界中。

Current deep learning methods for object recognition are purely data-driven and require a large number of training samples to achieve good results. Due to their sole dependence on image data, these methods tend to fail when confronted with new environments where even small deviations occur. Human perception, however, has proven to be significantly more robust to such distribution shifts. It is assumed that their ability to deal with unknown scenarios is based on extensive incorporation of contextual knowledge. Context can be based either on object co-occurrences in a scene or on memory of experience. In accordance with the human visual cortex which uses context to form different object representations for a seen image, we propose an approach that enhances deep learning methods by using external contextual knowledge encoded in a knowledge graph. Therefore, we extract different contextual views from a generic knowledge graph, transform the views into vector space and infuse it into a DNN. We conduct a series of experiments to investigate the impact of different contextual views on the learned object representations for the same image dataset. The experimental results provide evidence that the contextual views influence the image representations in the DNN differently and therefore lead to different predictions for the same images. We also show that context helps to strengthen the robustness of object recognition models for out-of-distribution images, usually occurring in transfer learning tasks or real-world scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题