论文标题
HL-NET:场景图的异性学习网络
HL-Net: Heterophily Learning Network for Scene Graph Generation
论文作者
论文摘要
场景图生成(SGG)旨在检测对象并在图像中预测其成对关系。当前的SGG方法通常利用图形神经网络(GNN)在对象/关系之间获取上下文信息。但是,尽管它们的有效性,但当前的SGG方法仅在忽略异质的同时假设场景图均匀。因此,在本文中,我们提出了一个新颖的异质学习网络(HL-NET),以全面地探索场景图中对象/关系之间的同质和异质性。更具体地说,HL-NET包含以下1)自适应重新加权变压器模块,该模块可以自适应地整合了从不同层的信息,以在对象中利用异性和同质性; 2)一个关系特征传播模块,该模块通过考虑异质性以完善关系表示来有效地探索关系之间的联系; 3)一种异性觉醒的消息传语方案,以进一步区分异性和同质物体/关系之间,从而促进了图形中的改进消息传递。我们在两个公共数据集上进行了广泛的实验:视觉基因组(VG)和开放图像(OI)。实验结果表明,我们提出的HL-NET优于现有的最新方法。更详细地,HL-NET在VG数据集上以2.1 $ \%$的优于第二好的竞争对手用于场景图分类,而IO数据集的1.2 $ \%$用于最终分数。代码可在https://github.com/siml3/hl-net上找到。
Scene graph generation (SGG) aims to detect objects and predict their pairwise relationships within an image. Current SGG methods typically utilize graph neural networks (GNNs) to acquire context information between objects/relationships. Despite their effectiveness, however, current SGG methods only assume scene graph homophily while ignoring heterophily. Accordingly, in this paper, we propose a novel Heterophily Learning Network (HL-Net) to comprehensively explore the homophily and heterophily between objects/relationships in scene graphs. More specifically, HL-Net comprises the following 1) an adaptive reweighting transformer module, which adaptively integrates the information from different layers to exploit both the heterophily and homophily in objects; 2) a relationship feature propagation module that efficiently explores the connections between relationships by considering heterophily in order to refine the relationship representation; 3) a heterophily-aware message-passing scheme to further distinguish the heterophily and homophily between objects/relationships, thereby facilitating improved message passing in graphs. We conducted extensive experiments on two public datasets: Visual Genome (VG) and Open Images (OI). The experimental results demonstrate the superiority of our proposed HL-Net over existing state-of-the-art approaches. In more detail, HL-Net outperforms the second-best competitors by 2.1$\%$ on the VG dataset for scene graph classification and 1.2$\%$ on the IO dataset for the final score. Code is available at https://github.com/siml3/HL-Net.