论文标题

Fashionbert:文本和图像与自适应损失的跨模式检索匹配

FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval

论文作者

Gao, Dehong, Jin, Linbo, Chen, Ben, Qiu, Minghui, Li, Peng, Wei, Yi, Hu, Yi, Wang, Hao

论文摘要

在本文中,我们介绍了时装行业的跨模式检索中的文本和图像匹配。与一般域中的匹配不同,时尚匹配需要更多地关注时尚图像和文本中的细粒度信息。先锋接近图像中检测感兴趣的区域(即ROI),并使用ROI嵌入作为图像表示。通常,ROI倾向于代表时尚图像中的“对象级”信息,而时尚文本则容易描述更详细的信息,例如样式,属性。因此,ROI不足以进行时尚文本和图像匹配。为此,我们提出了Fashionbert,该时尚伯特(Fashionbert)将补丁作为图像特征。 Fashionbert以预先训练的BERT模型为骨干网络,学习了文本和图像的高级表示。同时,我们提出了一种适应性损失,以在时尚伯特建模中交易多任务学习。合并了两个任务(即文本和图像匹配和跨模式检索)以评估Fashionbert。在公共数据集上,实验证明时尚伯特(Fashionbert)在表演方面取得了重大改善,而不是基线和最先进的方法。实际上,时尚伯特应用于混凝土跨模式检索应用。我们提供详细的匹配性能和推理效率分析。

In this paper, we address the text and image matching in cross-modal retrieval of the fashion industry. Different from the matching in the general domain, the fashion matching is required to pay much more attention to the fine-grained information in the fashion images and texts. Pioneer approaches detect the region of interests (i.e., RoIs) from images and use the RoI embeddings as image representations. In general, RoIs tend to represent the "object-level" information in the fashion images, while fashion texts are prone to describe more detailed information, e.g. styles, attributes. RoIs are thus not fine-grained enough for fashion text and image matching. To this end, we propose FashionBERT, which leverages patches as image features. With the pre-trained BERT model as the backbone network, FashionBERT learns high level representations of texts and images. Meanwhile, we propose an adaptive loss to trade off multitask learning in the FashionBERT modeling. Two tasks (i.e., text and image matching and cross-modal retrieval) are incorporated to evaluate FashionBERT. On the public dataset, experiments demonstrate FashionBERT achieves significant improvements in performances than the baseline and state-of-the-art approaches. In practice, FashionBERT is applied in a concrete cross-modal retrieval application. We provide the detailed matching performance and inference efficiency analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源