用于学术专家搜索的有效分布式表示

论文标题

用于学术专家搜索的有效分布式表示

Effective Distributed Representations for Academic Expert Search

论文作者

Berger, Mark, Zavrel, Jakub, Groth, Paul

论文摘要

专家搜索旨在根据用户查询查找和对专家进行排名。在学术界，检索专家是浏览大量学术知识的有效方法。在这里，我们研究了学术论文（即嵌入）如何影响学术专家检索的不同分布式表示。我们使用Microsoft Academic Graph数据集和实验，并具有以文档为中心的投票模型进行检索的不同配置。特别是，我们探讨了使用上下文化嵌入对搜索性能的影响。我们还提出了通过翻新包含引文信息的纸张嵌入的结果。此外，使用不同的技术根据作者顺序进行作者权重进行实验。我们观察到，使用训练句子相似性任务的变压器模型产生的上下文嵌入，为以文档为中心的专家检索产生了最有效的纸张表示。但是，对纸张进行翻新并使用精心设计的作者加权策略并不能提高检索性能。

Expert search aims to find and rank experts based on a user's query. In academia, retrieving experts is an efficient way to navigate through a large amount of academic knowledge. Here, we study how different distributed representations of academic papers (i.e. embeddings) impact academic expert retrieval. We use the Microsoft Academic Graph dataset and experiment with different configurations of a document-centric voting model for retrieval. In particular, we explore the impact of the use of contextualized embeddings on search performance. We also present results for paper embeddings that incorporate citation information through retrofitting. Additionally, experiments are conducted using different techniques for assigning author weights based on author order. We observe that using contextual embeddings produced by a transformer model trained for sentence similarity tasks produces the most effective paper representations for document-centric expert retrieval. However, retrofitting the paper embeddings and using elaborate author contribution weighting strategies did not improve retrieval performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题