论文标题
Biolink模型:用于临床,生物医学和转化科学中知识图的通用架构
Biolink Model: A Universal Schema for Knowledge Graphs in Clinical, Biomedical, and Translational Science
论文作者
论文摘要
在临床,生物医学和转化科学中,越来越多的项目正在采用图形来进行知识表示。基于图的数据模型阐明了核心生物医学概念之间的互连性,使数据结构易于更新,并支持直观查询,可视化和推理算法。但是,这些“知识图”(kgs)之间的知识发现仍然很困难。数据集异质性和复杂性;临时数据格式的扩散;遵守有关可发现性,可访问性,互操作性和可重复性的准则;而且,尤其是,缺乏在生物医学KGS之间进行标准化的普遍认可的开放式访问模型,这使将数据源与下游消费者进行了调和。 Biolink模型是一种开源数据模型,可用于将转化科学中数据结构之间的关系形式化。它结合了面向对象的分类和面向图形的特征。该模型的核心是一组分层,相互联系的类(或类别)及其之间的关系(或谓词),代表基因,疾病,化学,解剖结构和表型等生物医学实体。该模型提供了类和边缘属性和关联,以指导实体应如何相互关系。在这里,我们强调了对KGS的标准数据模型的需求,描述Biolink模型,并将其与其他模型进行比较。我们在各种计划中展示了Biolink模型的实用性,包括生物医学数据转换器联盟和君主计划,并展示了它如何支持生物医学KGS的更轻松的集成和互操作性,从而从多个来源汇集了知识并帮助实现转化科学的目标。
Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness between core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally-accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates), representing biomedical entities such as gene, disease, chemical, anatomical structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.