论文标题
SVSBI:基于序列的生物分子相互作用的虚拟筛选
SVSBI: Sequence-based virtual screening of biomolecular interactions
论文作者
论文摘要
虚拟筛查(VS)是理解生物分子相互作用的重要技术,特别是药物设计和发现。表现最佳的VS模型在极其取决于三维(3D)结构,这些结构一般不可用,但可以从分子对接获得。但是,当前的对接精度相对较低,呈现不可靠的模型。我们将基于序列的虚拟筛选(SVS)作为新一代与用于建模生物分子相互作用的模型的新一代。 SVS模型利用了高级自然语言处理(NLP)算法,并优化了深层$ K $ - 填充策略来编码生物分子交互,而无需调用基于3D结构的码头。我们证明了四个回归数据集的SVS性能,涉及蛋白质 - 蛋白质蛋白质,蛋白质 - 核酸结合以及蛋白质蛋白相互作用的配体抑制蛋白质相互作用和五个分类数据集的蛋白质 - 蛋白质酸的结合和蛋白质蛋白质相互作用的五个分类数据集。 SVS有可能大大改变药物发现和蛋白质工程方面的当前实践。
Virtual screening (VS) is an essential technique for understanding biomolecular interactions, particularly, drug design and discovery. The best-performing VS models depend vitally on three-dimensional (3D) structures, which are not available in general but can be obtained from molecular docking. However, current docking accuracy is relatively low, rendering unreliable VS models. We introduce sequence-based virtual screening (SVS) as a new generation of VS models for modeling biomolecular interactions. The SVS model utilizes advanced natural language processing (NLP) algorithms and optimizes deep $K$-embedding strategies to encode biomolecular interactions without invoking 3D structure-based docking. We demonstrate the state-of-art performance of SVS for four regression datasets involving protein-ligand binding, protein-protein, protein-nucleic acid binding, and ligand inhibition of protein-protein interactions and five classification datasets for the protein-protein interactions in five biological species. SVS has the potential to dramatically change the current practice in drug discovery and protein engineering.