论文标题
基于变压器的新分子设计模型
A Transformer-based Generative Model for De Novo Molecular Design
论文作者
论文摘要
在药物发现范围内,分子设计旨在从化学空间中鉴定出潜在药物样分子估计为10^60-10^100顺序的新型化合物。由于由于无限的搜索空间,此搜索任务在计算上是可悲的,因此深度学习吸引了很多关注,这是一种产生看不见的分子的新方法。当我们寻求具有特定靶蛋白的化合物时,我们提出了一个基于变压器的深层模型,用于从头靶标特异性分子设计。所提出的方法能够产生既有药物样化合物(无指定靶标)和靶标特异性化合物。后者是通过为每个目标执行多头注意的不同键和值来生成的。这样,我们允许在指定目标上产生微笑字符串。实验结果表明,我们的方法能够产生有效的药物样化合物和靶标特异性化合物。此外,条件模型的采样化合物在很大程度上占据了实际靶标特异性分子的化学空间,并且还覆盖了很大一部分新型化合物。
In the scope of drug discovery, the molecular design aims to identify novel compounds from the chemical space where the potential drug-like molecules are estimated to be in the order of 10^60 - 10^100. Since this search task is computationally intractable due to the unbounded search space, deep learning draws a lot of attention as a new way of generating unseen molecules. As we seek compounds with specific target proteins, we propose a Transformer-based deep model for de novo target-specific molecular design. The proposed method is capable of generating both drug-like compounds (without specified targets) and target-specific compounds. The latter are generated by enforcing different keys and values of the multi-head attention for each target. In this way, we allow the generation of SMILES strings to be conditional on the specified target. Experimental results demonstrate that our method is capable of generating both valid drug-like compounds and target-specific compounds. Moreover, the sampled compounds from conditional model largely occupy the real target-specific molecules' chemical space and also cover a significant fraction of novel compounds.