论文标题

Alphadesign:AlphaFoldDB上的图形蛋白设计方法和基准

AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB

论文作者

Gao, Zhangyang, Tan, Cheng, Li, Stan Z.

论文摘要

尽管DeepMind暂时求解了蛋白质折叠,但其反问题 - 蛋白质设计可预测其3D结构的蛋白质序列 - 仍然面临着巨大的挑战。特别是,缺乏大规模的标准化基准和差的准确性阻碍了研究的进度。为了标准化比较并引起更多的研究兴趣,我们使用Alphafold DB(世界上最大的蛋白质结构数据库之一)来建立一个新的基于图的基准基准-Alphadesign。基于Alphadesign,我们提出了一种称为Adesign的新方法,以使用简化的图形变压器编码器(SGT)引入蛋白质角度作为新特征来提高准确性,并提出一种信心感知的蛋白质解码器(CPD)。同时,中士和CPD还通过简化培训和测试程序来提高模型效率。实验表明,Adesign明显优于先前的图形模型,例如,平均准确性提高了8 \%,并且推理速度比以前快40倍以上。

While DeepMind has tentatively solved protein folding, its inverse problem -- protein design which predicts protein sequences from their 3D structures -- still faces significant challenges. Particularly, the lack of large-scale standardized benchmark and poor accuray hinder the research progress. In order to standardize comparisons and draw more research interest, we use AlphaFold DB, one of the world's largest protein structure databases, to establish a new graph-based benchmark -- AlphaDesign. Based on AlphaDesign, we propose a new method called ADesign to improve accuracy by introducing protein angles as new features, using a simplified graph transformer encoder (SGT), and proposing a confidence-aware protein decoder (CPD). Meanwhile, SGT and CPD also improve model efficiency by simplifying the training and testing procedures. Experiments show that ADesign significantly outperforms previous graph models, e.g., the average accuracy is improved by 8\%, and the inference speed is 40+ times faster than before.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源