论文标题
Alphadesign:AlphaFoldDB上的图形蛋白设计方法和基准
AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB
论文作者
论文摘要
尽管DeepMind暂时求解了蛋白质折叠,但其反问题 - 蛋白质设计可预测其3D结构的蛋白质序列 - 仍然面临着巨大的挑战。特别是,缺乏大规模的标准化基准和差的准确性阻碍了研究的进度。为了标准化比较并引起更多的研究兴趣,我们使用Alphafold DB(世界上最大的蛋白质结构数据库之一)来建立一个新的基于图的基准基准-Alphadesign。基于Alphadesign,我们提出了一种称为Adesign的新方法,以使用简化的图形变压器编码器(SGT)引入蛋白质角度作为新特征来提高准确性,并提出一种信心感知的蛋白质解码器(CPD)。同时,中士和CPD还通过简化培训和测试程序来提高模型效率。实验表明,Adesign明显优于先前的图形模型,例如,平均准确性提高了8 \%,并且推理速度比以前快40倍以上。
While DeepMind has tentatively solved protein folding, its inverse problem -- protein design which predicts protein sequences from their 3D structures -- still faces significant challenges. Particularly, the lack of large-scale standardized benchmark and poor accuray hinder the research progress. In order to standardize comparisons and draw more research interest, we use AlphaFold DB, one of the world's largest protein structure databases, to establish a new graph-based benchmark -- AlphaDesign. Based on AlphaDesign, we propose a new method called ADesign to improve accuracy by introducing protein angles as new features, using a simplified graph transformer encoder (SGT), and proposing a confidence-aware protein decoder (CPD). Meanwhile, SGT and CPD also improve model efficiency by simplifying the training and testing procedures. Experiments show that ADesign significantly outperforms previous graph models, e.g., the average accuracy is improved by 8\%, and the inference speed is 40+ times faster than before.