解密RNA二级结构预测：概率K-Rook匹配的观点

论文标题

解密RNA二级结构预测：概率K-Rook匹配的观点

Deciphering RNA Secondary Structure Prediction: A Probabilistic K-Rook Matching Perspective

论文作者

Tan, Cheng, Gao, Zhangyang, Cao, Hanqun, Chen, Xingran, Wang, Ge, Wu, Lirong, Xia, Jun, Zheng, Jiangbin, Li, Stan Z.

论文摘要

核糖核酸（RNA）的二级结构在细胞中比其三级结构更稳定，更容易获得，这对于功能预测至关重要。尽管深度学习在这一领域显示出令人鼓舞的结果，但当前的方法却遭受了不良的概括和高复杂性的影响。在这项工作中，我们将RNA二级结构预测作为K-Rook问题重新制定，从而将预测过程简化为有限解决方案空间内的概率匹配。在这种创新的角度的基础上，我们引入了Rfold，这是一种简单而有效的方法，可以学会从给定序列预测最匹配的K-Rook解决方案。 RFOLD采用双维优化策略，将概率匹配问题分解为行和列的组件，以降低匹配的复杂性，从而简化求解过程，同时保证输出的有效性。广泛的实验表明，RFOLD可以实现竞争性能，并且推理效率比最先进的方法快八倍。代码和COLAB演示可在（http://github.com/a4bio/rfold）中找到。

The secondary structure of ribonucleic acid (RNA) is more stable and accessible in the cell than its tertiary structure, making it essential for functional prediction. Although deep learning has shown promising results in this field, current methods suffer from poor generalization and high complexity. In this work, we reformulate the RNA secondary structure prediction as a K-Rook problem, thereby simplifying the prediction process into probabilistic matching within a finite solution space. Building on this innovative perspective, we introduce RFold, a simple yet effective method that learns to predict the most matching K-Rook solution from the given sequence. RFold employs a bi-dimensional optimization strategy that decomposes the probabilistic matching problem into row-wise and column-wise components to reduce the matching complexity, simplifying the solving process while guaranteeing the validity of the output. Extensive experiments demonstrate that RFold achieves competitive performance and about eight times faster inference efficiency than the state-of-the-art approaches. The code and Colab demo are available in (http://github.com/A4Bio/RFold).

下载PDF全文

下载文献需遵守相关版权规定

论文标题