论文标题

从基因组成的弦线重建的新代数方法

A New Algebraic Approach for String Reconstruction from Substring Compositions

论文作者

Gupta, Utkarsh, Mahdavifar, Hessam

论文摘要

我们考虑了Acharya等人首先引入和研究了二元字符串重建的问题,即其子弦组合物的多键(即称为子弦组合物多动物)的问题。我们引入了一种新算法,该算法是从其子弦组成的多序中引入弦的重建问题,该算法依赖于该问题的等效双变量多项式公式的代数属性。然后,我们表征要重建二进制字符串的特定代数条件,以确保算法不需要通过重建进行任何回溯,因此,时间复杂性在多个角度上是界定的。更具体地说,与Acharya等人的算法相比,我们的算法的时间复杂度为$ O(n^2)$,其算法的时间复杂性为$ O(N^2 \ log(n))$,其中$ n $是二进制字符串的长度。此外,显示出较大的二进制字符串是由新算法可以独特地重建的,而无需回溯到较大的重建代码的编码本,该代码是大小的线性因子,与Pattabiraman等人先前已知的构造相比,在Pattabiraman等人的先前已知的结构中,$ O(N^2)具有$ O(N^2)$ RECONTRUCTION INTEMSINTINS $。

We consider the problem of binary string reconstruction from the multiset of its substring compositions, i.e., referred to as the substring composition multiset, first introduced and studied by Acharya et al. We introduce a new algorithm for the problem of string reconstruction from its substring composition multiset which relies on the algebraic properties of the equivalent bivariate polynomial formulation of the problem. We then characterize specific algebraic conditions for the binary string to be reconstructed that guarantee the algorithm does not require any backtracking through the reconstruction, and, consequently, the time complexity is bounded polynomially. More specifically, in the case of no backtracking, our algorithm has a time complexity of $O(n^2)$ compared to the algorithm by Acharya et al., which has a time complexity of $O(n^2\log(n))$, where $n$ is the length of the binary string. Furthermore, it is shown that larger sets of binary strings are uniquely reconstructable by the new algorithm and without the need for backtracking leading to codebooks of reconstruction codes that are larger, by a linear factor in size, compared to the previously known construction by Pattabiraman et al., while having $O(n^2)$ reconstruction complexity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源