论文标题
基于DNA的存储:模型和基本限制
DNA-Based Storage: Models and Fundamental Limits
论文作者
论文摘要
由于其寿命和大量信息密度,DNA是存储存储的有吸引力的媒介。在这项工作中,我们通过引入新的频道模型来研究基于DNA的存储系统的基本限制和权衡,我们称之为嘈杂的改组抽样渠道。该模型受到当前技术限制的动机,该模型捕获了DNA存储系统的三个关键独特方面:(1)数据写入许多短DNA分子上; (2)分子在合成和测序过程中被噪声损坏,(3)通过从DNA池中随机采样数据。我们在特定的噪声和采样假设下为该通道提供了容量结果,并表明在许多情况下,简单的基于索引的编码方案是最佳的。
Due to its longevity and enormous information density, DNA is an attractive medium for archival storage. In this work, we study the fundamental limits and trade-offs of DNA-based storage systems by introducing a new channel model, which we call the noisy shuffling-sampling channel. Motivated by current technological constraints on DNA synthesis and sequencing, this model captures three key distinctive aspects of DNA storage systems: (1) the data is written onto many short DNA molecules; (2) the molecules are corrupted by noise during synthesis and sequencing and (3) the data is read by randomly sampling from the DNA pool. We provide capacity results for this channel under specific noise and sampling assumptions and show that, in many scenarios, a simple index-based coding scheme is optimal.