关于一个或两个删除通道的解码误差重量

论文标题

关于一个或两个删除通道的解码误差重量

On The Decoding Error Weight of One or Two Deletion Channels

论文作者

Sabary, Omer, Bar-Lev, Daniella, Gershon, Yotam, Yucovich, Alexander, Yaakobi, Eitan

论文摘要

本文解决了针对插入和删除编码的两个问题。这些问题是由多种应用激发的，其中包括重建基于DNA的存储系统中的链。在此范式下，一个单词在一些固定数量的相同的独立通道上传输，而解码器的目的是输出传输的单词或某些密切近似值。论文研究的第一部分是针对删除通道的特殊情况的最佳解码，该案例由$ k $ deletion频道提到，该通道完全均匀地删除了$ k $的符号。在这一部分中，目标是了解最佳解码器如何运行以最大程度地减少预期的归一化距离。为删除一个或两个符号的通道提供了该设置的有效最佳解码器的完整表征，称为最大可能性*（ML*）解码器。本文的第二部分研究了删除通道，该删除频道删除了一个固定概率$ p $的符号，同时着重于此频道的两个实例。由于在这种情况下操作最大似然（ML）解码器是计算上不可行的，因此我们研究了两个通道的该解码器的略有退化版本，并研究了其预期的归一化距离。我们观察到，主要的误差模式是在同一运行中的删除或交替序列引起的错误。基于这些观察结果，我们在降级的ML解码器的预期归一化距离上得出了较低的界限，该长度为$ n $的任何传输$ q $ ary序列和任何删除概率$ p $。我们进一步表明，随着单词长度接近无穷大，频道的删除概率$ p $接近零，这些界限会收敛到大约$ \ frac {3q -1} {q -1} p^2 $。这些理论结果通过相应的模拟得到验证。

This paper tackles two problems that fall under the study of coding for insertions and deletions. These problems are motivated by several applications, among them is reconstructing strands in DNA-based storage systems. Under this paradigm, a word is transmitted over some fixed number of identical independent channels and the goal of the decoder is to output the transmitted word or some close approximation of it. The first part of the paper studies optimal decoding for a special case of the deletion channel, referred by the $k$-deletion channel, which deletes exactly $k$ symbols of the transmitted word uniformly at random. In this part, the goal is to understand how an optimal decoder operates in order to minimize the expected normalized distance. A full characterization of an efficient optimal decoder for this setup, referred to as the maximum likelihood* (ML*) decoder, is given for a channel that deletes one or two symbols. The second part of this paper studies the deletion channel that deletes a symbol with some fixed probability $p$, while focusing on two instances of this channel. Since operating the maximum likelihood (ML) decoder, in this case, is computationally unfeasible, we study a slightly degraded version of this decoder for two channels and study its expected normalized distance. We observe that the dominant error patterns are deletions in the same run or errors resulting from alternating sequences. Based on these observations, we derive lower bounds on the expected normalized distance of the degraded ML decoder for any transmitted $q$-ary sequence of length $n$ and any deletion probability $p$. We further show that as the word length approaches infinity and the channel's deletion probability $p$ approaches zero, these bounds converge to approximately $\frac{3q - 1}{q - 1} p^2$. These theoretical results are verified by corresponding simulations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题