论文标题
隔离:接近单个通道语音分离
SepIt: Approaching a Single Channel Speech Separation Bound
论文作者
论文摘要
我们为单个通道语音分离任务提供了上限,该任务基于关于短段的性质的假设。使用界限,我们能够证明,尽管最近的方法对少数扬声器取得了重大进展,但对于五名和十个扬声器来说,有改进的空间。然后,我们介绍了一个深层神经网络Sepit,它迭代地改善了不同的说话者的估计。在测试时,根据我们的分析产生的相互信息标准,SPEIT每个测试样品的迭代次数不同。在一系列广泛的实验中,SEPIT的表现优于2、3、5和10扬声器的最新神经网络。
We present an upper bound for the Single Channel Speech Separation task, which is based on an assumption regarding the nature of short segments of speech. Using the bound, we are able to show that while the recent methods have made significant progress for a few speakers, there is room for improvement for five and ten speakers. We then introduce a Deep neural network, SepIt, that iteratively improves the different speakers' estimation. At test time, SpeIt has a varying number of iterations per test sample, based on a mutual information criterion that arises from our analysis. In an extensive set of experiments, SepIt outperforms the state-of-the-art neural networks for 2, 3, 5, and 10 speakers.