论文标题
双路径自我发作RNN用于实时演讲增强
Dual-path Self-Attention RNN for Real-Time Speech Enhancement
论文作者
论文摘要
我们提出了一个双路自发复发性神经网络(DP-SARNN),以增强时间域的语音。我们通过通过最近提出的有效的注意机制来增强与界面和厨房内的RNN来改善双路线RNN(DP-RNN)。锁骨间和锁骨内的注意力的组合改善了长序列语音框架的注意机制。 DP-SARNN的表现优于基线DP-RNN,它使用框架移动率是DP-RNN的四倍,这会导致每个话语的计算时间大大减少。结果,我们通过使用长期短期记忆(LSTM)RNN和chunk Inter-chunk Sarnn的因果注意来开发实时的DP-SARNN。 DP-SARNN显着优于现有的语音增强方法,平均需要7.9 ms CPU的时间来处理32毫秒的信号块。
We propose a dual-path self-attention recurrent neural network (DP-SARNN) for time-domain speech enhancement. We improve dual-path RNN (DP-RNN) by augmenting inter-chunk and intra-chunk RNN with a recently proposed efficient attention mechanism. The combination of inter-chunk and intra-chunk attention improves the attention mechanism for long sequences of speech frames. DP-SARNN outperforms a baseline DP-RNN by using a frame shift four times larger than in DP-RNN, which leads to a substantially reduced computation time per utterance. As a result, we develop a real-time DP-SARNN by using long short-term memory (LSTM) RNN and causal attention in inter-chunk SARNN. DP-SARNN significantly outperforms existing approaches to speech enhancement, and on average takes 7.9 ms CPU time to process a signal chunk of 32 ms.