论文标题
深层多帧MVDR过滤,以减少双耳降噪
Deep Multi-Frame MVDR Filtering for Binaural Noise Reduction
论文作者
论文摘要
为了改善嘈杂环境中的语音清晰度和语音质量,降低双耳降噪算法的头部安装辅助聆听设备至关重要。已经提出了几种双耳降噪算法,例如众所周知的双耳最小差异反应(MVDR)光束形式,它们利用了目标语音和噪声组件的空间相关性。此外,已经提出了多帧算法(例如多帧MVDR(MFMVDR)滤波器),它提出了多帧算法,从而利用时间相关性而不是空间相关性。在此贡献中,我们提出了MFMVDR滤波器的双耳扩展,该滤波器利用了空间和时间相关性。双耳MFMVDR滤波器嵌入到端到端的深度学习框架中,其中所需的参数,即语音时空时空相关矢量以及(逆)噪声时空协方差矩阵,通过训练的损失是由训练的损失来估算的,这些频率是由训练的损失来估计的。模拟结果包括测量的双耳房间的脉冲和信噪比的不同噪声源,从-5 dB到20 dB,这表明了利用双耳MFMVDR滤波器结构的优势,而不是直接估算使用TCN的双耳多帧滤镜系数。
To improve speech intelligibility and speech quality in noisy environments, binaural noise reduction algorithms for head-mounted assistive listening devices are of crucial importance. Several binaural noise reduction algorithms such as the well-known binaural minimum variance distortionless response (MVDR) beamformer have been proposed, which exploit spatial correlations of both the target speech and the noise components. Furthermore, for single-microphone scenarios, multi-frame algorithms such as the multi-frame MVDR (MFMVDR) filter have been proposed, which exploit temporal instead of spatial correlations. In this contribution, we propose a binaural extension of the MFMVDR filter, which exploits both spatial and temporal correlations. The binaural MFMVDR filters are embedded in an end-to-end deep learning framework, where the required parameters, i.e., the speech spatio-temporal correlation vectors as well as the (inverse) noise spatio-temporal covariance matrix, are estimated by temporal convolutional networks (TCNs) that are trained by minimizing the mean spectral absolute error loss function. Simulation results comprising measured binaural room impulses and diverse noise sources at signal-to-noise ratios from -5 dB to 20 dB demonstrate the advantage of utilizing the binaural MFMVDR filter structure over directly estimating the binaural multi-frame filter coefficients with TCNs.