论文标题

多普尔巴斯:双耳音频综合解决多普勒效应

DopplerBAS: Binaural Audio Synthesis Addressing Doppler Effect

论文作者

Liu, Jinglin, Ye, Zhenhui, Chen, Qian, Zheng, Siqi, Wang, Wen, Zhang, Qinglin, Zhao, Zhou

论文摘要

最近,双耳音频合成(BAS)已成为其在增强和虚拟现实中应用的有前途的研究领域。双耳音频可以通过为大脑提供反映空间信息的室内时间差异来帮助用户定向并建立沉浸式。但是,现有的BAS方法在相位估计方面受到限制,这对于空间听力至关重要。在本文中,我们提出\ textbf {dopplerbas}方法,以明确地解决移动声源的多普勒效应。具体而言,我们计算球形坐标中移动扬声器的径向相对速度,这进一步指导了双耳音频的合成。这种简单的方法没有引入其他超级参数,也不会改变损失功能,并且是插件:它可以很好地扩展到不同类型的骨干。 Dopperbas在相误差度量中明显改善了代表性的Warpnet和Binauralgrad骨架,并达到了新的最新状态(SOTA):0.780(与当前的SOTA 0.807相比)。实验和消融研究证明了我们方法的有效性。

Recently, binaural audio synthesis (BAS) has emerged as a promising research field for its applications in augmented and virtual realities. Binaural audio helps users orient themselves and establish immersion by providing the brain with interaural time differences reflecting spatial information. However, existing BAS methods are limited in terms of phase estimation, which is crucial for spatial hearing. In this paper, we propose the \textbf{DopplerBAS} method to explicitly address the Doppler effect of the moving sound source. Specifically, we calculate the radial relative velocity of the moving speaker in spherical coordinates, which further guides the synthesis of binaural audio. This simple method introduces no additional hyper-parameters and does not modify the loss functions, and is plug-and-play: it scales well to different types of backbones. DopperBAS distinctly improves the representative WarpNet and BinauralGrad backbones in the phase error metric and reaches a new state of the art (SOTA): 0.780 (versus the current SOTA 0.807). Experiments and ablation studies demonstrate the effectiveness of our method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源