学习多声源2D本地化

论文标题

学习多声源2D本地化

Learning Multiple Sound Source 2D Localization

论文作者

Moing, Guillaume Le, Vinayavekhin, Phongtharin, Inoue, Tadanobu, Vongkulbhisal, Jayakorn, Munawar, Asim, Tachibana, Ryuki, Agravante, Don Joven

论文摘要

在本文中，我们提出了用于多种声源本地化的新型基于深度学习的算法。具体而言，我们旨在通过使用多个麦克风阵列在封闭环境中找到多个声源的2D笛卡尔坐标。为此，我们使用编码编码的架构，并在其上提出两个改进来完成任务。此外，我们还提出了两种新的定位表示，以提高准确性。最后，开发了新的指标，依靠基于解决方案的多重源关联，使我们能够评估和比较不同的本地化方法。我们测试了合成和现实世界数据的方法。结果表明，我们的方法在此问题的先前基线方法上有所改善。

In this paper, we propose novel deep learning based algorithms for multiple sound source localization. Specifically, we aim to find the 2D Cartesian coordinates of multiple sound sources in an enclosed environment by using multiple microphone arrays. To this end, we use an encoding-decoding architecture and propose two improvements on it to accomplish the task. In addition, we also propose two novel localization representations which increase the accuracy. Lastly, new metrics are developed relying on resolution-based multiple source association which enables us to evaluate and compare different localization approaches. We tested our method on both synthetic and real world data. The results show that our method improves upon the previous baseline approach for this problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题