基于接受场正则化和频率阻尼的声学场景分类的低复杂模型

论文标题

基于接受场正则化和频率阻尼的声学场景分类的低复杂模型

Low-Complexity Models for Acoustic Scene Classification Based on Receptive Field Regularization and Frequency Damping

论文作者

Koutini, Khaled, Henkel, Florian, Eghbal-zadeh, Hamid, Widmer, Gerhard

论文摘要

在计算和记忆要求方面，深度神经网络非常要求。由于对资源预算有限的嵌入式系统和移动设备的使用不断增加，因此设计低复杂模型而不牺牲过多的预测性能变得非常重要。在这项工作中，我们调查并比较了几种可减少神经网络参数数量的众所周知的方法。我们将这些进一步介绍了有关接收场（RF）对模型性能的影响的最新研究的背景，并从经验上表明，我们可以通过对RFS应用特定的限制以及参数减少方法来实现高性能的低复杂性模型。此外，我们提出了一种用于使模型RF正规化的滤波器阻尼技术，而无需更改其体系结构并更改其参数计数。我们将证明，合并这种技术可以改善各种低复杂性设置（例如修剪和分解卷积）的性能。使用我们提出的过滤器阻尼，我们在低复杂性声学场景分类的任务中达到了DCASE-2020挑战中的第一名。

Deep Neural Networks are known to be very demanding in terms of computing and memory requirements. Due to the ever increasing use of embedded systems and mobile devices with a limited resource budget, designing low-complexity models without sacrificing too much of their predictive performance gained great importance. In this work, we investigate and compare several well-known methods to reduce the number of parameters in neural networks. We further put these into the context of a recent study on the effect of the Receptive Field (RF) on a model's performance, and empirically show that we can achieve high-performing low-complexity models by applying specific restrictions on the RFs, in combination with parameter reduction methods. Additionally, we propose a filter-damping technique for regularizing the RF of models, without altering their architecture and changing their parameter counts. We will show that incorporating this technique improves the performance in various low-complexity settings such as pruning and decomposed convolution. Using our proposed filter damping, we achieved the 1st rank at the DCASE-2020 Challenge in the task of Low-Complexity Acoustic Scene Classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题