DD-CNN：低复杂性场景分类的深度散布卷积神经网络

论文标题

DD-CNN：低复杂性场景分类的深度散布卷积神经网络

DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity Acoustic Scene Classification

论文作者

Zhao, Jingqiao, Feng, Zhen-Hua, Kong, Qiuqiang, Song, Xiaoning, Wu, Xiao-Jun

论文摘要

本文介绍了一个深度散布的卷积神经网络（DD-CNN），用于检测和分类城市声学场景。具体而言，我们将log-mel用作网络输入的声学信号的特征表示。在拟议的DD-CNN中，使用深度可分离的卷积用于降低网络的复杂性。此外，规格和删除用于进一步提高性能。实验结果表明，我们的DD-CNN可以从音频片段学习判别性声学特征，并有效地降低网络复杂性。我们的DD-CNN用于DCASE2020挑战的低复杂性声学场景分类任务，该任务在验证集上达到了92.04％的精度。

This paper presents a Depthwise Disout Convolutional Neural Network (DD-CNN) for the detection and classification of urban acoustic scenes. Specifically, we use log-mel as feature representations of acoustic signals for the inputs of our network. In the proposed DD-CNN, depthwise separable convolution is used to reduce the network complexity. Besides, SpecAugment and Disout are used for further performance boosting. Experimental results demonstrate that our DD-CNN can learn discriminative acoustic characteristics from audio fragments and effectively reduce the network complexity. Our DD-CNN was used for the low-complexity acoustic scene classification task of the DCASE2020 Challenge, which achieves 92.04% accuracy on the validation set.

下载PDF全文

下载文献需遵守相关版权规定

论文标题