基于变压器的阿拉伯方言标识

论文标题

基于变压器的阿拉伯方言标识

Transformer-based Arabic Dialect Identification

论文作者

Lin, Wanqiu, Madhavi, Maulik, Das, Rohan Kumar, Li, Haizhou

论文摘要

本文提出了基于变压器神经网络体系结构的方言标识（DID）系统。常规的卷积神经网络（CNN）的系统使用较短的接收场。我们认为，远距离信息对于语言和做到的同样重要，而变形金刚中的自我发挥作用机制捕获了远距离依赖性。此外，为了降低计算复杂性，使用下采样的自我发作来处理声学特征。此过程提取了稀疏但内容丰富的功能。我们的实验结果表明，变压器在阿拉伯方言标识（ADI）数据集上优于基于CNN的网络。我们还报告，基于CNN和基于变压器的系统的得分级融合在ADI17数据库中获得了86.29％的总体精度。

This paper presents a dialect identification (DID) system based on the transformer neural network architecture. The conventional convolutional neural network (CNN)-based systems use the shorter receptive fields. We believe that long range information is equally important for language and DID, and self-attention mechanism in transformer captures the long range dependencies. In addition, to reduce the computational complexity, self-attention with downsampling is used to process the acoustic features. This process extracts sparse, yet informative features. Our experimental results show that transformer outperforms CNN-based networks on the Arabic dialect identification (ADI) dataset. We also report that the score-level fusion of CNN and transformer-based systems obtains an overall accuracy of 86.29% on the ADI17 database.

下载PDF全文

下载文献需遵守相关版权规定

论文标题