利用域特征来检测噪声中深层语音识别的对抗性攻击

论文标题

利用域特征来检测噪声中深层语音识别的对抗性攻击

Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise

论文作者

Nielsen, Christian Heider, Tan, Zheng-Hua

论文摘要

近年来，在基于模型的自动语音识别（ASR）中取得了重大进展，从而导致其在现实世界中广泛部署。同时，针对深度ASR系统的对抗性攻击非常成功。已经提出了各种方法来捍卫ASR系统免受这些攻击的影响。但是，现有的基于分类的方法着眼于深度学习模型的设计，同时缺乏对域特定特征的探索。这项工作利用基于过滤库的功能更好地捕获攻击的特征，以改善检测。此外，本文分析了在检测对抗攻击时分别使用语音和非语音部分的潜力。最后，考虑到可能部署ASR系统的不利环境，我们研究了各种类型的声音和信噪比的影响。广泛的实验表明，反滤波器库的特征通常在干净和嘈杂的环境中表现更好，使用语音或非语音部分的检测有效，声音噪声可以在很大程度上降低检测性能。

In recent years, significant progress has been made in deep model-based automatic speech recognition (ASR), leading to its widespread deployment in the real world. At the same time, adversarial attacks against deep ASR systems are highly successful. Various methods have been proposed to defend ASR systems from these attacks. However, existing classification based methods focus on the design of deep learning models while lacking exploration of domain specific features. This work leverages filter bank-based features to better capture the characteristics of attacks for improved detection. Furthermore, the paper analyses the potentials of using speech and non-speech parts separately in detecting adversarial attacks. In the end, considering adverse environments where ASR systems may be deployed, we study the impact of acoustic noise of various types and signal-to-noise ratios. Extensive experiments show that the inverse filter bank features generally perform better in both clean and noisy environments, the detection is effective using either speech or non-speech part, and the acoustic noise can largely degrade the detection performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题