基于X-Vector的扬声器匿名化的设计选择

论文标题

基于X-Vector的扬声器匿名化的设计选择

Design Choices for X-vector Based Speaker Anonymization

论文作者

Srivastava, Brij Mohan Lal, Tomashenko, Natalia, Wang, Xin, Vincent, Emmanuel, Yamagishi, Junichi, Maouche, Mohamed, Bellet, Aurélien, Tommasi, Marc

论文摘要

最近提出的基于X-vector的匿名方案将任何输入语音转换为随机伪扬声器的输入语音。在本文中，我们提出了一种灵活的伪扬声器选择技术，作为第一个Voice Privacy Challenge的基准。我们探索了扬声器之间的距离度量的几种设计选择，扬声器的X-Vector空间区域挑选了伪扬声器，并选择了性别。为了评估实现匿名的强度，我们考虑使用基于X-vector的扬声器验证系统的攻击者，他们可以根据对匿名方案的了解，使用原始或匿名的语音进行注册。攻击者达到的同等错误率（EER）以及对匿名数据的解码单词错误率（WER）被报告为隐私和实用程序的度量。使用从LibrisPeech得出的数据集进行实验，以在隐私和实用程序方面找到设计选择的最佳组合。

The recently proposed x-vector based anonymization scheme converts any input voice into that of a random pseudo-speaker. In this paper, we present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge. We explore several design choices for the distance metric between speakers, the region of x-vector space where the pseudo-speaker is picked, and gender selection. To assess the strength of anonymization achieved, we consider attackers using an x-vector based speaker verification system who may use original or anonymized speech for enrollment, depending on their knowledge of the anonymization scheme. The Equal Error Rate (EER) achieved by the attackers and the decoding Word Error Rate (WER) over anonymized data are reported as the measures of privacy and utility. Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题