独立于文本的扬声器身份的多标签培训

论文标题

独立于文本的扬声器身份的多标签培训

Multi-Label Training for Text-Independent Speaker Identification

论文作者

Xue, Yuqi

论文摘要

在本文中，我们提出了一种独立于文本的说话者识别系统的新型策略：多标签培训（MLT）。我们将每个说话者的所有语音分为几个子组，而不是语音和说话者标签之间常用的一对对应关系，每个子组分配了一组不同的标签。在识别过程中，只要预测的标签与他/她相应的标签之一相同，就会确定特定的说话者。我们发现，这种方法可以迫使模型更准确地区分数据，并以某种方式占据集合学习的优势，同时避免了计算和存储负担的显着增加。在实验中，我们发现，不仅在清洁条件下，而且在嘈杂的条件下，在语音增强的条件下，多标签训练仍然可以比孔隙方法获得更好的识别性能。应当指出，所提出的策略可以轻松地应用于几乎所有与文本无关的说话者识别模型，以实现进一步的改进。

In this paper, we propose a novel strategy for text-independent speaker identification system: Multi-Label Training (MLT). Instead of the commonly used one-to-one correspondence between the speech and the speaker label, we divide all the speeches of each speaker into several subgroups, with each subgroup assigned a different set of labels. During the identification process, a specific speaker is identified as long as the predicted label is the same as one of his/her corresponding labels. We found that this method can force the model to distinguish the data more accurately, and somehow takes advantages of ensemble learning, while avoiding the significant increase of computation and storage burden. In the experiments, we found that not only in clean conditions, but also in noisy conditions with speech enhancement, Multi-Label Training can still achieve better identification performance than commom methods. It should be noted that the proposed strategy can be easily applied to almost all current text-independent speaker identification models to achieve further improvements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题