voxceleb扬声器识别挑战2022的SJTU-AISPEECH系统2022

论文标题

voxceleb扬声器识别挑战2022的SJTU-AISPEECH系统2022

SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022

论文作者

Chen, Zhengyang, Han, Bing, Xiang, Xu, Huang, Houjun, Liu, Bei, Qian, Yanmin

论文摘要

该报告描述了Voxceleb扬声器识别挑战2022的SJTU-AISPEECH系统。对于Track1，我们实施了两种系统，即在线系统和离线系统。探索了不同的基于重新连接的骨干和损失功能。我们的最终融合系统在Track1中获得了第三名。对于Track3，我们实施了统计适应和共同基于培训的领域适应。在共同基于培训的领域适应中，我们共同培训了源和目标域数据集，并具有不同的培训目标，以进行域的适应性。我们探索了目标域数据的两个不同的培训目标，基于自我监督学习的原始典型损失和基于半监督的学习分类损失，并带有估计的伪标签。此外，当目标域目标是分类损失时，我们使用了动态损失门和标签校正（DLG-LC）策略来提高伪标签的质量。我们的最终融合系统在Track3中获得了第四名（非常接近第三名，相对少于1％）。

This report describes the SJTU-AISPEECH system for the Voxceleb Speaker Recognition Challenge 2022. For track1, we implemented two kinds of systems, the online system and the offline system. Different ResNet-based backbones and loss functions are explored. Our final fusion system achieved 3rd place in track1. For track3, we implemented statistic adaptation and jointly training based domain adaptation. In the jointly training based domain adaptation, we jointly trained the source and target domain dataset with different training objectives to do the domain adaptation. We explored two different training objectives for target domain data, self-supervised learning based angular proto-typical loss and semi-supervised learning based classification loss with estimated pseudo labels. Besides, we used the dynamic loss-gate and label correction (DLG-LC) strategy to improve the quality of pseudo labels when the target domain objective is a classification loss. Our final fusion system achieved 4th place (very close to 3rd place, relatively less than 1%) in track3.

下载PDF全文

下载文献需遵守相关版权规定

论文标题