论文标题
快速连接主义者的演讲者适应
Rapid Connectionist Speaker Adaptation
论文作者
论文摘要
我们提出SVCNET,这是一种建模扬声器变异性的系统。专门针对每个语音的编码器神经网络产生声学变化的低维模型,这些模型进一步合并为语音可变性的整体模型。描述了一个训练程序,该程序最小化了该模型的依赖性。该系统使用训练有素的模型(SVCNET)和简短的,不受限制的扬声器声音样本,该系统生成了扬声器语音代码,该语音代码可用于在不进行重新训练的情况下将识别系统适应新扬声器。描述了将SVCNET与MS-TDNN识别器相结合的系统
We present SVCnet, a system for modelling speaker variability. Encoder Neural Networks specialized for each speech sound produce low dimensionality models of acoustical variation, and these models are further combined into an overall model of voice variability. A training procedure is described which minimizes the dependence of this model on which sounds have been uttered. Using the trained model (SVCnet) and a brief, unconstrained sample of a new speaker's voice, the system produces a Speaker Voice Code that can be used to adapt a recognition system to the new speaker without retraining. A system which combines SVCnet with an MS-TDNN recognizer is described