论文标题
笑声综合:将SEQ2SEQ建模与转移学习结合
Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning
论文作者
论文摘要
尽管对表达性语音综合的兴趣日益增加,但非语言表达的合成是一个不足的区域。在本文中,我们提出了基于序列到序列TTS合成系统的音频笑声合成系统。我们通过培训深度学习模型来利用转移学习来学习语音和从注释中产生笑声。我们通过听力测试评估我们的模型,将其性能与基于HMM的笑声合成1进行比较,并评估其达到更高的自然性。我们的解决方案是迈向TTS系统的第一步,该系统将能够通过笑声整合来控制语音,以控制娱乐水平。
Despite the growing interest for expressive speech synthesis, synthesis of nonverbal expressions is an under-explored area. In this paper we propose an audio laughter synthesis system based on a sequence-to-sequence TTS synthesis system. We leverage transfer learning by training a deep learning model to learn to generate both speech and laughs from annotations. We evaluate our model with a listening test, comparing its performance to an HMM-based laughter synthesis one and assess that it reaches higher perceived naturalness. Our solution is a first step towards a TTS system that would be able to synthesize speech with a control on amusement level with laughter integration.