论文标题
风格调和预培训和参数有效的微调,以了解口语理解
Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding
论文作者
论文摘要
神经模型产生了最先进的导致口语理解(SLU)问题。但是,这些模型需要大量特定于领域的标签示例进行培训,这非常昂贵。尽管已经证明了像BERT这样的预训练的语言模型,可以通过从未标记的Corpora中学习并使用更少的标记示例来捕获大量知识以适应,但知识的编码是隐含的,对下游任务是隐含的和不可思议的。这种编码在参数使用中的模型效率低下的结果:每个域都需要一个全新的模型。为了应对这些挑战,我们介绍了一个新颖的SLU框架,其中包括对话语言建模(CLM)预训练任务和光编码器体系结构。 CLM预训练使网络能够在ASR错误的存在下以对话方式捕获语言的表示。光编码器体系结构将共享的预训练的网络与通常编码知识的映射到SLU的特定领域的映射,从而使域的适应性仅在光编码器上执行,从而提高了效率。使用框架,我们将最新的SLU结果的性能与Alexa内部数据集和两个公共数据集(ATIS,snips)的性能匹配,每个任务仅添加4.4%的参数。
Neural models have yielded state-of-the-art results in deciphering spoken language understanding (SLU) problems; however, these models require a significant amount of domain-specific labeled examples for training, which is prohibitively expensive. While pre-trained language models like BERT have been shown to capture a massive amount of knowledge by learning from unlabeled corpora and solve SLU using fewer labeled examples for adaption, the encoding of knowledge is implicit and agnostic to downstream tasks. Such encoding results in model inefficiencies in parameter usage: an entirely new model is required for every domain. To address these challenges, we introduce a novel SLU framework, comprising a conversational language modeling (CLM) pre-training task and a light encoder architecture. The CLM pre-training enables networks to capture the representation of the language in conversation style with the presence of ASR errors. The light encoder architecture separates the shared pre-trained networks from the mappings of generally encoded knowledge to specific domains of SLU, allowing for the domain adaptation to be performed solely at the light encoder and thus increasing efficiency. With the framework, we match the performance of state-of-the-art SLU results on Alexa internal datasets and on two public ones (ATIS, SNIPS), adding only 4.4% parameters per task.