使用知识蒸馏改善神经主题模型

论文标题

使用知识蒸馏改善神经主题模型

Improving Neural Topic Models using Knowledge Distillation

论文作者

Hoyle, Alexander, Goel, Pranav, Resnik, Philip

论文摘要

主题模型通常用于识别人类解剖的主题，以帮助理解大型文档收集。我们使用知识蒸馏来结合概率主题模型的最佳属性和预处理的变压器。我们的模块化方法可以与任何神经主题模型一起直接应用，以提高主题质量，我们使用两个具有不同体系结构的模型证明了这一点，从而获得了最新的主题连贯性。我们表明，正如通常报道的那样，我们的适应性框架不仅在所有估计的主题上都提高了总体绩效的性能，而且还可以在对齐主题的正面比较中进行比较。

Topic models are often used to identify human-interpretable topics to help make sense of large document collections. We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers. Our modular method can be straightforwardly applied with any neural topic model to improve topic quality, which we demonstrate using two models having disparate architectures, obtaining state-of-the-art topic coherence. We show that our adaptable framework not only improves performance in the aggregate over all estimated topics, as is commonly reported, but also in head-to-head comparisons of aligned topics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题