非常大的语言模型作为文本挖掘的统一方法

论文标题

非常大的语言模型作为文本挖掘的统一方法

Very Large Language Model as a Unified Methodology of Text Mining

论文作者

Jiang, Meng

论文摘要

文本数据挖掘是从语言文本中得出基本信息的过程。典型的文本挖掘任务包括文本分类，文本聚类，主题建模，信息提取和文本摘要。收集了各种数据集，并为不同类型的任务设计了各种算法。在本文中，我提出了一个蓝天的想法，即非常大的语言模型（VLLM）将成为文本挖掘的有效统一方法。我至少讨论了针对常规方法的这种新方法的三个优势。最后，我讨论了文本挖掘的VLLM技术的设计和开发挑战。

Text data mining is the process of deriving essential information from language text. Typical text mining tasks include text categorization, text clustering, topic modeling, information extraction, and text summarization. Various data sets are collected and various algorithms are designed for the different types of tasks. In this paper, I present a blue sky idea that very large language model (VLLM) will become an effective unified methodology of text mining. I discuss at least three advantages of this new methodology against conventional methods. Finally I discuss the challenges in the design and development of VLLM techniques for text mining.

下载PDF全文

下载文献需遵守相关版权规定

论文标题