论文标题
通过自然语言处理中的合成标签的信息传播
Information Propagation by Composited Labels in Natural Language Processing
论文作者
论文摘要
在自然语言处理(NLP)中,在文本区域(例如单词,句子和段落)上标记是一项基本任务。在本文中,标签被定义为在包含提及的文本中的文本和实体上下文中对实体的提及之间的映射。该定义自然引入了由于区域的包含关系而引起的实体的链接,并且连接的实体形成了代表MAP定义的信息流的图。它还可以使用熵通过MAP通过MAP计算信息丢失,而熵丢失被认为是图路径上两个实体之间的距离。
In natural language processing (NLP), labeling on regions of text, such as words, sentences and paragraphs, is a basic task. In this paper, label is defined as map between mention of entity in a region on text and context of entity in a broader region on text containing the mention. This definition naturally introduces linkage of entities induced from inclusion relation of regions, and connected entities form a graph representing information flow defined by map. It also enables calculation of information loss through map using entropy, and entropy lost is regarded as distance between two entities over a path on graph.