事件序列的摘要马尔可夫模型

论文标题

事件序列的摘要马尔可夫模型

Summary Markov Models for Event Sequences

论文作者

Bhattacharjya, Debarun, Sihag, Saurabh, Hassanzadeh, Oktie, Bialik, Liza

论文摘要

涉及不同类型事件的序列的数据集在许多应用程序中普遍存在，例如从文本语料库中提取时。我们为此类事件序列（摘要Markov模型）提出了一个模型家族，其中观察事件类型的可能性仅取决于其影响事件类型集的历史事件的摘要。这个马尔可夫模型家族是由Granger因果模型的时间序列激励，其重要区别是，事件序列中只能在一个位置中发生一个事件。我们表明，对于任何感兴趣的事件类型和摘要功能的选择，都存在一个独特的最小影响集，从一般家族中制定了代表特定序列动态的两个新型模型，并提出了一种贪婪的搜索算法，以从事件序列数据中学习它们。我们进行了一项实验研究，将提出的模型与相关基线进行比较，并通过涉及文本序列的案例研究来说明其知识的获取和发现能力。

Datasets involving sequences of different types of events without meaningful time stamps are prevalent in many applications, for instance when extracted from textual corpora. We propose a family of models for such event sequences -- summary Markov models -- where the probability of observing an event type depends only on a summary of historical occurrences of its influencing set of event types. This Markov model family is motivated by Granger causal models for time series, with the important distinction that only one event can occur in a position in an event sequence. We show that a unique minimal influencing set exists for any set of event types of interest and choice of summary function, formulate two novel models from the general family that represent specific sequence dynamics, and propose a greedy search algorithm for learning them from event sequence data. We conduct an experimental investigation comparing the proposed models with relevant baselines, and illustrate their knowledge acquisition and discovery capabilities through case studies involving sequences from text.

下载PDF全文

下载文献需遵守相关版权规定

论文标题