论文标题

分解的商品家庭

Decomposable Families of Itemsets

论文作者

Tatti, Nikolaj, Heikinheimo, Hannes

论文摘要

从较大的商品集中选择一个小但高质量的模式子集的问题最近吸引了很多研究。在这里,我们使用可分解的项目集的概念讨论了解决此问题的方法。此类项目集系列为原始项目集的数据定义了一个概率模型。此外,它们诱发了一种特殊的树结构,称为一棵连接树,这是马尔可夫随机场理论所熟悉的。该方法具有几个优点。接线树提供了采矿结果的直观表示。从计算的角度来看,该模型可以使用整个项目集的集合来解决可能棘手的问题。我们提供了一种有效的算法来构建可分解​​的项目集系列,并提供了一个使用模型的频率约束查询的应用程序示例。经验结果表明,我们的算法可产生高质量的结果。

The problem of selecting a small, yet high quality subset of patterns from a larger collection of itemsets has recently attracted lot of research. Here we discuss an approach to this problem using the notion of decomposable families of itemsets. Such itemset families define a probabilistic model for the data from which the original collection of itemsets has been derived from. Furthermore, they induce a special tree structure, called a junction tree, familiar from the theory of Markov Random Fields. The method has several advantages. The junction trees provide an intuitive representation of the mining results. From the computational point of view, the model provides leverage for problems that could be intractable using the entire collection of itemsets. We provide an efficient algorithm to build decomposable itemset families, and give an application example with frequency bound querying using the model. Empirical results show that our algorithm yields high quality results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源