论文标题
概念和依赖性简明表示的实验研究
Experimental Study of Concise Representations of Concepts and Dependencies
论文作者
论文摘要
在本文中,我们有兴趣研究概念和依赖关系的简洁表示,即启示和关联规则。这些表示形式基于等价类及其元素,即最小的发电机,包括密钥和Passkey的最小发电机,适当的前提和伪符号。从计算的角度来看,所有这些属性集都是重要的,并且对其统计属性进行了良好的研究。这是本文研究这些单数属性集的目的,并同时研究如何从FCA的角度评估数据集的复杂性。在本文中,我们分析了这些特定属性集的经验分布和大小。此外,我们提出了几种数据复杂性的度量,例如分布性,线性,概念的大小,最小发电机的大小,用于分析现实世界和合成数据集。
In this paper we are interested in studying concise representations of concepts and dependencies, i.e., implications and association rules. Such representations are based on equivalence classes and their elements, i.e., minimal generators, minimum generators including keys and passkeys, proper premises, and pseudo-intents. All these sets of attributes are significant and well studied from the computational point of view, while their statistical properties remain to be studied. This is the purpose of this paper to study these singular attribute sets and in parallel to study how to evaluate the complexity of a dataset from an FCA point of view. In the paper we analyze the empirical distributions and the sizes of these particular attribute sets. In addition we propose several measures of data complexity, such as distributivity, linearity, size of concepts, size of minimum generators, for the analysis of real-world and synthetic datasets.