各种规则集

论文标题

Diverse Rule Sets

论文作者

Zhang, Guangyi, Gionis, Aristides

论文摘要

尽管机器学习模型正在蓬勃发展并改变了日常生活的许多方面，但人类无法理解复杂模型的能力使这些模型完全受到信任和拥抱。因此，模型的解释性被认为是其预测能力同样重要的质量。特别是，基于规则的系统由于其直观的即时表示，因此正在经历文艺复兴时期。但是，简单地基于规则并不能确保可解释性。例如，重叠的规则产生了歧义和阻碍解释。在这里，我们提出了一种新颖的方法来推断各种规则集，通过在最大和多样化的框架下优化决策规则之间的小小的重叠。我们提出问题，以最大程度地提高规则集的判别质量和多样性的加权总和。为了克服协会规则的指数尺寸搜索空间，我们研究了一组少量候选高质量规则的自然选择，包括频繁和准确的规则，并检查其硬度。利用我们的配方中的特殊结构，我们设计了一种有效的随机算法，该算法采样了高度歧视且重叠较小的规则。所提出的采样算法分析旨在针对根据我们目标量身定制的规则分布。我们通过针对强大基准的全面经验研究来证明我们的模型具有优越的预测能力和解释性。

While machine-learning models are flourishing and transforming many aspects of everyday life, the inability of humans to understand complex models poses difficulties for these models to be fully trusted and embraced. Thus, interpretability of models has been recognized as an equally important quality as their predictive power. In particular, rule-based systems are experiencing a renaissance owing to their intuitive if-then representation. However, simply being rule-based does not ensure interpretability. For example, overlapped rules spawn ambiguity and hinder interpretation. Here we propose a novel approach of inferring diverse rule sets, by optimizing small overlap among decision rules with a 2-approximation guarantee under the framework of Max-Sum diversification. We formulate the problem as maximizing a weighted sum of discriminative quality and diversity of a rule set. In order to overcome an exponential-size search space of association rules, we investigate several natural options for a small candidate set of high-quality rules, including frequent and accurate rules, and examine their hardness. Leveraging the special structure in our formulation, we then devise an efficient randomized algorithm, which samples rules that are highly discriminative and have small overlap. The proposed sampling algorithm analytically targets a distribution of rules that is tailored to our objective. We demonstrate the superior predictive power and interpretability of our model with a comprehensive empirical study against strong baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题