具有基于图的数据评估和规则推理的深度解释学习

论文标题

具有基于图的数据评估和规则推理的深度解释学习

Deep Explainable Learning with Graph Based Data Assessing and Rule Reasoning

论文作者

Li, Yuanlong, Huang, Gaopan, Zhou, Min, Fu, Chuan, Qiao, Honglin, He, Yan

论文摘要

学习可解释的分类器通常会导致精确度模型较低或最终以巨大的规则集，而学习深层模型通常更有能力按大规模处理嘈杂的数据，但由于难以解释结果和概括时弱的成本。为了减轻这一差距，我们提出了一种端到端的可解释的学习方法，该方法结合了深层模型在噪声处理和基于专家规则的解释性方面的优势。具体来说，我们建议学习一个深度数据评估模型，该模型将数据模拟为图表，以表示不同观察值之间的相关性，该观察值将使用其输出来提取关键数据特征。然后，按照可训练参数的预定义嘈杂的专家规则，将关键功能馈入构建的规则网络中。随着这些模型的相关性，我们提出了一个端到端培训框架，利用规则分类损失来同时优化规则学习模型和数据评估模型。由于基于规则的计算是无关的，因此我们提出了一个梯度链接搜索模块，以将梯度信息从规则学习模型传递到数据评估模型。与体面的深层集合基线相比，提出的方法在行业生产系统中进行了测试，显示出可比的预测准确性，更高的概括稳定性和更好的解释性，并且比基于纯规则的方法显示出更好的拟合能力。

Learning an explainable classifier often results in low accuracy model or ends up with a huge rule set, while learning a deep model is usually more capable of handling noisy data at scale, but with the cost of hard to explain the result and weak at generalization. To mitigate this gap, we propose an end-to-end deep explainable learning approach that combines the advantage of deep model in noise handling and expert rule-based interpretability. Specifically, we propose to learn a deep data assessing model which models the data as a graph to represent the correlations among different observations, whose output will be used to extract key data features. The key features are then fed into a rule network constructed following predefined noisy expert rules with trainable parameters. As these models are correlated, we propose an end-to-end training framework, utilizing the rule classification loss to optimize the rule learning model and data assessing model at the same time. As the rule-based computation is none-differentiable, we propose a gradient linking search module to carry the gradient information from the rule learning model to the data assessing model. The proposed method is tested in an industry production system, showing comparable prediction accuracy, much higher generalization stability and better interpretability when compared with a decent deep ensemble baseline, and shows much better fitting power than pure rule-based approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题