论文标题
基于功能的学习,用于多元化和隐私的反事实解释
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations
论文作者
论文摘要
可解释的机器学习试图了解复杂的黑盒系统的推理过程,这些系统因缺乏解释性而臭名昭著。一种蓬勃发展的方法是通过反事实解释,该解释提供了用户可以采取什么行动来改变结果的建议。反事实示例不仅必须对抗黑盒分类器的原始预测,而且还应满足实际应用的各种约束。多样性是关键的限制之一,但讨论的讨论程度较低。尽管各种反事实是理想的选择,但同时解决其他一些约束在计算上具有挑战性。此外,对发布的反事实数据存在越来越多的隐私问题。为此,我们提出了一个基于功能的学习框架,该框架有效地处理了反事实的约束,并为有限的私人解释模型做出了贡献。我们证明了方法在产生可操作性和合理性的各种反事实方面的灵活性和有效性。我们的反事实发动机比具有相同容量的同时更有效,同时产生了最低的重新识别风险。
Interpretable machine learning seeks to understand the reasoning process of complex black-box systems that are long notorious for lack of explainability. One flourishing approach is through counterfactual explanations, which provide suggestions on what a user can do to alter an outcome. Not only must a counterfactual example counter the original prediction from the black-box classifier but it should also satisfy various constraints for practical applications. Diversity is one of the critical constraints that however remains less discussed. While diverse counterfactuals are ideal, it is computationally challenging to simultaneously address some other constraints. Furthermore, there is a growing privacy concern over the released counterfactual data. To this end, we propose a feature-based learning framework that effectively handles the counterfactual constraints and contributes itself to the limited pool of private explanation models. We demonstrate the flexibility and effectiveness of our method in generating diverse counterfactuals of actionability and plausibility. Our counterfactual engine is more efficient than counterparts of the same capacity while yielding the lowest re-identification risks.