使用深层图像先验来产生反事实解释

论文标题

使用深层图像先验来产生反事实解释

Using Deep Image Priors to Generate Counterfactual Explanations

论文作者

Narayanaswamy, Vivek, Thiagarajan, Jayaraman J., Spanias, Andreas

论文摘要

通过使用精心量身定制的卷积神经网络体系结构，可以使用深层图像（DIP）从潜在表示编码中获取前图像。尽管众所周知，倾斜反转优于常规的正规化反演策略，例如总变化，但这种过度参数化的发电机能够有效地重建未在原始数据分布中的图像。这种限制使使用此类先验来实现诸如反事实推理之类的任务的挑战，其中，目标是为系统地导致模型预测变化的图像产生微小的，可解释的更改。为此，我们提出了一种基于与预测因子联合训练的辅助损耗估计器进行新的正则化策略，该估计值在恢复自然图像之前有效地指导。我们对现实世界中皮肤病变检测问题的经验研究清楚地证明了拟议方法在合成有意义的反事实中的有效性。相比之下，我们发现标准倾斜反演通常会在视觉上不可察觉地扰动图像的部分，因此对模型行为没有任何其他见解。

Through the use of carefully tailored convolutional neural network architectures, a deep image prior (DIP) can be used to obtain pre-images from latent representation encodings. Though DIP inversion has been known to be superior to conventional regularized inversion strategies such as total variation, such an over-parameterized generator is able to effectively reconstruct even images that are not in the original data distribution. This limitation makes it challenging to utilize such priors for tasks such as counterfactual reasoning, wherein the goal is to generate small, interpretable changes to an image that systematically leads to changes in the model prediction. To this end, we propose a novel regularization strategy based on an auxiliary loss estimator jointly trained with the predictor, which efficiently guides the prior to recover natural pre-images. Our empirical studies with a real-world ISIC skin lesion detection problem clearly evidence the effectiveness of the proposed approach in synthesizing meaningful counterfactuals. In comparison, we find that the standard DIP inversion often proposes visually imperceptible perturbations to irrelevant parts of the image, thus providing no additional insights into the model behavior.

下载PDF全文

下载文献需遵守相关版权规定

论文标题