论文标题
基于GAN的域推理攻击
GAN-based Domain Inference Attack
论文作者
论文摘要
基于模型的攻击可以从深度神经网络模型中推断培训数据信息。这些攻击在很大程度上取决于攻击者对应用程序域的了解,例如使用它来确定用于模型内攻击的辅助数据。但是,攻击者可能不知道该模型在实践中使用了什么。我们提出了一种基于生成的对抗网络(GAN)方法,以探索目标模型的可能或类似域 - 模型域推理(MDI)攻击。对于给定的目标(分类)模型,我们假设攻击者只知道输入和输出格式,并且可以使用模型来得出所需形式的任何输入的预测。我们的基本思想是使用目标模型来影响易于获取的候选域数据集的GAN训练过程。我们发现,如果域与目标域更相似,则目标模型可能会减少训练程序的注意力。然后,我们测量了与GAN生成的数据集之间的距离的分心水平,该数据集可用于对目标模型的候选域进行排名。我们的实验表明,来自MDI顶部域的辅助数据集可以有效地增强模型中攻击的结果。
Model-based attacks can infer training data information from deep neural network models. These attacks heavily depend on the attacker's knowledge of the application domain, e.g., using it to determine the auxiliary data for model-inversion attacks. However, attackers may not know what the model is used for in practice. We propose a generative adversarial network (GAN) based method to explore likely or similar domains of a target model -- the model domain inference (MDI) attack. For a given target (classification) model, we assume that the attacker knows nothing but the input and output formats and can use the model to derive the prediction for any input in the desired form. Our basic idea is to use the target model to affect a GAN training process for a candidate domain's dataset that is easy to obtain. We find that the target model may distract the training procedure less if the domain is more similar to the target domain. We then measure the distraction level with the distance between GAN-generated datasets, which can be used to rank candidate domains for the target model. Our experiments show that the auxiliary dataset from an MDI top-ranked domain can effectively boost the result of model-inversion attacks.