论文标题
隐藏的毒药:机器擦学习可以使伪装的中毒攻击
Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks
论文作者
论文摘要
我们介绍了伪装的数据中毒攻击,这是一种新的攻击向量,该攻击向量是在可能诱导模型再培训时在机器学习和其他设置的背景下产生的。对手首先在培训数据集中增加了一些精心设计的点,因此对模型预测的影响很小。对手随后触发了一个请求,以删除引入点的子集,以释放攻击,并且模型的预测受到了负面影响。特别是,我们考虑在包括CIFAR-10,Imagenette和ImageWoof(ImageWoof)上的数据集上,考虑了清洁标签的目标攻击(目标是导致模型错误分类的特定测试点)。通过构建掩盖中毒数据集效果的伪装数据点,可以实现此攻击。
We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced. An adversary first adds a few carefully crafted points to the training dataset such that the impact on the model's predictions is minimal. The adversary subsequently triggers a request to remove a subset of the introduced points at which point the attack is unleashed and the model's predictions are negatively affected. In particular, we consider clean-label targeted attacks (in which the goal is to cause the model to misclassify a specific test point) on datasets including CIFAR-10, Imagenette, and Imagewoof. This attack is realized by constructing camouflage datapoints that mask the effect of a poisoned dataset.