论文标题
紧张的二阶证书,用于随机平滑
Tight Second-Order Certificates for Randomized Smoothing
论文作者
论文摘要
随机平滑是一种提供鲁棒性保证对抗性攻击的流行方式:随机平滑的功能具有通用的Lipschitz型绑定,可以轻松计算稳健性证书。在这项工作中,我们表明存在一种用于高斯随机平滑的通用曲率绑定:鉴于平滑函数的确切值和梯度,我们计算了一个点的下限,即在一个点到其最接近的对手示例的距离,称为二阶平滑(SOS)鲁棒性证书。除了证明这份新颖证书的正确性外,我们还表明SOS证书是可实现的,因此很紧张。有趣的是,我们表明,从使用梯度规范的其他信息来看,就认证的鲁棒性而言,最大可实现的好处相对较小:由于我们的界限很紧,这是一个基本的负面结果。如果我们考虑梯度规范的估计误差,则SOS证书的增益会进一步减少。因此,我们还开发出一种高斯平滑的变体,称为高斯偶极平滑,它提供了与随机平滑的相似界限,并具有梯度信息,但具有大量改进的样品效率。这使我们能够在高维数据集(例如CIFAR-10和Imagenet)上获得(略有)改进的鲁棒性证书。代码可从https://github.com/alevine0/smoothing_second_order获得。
Randomized smoothing is a popular way of providing robustness guarantees against adversarial attacks: randomly-smoothed functions have a universal Lipschitz-like bound, allowing for robustness certificates to be easily computed. In this work, we show that there also exists a universal curvature-like bound for Gaussian random smoothing: given the exact value and gradient of a smoothed function, we compute a lower bound on the distance of a point to its closest adversarial example, called the Second-order Smoothing (SoS) robustness certificate. In addition to proving the correctness of this novel certificate, we show that SoS certificates are realizable and therefore tight. Interestingly, we show that the maximum achievable benefits, in terms of certified robustness, from using the additional information of the gradient norm are relatively small: because our bounds are tight, this is a fundamental negative result. The gain of SoS certificates further diminishes if we consider the estimation error of the gradient norms, for which we have developed an estimator. We therefore additionally develop a variant of Gaussian smoothing, called Gaussian dipole smoothing, which provides similar bounds to randomized smoothing with gradient information, but with much-improved sample efficiency. This allows us to achieve (marginally) improved robustness certificates on high-dimensional datasets such as CIFAR-10 and ImageNet. Code is available at https://github.com/alevine0/smoothing_second_order.