论文标题
Adadgs:一种具有非局部定向高斯平滑梯度的自适应黑盒优化方法
AdaDGS: An adaptive black-box optimization method with a nonlocal directional Gaussian smoothing gradient
论文作者
论文摘要
局部梯度指向无穷小社区中最陡峭斜率的方向。当损失景观是多模式时,以当地梯度为指导的优化器通常会被困在本地Optima中。最近在(Zhang等,2020)中提出了定向高斯平滑(DGS)方法,并用于定义一种真正的非局部梯度,称为DGS梯度,用于高维黑盒优化。有希望的结果表明,用DGS梯度代替传统的本地梯度可以显着提高基于梯度的方法在优化高度多模式损耗函数方面的性能。但是,DGS梯度的最佳性能可能依赖于两个重要的超参数的微调,即平滑半径和学习率。在本文中,我们提出了一种简单但巧妙而有效的自适应方法,可与DGS梯度进行优化,从而消除了超参数微调的需求。由于DGS梯度通常指向一个良好的搜索方向,因此我们沿DGS方向执行线路搜索,以确定每次迭代处的步长。学习的步长依次将向我们告知我们周围区域中功能景观的规模,我们基于我们对下一次迭代的平滑半径进行调整。我们介绍了有关高维基准功能,翼型设计问题和游戏内容生成问题的实验结果。 Adadgs方法比最新的黑盒优化方法表现出了优越的性能。
The local gradient points to the direction of the steepest slope in an infinitesimal neighborhood. An optimizer guided by the local gradient is often trapped in local optima when the loss landscape is multi-modal. A directional Gaussian smoothing (DGS) approach was recently proposed in (Zhang et al., 2020) and used to define a truly nonlocal gradient, referred to as the DGS gradient, for high-dimensional black-box optimization. Promising results show that replacing the traditional local gradient with the DGS gradient can significantly improve the performance of gradient-based methods in optimizing highly multi-modal loss functions. However, the optimal performance of the DGS gradient may rely on fine tuning of two important hyper-parameters, i.e., the smoothing radius and the learning rate. In this paper, we present a simple, yet ingenious and efficient adaptive approach for optimization with the DGS gradient, which removes the need of hyper-parameter fine tuning. Since the DGS gradient generally points to a good search direction, we perform a line search along the DGS direction to determine the step size at each iteration. The learned step size in turn will inform us of the scale of function landscape in the surrounding area, based on which we adjust the smoothing radius accordingly for the next iteration. We present experimental results on high-dimensional benchmark functions, an airfoil design problem and a game content generation problem. The AdaDGS method has shown superior performance over several the state-of-the-art black-box optimization methods.