随机优化与决策分布

论文标题

随机优化与决策分布

Stochastic optimization with decision-dependent distributions

论文作者

Drusvyatskiy, Dmitriy, Xiao, Lin

论文摘要

随机优化问题通常涉及对决策变量反应发生变化的数据分布。例如，当人口成员通过操纵其功能以提高被积极标记的可能性而对部署的分类器做出反应时，就是这种情况。关于表演预测的最新著作已经确定了有关此类问题的有趣解决方案概念：找到有关决策引起的静态分布的最佳决定。继续进行这项工作，我们表明，可以直接应用典型的随机算法（最初是为静态问题设计的），可以直接用于寻找这种平衡，而效率很少。原因很容易解释：分布转移的主要结果是，它以偏差与与解决方案的距离线性衰减的偏差损坏算法。利用这一观点，我们获得了流行算法的尖锐收敛保证，例如随机梯度，剪切梯度，近端点和双重平均方法，以及它们的加速和近端变体。在现实的应用程序中，决策规则的部署通常比抽样要昂贵得多。我们展示了如何修改上述算法，以在仅执行对数的许多部署时保持其样本效率。

Stochastic optimization problems often involve data distributions that change in reaction to the decision variables. This is the case for example when members of the population respond to a deployed classifier by manipulating their features so as to improve the likelihood of being positively labeled. Recent works on performative prediction have identified an intriguing solution concept for such problems: find the decision that is optimal with respect to the static distribution that the decision induces. Continuing this line of work, we show that typical stochastic algorithms -- originally designed for static problems -- can be applied directly for finding such equilibria with little loss in efficiency. The reason is simple to explain: the main consequence of the distributional shift is that it corrupts algorithms with a bias that decays linearly with the distance to the solution. Using this perspective, we obtain sharp convergence guarantees for popular algorithms, such as stochastic gradient, clipped gradient, proximal point, and dual averaging methods, along with their accelerated and proximal variants. In realistic applications, deployment of a decision rule is often much more expensive than sampling. We show how to modify the aforementioned algorithms so as to maintain their sample efficiency while performing only logarithmically many deployments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题