论文标题

钻石:驯服样本和交流复杂性,分散双层优化

DIAMOND: Taming Sample and Communication Complexities in Decentralized Bilevel Optimization

论文作者

Qiu, Peiwen, Li, Yining, Liu, Zhuqing, Khanduri, Prashant, Liu, Jia, Shroff, Ness B., Bentley, Elizabeth Serena, Turck, Kurt

论文摘要

由于其在许多新兴的多项式学习范例(例如,多代理元学习和多代理增强学习)上,分散的二元优化最近受到了越来越多的关注。但是,要处理边缘网络的计算和通信能力有限,开发分散的双层优化技术的主要挑战是降低样本和通信复杂性。这激发了我们开发一种称为Diamond的新的分散的二元优化(通过动量和梯度跟踪的分散的单时间尺度随机近似)。本文的贡献如下:i)我们的钻石算法采用单循环结构,而不是遵循自然的双环优化结构,该结构具有低计算和实现的复杂性; ii)与现有方法相比,钻石算法不需要任何完整的梯度评估,这进一步降低了样本和计算复杂性; iii)通过仔细整合动量信息和梯度跟踪技术,我们表明钻石算法享受$ \ Mathcal {o}(O}(ε^{ - 3/2})$在样本和通信复杂性中,可实现$ε$ $ $ - 规则的解决方案,其中包括与数据集量相关的零件和明显的现有工作。广泛的实验还验证了我们的理论发现。

Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e.g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks. However, to work with the limited computation and communication capabilities of edge networks, a major challenge in developing decentralized bilevel optimization techniques is to lower sample and communication complexities. This motivates us to develop a new decentralized bilevel optimization called DIAMOND (decentralized single-timescale stochastic approximation with momentum and gradient-tracking). The contributions of this paper are as follows: i) our DIAMOND algorithm adopts a single-loop structure rather than following the natural double-loop structure of bilevel optimization, which offers low computation and implementation complexity; ii) compared to existing approaches, the DIAMOND algorithm does not require any full gradient evaluations, which further reduces both sample and computational complexities; iii) through a careful integration of momentum information and gradient tracking techniques, we show that the DIAMOND algorithm enjoys $\mathcal{O}(ε^{-3/2})$ in sample and communication complexities for achieving an $ε$-stationary solution, both of which are independent of the dataset sizes and significantly outperform existing works. Extensive experiments also verify our theoretical findings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源