Cong WangShengyuan XuDeming Yuan
This article studies the distributed online stochastic convex optimization problem with the time-varying constraint over a multiagent system constructed by various agents. The sequences of cost functions and constraint functions, both of which have dynamic parameters following time-varying distributions, are unacquainted to the agent ahead of time. Agents in the network are able to interact with their neighbors through a sequence of strongly connected and time-varying graphs. We develop the adaptive distributed bandit primal-dual algorithm whose step size and regularization sequences are adaptive and have no prior knowledge about the total iteration span T . The adaptive distributed bandit primal-dual algorithm applies bandit feedback with a one-point or two-point gradient estimator to evaluate gradient values. It is illustrated in this article that if the drift of the benchmark sequence is sublinear, then the adaptive distributed bandit primal-dual algorithm exhibits sublinear expected dynamic regret and constraint violation using both two kinds of gradient estimator to compute gradient information. We present a numerical experiment to show the performance of the proposed method.
Keishin TadaNaoki HayashiShigemasa Takai
Alekh AgarwalDean P. FosterDaniel HsuSham M. KakadeAlexander Rakhlin
Menghui XiongBaoyong ZhangDeming YuanYijun ZhangJun Chen
Ju YangMengli WeiYan WangZhongyuan Zhao