Lifu DingZhiyun LinXiasheng ShiGangfeng Yan
With the increasing expansion of the power grid, economic dispatch problems have received considerable attention. A multi-agent coordinated deep reinforcement learning algorithm is proposed to deal with distributed nonconvex economic dispatch problems. In the algorithm, agents run independent reinforcement learning algorithms and update their local Q-functions with a newly defined joint reward. The double network structure is adopted to approximate the Q-function so that the offline trained model can be used online to provide recommended power outputs for time-varying demands in real-time. By introducing the reward network, the competition mechanism between the reward network and the target network is established to determine a progressively stable target value, which achieves coordination among agents and pledges the losses of the Q-networks to converge well. Theoretical analysis is given and case studies are conducted to prove the advantages compared with existing approaches.
Lifu DingZhiyun LinGangfeng Yan
Weijue XuXiaoli WangKaiguang ChenHaowen Tan
Fangyuan LiJiahu QinYu KangWei Xing Zheng
Shuai GongHang ChengWanwan CaoDongbo YuChao Chen