In multi-agent system, the learning of actions selected by each agent results in poor cooperation, at the same time the conventional reinforcement learning requires a large computation cost because every agent must learn separately. A novel multi-agent cooperative learning model (MCLM) and a multi-agent cooperative learning algorithm (MCLA) are presented to solve these problems. In MCLM, the agents have the ability to learn together, thus observing one another's actions to decide individual action strategy. Based on this model, trying to improve the applicability and efficacy of reinforcement learning algorithms, MCLA is introduced. In MCLA, an evaluating method based on long-time reward is proposed, in which the reward gradually converge at a stable value by constant interaction with environment and payments from it. A series of simulations are provided to demonstrate the practical values and performance of the proposed algorithm in solving hunter-prey problem
Yinjiang SunRui ZhangWenbao LiangXu Cheng
Zhiwei XuBin ZhangDapeng LiZeren ZhangGuangchong ZhouHao ChenGuoliang Fan
Zhiwei GeYuanyang ZhuChunlin Chen