We proposed the Sparse Bayesian Network-based Disturbance Observer (SBN-DOB) to enhance the robustness of policy-based reinforcement learning. SBN-DOB utilizes sparse Bayesian learning to estimate the nominal inverse model dynamics, effectively mitigating model uncertainty and disturbances without relying on physical modeling. The SBN-DOB can be compressed by inducing sparsity in network parameters through sparse Bayesian learning, and the Bayesian model reduces the risk of overfitting during inference. To evaluate the effectiveness of the proposed approach, We conducted experiments using the policy network (PN) of the soft actor-critic algorithm in combination with SBN-DOB for six control tasks with an uncertain environment. The results of these experiments demonstrate that the performance of PN is preserved against continuous disturbances and state noise even with compression applied to SBN-DOB. Consequently, SBN-DOB is expected to minimize the simulation-to-reality gap of reinforcement learning by used in embedded systems with limited performance and capacity.
Thanh Tung BuiTrung Thanh CaoTrong Hieu NguyenDinh Hieu LeHuy Hoang DaoPhuong Nam Dao
Van Tu VuThanh Loc PhamPhuong Nam Dao
Yu‐Cheng ChenYupeng ZhuShaohai WangLiyuan YinHongming ZhuXingjian SunChengwei Wu