Zixiao LuoDongmei DuDandan LiuQiangqiang YangYi ChaiShiyu HuJiayou Wu
To address trajectory tracking of underactuated unmanned surface vessels (USVs) under disturbances and model uncertainty, we propose a hierarchical control framework that combines model predictive control (MPC) with proximal policy optimization (PPO). The outer loop runs in the inertial reference frame, where an MPC planner based on a kinematic model enforces velocity and safety constraints and generates feasible body–fixed velocity references. The inner loop runs in the body–fixed reference frame, where a PPO policy learns the nonlinear inverse mapping from velocity to multi–thruster thrust, compensating hydrodynamic modeling errors and external disturbances. On top of this framework, we design a Proactive–Reactive Adaptive Reward (PRAR) that uses the MPC prediction sequence and real–time pose errors to adaptively reweight the reward across surge, sway and yaw, improving robustness and cross–model generalization. Simulation studies on circular and curvilinear trajectories compare the proposed PRAR–driven dual–loop controller (PRAR–DLC) with MPC–PID, PPO–Only, MPC–PPO and PPO variants. On the curvilinear trajectory, PRAR–DLC reduces surge MAE and maximum tracking error from 0.269 m and 0.963 m (MPC–PID) to 0.138 m and 0.337 m, respectively; on the circular trajectory it achieves about an 8.5% reduction in surge MAE while maintaining comparable sway and yaw accuracy to the baseline controllers. Real–time profiling further shows that the average MPC and PPO evaluation times remain below the control sampling period, indicating that the proposed architecture is compatible with real–time onboard implementation and physical deployment.
Jianghui SangYongli WangWeiping DingZaki AhmadkhanLin Xu
Lei XuLijun WangYuanfang ZHAOJinfeng TANAntao CHEN
Gianluca GarofaloChristian Ott
Yanchuan XuHuarong ZhengWeimin WuJun Wu