Various quality assessment parameters for multimedia traffic in the wireless network depends on reckoning Quality of Experience (QoE) from Quality of Service (QoS). Mean Opinion Score (MOS) is the extensively used network quality metric for integrated (data and video) traffic management and resource allocation. This work mainly studies an uplink underlay Dynamic Spectrum Access (DSA) optimization problem that utilizes the Deep Reinforcement Learning (DRL) algorithm for simultaneous QoE enhancement and interference management within a tolerable limit. A Resource Allocation Deep Deterministic Policy Gradient (RADDPG) algorithm is proposed for joint quality improvement and distortion maintenance. In this work, the Deterministic Policy Gradient method merges Deep Q Network (DQN) along with the policy gradient actor-critic framework to choose suitable actions for improving the learning process speed, stability and computation time therefore accomplishing precise estimations. Simulation result shows that the proposed RADDPG method outperforms the existing Q and DQN learning algorithm.
Siyu YuanYong ZhangWenbo QieTengteng MaSisi Li
K.F. MutebaKarim DjouaniThomas O. Olwal