Pham Hoai AnNguyen DungNguyen Thi Xuan UyenNguyen Thai Cong NghiaNgo Minh Nghia
Massive MIMO systems with preconfigured spatial beams efficiently serve near-field (NF) users, while farfield (FF) users can be multiplexed on the same beams using non-orthogonal multiple access (NOMA). To realistically capture propagation, the spherical wave model (SWM) is employed for NF channels and the plane wave model (PWM) for FF channels, reflecting the distinct near- and far-field regions. While conventional optimization approaches such as successive convex approximation (SCA) and branch-andbound (BB) suffer from local optimality or prohibitive complexity, recent advances in deep learning have enabled scalable and adaptive solutions for wireless resource allocation. On this basis, a resource allocation strategy is developed using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, where the base station acts as an agent that dynamically adjusts power and allocation coefficients to maximize the sum throughput of FF users. Simulation results show that the proposed DRLbased method can approach, and in some cases match, deterministic SCA at high SNR, while consistently outperforming randomly initialized SCA in medium-to-high SNR regimes. Compared to optimization-based baselines, the TD3 approach eliminates iterative problem reformulation, reduces computational complexity, and provides stronger adaptability to dynamic channels and user mobility.
Liang ChenFanglei SunKai LiRuiqing ChenYang YangJun Wang
Dandan YanBenjamin K. NgWei KeChan‐Tong Lam
Ali Y. YildirimHasan Anıl Akyıldızİbrahim HökelekHakan Ali Çırpan
Xiaona ZhangHaifeng YanYunbin HeWei YanZhao WangYajing Deng