This paper proposes a spectrum-agile cognitive radio (CR) which may withstand, or outperform, a sophisticated jammer that adapts to the radio's channel selections. The spectrum co-existence of the CR and the jammer is modeled as a non-cooperative stochastic game and game-theoretic learning is allowed for both to learn effective policies in response to each other's actions. The CR is designed to use Win-or-Learn-Fast Policy Hill Climbing (WoLF-PHC) Reinforcement Learning to make better future channel selection decisions. A cognitive jammer (CJ) that time-interweaves jamming and sensing and uses No-regret Learning (NRL) to zero-in on channels containing signals is also developed. Both protocols were implemented on USRP software-defined radios. Over-the-air (OTA) tests showed that the developed CR performed 68% better than a legacy radio against the cognitive jammer. Similarly, the CJ performed 39% better than a non-learning jammer against the CR.
Zefang LvLiang XiaoYousong DuGuohang NiuChengwen XingWenyuan Xu
Ali PourranjbarGeorges KaddoumKeyvan Aghababaiyan
Muhammad ArifSami MuhaidatPaschalis C. Sofotasios
Mohamed A. ArefSudharman K. JayaweeraEsteban Yepez