JOURNAL ARTICLE

Efficient and Robust Reinforcement Learning from Human Feedback

Huazheng Wang

Year: 2025 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 39 (27)Pages: 28730-28730   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

Reinforcement Learning (RL) has emerged as a powerful paradigm for sequential decision-making with numerous real-world applications. However, in practical environments such as recommender systems, search engines, and LLMs, RL algorithms must efficiently learn from biased human feedback that may be subject to corruption. In this talk, I will present our recent efforts in developing robust RL algorithms that can provably effectively handle such challenging scenarios. First, I will introduce our works on reinforcement learning from biased click feedback in ranking. While previous approaches typically relied on strong assumptions about human click behavior (formalized as click models) and required specialized debiasing methods for different models, we propose a novel unified framework that formulates the ranking process under general click models as a Markov Decision Process, enabling the development of a click model-agnostic RL algorithm. Second, I will introduce the fundamental vulnerability of bandits and reinforcement learning under corrupted feedback. Our theoretical analysis provides complete necessity and sufficiency characterizations of the attackability of linear bandits and linear RL, revealing their intrinsic robustness and limitations. Lastly, I will discuss our recent works on improving RL finetuning for LLMs, including sample efficient off-policy RLHF and solving the gradient entanglement issue in margin-based alignment methods.

Keywords:
Reinforcement learning Reinforcement Computer science Psychology Artificial intelligence Social psychology

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
7
Refs
0.12
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.