Return to Colloquia & Seminar listing
Breaking the Sample Size Barrier in Reinforcement Learning
Mathematics of Data & DecisionsSpeaker: | Yuxin Chen, University of Pennsylvania |
Location: | zoom |
Start time: | Tue, Apr 18 2023, 12:10PM |
Emerging reinforcement learning (RL) applications necessitate the design of sample-efficient solutions in order to accommodate the explosive growth of problem dimensionality. Despite the empirical success, however, our understanding about the statistical limits of RL remains highly incomplete. In this talk, I will present some recent progress towards settling the sample complexity in three RL scenarios. The first one is concerned with RL in the presence of a simulator, and we demonstrate the minimax optimality of the model-based RL approach (a.k.a. the plug-in approach), without suffering from a sample size barrier that was present in all past works. The second part studies offline RL, which learns using pre-collected data and needs to accommodate distribution shifts and limited data coverage. We prove that model-based offline RL achieves minimal-optimal sample complexity without any burn-in cost. The insights from offline RL further motivate optimal algorithm design in online RL with reward-agnostic exploration, a scenario where the learner is unaware of the reward functions during the exploration stage. (See https://arxiv.org/abs/2005.12900, https://arxiv.org/abs/
2204.05275 , https://yuxinchen2020.github.io/ for more details).publications/Reward-free- exploration.pdf This is based on joint work with Gen Li, Laixi Shi, Yuling Yan, Yuejie Chi, Jianqing Fan, and Yuting Wei.