[paper-review] OFFLINE REINFORCEMENT LEARNING WITH IMPLICIT Q-LEARNING

NeurIPS 2023. [Paper] [Github¹ Github²]

Ilya Kostrikov, Ashvin Nair & Sergey Levine Department of Electrical Engineering and Computer Science University of California, Berkeley

12 Oct 2021

한 문장 요약

요약: State value function을 random variable로 정의하여, policy improvement를 implicit하게 근사해보자. 구체적으로는 Expectiles of the state value function을 추정해보자.

Offline RL method that never needs to evaluate actions outside of the dataset, but still enables the learned policy to improve substantially over the best behavior in the data through generalization

Keyword: Reinforcement Learning, Offline RL, Quantile Regression

Introduction

Enjoy Reading This Article?

Here are some more articles you might like to read next: