Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism

Open in new window