Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation

Open in new window