Goto

Collaborating Authors

 Reinforcement Learning







a18aa23ee676d7f5ffb34cf16df3e08c-Paper.pdf

Neural Information Processing Systems

Sampling is an important research problem in statistics learning with many applications such as Bayesian inference [1], multi-arm bandit optimization [2], and reinforcement learning [3].


Kernel-BasedFunctionApproximationforAverage RewardReinforcementLearning: AnOptimist No-RegretAlgorithm

Neural Information Processing Systems

Reinforcement learning utilizing kernel ridge regression to predict the expected value function represents a powerful method with great representational capacity. This setting is a highly versatile framework amenable to analytical results. Weconsider kernel-based function approximation for RL in the infinite horizon average reward setting, also referred toasthe undiscounted setting. Wepropose an optimistic algorithm, similar to acquisition function based algorithms in the special caseofbandits.