Kernel-BasedFunctionApproximationforAverage RewardReinforcementLearning: AnOptimist No-RegretAlgorithm