srld
- Energy > Oil & Gas > Upstream (0.46)
- Education > Focused Education > Special Education (0.44)
A Discussion on Hyper parameter Tuning
Contextual bandit is a class of online learning problems that can be viewed as a simple reinforcement learning problem without transition. For a completely understanding of contextual bandit problems, we refer the readers to the Chapter 4 of [Bubeck et al., 2012]. Here we include the main idea for completeness. In contextual bandit problems, the agent needs to find out the best action given some observed context (a.k.a the optimal policy in reinforcement learning). Formally, we define S as the context set and K as the number of action.
- Energy > Oil & Gas > Upstream (0.46)
- Education > Focused Education > Special Education (0.44)
Stein Self-Repulsive Dynamics: Benefits From Past Samples
Ye, Mao, Ren, Tongzheng, Liu, Qiang
We propose a new Stein self-repulsive dynamics for obtaining diversified samples from intractable un-normalized distributions. Our idea is to introduce Stein variational gradient as a repulsive force to push the samples of Langevin dynamics away from the past trajectories. This simple idea allows us to significantly decrease the auto-correlation in Langevin dynamics and hence increase the effective sample size. Importantly, as we establish in our theoretical analysis, the asymptotic stationary distribution remains correct even with the addition of the repulsive force, thanks to the special properties of the Stein variational gradient. We perform extensive empirical studies of our new algorithm, showing that our method yields much higher sample efficiency and better uncertainty estimation than vanilla Langevin dynamics.
- North America > United States > California (0.14)
- North America > Canada > Quebec (0.14)
- Europe > Netherlands (0.14)
- Europe > France (0.14)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Data Science > Data Mining > Big Data (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)