Yang, Ger
Stochastic Multi-armed Bandits in Constant Space
Liau, David, Price, Eric, Song, Zhao, Yang, Ger
We consider the stochastic bandit problem in the sublinear space setting, where one cannot record the win-loss record for all $K$ arms. We give an algorithm using $O(1)$ words of space with regret \[ \sum_{i=1}^{K}\frac{1}{\Delta_i}\log \frac{\Delta_i}{\Delta}\log T \] where $\Delta_i$ is the gap between the best arm and arm $i$ and $\Delta$ is the gap between the best and the second-best arms. If the rewards are bounded away from $0$ and $1$, this is within an $O(\log 1/\Delta)$ factor of the optimum regret possible without space constraints.
Approximation Algorithms for Route Planning with Nonlinear Objectives
Yang, Ger (University of Texas at Austin) | Nikolova, Evdokia (University of Texas at Austin)
We consider optimal route planning when the objective function is a general nonlinear and non-monotonic function. Such an objective models user behavior more accurately, for example, when a user is risk-averse, or the utility function needs to capture a penalty for early arrival. It is known that as non-linearity arises, the problem can become NP-hard and little is known on computing optimal solutions when in addition there is no monotonicity guarantee. We show that an approximately optimal non-simple path can be efficiently computed under some natural constraints. In particular, we provide a fully polynomial approximation scheme under hop constraints. Our approximation algorithm can extend to run in pseudo-polynomial time under an additional linear constraint that sometimes is useful. As a by-product, we show that our algorithm can be applied to the problem of finding a path that is most likely to be on time for a given deadline.