Reviews: Adaptive Active Hypothesis Testing under Limited Information

Neural Information Processing Systems 

The paper studies the problem of active hypothesis testing with limited information. More specifically, denoting the probability of'correct indication' by q(j,w) if the true hypothesis is j and the chosen action is w, they assume that q(j,w) is bounded below by \alpha(w) for all j, and this quantity is available to the controller. They first study the incomplete-bayesian update rule, where all q's are replaced with alpha's, hence approximate, and show that with this IB update rule, one needs at least O(log(1/\delta)) to achieve (1-\delta) posterior error probability. Then, they show that their gradient-based policy achieves the order-wise optimal sample complexity. I have a few technical questions.