A/B/n Testing with Control in the Presence of Subpopulations

Neural Information Processing Systems 

The quality of each arm is assessed through a weighted combination of its subpopulation means. We propose a strategy for sequentially choosing one arm per time step so as to discover as fast as possible which arms, if any, have higher weighted expectation than the control.