Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits

Open in new window