Optimal Data Driven Resource Allocation under Multi-Armed Bandit Observations