Balance Risk and Reward: A Batched-Bandit Strategy for Automated Phased Release