A Bandit Regret Bound Analysis A.1 Algorithm Procedure At each rounds [t ], after performing a list of actions { A