Regret bounds for Narendra-Shapiro bandit algorithms