Combining No-regret and Q-learning