Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning