Test-Time Regret Minimization in Meta Reinforcement Learning