Value-based Bayesian Meta-reinforcement Learning and Traffic Signal Control