State-Aware Variational Thompson Sampling for Deep Q-Networks