Bayesian Risk-Averse Q-Learning with Streaming Observations