Embedding Words as Distributions with a Bayesian Skip-gram Model