Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning