On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling and Beyond