Offline and Distributional Reinforcement Learning for Radio Resource Management