Average Reward Reinforcement Learning for Wireless Radio Resource Management