Stochastic first-order methods for average-reward Markov decision processes