A unified algorithm framework for mean-variance optimization in discounted Markov decision processes