Sufficient Markov Decision Processes with Alternating Deep Neural Networks