AlphaSnake: Policy Iteration on a Nondeterministic NP-hard Markov Decision Process