Non-myopic learning in repeated stochastic games