Learning Finite State Representations of Recurrent Policy Networks