Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems