Provably Efficient Algorithm for Nonstationary Low-Rank MDPs

Neural Information Processing Systems 

However, in practice, the environment is typically time-varying and nonstationary .