Optimal prediction of Markov chains with and without spectral gap

Neural Information Processing Systems 

We study the following learning problem with dependent data: Given a trajectory of length $n$ from a stationary Markov chain with $k$ states, the goal is to predict the distribution of the next state.