Autonomous exploration for navigating in non-stationary CMPs

Gajane, Pratik, Ortner, Ronald, Auer, Peter, Szepesvari, Csaba

arXiv.org Machine Learning 

We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change. For this setting, we propose a performance measure called exploration steps which counts the time steps at which the learner lacks sufficient knowledge to navigate its environment efficiently. We devise a learning meta-algorithm, MNM, and prove an upper bound on the exploration steps in terms of the number of changes.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found