building phase
Autonomous exploration for navigating in non-stationary CMPs
Gajane, Pratik, Ortner, Ronald, Auer, Peter, Szepesvari, Csaba
We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change. For this setting, we propose a performance measure called exploration steps which counts the time steps at which the learner lacks sufficient knowledge to navigate its environment efficiently. We devise a learning meta-algorithm, MNM, and prove an upper bound on the exploration steps in terms of the number of changes.
1910.08446
Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)