Bogert, Kenneth
The Principle of Uncertain Maximum Entropy
Bogert, Kenneth, Kothe, Matthew
To whom correspondence should be addressed; E-mail: kbogert@unca.edu. Its resultant solutions have served as a catalyst, facilitating researchers in mapping their empirical observations to the acquisition of unbiased models, whilst deepening the understanding of complex systems and phenomena. However, when we consider situations in which the model elements are not directly observable, such as when noise or ocular occlusion is present, possibilities arise for which standard maximum entropy approaches may fail, as they are unable to match feature constraints. Here we show the Principle of Uncertain Maximum Entropy as a method that both encodes all available information in spite of arbitrarily noisy observations while surpassing the accuracy of some ad-hoc methods. Additionally, we utilize the output of a black-box machine learning model as input into an uncertain maximum entropy model, resulting in a novel approach for scenarios where the observation function is unavailable. We anticipate our principle finding broad applications in diverse fields due to generalizing the traditional maximum entropy method with the ability to utilize uncertain observations. Throughout the sciences, we often desire to estimate some distribution given a number of samples taken from it. This distribution may represent the outcome of a process in the real world whose parameters we are interested in learning. The principle of maximum entropy, as shown in equation 3, offers an attractive solution to this task as it has several valuable attributes, both theoretical and practical. The principle states that, given a set of constraints, one should select the distribution with the highest entropy. Subsequently, this ensures that the chosen posterior distribution contains the least amount of information, thus only encoding the information present in the constraints.
A Hierarchical Bayesian model for Inverse RL in Partially-Controlled Environments
Bogert, Kenneth, Doshi, Prashant
Robots learning from observations in the real world using inverse reinforcement learning (IRL) may encounter objects or agents in the environment, other than the expert, that cause nuisance observations during the demonstration. These confounding elements are typically removed in fully-controlled environments such as virtual simulations or lab settings. When complete removal is impossible the nuisance observations must be filtered out. However, identifying the source of observations when large amounts of observations are made is difficult. To address this, we present a hierarchical Bayesian model that incorporates both the expert's and the confounding elements' observations thereby explicitly modeling the diverse observations a robot may receive. We extend an existing IRL algorithm originally designed to work under partial occlusion of the expert to consider the diverse observations. In a simulated robotic sorting domain containing both occlusion and confounding elements, we demonstrate the model's effectiveness. In particular, our technique outperforms several other comparative methods, second only to having perfect knowledge of the subject's trajectory.