Bob and Alice Go to a Bar: Reasoning About Future With Probabilistic Programs

Tolpin, David, Dobkin, Tomer

arXiv.org Artificial Intelligence 

The'planning as inference' paradigm extends Bayesian inference to future observations. The agent in the environment is modelled as a Bayesian generative model, but the belief about the distribution of agent's actions is updated based on future goals rather than on past facts. This allows to use common modelling and inference tools, notably probabilistic programming, to represent computer agents and explore their behavior. Representing agents as general programs provides flexibility compared to restricted approaches, such as Markov decision processes and their variants and extensions, and allows to model a broad range of complex behaviors in a unified and natural way. Planning as inference models agent preferences through conditioning agents on preferred future behaviors. Often, the conditioning is achieved through the Boltzmann distribution: the probability of a realization of agent's behavior is proportional to the exponent of the agent's reward. The motivation of using the Boltzmann distribution is not clear though. A'rational' agent should behave in a way that maximizes the agent's expected utility, shouldn't it? One argument is that the Boltzmann distribution models human errors and irrationality.