A Bounded rationality, maximum entropy, and Boltzmann-rational policies

Neural Information Processing Systems 

Given the constraint that the human's expected reward is satisfactory, how should we pick a distribution to model the human's choices? The principle of maximum entropy [52] gives us a guide. If we want to encode no extra information in the distribution, then we ought to pick the distribution that maximizes entropy subject to the constraint on the satisficing threshold.