A Bounded rationality, maximum entropy, and Boltzmann-rational policies
–Neural Information Processing Systems
Given the constraint that the human's expected reward is satisfactory, how should we pick a distribution to model the human's choices? The principle of maximum entropy [52] gives us a guide. If we want to encode no extra information in the distribution, then we ought to pick the distribution that maximizes entropy subject to the constraint on the satisficing threshold.
Neural Information Processing Systems
Jan-23-2025, 02:29:18 GMT
- Technology: