An Online-Learning Approach to Inverse Optimization
Bärmann, Andreas, Martin, Alexander, Pokutta, Sebastian, Schneider, Oskar
–arXiv.org Artificial Intelligence
Human decision-makers are very good at taking decisions under rather imprecise specification of the decision-making problem, both in terms of constraints as well as objective. One 1 might argue that the human decision-maker can pretty reliably learn from observed previous decisions - a traditional learning-by-example setup. At the same time, when we try to turn these decision-making problems into actual optimization problems, we often run into all types of issues in terms of specifying the model. In an optimal world, we would be able to infer or learn the optimization problem from previously observed decisions taken by an expert. This problem naturally occurs in many settings where we do not have direct access to the decision-maker's preference or objective function but can observe his behaviour, and where the learner as well as the decision-maker have access to the same information. Natural examples are as diverse as making recommendations based on user history and strategic planning problems, where the agent's preferences are unknown but the system is observable. Other examples include knowledge transfer from a human planner into a decision support system: often human operators have arrived at finely-tuned "objective functions" through many years of experience, and in many cases it is desirable to replicate the decision-making process both for scaling up and also for potentially including it in large-scale scenario analysis and simulation to explore responses under varying conditions. Here we consider the learning of preferences or objectives from an expert by means of observing his actions.
arXiv.org Artificial Intelligence
Oct-30-2018
- Country:
- Europe (0.67)
- North America > United States (0.93)
- Genre:
- Research Report (1.00)
- Industry:
- Education > Educational Setting > Online (0.41)
- Technology: