Multiple Mean-Payoff Optimization under Local Stability Constraints
Klaška, David, Kučera, Antonín, Kůr, Vojtěch, Musil, Vít, Řehák, Vojtěch
–arXiv.org Artificial Intelligence
The long-run average payoff per transition (mean payoff) is the main tool for specifying the performance and dependability properties of discrete systems. The problem of constructing a controller (strategy) simultaneously optimizing several mean payoffs has been deeply studied for stochastic and game-theoretic models. One common issue of the constructed controllers is the instability of the mean payoffs, measured by the deviations of the average rewards per transition computed in a finite "window" sliding along a run. Unfortunately, the problem of simultaneously optimizing the mean payoffs under local stability constraints is computationally hard, and the existing works do not provide a practically usable algorithm even for non-stochastic models such as two-player games. In this paper, we design and evaluate the first efficient and scalable solution to this problem applicable to Markov decision processes.
arXiv.org Artificial Intelligence
Dec-17-2024
- Country:
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Czechia > South Moravian Region
- Brno (0.04)
- United Kingdom > England
- Europe
- Genre:
- Research Report (0.64)