Maximum Causal Entropy Inverse Reinforcement Learning for Mean-Field Games

Anahtarci, Berkay, Kariksiz, Can Deha, Saldi, Naci

Jan-12-2024–arXiv.org Artificial Intelligence

In this paper, we introduce the maximum casual entropy Inverse Reinforcement Learning (IRL) problem for discrete-time mean-field games (MFGs) under an infinite-horizon discounted-reward optimality criterion. The state space of a typical agent is finite. Our approach begins with a comprehensive review of the maximum entropy IRL problem concerning deterministic and stochastic Markov decision processes (MDPs) in both finite and infinite-horizon scenarios. Subsequently, we formulate the maximum casual entropy IRL problem for MFGs--a non-convex optimization problem with respect to policies. Leveraging the linear programming formulation of MDPs, we restructure this IRL problem into a convex optimization problem and establish a gradient descent algorithm to compute the optimal solution with a rate of convergence. Finally, we present a new algorithm by formulating the MFG problem as a generalized Nash equilibrium problem (GNEP), which is capable of computing the mean-field equilibrium (MFE) for the forward RL problem. This method is employed to produce data for a numerical example. We note that this novel algorithm is also applicable to general MFE computations. Keywords: Mean-field games, inverse reinforcement learning, maximum causal entropy, discounted reward.

entropy principle, irl problem, optimization problem, (13 more...)

arXiv.org Artificial Intelligence

Jan-12-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Republic of Türkiye
    - Ankara Province > Ankara (0.04)
    - Istanbul Province > Istanbul (0.04)
- Europe > Middle East
  - Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- North America > United States
  - Illinois > Cook County
    - Chicago (0.04)
  - Nevada > Clark County
    - Las Vegas (0.04)

Genre:
- Overview (0.87)
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.34)
    - Reinforcement Learning (1.00)
    - Statistical Learning > Gradient Descent (0.34)
  - Representation & Reasoning > Agents (1.00)