AITopics | Ouhamma, Reda

Collaborating Authors

Ouhamma, Reda

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nash equilibria in scalar discrete-time linear quadratic games

Salizzoni, Giulio, Ouhamma, Reda, Kamgarpour, Maryam

arXiv.org Artificial IntelligenceOct-16-2024

An open problem in linear quadratic (LQ) games has been characterizing the Nash equilibria. This problem has renewed relevance given the surge of work on understanding the convergence of learning algorithms in dynamic games. This paper investigates scalar discrete-time infinite-horizon LQ games with two agents. Even in this arguably simple setting, there are no results for finding $\textit{all}$ Nash equilibria. By analyzing the best response map, we formulate a polynomial system of equations characterizing the linear feedback Nash equilibria. This enables us to bring in tools from algebraic geometry, particularly the Gr\"obner basis, to study the roots of this polynomial system. Consequently, we can not only compute all Nash equilibria numerically, but we can also characterize their number with explicit conditions. For instance, we prove that the LQ games under consideration admit at most three Nash equilibria. We further provide sufficient conditions for the existence of at most two Nash equilibria and sufficient conditions for the uniqueness of the Nash equilibrium. Our numerical experiments demonstrate the tightness of our bounds and showcase the increased complexity in settings with more than two agents.

artificial intelligence, machine learning, nash equilibria, (18 more...)

arXiv.org Artificial Intelligence

2410.12544

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Finite-time convergence to an $\epsilon$-efficient Nash equilibrium in potential games

Maddux, Anna, Ouhamma, Reda, Kamgarpour, Maryam

arXiv.org Artificial IntelligenceJun-17-2024

This paper investigates the convergence time of log-linear learning to an $\epsilon$-efficient Nash equilibrium (NE) in potential games. In such games, an efficient NE is defined as the maximizer of the potential function. Existing results are limited to potential games with stringent structural assumptions and entail exponential convergence times in $1/\epsilon$. Unaddressed so far, we tackle general potential games and prove the first finite-time convergence to an $\epsilon$-efficient NE. In particular, by using a problem-dependent analysis, our bound depends polynomially on $1/\epsilon$. Furthermore, we provide two extensions of our convergence result: first, we show that a variant of log-linear learning that requires a factor $A$ less feedback on the utility per round enjoys a similar convergence time; second, we demonstrate the robustness of our convergence guarantee if log-linear learning is subject to small perturbations such as alterations in the learning rule or noise-corrupted utilities.

artificial intelligence, log-linear learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.15497

Country:

Europe > Switzerland (0.14)
Europe > Italy (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability

Ouhamma, Reda, Kamgarpour, Maryam

arXiv.org Artificial IntelligenceDec-13-2023

We consider decentralized learning for zero-sum games, where players only see their payoff information and are agnostic to actions and payoffs of the opponent. Previous works demonstrated convergence to a Nash equilibrium in this setting using double time-scale algorithms under strong reachability assumptions. We address the open problem of achieving an approximate Nash equilibrium efficiently with an uncoupled and single time-scale algorithm under weaker conditions. Our contribution is a rational and convergent algorithm, utilizing Tsallis-entropy regularization in a value-iteration-based approach. The algorithm learns an approximate Nash equilibrium in polynomial time, requiring only the existence of a policy pair that induces an irreducible and aperiodic Markov chain, thus considerably weakening past assumptions. Our analysis leverages negative drift inequalities and introduces novel properties of Tsallis entropy that are of independent interest.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2312.08008

Country: Europe > Switzerland (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients

Flet-Berliac, Yannis, Ouhamma, Reda, Maillard, Odalric-Ambrym, Preux, Philippe

arXiv.org Artificial IntelligenceOct-9-2020

Policy gradient algorithms have proven to be successful in diverse decision making and control tasks. However, these methods suffer from high sample complexity and instability issues. In this paper, we address these challenges by providing a different approach for training the critic in the actor-critic framework. Our work builds on recent studies indicating that traditional actor-critic algorithms do not succeed in fitting the true value function, calling for the need to identify a better objective for the critic. In our method, the critic uses a new state-value (resp. state-action-value) function approximation that learns the relative value of the states (resp. state-action pairs) rather than their absolute value as in conventional actor-critic. We prove the theoretical consistency of the new gradient estimator and observe dramatic empirical improvement across a variety of continuous control tasks and algorithms. Furthermore, we validate our method in tasks with sparse rewards, where we provide experimental evidence and theoretical insights.

artificial intelligence, avec, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2010.0444

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback