Goto

Collaborating Authors

 perfect recall


Ex ante coordination and collusion in zero-sum multi-player extensive-form games

Gabriele Farina, Andrea Celli, Nicola Gatti, Tuomas Sandholm

Neural Information Processing Systems

Recent milestones in equilibrium computation, such as the success of Libratus, show that it is possible to compute strong solutions to two-player zero-sum games in theory and practice. This is not the case for games with more than two players, which remain one of the main open challenges in computational game theory. This paper focuses on zero-sum games where a team of players faces an opponent, as is the case, for example, in Bridge, collusion in poker, and many non-recreational applications such as war, where the colluders do not have time or means of communicating during battle, collusion in bidding, where communication during the auction is illegal, and coordinated swindling in public. The possibility for the team members to communicate before game play--that is, coordinate their strategies ex ante--makes the use of behavioral strategies unsatisfactory. The reasons for this are closely related to the fact that the team can be represented as a single player with imperfect recall. We propose a new game representation, the realization form, that generalizes the sequence form but can also be applied to imperfect-recall games. Then, we use it to derive an auxiliary game that is equivalent to the original one. It provides a sound way to map the problem of finding an optimal ex-antecoordinated strategy for the team to the well-understood Nash equilibrium-finding problem in a (larger) two-player zero-sum perfect-recall game. By reasoning over the auxiliary game, we devise an anytime algorithm, fictitious team-play, that is guaranteed to converge to an optimal coordinated strategy for the team against an optimal opponent, and that is dramatically faster than the prior state-of-the-art algorithm for this problem.


Learning in two-player zero-sum partially observable Markov games with perfect recall

Neural Information Processing Systems

We study the problem of learning a Nash equilibrium (NE) in an extensive game with imperfect information (EGII) through self-play. Precisely, we focus on two-player, zero-sum, episodic, tabular EGII under the \textit{perfect-recall} assumption where the only feedback is realizations of the game (bandit feedback). In particular the \textit{dynamics of the EGII is not known}---we can only access it by sampling or interacting with a game simulator. For this learning setting, we provide the Implicit Exploration Online Mirror Descent (IXOMD) algorithm. It is a model-free algorithm with a high-probability bound on convergence rate to the NE of order $1/\sqrt{T}$ where~$T$ is the number of played games. Moreover IXOMD is computationally efficient as it needs to perform the updates only along the sampled trajectory.


Dominated Actions in Imperfect-Information Games

Ganzfried, Sam

arXiv.org Artificial Intelligence

Dominance is a fundamental concept in game theory. In normal-form games dominated strategies can be identified in polynomial time. As a consequence, iterative removal of dominated strategies can be performed efficiently as a preprocessing step for reducing the size of a game before computing a Nash equilibrium. For imperfect-information games in extensive form, we could convert the game to normal form and then iteratively remove dominated strategies in the same way; however, this conversion may cause an exponential blowup in game size. In this paper we define and study the concept of dominated actions in imperfect-information games. Our main result is a polynomial-time algorithm for determining whether an action is dominated (strictly or weakly) by any mixed strategy in n-player games, which can be extended to an algorithm for iteratively removing dominated actions. This allows us to efficiently reduce the size of the game tree as a preprocessing step for Nash equilibrium computation. We explore the role of dominated actions empirically in "All In or Fold" No-Limit Texas Hold'em poker.



Assessing the Performance Gap Between Lexical and Semantic Models for Information Retrieval With Formulaic Legal Language

Mori, Larissa, de Oliveira, Carlos Sousa, Yih, Yuehwern, Ventresca, Mario

arXiv.org Artificial Intelligence

Legal passage retrieval is an important task that assists legal practitioners in the time-intensive process of finding relevant precedents to support legal arguments. This study investigates the task of retrieving legal passages or paragraphs from decisions of the Court of Justice of the European Union (CJEU), whose language is highly structured and formulaic, leading to repetitive patterns. Understanding when lexical or semantic models are more effective at handling the repetitive nature of legal language is key to developing retrieval systems that are more accurate, efficient, and transparent for specific legal domains. To this end, we explore when this routinized legal language is better suited for retrieval using methods that rely on lexical and statistical features, such as BM25, or dense retrieval models trained to capture semantic and contextual information. A qualitative and quantitative analysis with three complementary metrics shows that both lexical and dense models perform well in scenarios with more repetitive usage of language, whereas BM25 performs better than the dense models in more nuanced scenarios where repetition and verbatim~quotes are less prevalent and in longer queries. Our experiments also show that BM25 is a strong baseline, surpassing off-the-shelf dense models in 4 out of 7 performance metrics. However, fine-tuning a dense model on domain-specific data led to improved performance, surpassing BM25 in most metrics, and we analyze the effect of the amount of data used in fine-tuning on the model's performance and temporal robustness. The code, dataset and appendix related to this work are available on: https://github.com/larimo/lexsem-legal-ir.


Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method

Chen, Peter Baile, Zhang, Yi, Cafarella, Michael, Roth, Dan

arXiv.org Artificial Intelligence

Real-world open-domain questions can be complicated, particularly when answering them involves information from multiple information sources. LLMs have demonstrated impressive performance in decomposing complex tasks into simpler steps, and previous work has used it for better retrieval in support of complex questions. However, LLM's decomposition of questions is unaware of what data is available and how data is organized, often leading to a sub-optimal retrieval performance. Recent effort in agentic RAG proposes to perform retrieval in an iterative fashion, where a followup query is derived as an action based on previous rounds of retrieval. While this provides one way of interacting with the data collection, agentic RAG's exploration of data is inefficient because successive queries depend on previous results rather than being guided by the organization of available data in the collection. To address this problem, we propose an LLM-based retrieval method -- ARM, that aims to better align the question with the organization of the data collection by exploring relationships among data objects beyond matching the utterance of the query, thus leading to a retrieve-all-at-once solution for complex queries. We evaluated ARM on two datasets, Bird and OTT-QA. On Bird, it outperforms standard RAG with query decomposition by up to 5.2 pt in execution accuracy and agentic RAG (ReAct) by up to 15.9 pt. On OTT-QA, it achieves up to 5.5 pt and 19.3 pt higher F1 match scores compared to these approaches.


Asynchronous Agents with Perfect Recall: Model Reductions, Knowledge-Based Construction, and Model Checking for Coalitional Strategies

Gurov, Dilian, Jamroga, Filip, Jamroga, Wojciech, Kamiński, Mateusz, Kurpiewski, Damian, Penczek, Wojciech, Sidoruk, Teofil

arXiv.org Artificial Intelligence

Model checking of strategic abilities for agents with memory is a notoriously hard problem, and very few attempts have been made to tackle it. In this paper, we present two important steps towards this goal. First, we take the partial-order reduction scheme that was recently proved to preserve individual and coalitional abilities of memoryless agents, and show that it also works for agents with memory. Secondly, we take the Knowledge-Based Subset Construction, that was recently studied for synchronous concurrent games, and adapt it to preserve abilities of memoryful agents in asynchronous MAS. On the way, we also propose a new execution semantics for strategies in asynchronous MAS, that combines elements of Concurrent Game Structures and Interleaved Interpreted Systems in a natural and intuitive way.


Learning in two-player zero-sum partially observable Markov games with perfect recall

Neural Information Processing Systems

We study the problem of learning a Nash equilibrium (NE) in an extensive game with imperfect information (EGII) through self-play. Precisely, we focus on two-player, zero-sum, episodic, tabular EGII under the \textit{perfect-recall} assumption where the only feedback is realizations of the game (bandit feedback). In particular the \textit{dynamics of the EGII is not known}---we can only access it by sampling or interacting with a game simulator. For this learning setting, we provide the Implicit Exploration Online Mirror Descent (IXOMD) algorithm. It is a model-free algorithm with a high-probability bound on convergence rate to the NE of order 1/\sqrt{T} where T is the number of played games.


Approximating Perfect Recall when Model Checking Strategic Abilities: Theory and Applications

Belardinelli, Francesco (Imperial College London, United Kingdom and Laboratoire IBISC, Université d'Evry, France) | Lomuscio, Alessio (Imperial College London United Kingdom) | Malvone, Vadim | Yu, Emily ( Johannes Kepler University Linz, Austria)

Journal of Artificial Intelligence Research

The model checking problem for multi-agent systems against specifications in the alternating-time temporal logic ATL, hence ATL∗, under perfect recall and imperfect information is known to be undecidable. To tackle this problem, in this paper we investigate a notion of bounded recall under incomplete information. We present a novel three-valued semantics for ATL∗ in this setting and analyse the corresponding model checking problem. We show that the three-valued semantics here introduced is an approximation of the classic two-valued semantics, then give a sound, albeit partial, algorithm for model checking two-valued perfect recall via its approximation as three-valued bounded recall. Finally, we extend MCMAS, an open-source model checker for ATL and other agent specifications, to incorporate bounded recall; we illustrate its use and present experimental results.


Knowing How to Plan

Li, Yanjun, Wang, Yanjing

arXiv.org Artificial Intelligence

Standard Epistemic Logic (EL) mainly studies reasoning patterns of knowing that ϕ, despite early contributions by Hintikka on formulating other know-wh expressions such as knowing who and why using first-order and higher-order modal logic. In recent years, there is a resurgence of interest on epistemic logics of know-wh powered by the new techniques for fragments of firstorder modal logic based on the so-called bundle modalities packing a quantifier and a normal epistemic modality together [26, 24, 21]. Within the varieties of logics of know-wh, the logics of know-how received the most attention in AI (cf.