AITopics | sequential game

Stochastic contextual bandits with graph feedback: from independence number to MAS number Y uxiao Wen Y anjun Han

Neural Information Processing SystemsFeb-15-2026, 22:37:24 GMT

The framework of formulating the feedback structure as feedback graphs in bandits has a long history (Mannor and Shamir, 2011; Alon et al., 2015, 2017; Lykouris et al.,

bandit, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Learning to Correlate in Multi-Player General-Sum Sequential Games

Andrea Celli, Alberto Marchesi, Tommaso Bianchi, Nicola Gatti

Neural Information Processing SystemsFeb-12-2026, 04:36:12 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, cce, cfr-jr, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas (0.04)
North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Games > Poker (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

65cf25ef90de99d93fa96dc49d0d8b3c-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 17:05:13 GMT

algorithm, learner, opponent, (13 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada (0.04)
(2 more...)

Industry:

Transportation > Infrastructure & Services (0.68)
Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.96)

Add feedback

Adversarially Robust Decision Transformer

Neural Information Processing SystemsDec-24-2025, 21:48:21 GMT

Decision Transformer (DT), as one of the representative Reinforcement Learning via Supervised Learning (RvS) methods, has achieved strong performance in offline learning tasks by leveraging the powerful Transformer architecture for sequential decision-making. However, in adversarial environments, these methods can be non-robust, since the return is dependent on the strategies of both the decision-maker and adversary. Training a probabilistic model conditioned on observed return to predict action can fail to generalize, as the trajectories that achieve a return in the dataset might have done so due to a suboptimal behavior adversary. To address this, we propose a worst-case-aware RvS algorithm, the Adversarially Robust Decision Transformer (ARDT), which learns and conditions the policy on in-sample minimax returns-to-go. ARDT aligns the target return with the worst-case return learned through minimax expectile regression, thereby enhancing robustness against powerful test-time adversaries. In experiments conducted on sequential games with full data coverage, ARDT can generate a maximin (Nash Equilibrium) strategy, the solution with the largest adversarial robustness. In large-scale sequential games and continuous adversarial RL environments with partial data coverage, ARDT demonstrates significantly superior robustness to powerful test-time adversaries and attains higher worst-case returns compared to contemporary DT methods.

artificial intelligence, machine learning, reinforcement learning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

Exploiting Opponents Under Utility Constraints in Sequential Games

Neural Information Processing SystemsDec-24-2025, 06:33:37 GMT

Recently, game-playing agents based on AI techniques have demonstrated super-human performance in several sequential games, such as chess, Go, and poker. Surprisingly, the multi-agent learning techniques that allowed to reach these achievements do not take into account the actual behavior of the human player, potentially leading to an impressive gap in performances. In this paper, we address the problem of designing artificial agents that learn how to effectively exploit unknown human opponents while playing repeatedly against them in an online fashion. We study the case in which the agent's strategy during each repetition of the game is subject to constraints ensuring that the human's expected utility is within some lower and upper thresholds. Our framework encompasses several real-world problems, such as human engagement in repeated game playing and human education by means of serious games. As a first result, we formalize a set of linear inequalities encoding the conditions that the agent's strategy must satisfy at each iteration in order to do not violate the given bounds for the human's expected utility. Then, we use such formulation in an upper confidence bound algorithm, and we prove that the resulting procedure suffers from sublinear regret and guarantees that the constraints are satisfied with high probability at each iteration. Finally, we empirically evaluate the convergence of our algorithm on standard testbeds of sequential games.

exploiting opponent, name change, utility constraint, (6 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.59)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.59)

Add feedback

Stochastic contextual bandits with graph feedback: from independence number to MAS number Y uxiao Wen Y anjun Han

Neural Information Processing SystemsOct-10-2025, 06:29:37 GMT

The framework of formulating the feedback structure as feedback graphs in bandits has a long history (Mannor and Shamir, 2011; Alon et al., 2015, 2017; Lykouris et al.,

algorithm, bandit, contextual bandit, (15 more...)

Neural Information Processing Systems

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Learning to Play Sequential Games versus Unknown Opponents

Neural Information Processing SystemsOct-9-2025, 14:42:10 GMT

To this end, we use kernel-based regularity assumptions to capture and exploit the structure in the opponent's response. We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.

artificial intelligence, machine learning, opponent, (15 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America (0.28)

Industry:

Transportation > Infrastructure & Services (0.68)
Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.96)

Add feedback

Learning to Correlate in Multi-Player General-Sum Sequential Games

Andrea Celli, Alberto Marchesi, Tommaso Bianchi, Nicola Gatti

Neural Information Processing SystemsOct-2-2025, 17:48:19 GMT

In this paper, we focus on coarse correlated equilibria (CCEs) in sequential games.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Games > Poker (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Appendix of the paper Exploiting Opponents under Utility Constraints in Sequential Games

Neural Information Processing SystemsAug-15-2025, 01:37:29 GMT

Appendix A provides the proofs omitted from Section 4.1, describing the method adopted Appendix B provides the proofs omitted from Section 4.2, describing the method adopted Appendix D provides some additional experimental results. A Proofs omitted from Section 4.1 B Proofs omitted from Section 4.2 Theorem 2. Let t [T ] and δ (0, 1). The proof follows the reasoning outlined in Section 4.2. Before proving Theorem 3, we need to show the following technical lemma. Lemma 4. Let f (τ):= This is reasonable in practice, since a new player can always be profiled according to a number of user classes.

algorithm, equation, probability, (13 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback

Adversarially Robust Decision Transformer

Neural Information Processing SystemsMay-26-2025, 20:59:10 GMT

Decision Transformer (DT), as one of the representative Reinforcement Learning via Supervised Learning (RvS) methods, has achieved strong performance in offline learning tasks by leveraging the powerful Transformer architecture for sequential decision-making. However, in adversarial environments, these methods can be non-robust, since the return is dependent on the strategies of both the decision-maker and adversary. Training a probabilistic model conditioned on observed return to predict action can fail to generalize, as the trajectories that achieve a return in the dataset might have done so due to a suboptimal behavior adversary. To address this, we propose a worst-case-aware RvS algorithm, the Adversarially Robust Decision Transformer (ARDT), which learns and conditions the policy on in-sample minimax returns-to-go. ARDT aligns the target return with the worst-case return learned through minimax expectile regression, thereby enhancing robustness against powerful test-time adversaries.

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Filters

Collaborating Authors

sequential game

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Stochastic contextual bandits with graph feedback: from independence number to MAS number Y uxiao Wen Y anjun Han

Learning to Correlate in Multi-Player General-Sum Sequential Games

65cf25ef90de99d93fa96dc49d0d8b3c-Paper.pdf

Adversarially Robust Decision Transformer

Exploiting Opponents Under Utility Constraints in Sequential Games

Stochastic contextual bandits with graph feedback: from independence number to MAS number Y uxiao Wen Y anjun Han

Learning to Play Sequential Games versus Unknown Opponents

Learning to Correlate in Multi-Player General-Sum Sequential Games

Appendix of the paper Exploiting Opponents under Utility Constraints in Sequential Games

Adversarially Robust Decision Transformer