Goto

Collaborating Authors

 skill level


Maia-2: A Unified Model for Human-AI Alignment in Chess

Neural Information Processing Systems

There are an increasing number of domains in which artificial intelligence (AI) systems both surpass human ability and accurately model human behavior. This introduces the possibility of algorithmically-informed teaching in these domains through more relatable AI partners and deeper insights into human decision-making. Critical to achieving this goal, however, is coherently modeling human behavior at various skill levels. Chess is an ideal model system for conducting research into this kind of human-AI alignment, with its rich history as a pivotal testbed for AI research, mature superhuman AI systems like AlphaZero, and precise measurements of skill via chess rating systems. Previous work in modeling human decision-making in chess uses completely independent models to capture human style at different skill levels, meaning they lack coherence in their ability to adapt to the full spectrum of human improvement and are ultimately limited in their effectiveness as AI partners and teaching tools. In this work, we propose a unified modeling approach for human-AI alignment in chess that coherently captures human style across different skill levels and directly captures how people improve. Recognizing the complex, non-linear nature of human learning, we introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions, enabling our model to be sensitive to evolving player skill. Our experimental results demonstrate that this unified framework significantly enhances the alignment between AI and human players across a diverse range of expertise levels, paving the way for deeper insights into human decision-making and AI-guided teaching tools.


Estimation of Skill Distribution from a Tournament

Neural Information Processing Systems

In this paper, we study the problem of learning the skill distribution of a population of agents from observations of pairwise games in a tournament. These games are played among randomly drawn agents from the population. The agents in our model can be individuals, sports teams, or Wall Street fund managers. Formally, we postulate that the likelihoods of outcomes of games are governed by the parametric Bradley-Terry-Luce (or multinomial logit) model, where the probability of an agent beating another is the ratio between its skill level and the pairwise sum of skill levels, and the skill parameters are drawn from an unknown, non-parametric skill density of interest. The problem is, in essence, to learn a distribution from noisy, quantized observations.


Predicting Human Chess Moves: An AI Assisted Analysis of Chess Games Using Skill-group Specific n-gram Language Models

Zhong, Daren, Huang, Dingcheng, Greenberg, Clayton

arXiv.org Artificial Intelligence

Chess, a deterministic game with perfect information, has long served as a benchmark for studying strategic decision-making and artificial intelligence. Traditional chess engines or tools for analysis primarily focus on calculating optimal moves, often neglecting the variability inherent in human chess playing, particularly across different skill levels. To overcome this limitation, we propose a novel and computationally efficient move prediction framework that approaches chess move prediction as a behavioral analysis task. The framework employs n-gram language models to capture move patterns characteristic of specific player skill levels. By dividing players into seven distinct skill groups, from novice to expert, we trained separate models using data from the open-source chess platform Lichess. The framework dynamically selects the most suitable model for prediction tasks and generates player moves based on preceding sequences. Evaluation on real-world game data demonstrates that the model selector module within the framework can classify skill levels with an accuracy of up to 31.7\% when utilizing early game information (16 half-moves). The move prediction framework also shows substantial accuracy improvements, with our Selector Assisted Accuracy being up to 39.1\% more accurate than our benchmark accuracy. The computational efficiency of the framework further enhances its suitability for real-time chess analysis.




Discrimination in Online Markets: Effects of Social Bias on Learning from Reviews and Policy Design

Faidra Georgia Monachou, Itai Ashlagi

Neural Information Processing Systems

The increasing popularity of online two-sided markets such as ride-sharing, accommodation and freelance labor platforms, goes hand in hand with new socioeconomic challenges. One major issue remains the existence of bias and discrimination against certain social groups.


Skill-Aligned Fairness in Multi-Agent Learning for Collaboration in Healthcare

Ekpo, Promise Osaine, La, Brian, Wiener, Thomas, Agarwal, Saesha, Agrawal, Arshia, Gonzalez-Pumariega, Gonzalo, Molu, Lekan P., Taylor, Angelique

arXiv.org Artificial Intelligence

Fairness in multi-agent reinforcement learning (MARL) is often framed as a workload balance problem, overlooking agent expertise and the structured coordination required in real-world domains. In healthcare, equitable task allocation requires workload balance or expertise alignment to prevent burnout and overuse of highly skilled agents. Workload balance refers to distributing an approximately equal number of subtasks or equalised effort across healthcare workers, regardless of their expertise. We make two contributions to address this problem. First, we propose FairSkillMARL, a framework that defines fairness as the dual objective of workload balance and skill-task alignment. Second, we introduce MARLHospital, a customizable healthcare-inspired environment for modeling team compositions and energy-constrained scheduling impacts on fairness, as no existing simulators are well-suited for this problem. We conducted experiments to compare FairSkillMARL in conjunction with four standard MARL methods, and against two state-of-the-art fairness metrics. Our results suggest that fairness based solely on equal workload might lead to task-skill mismatches and highlight the need for more robust metrics that capture skill-task misalignment. Our work provides tools and a foundation for studying fairness in heterogeneous multi-agent systems where aligning effort with expertise is critical.