Not enough data to create a plot.
Try a different view from the menu above.
Anderson, Ashton
Maia-2: A Unified Model for Human-AI Alignment in Chess
Tang, Zhenwei, Jiao, Difan, McIlroy-Young, Reid, Kleinberg, Jon, Sen, Siddhartha, Anderson, Ashton
There are an increasing number of domains in which artificial intelligence (AI) systems both surpass human ability and accurately model human behavior. This introduces the possibility of algorithmically-informed teaching in these domains through more relatable AI partners and deeper insights into human decision-making. Critical to achieving this goal, however, is coherently modeling human behavior at various skill levels. Chess is an ideal model system for conducting research into this kind of human-AI alignment, with its rich history as a pivotal testbed for AI research, mature superhuman AI systems like AlphaZero, and precise measurements of skill via chess rating systems. Previous work in modeling human decision-making in chess uses completely independent models to capture human style at different skill levels, meaning they lack coherence in their ability to adapt to the full spectrum of human improvement and are ultimately limited in their effectiveness as AI partners and teaching tools. In this work, we propose a unified modeling approach for human-AI alignment in chess that coherently captures human style across different skill levels and directly captures how people improve. Recognizing the complex, non-linear nature of human learning, we introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions, enabling our model to be sensitive to evolving player skill. Our experimental results demonstrate that this unified framework significantly enhances the alignment between AI and human players across a diverse range of expertise levels, paving the way for deeper insights into human decision-making and AI-guided teaching tools. Our implementation is available here.
Large Language Model Agents for Improving Engagement with Behavior Change Interventions: Application to Digital Mindfulness
Kumar, Harsh, Yoo, Suhyeon, Bernuy, Angela Zavaleta, Shi, Jiakai, Luo, Huayin, Williams, Joseph, Kuzminykh, Anastasia, Anderson, Ashton, Kornfield, Rachel
Although engagement in self-directed wellness exercises typically declines over time, integrating social support such as coaching can sustain it. However, traditional forms of support are often inaccessible due to the high costs and complex coordination. Large Language Models (LLMs) show promise in providing human-like dialogues that could emulate social support. Yet, in-depth, in situ investigations of LLMs to support behavior change remain underexplored. We conducted two randomized experiments to assess the impact of LLM agents on user engagement with mindfulness exercises. First, a single-session study, involved 502 crowdworkers; second, a three-week study, included 54 participants. We explored two types of LLM agents: one providing information and another facilitating self-reflection. Both agents enhanced users' intentions to practice mindfulness. However, only the information-providing LLM, featuring a friendly persona, significantly improved engagement with the exercises. Our findings suggest that specific LLM agents may bridge the social support gap in digital health interventions.
Designing Skill-Compatible AI: Methodologies and Frameworks in Chess
Hamade, Karim, McIlroy-Young, Reid, Sen, Siddhartha, Kleinberg, Jon, Anderson, Ashton
Powerful artificial intelligence systems are often used in settings where they must interact with agents that are computationally much weaker, for example when they work alongside humans or operate in complex environments where some tasks are handled by algorithms, heuristics, or other entities of varying computational power. For AI agents to successfully interact in these settings, however, achieving superhuman performance alone is not sufficient; they also need to account for suboptimal actions or idiosyncratic style from their less-skilled counterparts. We propose a formal evaluation framework for assessing the compatibility of near-optimal AI with interaction partners who may have much lower levels of skill; we use popular collaborative chess variants as model systems to study and develop AI agents that can successfully interact with lower-skill entities. Traditional chess engines designed to output near-optimal moves prove to be inadequate partners when paired with engines of various lower skill levels in this domain, as they are not designed to consider the presence of other agents. We contribute three methodologies to explicitly create skill-compatible AI agents in complex decision-making settings, and two chess game frameworks designed to foster collaboration between powerful AI agents and less-skilled partners. On these frameworks, our agents outperform state-of-the-art chess AI (based on AlphaZero) despite being weaker in conventional chess, demonstrating that skill-compatibility is a tangible trait that is qualitatively and measurably distinct from raw performance. Our evaluations further explore and clarify the mechanisms by which our agents achieve skill-compatibility.
ICL Markup: Structuring In-Context Learning using Soft-Token Tags
Brunet, Marc-Etienne, Anderson, Ashton, Zemel, Richard
Large pretrained language models (LLMs) can be rapidly adapted to a wide variety of tasks via a text-to-text approach, where the instruction and input are fed to the model in natural language. Combined with in-context learning (ICL), this paradigm is impressively flexible and powerful. However, it also burdens users with an overwhelming number of choices, many of them arbitrary. Inspired by markup languages like HTML, we contribute a method of using soft-token tags to compose prompt templates. This approach reduces arbitrary decisions and streamlines the application of ICL. Our method is a form of meta-learning for ICL; it learns these tags in advance during a parameter-efficient fine-tuning ``warm-up'' process. The tags can subsequently be used in templates for ICL on new, unseen tasks without any additional fine-tuning. Our experiments with this approach yield promising initial results, improving LLM performance on important enterprise applications such as few-shot and open-world intent detection, as well as text classification in news and legal domains.
Sparsify-then-Classify: From Internal Neurons of Large Language Models To Efficient Text Classifiers
Liu, Yilun, Jiao, Difan, Anderson, Ashton
Among the many tasks that Large Language Models (LLMs) have revolutionized is text classification. However, existing approaches for applying pretrained LLMs to text classification predominantly rely on using single token outputs from only the last layer of hidden states. As a result, they suffer from limitations in efficiency, task-specificity, and interpretability. In our work, we contribute an approach that uses all internal representations by employing multiple pooling strategies on all activation and hidden states. Our novel lightweight strategy, Sparsify-then-Classify (STC) first sparsifies task-specific features layer-by-layer, then aggregates across layers for text classification. STC can be applied as a seamless plug-and-play module on top of existing LLMs. Our experiments on a comprehensive set of models and datasets demonstrate that STC not only consistently improves the classification performance of pretrained and fine-tuned models, but is also more efficient for both training and inference, and is more intrinsically interpretable.
Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science
Veselovsky, Veniamin, Ribeiro, Manoel Horta, Arora, Akhil, Josifoski, Martin, Anderson, Ashton, West, Robert
Large Language Models (LLMs) have democratized synthetic data generation, which in turn has the potential to simplify and broaden a wide gamut of NLP tasks. Here, we tackle a pervasive problem in synthetic data generation: its generative distribution often differs from the distribution of real-world data researchers care about (in other words, it is unfaithful). In a case study on sarcasm detection, we study three strategies to increase the faithfulness of synthetic data: grounding, filtering, and taxonomy-based generation. We evaluate these strategies using the performance of classifiers trained with generated synthetic data on real-world data. While all three strategies improve the performance of classifiers, we find that grounding works best for the task at hand. As synthetic data generation plays an ever-increasing role in NLP research, we expect this work to be a stepping stone in improving its utility. We conclude this paper with some recommendations on how to generate high(er)-fidelity synthetic data for specific tasks.
Detecting Individual Decision-Making Style: Exploring Behavioral Stylometry in Chess
McIlroy-Young, Reid, Wang, Russell, Sen, Siddhartha, Kleinberg, Jon, Anderson, Ashton
The advent of machine learning models that surpass human decision-making ability in complex domains has initiated a movement towards building AI systems that interact with humans. Many building blocks are essential for this activity, with a central one being the algorithmic characterization of human behavior. While much of the existing work focuses on aggregate human behavior, an important long-range goal is to develop behavioral models that specialize to individual people and can differentiate among them. To formalize this process, we study the problem of behavioral stylometry, in which the task is to identify a decision-maker from their decisions alone. We present a transformer-based approach to behavioral stylometry in the context of chess, where one attempts to identify the player who played a set of games. Our method operates in a few-shot classification framework, and can correctly identify a player from among thousands of candidate players with 98% accuracy given only 100 labeled games. Even when trained on amateur play, our method generalises to out-of-distribution samples of Grandmaster players, despite the dramatic differences between amateur and world-class players. Finally, we consider more broadly what our resulting embeddings reveal about human style in chess, as well as the potential ethical implications of powerful methods for identifying individuals from behavioral data.
Learning Personalized Models of Human Behavior in Chess
McIlroy-Young, Reid, Wang, Russell, Sen, Siddhartha, Kleinberg, Jon, Anderson, Ashton
Even when machine learning systems surpass human ability in a domain, there are many reasons why AI systems that capture human-like behavior would be desirable: humans may want to learn from them, they may need to collaborate with them, or they may expect them to serve as partners in an extended interaction. Motivated by this goal of human-like AI systems, the problem of predicting human actions -- as opposed to predicting optimal actions -- has become an increasingly useful task. We extend this line of work by developing highly accurate personalized models of human behavior in the context of chess. Chess is a rich domain for exploring these questions, since it combines a set of appealing features: AI systems have achieved superhuman performance but still interact closely with human chess players both as opponents and preparation tools, and there is an enormous amount of recorded data on individual players. Starting with an open-source version of AlphaZero trained on a population of human players, we demonstrate that we can significantly improve prediction of a particular player's moves by applying a series of fine-tuning adjustments. The differences in prediction accuracy between our personalized models and unpersonalized models are at least as large as the differences between unpersonalized models and a simple baseline. Furthermore, we can accurately perform stylometry -- predicting who made a given set of actions -- indicating that our personalized models capture human decision-making at an individual level.
Aligning Superhuman AI with Human Behavior: Chess as a Model System
McIlroy-Young, Reid, Sen, Siddhartha, Kleinberg, Jon, Anderson, Ashton
As artificial intelligence becomes increasingly intelligent---in some cases, achieving superhuman performance---there is growing potential for humans to learn from and collaborate with algorithms. However, the ways in which AI systems approach problems are often different from the ways people do, and thus may be uninterpretable and hard to learn from. A crucial step in bridging this gap between human and artificial intelligence is modeling the granular actions that constitute human behavior, rather than simply matching aggregate human performance. We pursue this goal in a model system with a long history in artificial intelligence: chess. The aggregate performance of a chess player unfolds as they make decisions over the course of a game. The hundreds of millions of games played online by players at every skill level form a rich source of data in which these decisions, and their exact context, are recorded in minute detail. Applying existing chess engines to this data, including an open-source implementation of AlphaZero, we find that they do not predict human moves well. We develop and introduce Maia, a customized version of Alpha-Zero trained on human chess games, that predicts human moves at a much higher accuracy than existing engines, and can achieve maximum accuracy when predicting decisions made by players at a specific skill level in a tuneable way. For a dual task of predicting whether a human will make a large mistake on the next move, we develop a deep neural network that significantly outperforms competitive baselines. Taken together, our results suggest that there is substantial promise in designing artificial intelligence systems with human collaboration in mind by first accurately modeling granular human decision-making.
Understanding the Origins of Bias in Word Embeddings
Brunet, Marc-Etienne, Alkalay-Houlihan, Colleen, Anderson, Ashton, Zemel, Richard
The power of machine learning systems not only promises great technical progress, but risks societal harm. As a recent example, researchers have shown that popular word embedding algorithms exhibit stereotypical biases, such as gender bias. The widespread use of these algorithms in machine learning systems, from automated translation services to curriculum vitae scanners, can amplify stereotypes in important contexts. Although methods have been developed to measure these biases and alter word embeddings to mitigate their biased representations, there is a lack of understanding in how word embedding bias depends on the training data. In this work, we develop a technique for understanding the origins of bias in word embeddings. Given a word embedding trained on a corpus, our method identifies how perturbing the corpus will affect the bias of the resulting embedding. This can be used to trace the origins of word embedding bias back to the original training documents. Using our method, one can investigate trends in the bias of the underlying corpus and identify subsets of documents whose removal would most reduce bias. We demonstrate our techniques on both a New York Times and Wikipedia corpus and find that our influence function-based approximations are extremely accurate.