AITopics | Gerstgrasser, Matthias

Collaborating Authors

Gerstgrasser, Matthias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Collapse or Thrive? Perils and Promises of Synthetic Data in a Self-Generating World

Kazdan, Joshua, Schaeffer, Rylan, Dey, Apratim, Gerstgrasser, Matthias, Rafailov, Rafael, Donoho, David L., Koyejo, Sanmi

arXiv.org Artificial IntelligenceDec-16-2024

The increasing presence of AI-generated content on the internet raises a critical question: What happens when generative machine learning models are pretrained on web-scale datasets containing data created by earlier models? Some authors prophesy \textit{model collapse} under a `{\it replace}' scenario: a sequence of models, the first trained with real data and each later one trained {\it only on} synthetic data from its preceding model. In this scenario, models successively degrade. Others see collapse as avoidable; in an `{\it accumulate}' scenario, a sequence of models is trained, but each training uses all real and synthetic data generated so far. In this work, we deepen and extend the study of these contrasting scenarios. First, collapse versus avoidance of collapse is studied by comparing the replace and accumulate scenarios on each of three prominent generative modeling settings; we find the same contrast emerges in all three settings. Second, we study a compromise scenario; the available data remains the same as in the {\it accumulate} scenario -- but unlike {\it accumulate} and like {\it replace}, each model is trained using a fixed compute budget; we demonstrate that model test loss on real data is larger than in the {\it accumulate} scenario, but apparently plateaus, unlike the divergence seen with {\it replace}. Third, we study the relative importance of cardinality and proportion of real data for avoiding model collapse. Surprisingly, we find a non-trivial interaction between real and synthetic data, where the value of synthetic data for reducing test loss depends on the absolute quantity of real data. Our insights are particularly important when forecasting whether future frontier generative models will collapse or thrive, and our results open avenues for empirically and mathematically studying the context-dependent value of synthetic data.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.16713

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

Gerstgrasser, Matthias, Schaeffer, Rylan, Dey, Apratim, Rafailov, Rafael, Sleight, Henry, Hughes, John, Korbak, Tomasz, Agrawal, Rajashree, Pai, Dhruv, Gromov, Andrey, Roberts, Daniel A., Yang, Diyi, Donoho, David L., Koyejo, Sanmi

arXiv.org Machine LearningApr-1-2024

The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investigations into model-data feedback loops discovered that such loops can lead to model collapse, a phenomenon where performance progressively degrades with each model-fitting iteration until the latest model becomes useless. However, several recent papers studying model collapse assumed that new data replace old data over time rather than assuming data accumulate over time. In this paper, we compare these two settings and show that accumulating data prevents model collapse. We begin by studying an analytically tractable setup in which a sequence of linear models are fit to the previous models' predictions. Previous work showed if data are replaced, the test error increases linearly with the number of model-fitting iterations; we extend this result by proving that if data instead accumulate, the test error has a finite upper bound independent of the number of iterations. We next empirically test whether accumulating data similarly prevents model collapse by pretraining sequences of language models on text corpora. We confirm that replacing data does indeed cause model collapse, then demonstrate that accumulating data prevents model collapse; these results hold across a range of model sizes, architectures and hyperparameters. We further show that similar results hold for other deep generative models on real data: diffusion models for molecule generation and variational autoencoders for image generation. Our work provides consistent theoretical and empirical evidence that data accumulation mitigates model collapse.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2404.01413

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

Add feedback

Grounding or Guesswork? Large Language Models are Presumptive Grounders

Shaikh, Omar, Gligorić, Kristina, Khetan, Ashna, Gerstgrasser, Matthias, Yang, Diyi, Jurafsky, Dan

arXiv.org Artificial IntelligenceNov-15-2023

Effective conversation requires common ground: a shared understanding between the participants. Common ground, however, does not emerge spontaneously in conversation. Speakers and listeners work together to both identify and construct a shared basis while avoiding misunderstanding. To accomplish grounding, humans rely on a range of dialogue acts, like clarification (What do you mean?) and acknowledgment (I understand.). In domains like teaching and emotional support, carefully constructing grounding prevents misunderstanding. However, it is unclear whether large language models (LLMs) leverage these dialogue acts in constructing common ground. To this end, we curate a set of grounding acts and propose corresponding metrics that quantify attempted grounding. We study whether LLMs use these grounding acts, simulating them taking turns from several dialogue datasets, and comparing the results to humans. We find that current LLMs are presumptive grounders, biased towards assuming common ground without using grounding acts. To understand the roots of this behavior, we examine the role of instruction tuning and reinforcement learning with human feedback (RLHF), finding that RLHF leads to less grounding. Altogether, our work highlights the need for more research investigating grounding in human-AI interaction.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.09144

Country:

North America > United States > New York (0.14)
North America > United States > Colorado (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design

Brero, Gianluca, Eden, Alon, Chakrabarti, Darshan, Gerstgrasser, Matthias, Greenwald, Amy, Li, Vincent, Parkes, David C.

arXiv.org Artificial IntelligenceNov-9-2023

We introduce a reinforcement learning framework for economic design where the interaction between the environment designer and the participants is modeled as a Stackelberg game. In this game, the designer (leader) sets up the rules of the economic system, while the participants (followers) respond strategically. We integrate algorithms for determining followers' response strategies into the leader's learning environment, providing a formulation of the leader's learning problem as a POMDP that we call the Stackelberg POMDP. We prove that the optimal leader's strategy in the Stackelberg game is the optimal policy in our Stackelberg POMDP under a limited set of possible policies, establishing a connection between solving POMDPs and Stackelberg games. We solve our POMDP under a limited set of policy options via the centralized training with decentralized execution framework. For the specific case of followers that are modeled as no-regret learners, we solve an array of increasingly complex settings, including problems of indirect mechanism design where there is turn-taking and limited communication by agents. We demonstrate the effectiveness of our training framework through ablation studies. We also give convergence results for no-regret learners to a Bayesian version of a coarse-correlated equilibrium, extending known results to the case of correlated types.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2210.03852

Genre: Research Report (0.50)

Industry:

Education (0.68)
Leisure & Entertainment > Games (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Selectively Sharing Experiences Improves Multi-Agent Reinforcement Learning

Gerstgrasser, Matthias, Danino, Tom, Keren, Sarah

arXiv.org Artificial IntelligenceNov-1-2023

We present a novel multi-agent RL approach, Selective Multi-Agent Prioritized Experience Relay, in which agents share with other agents a limited number of transitions they observe during training. The intuition behind this is that even a small number of relevant experiences from other agents could help each agent learn. Unlike many other multi-agent RL algorithms, this approach allows for largely decentralized training, requiring only a limited communication channel between agents. We show that our approach outperforms baseline no-sharing decentralized training and state-of-the art multi-agent RL algorithms. Further, sharing only a small number of highly relevant experiences outperforms sharing all experiences between agents, and the performance uplift from selective experience sharing is robust across a range of hyperparameters and DQN variants. A reference implementation of our algorithm is available at https://github.com/mgerstgrasser/super.

artificial intelligence, multi-agent reinforcement learning, selectively

arXiv.org Artificial Intelligence

2311.00865

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning

Gerstgrasser, Matthias, Parkes, David C.

arXiv.org Artificial IntelligenceJun-1-2023

Stackelberg equilibria arise naturally in a range of popular learning problems, such as in security games or indirect mechanism design, and have received increasing attention in the reinforcement learning literature. We present a general framework for implementing Stackelberg equilibria search as a multi-agent RL problem, allowing a wide range of algorithmic design choices. We discuss how previous approaches can be seen as specific instantiations of this framework. As a key insight, we note that the design space allows for approaches not previously seen in the literature, for instance by leveraging multitask and meta-RL techniques for follower convergence. We propose one such approach using contextual policies, and evaluate it experimentally on both standard and novel benchmark domains, showing greatly improved sample efficiency compared to previous approaches. Finally, we explore the effect of adopting algorithm designs outside the borders of our framework.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2210.11942

Country: North America > United States (0.28)

Genre:

Research Report (0.82)
Workflow (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Collaboration Promotes Group Resilience in Multi-Agent AI

Keren, Sarah, Gerstgrasser, Matthias, Abu, Ofir, Rosenschein, Jeffrey

arXiv.org Artificial IntelligenceDec-9-2022

Reinforcement Learning (RL) agents are typically required to operate in dynamic environments, and must develop an ability to quickly adapt to unexpected perturbations in their environment. Promoting this ability is hard, even in single-agent settings Padakandla (2020). For a group this is even more challenging; in addition to the dynamic nature of the environment, agents need to deal with high variance caused by changes in the behavior of other agents. Unsurprisingly, many recent Multi-Agent RL (MARL) works have shown the beneficial effect collaboration between agents has on their performance Xu, Rao, and Bu (2012); Foerster et al. (2016); Lowe et al. (2017); Qian et al. (2019); Jaques et al. (2019); Christianos, Schäfer, and Albrecht (2020). Our objective is to highlight the relationship between a group's ability to collaborate effectively and the group's resilience, which we measure as the group's ability to adapt to perturbations in the environment. Thus, agents that collaborate not only increase their expected utility in a given environment, but are also able to recover a larger fraction of the previous performance after a perturbation occurs. Contrary to investigations of transfer learning Zhu, Lin, and Zhou (2020); Liang and Li (2020) or curriculum learning Portelas et al. (2020), we do not have a stationary target domain in which

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2111.06614

Genre: Research Report (0.82)

Industry: Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning of Simple Indirect Mechanisms

Brero, Gianluca, Eden, Alon, Gerstgrasser, Matthias, Parkes, David C., Rheingans-Yoo, Duncan

arXiv.org Artificial IntelligenceOct-2-2020

Over the last fifty years, a large body of research in microeconomics has introduced many different mechanisms for resource allocation. Despite the wide variety of available options, "simple" mechanisms such as posted price and serial dictatorship are often preferred for practical applications, including housing allocation [Abdulkadiroğlu and Sönmez, 1998], online procurement [Badanidiyuru et al., 2012], or allocation of medical appointments [Klaus and Nichifor, 2019]. There has been considerable interest in formalizing different notions of simplicity. Li [2017] identifies mechanisms that are particularly simple from a strategic perspective, introducing the concept of obviously strategyproof mechanisms; under obviously strategyproof mechanisms, it is obvious that an agent cannot profit by trying to game the system, as even the worst possible final outcome from behaving truthfully is at least as good as the best possible outcome from any other strategy. Pycia and Troyan [2019] introduce the still stronger concept of strongly obviously strategyproof (SOSP) mechanisms, and show that this class can essentially be identified with sequential price mechanisms, where agents are visited in turn and offered a choice from a menu of options (which may or may not include transfers). SOSP mechanisms are ones in which an agent is not even required to consider her future (truthful) actions to understand that the mechanism is obviously strategyproof.

agent, deep learning, game theory, (22 more...)

arXiv.org Artificial Intelligence

2010.0118

Genre: Research Report > New Finding (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Game Theory (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)
(2 more...)

Add feedback

Riemannian tangent space mapping and elastic net regularization for cost-effective EEG markers of brain atrophy in Alzheimer's disease

Fruehwirt, Wolfgang, Gerstgrasser, Matthias, Zhang, Pengfei, Weydemann, Leonard, Waser, Markus, Schmidt, Reinhold, Benke, Thomas, Dal-Bianco, Peter, Ransmayr, Gerhard, Grossegger, Dieter, Garn, Heinrich, Peters, Gareth W., Roberts, Stephen, Dorffner, Georg

arXiv.org Machine LearningNov-22-2017

The diagnosis of Alzheimer's disease (AD) in routine clinical practice is most commonly based on subjective clinical interpretations. Quantitative electroencephalography (QEEG) measures have been shown to reflect neurodegenerative processes in AD and might qualify as affordable and thereby widely available markers to facilitate the objectivization of AD assessment. Here, we present a novel framework combining Riemannian tangent space mapping and elastic net regression for the development of brain atrophy markers. While most AD QEEG studies are based on small sample sizes and psychological test scores as outcome measures, here we train and test our models using data of one of the largest prospective EEG AD trials ever conducted, including MRI biomarkers of brain atrophy.

health & medicine, neural network, neurology, (17 more...)

arXiv.org Machine Learning

1711.08359

Country: Europe > Austria (0.48)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback