AITopics | transience

Collaborating Authors

transience

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

58692a1701314e09cbd7a5f5f3871cc9-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 03:02:36 GMT

large language model, machine learning, transience, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

The Transient Nature of Emergent In-Context Learning in Transformers Aaditya K. Singh Gatsby Unit, UCL Stephanie C.Y. Chan

Neural Information Processing SystemsOct-10-2025, 23:18:49 GMT

Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it.

large language model, machine learning, transience, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Predictability Shapes Adaptation: An Evolutionary Perspective on Modes of Learning in Transformers

Ku, Alexander Y., Griffiths, Thomas L., Chan, Stephanie C. Y.

arXiv.org Artificial IntelligenceMay-16-2025

Transformer models learn in two distinct modes: in-weights learning (IWL), encoding knowledge into model weights, and in-context learning (ICL), adapting flexibly to context without weight modification. To better understand the interplay between these learning modes, we draw inspiration from evolutionary biology's analogous adaptive strategies: genetic encoding (akin to IWL, adapting over generations and fixed within an individual's lifetime) and phenotypic plasticity (akin to ICL, enabling flexible behavioral responses to environmental cues). In evolutionary biology, environmental predictability dictates the balance between these strategies: stability favors genetic encoding, while reliable predictive cues promote phenotypic plasticity. We experimentally operationalize these dimensions of predictability and systematically investigate their influence on the ICL/IWL balance in Transformers. Using regression and classification tasks, we show that high environmental stability decisively favors IWL, as predicted, with a sharp transition at maximal stability. Conversely, high cue reliability enhances ICL efficacy, particularly when stability is low. Furthermore, learning dynamics reveal task-contingent temporal evolution: while a canonical ICL-to-IWL shift occurs in some settings (e.g., classification with many classes), we demonstrate that scenarios with easier IWL (e.g., fewer classes) or slower ICL acquisition (e.g., regression) can exhibit an initial IWL phase later yielding to ICL dominance. These findings support a relative-cost hypothesis for explaining these learning mode transitions, establishing predictability as a critical factor governing adaptive strategies in Transformers, and offering novel insights for understanding ICL and guiding training methodologies.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.09855

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

Strategy Coopetition Explains the Emergence and Transience of In-Context Learning

Singh, Aaditya K., Moskovitz, Ted, Dragutinovic, Sara, Hill, Felix, Chan, Stephanie C. Y., Saxe, Andrew M.

arXiv.org Artificial IntelligenceMar-10-2025

In-context learning (ICL) is a powerful ability that emerges in transformer models, enabling them to learn from context without weight updates. Recent work has established emergent ICL as a transient phenomenon that can sometimes disappear after long training times. In this work, we sought a mechanistic understanding of these transient dynamics. Firstly, we find that, after the disappearance of ICL, the asymptotic strategy is a remarkable hybrid between in-weights and in-context learning, which we term "context-constrained in-weights learning" (CIWL). CIWL is in competition with ICL, and eventually replaces it as the dominant strategy of the model (thus leading to ICL transience). However, we also find that the two competing strategies actually share sub-circuits, which gives rise to cooperative dynamics as well. For example, in our setup, ICL is unable to emerge quickly on its own, and can only be enabled through the simultaneous slow development of asymptotic CIWL. CIWL thus both cooperates and competes with ICL, a phenomenon we term "strategy coopetition." We propose a minimal mathematical model that reproduces these key dynamics and interactions. Informed by this model, we were able to identify a setup where ICL is truly emergent and persistent.

icl, sequence, transience, (15 more...)

arXiv.org Artificial Intelligence

2503.05631

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)

Genre: Research Report (0.65)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Scopes of Alignment

Varshney, Kush R., Ashktorab, Zahra, Bouneffouf, Djallel, Riemer, Matthew, Weisz, Justin D.

arXiv.org Artificial IntelligenceJan-14-2025

Much of the research focus on AI alignment seeks to align large language models and other foundation models to the context-less and generic values of helpfulness, harmlessness, and honesty. Frontier model providers also strive to align their models with these values. In this paper, we motivate why we need to move beyond such a limited conception and propose three dimensions for doing so. The first scope of alignment is competence: knowledge, skills, or behaviors the model must possess to be useful for its intended purpose. The second scope of alignment is transience: either semantic or episodic depending on the context of use. The third scope of alignment is audience: either mass, public, small-group, or dyadic. At the end of the paper, we use the proposed framework to position some technologies and workflows that go beyond prevailing notions of alignment.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.12405

Country:

North America > United States (0.28)
Asia > China (0.05)
Europe > Switzerland (0.04)

Genre: Research Report (0.70)

Industry:

Law (1.00)
Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

Differential learning kinetics govern the transition from memorization to generalization during in-context learning

Nguyen, Alex, Reddy, Gautam

arXiv.org Artificial IntelligenceDec-12-2024

Transformers exhibit in-context learning (ICL): the ability to use novel information presented in the context without additional weight updates. Recent work shows that ICL emerges when models are trained on a sufficiently diverse set of tasks and the transition from memorization to generalization is sharp with increasing task diversity. One interpretation is that a network's limited capacity to memorize favors generalization. Here, we examine the mechanistic underpinnings of this transition using a small transformer applied to a synthetic ICL task. Using theory and experiment, we show that the sub-circuits that memorize and generalize can be viewed as largely independent. The relative rates at which these sub-circuits learn explains the transition from memorization to generalization, rather than capacity constraints. We uncover a memorization scaling law, which determines the task diversity threshold at which the network generalizes. The theory quantitatively explains a variety of other ICL-related phenomena, including the long-tailed distribution of when ICL is acquired, the bimodal behavior of solutions close to the task diversity threshold, the influence of contextual and data distributional statistics on ICL, and the transient nature of ICL.

icl, sequence, transition, (14 more...)

arXiv.org Artificial Intelligence

2412.00104

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting

Anand, Suraj, Lepori, Michael A., Merullo, Jack, Pavlick, Ellie

arXiv.org Artificial IntelligenceJul-1-2024

Language models have the ability to perform in-context learning (ICL), allowing them to flexibly adapt their behavior based on context. This contrasts with in-weights learning, where information is statically encoded in model parameters from iterated observations of the data. Despite this apparent ability to learn in-context, language models are known to struggle when faced with unseen or rarely seen tokens. Hence, we study $\textbf{structural in-context learning}$, which we define as the ability of a model to execute in-context learning on arbitrary tokens -- so called because the model must generalize on the basis of e.g. sentence structure or task structure, rather than semantic content encoded in token embeddings. An ideal model would be able to do both: flexibly deploy in-weights operations (in order to robustly accommodate ambiguous or unknown contexts using encoded semantic information) and structural in-context operations (in order to accommodate novel tokens). We study structural in-context algorithms in a simple part-of-speech setting using both practical and toy models. We find that active forgetting, a technique that was recently introduced to help models generalize to new languages, forces models to adopt structural in-context learning solutions. Finally, we introduce $\textbf{temporary forgetting}$, a straightforward extension of active forgetting that enables one to control how much a model relies on in-weights vs. in-context solutions. Importantly, temporary forgetting allows us to induce a $\textit{dual process strategy}$ where in-context and in-weights solutions coexist within a single model.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.00053

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Education (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Add feedback

The Transient Nature of Emergent In-Context Learning in Transformers

Singh, Aaditya K., Chan, Stephanie C. Y., Moskovitz, Ted, Grant, Erin, Saxe, Andrew M., Hill, Felix

arXiv.org Artificial IntelligenceDec-11-2023

Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it. Prior work has provided a deeper understanding of how ICL emerges in transformers, e.g., through the lens of mechanistic interpretability, Bayesian inference, or by examining the distributional properties of training data. However, in each of these cases, ICL is treated largely as a persistent phenomenon; namely, once ICL emerges, it is assumed to persist asymptotically. Here, we show that the emergence of ICL during transformer training is, in fact, often transient. We train transformers on synthetic data designed so that both ICL and in-weights learning (IWL) strategies can lead to correct predictions. We find that ICL first emerges, then disappears and gives way to IWL, all while the training loss decreases, indicating an asymptotic preference for IWL. The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models. We find that L2 regularization may offer a path to more persistent ICL that removes the need for early stopping based on ICL-style validation tasks. Finally, we present initial evidence that ICL transience may be caused by competition between ICL and IWL circuits.

icl, icl transience, transience, (16 more...)

arXiv.org Artificial Intelligence

2311.0836

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Transience, Replication, and the Paradox of Social Robotics

RobohubNov-4-2019, 19:01:29 GMT

An Art, Technology, and Culture Colloquium, co-sponsored by the Center for New Music and Audio Technologies and CITRIS People and Robots (CPAR), presented with Berkeley Arts Design as part of Arts Design Mondays. As we continue to develop social robots designed for connectedness, we struggle with paradoxes related to authenticity, transience, and replication. In this talk, I will attempt to link together 15 years of experience designing social robots with 100-year-old texts on transience, replication, and the fear of dying. Can there be meaningful relationships with robots who do not suffer natural decay? What would our families look like if we all choose to buy identical robotic family members?

citris people, citris people and robot, co-sponsored, (9 more...)

Robohub

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.06)
Oceania > New Zealand (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
(4 more...)

Industry:

Media (0.52)
Leisure & Entertainment (0.37)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.41)

Add feedback