AITopics | in-context

Collaborating Authors

in-context

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimization Inspired Few-Shot Adaptation for Large Language Models

Neural Information Processing SystemsJun-12-2026, 06:50:19 GMT

Large Language Models (LLMs) have demonstrated remarkable performance in real-world applications. However, adapting LLMs to novel tasks via fine-tuning often requires substantial training data and computational resources that are impractical in few-shot scenarios. Existing approaches, such as In-context learning and Parameter-Efficient Fine-Tuning (PEFT), face key limitations: In-context learning introduces additional inference computational overhead with limited performance gains, while PEFT models are prone to overfitting on the few demonstration examples.

large language model, machine learning, natural language, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.99)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

58692a1701314e09cbd7a5f5f3871cc9-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 03:02:36 GMT

large language model, machine learning, transience, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

The Transient Nature of Emergent In-Context Learning in Transformers Aaditya K. Singh Gatsby Unit, UCL Stephanie C.Y. Chan

Neural Information Processing SystemsOct-10-2025, 23:18:49 GMT

Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it.

large language model, machine learning, transience, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Frog Soup: Zero-Shot, In-Context, and Sample-Efficient Frogger Agents

Li, Xiang, Hao, Yiyang, Fulop, Doug

arXiv.org Artificial IntelligenceMay-8-2025

RL game playing agents are traditionally initialized with zero pre-existing knowledge about a specific game environment and learn to play the game through millions of interactions with the environment. Significant time and compute is often spent exploring states that will not be experienced during high scoring policies. Exploration is particularly challenging in environments that require long horizon action sequences and provide sparse rewards, such as the Atari games and real-world robotics challenges where the state space is too large to effectively sample through free-form exploration. In this paper we will explore whether pretrained general RL agents like reasoning LLMs can play Atari games and investigate ways to leverage pretrained RL agents to reduce the training samples for training smaller agents from scratch. We first explore whether the contextual under-1 Stanford University.

in-context, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.03947

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Workflow (0.91)
Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection

Spliethöver, Maximilian, Knebler, Tim, Fumagalli, Fabian, Muschalik, Maximilian, Hammer, Barbara, Hüllermeier, Eyke, Wachsmuth, Henning

arXiv.org Artificial IntelligenceFeb-10-2025

Recent advances on instruction fine-tuning have led to the development of various prompting techniques for large language models, such as explicit reasoning steps. However, the success of techniques depends on various parameters, such as the task, language model, and context provided. Finding an effective prompt is, therefore, often a trial-and-error process. Most existing approaches to automatic prompting aim to optimize individual techniques instead of compositions of techniques and their dependence on the input. To fill this gap, we propose an adaptive prompting approach that predicts the optimal prompt composition ad-hoc for a given input. We apply our approach to social bias detection, a highly context-dependent task that requires semantic understanding. We evaluate it with three large language models on three datasets, comparing compositions to individual techniques and other baselines. The results underline the importance of finding an effective prompt composition. Our approach robustly ensures high detection performance, and is best in several settings. Moreover, first experiments on other tasks support its generalizability.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.06487

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
(13 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

In-context denoising with one-layer transformers: connections between attention and associative memory retrieval

Smart, Matthew, Bietti, Alberto, Sengupta, Anirvan M.

arXiv.org Artificial IntelligenceFeb-7-2025

We introduce in-context denoising, a task that refines the connection between attention-based architectures and dense associative memory (DAM) networks, also known as modern Hopfield networks. Using a Bayesian framework, we show theoretically and empirically that certain restricted denoising problems can be solved optimally even by a single-layer transformer. We demonstrate that a trained attention layer processes each denoising prompt by performing a single gradient descent update on a context-aware DAM energy landscape, where context tokens serve as associative memories and the query token acts as an initial state. This one-step update yields better solutions than exact retrieval of either a context token or a spurious local minimum, providing a concrete example of DAM networks extending beyond the standard retrieval paradigm. Overall, this work solidifies the link between associative memory and attention mechanisms first identified by Ramsauer et al., and demonstrates the relevance of associative memory models in the study of in-context learning.

artificial intelligence, machine learning, transformer, (15 more...)

arXiv.org Artificial Intelligence

2502.05164

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Austria (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(2 more...)

Add feedback

In-context learning for medical image segmentation

Takaya, Eichi, Yamamoto, Shinnosuke

arXiv.org Artificial IntelligenceDec-17-2024

Annotation of medical images, such as MRI and CT scans, is crucial for evaluating treatment efficacy and planning radiotherapy. However, the extensive workload of medical professionals limits their ability to annotate large image datasets, posing a bottleneck for AI applications in medical imaging. To address this, we propose In-context Cascade Segmentation (ICS), a novel method that minimizes annotation requirements while achieving high segmentation accuracy for sequential medical images. ICS builds on the UniverSeg framework, which performs few-shot segmentation using support images without additional training. By iteratively adding the inference results of each slice to the support set, ICS propagates information forward and backward through the sequence, ensuring inter-slice consistency. We evaluate the proposed method on the HVSMR dataset, which includes segmentation tasks for eight cardiac regions. Experimental results demonstrate that ICS significantly improves segmentation performance in complex anatomical regions, particularly in maintaining boundary consistency across slices, compared to baseline methods. The study also highlights the impact of the number and position of initial support slices on segmentation accuracy. ICS offers a promising solution for reducing annotation burdens while delivering robust segmentation results, paving the way for its broader adoption in clinical and research applications.

artificial intelligence, machine learning, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2412.13299

Country:

Asia > Japan > Honshū > Tōhoku (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

In-context learning and Occam's razor

Elmoznino, Eric, Marty, Tom, Kasetty, Tejas, Gagnon, Leo, Mittal, Sarthak, Fathi, Mahan, Sridhar, Dhanya, Lajoie, Guillaume

arXiv.org Artificial IntelligenceDec-5-2024

A central goal of machine learning is generalization. While the No Free Lunch Theorem states that we cannot obtain theoretical guarantees for generalization without further assumptions, in practice we observe that simple models which explain the training data generalize best: a principle called Occam's razor. Despite the need for simple models, most current approaches in machine learning only minimize the training error, and at best indirectly promote simplicity through regularization or architecture design. Here, we draw a connection between Occam's razor and in-context learning: an emergent ability of certain sequence models like Transformers to learn at inference time from past observations in a sequence. In particular, we show that the next-token prediction loss used to train in-context learners is directly equivalent to a data compression technique called prequential coding, and that minimizing this loss amounts to jointly minimizing both the training error and the complexity of the model that was implicitly learned from context. Our theory and the empirical experiments we use to support it not only provide a normative account of in-context learning, but also elucidate the shortcomings of current in-context learning methods, suggesting ways in which they can be improved. We make our code available at https://github.com/3rdCore/PrequentialCode.

complexity, kolmogorov complexity, prequential code length, (14 more...)

arXiv.org Artificial Intelligence

2410.14086

Country:

Europe > Austria > Vienna (0.14)
Oceania > New Zealand (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(3 more...)

Add feedback

No Size Fits All: The Perils and Pitfalls of Leveraging LLMs Vary with Company Size

Urlana, Ashok, Kumar, Charaka Vinayak, Garlapati, Bala Mallikarjunarao, Singh, Ajeet Kumar, Mishra, Rahul

arXiv.org Artificial IntelligenceJul-21-2024

Large language models (LLMs) are playing a pivotal role in deploying strategic use cases across a range of organizations, from large pan-continental companies to emerging startups. The issues and challenges involved in the successful utilization of LLMs can vary significantly depending on the size of the organization. It is important to study and discuss these pertinent issues of LLM adaptation with a focus on the scale of the industrial concerns and brainstorm possible solutions and prospective directions. Such a study has not been prominently featured in the current research literature. In this study, we adopt a threefold strategy: first, we conduct a case study with industry practitioners to formulate the key research questions; second, we examine existing industrial publications to address these questions; and finally, we provide a practical guide for industries to utilize LLMs more efficiently.

language model, llm, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2408.01444

Country:

Asia > Singapore (0.05)
North America > Canada > Ontario > Toronto (0.05)
South America > Uruguay > Maldonado > Maldonado (0.04)
(9 more...)

Genre:

Research Report > New Finding (0.86)
Research Report > Experimental Study (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Law (0.93)
Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey

Urlana, Ashok, Kumar, Charaka Vinayak, Singh, Ajeet Kumar, Garlapati, Bala Mallikarjunarao, Chalamala, Srinivasa Rao, Mishra, Rahul

arXiv.org Artificial IntelligenceFeb-22-2024

Large language models (LLMs) have become the secret ingredient driving numerous industrial applications, showcasing their remarkable versatility across a diverse spectrum of tasks. From natural language processing and sentiment analysis to content generation and personalized recommendations, their unparalleled adaptability has facilitated widespread adoption across industries. This transformative shift driven by LLMs underscores the need to explore the underlying associated challenges and avenues for enhancement in their utilization. In this paper, our objective is to unravel and evaluate the obstacles and opportunities inherent in leveraging LLMs within an industrial context. To this end, we conduct a survey involving a group of industry practitioners, develop four research questions derived from the insights gathered, and examine 68 industry papers to address these questions and derive meaningful conclusions.

application, computational linguistic, llm, (14 more...)

arXiv.org Artificial Intelligence

2402.14558

Country:

North America > United States > Arizona > Maricopa County > Tempe (0.14)
Asia > Singapore (0.05)
North America > Canada > Ontario > Toronto (0.05)
(12 more...)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.48)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback