AITopics | amortization

Collaborating Authors

amortization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Country:

North America > United States > California (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

StochasticAmortization

Neural Information Processing SystemsFeb-7-2026, 13:04:19 GMT

We therefore explore training amortized models with noisy labels, and we find that this is inexpensive and surprisingly effective.

artificial intelligence, lreg, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Generative Bayesian Hyperparameter Tuning

Lopes, Hedibert, Polson, Nick, Sokolov, Vadim

arXiv.org Machine LearningDec-24-2025

\noindent Hyper-parameter selection is a central practical problem in modern machine learning, governing regularization strength, model capacity, and robustness choices. Cross-validation is often computationally prohibitive at scale, while fully Bayesian hyper-parameter learning can be difficult due to the cost of posterior sampling. We develop a generative perspective on hyper-parameter tuning that combines two ideas: (i) optimization-based approximations to Bayesian posteriors via randomized, weighted objectives (weighted Bayesian bootstrap), and (ii) amortization of repeated optimization across many hyper-parameter settings by learning a transport map from hyper-parameters (including random weights) to the corresponding optimizer. This yields a ``generator look-up table'' for estimators, enabling rapid evaluation over grids or continuous ranges of hyper-parameters and supporting both predictive tuning objectives and approximate Bayesian uncertainty quantification. We connect this viewpoint to weighted $M$-estimation, envelope/auxiliary-variable representations that reduce non-quadratic losses to weighted least squares, and recent generative samplers for weighted $M$-estimators.

generator, objective, optimization, (14 more...)

arXiv.org Machine Learning

2512.20051

Country: North America > United States > Illinois > Cook County > Chicago (0.05)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Structured Embedding Models for Grouped Data

Maja Rudolph, Francisco Ruiz, Susan Athey, David Blei

Neural Information Processing SystemsNov-21-2025, 12:37:27 GMT

We study how the word usage of U.S. Congressional speeches varies across states and party affiliation, how words are used differently across sections of the ArXiv, and how the co-purchase patterns of groceries can vary across seasons.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.04)
North America > United States > Alabama (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre: Research Report > New Finding (0.68)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

AmorLIP: Efficient Language-Image Pretraining via Amortization

Sun, Haotian, Li, Yitong, Zhuang, Yuchen, He, Niao, Dai, Hanjun, Dai, Bo

arXiv.org Artificial IntelligenceOct-23-2025

Contrastive Language-Image Pretraining (CLIP) has demonstrated strong zero-shot performance across diverse downstream text-image tasks. Existing CLIP methods typically optimize a contrastive objective using negative samples drawn from each minibatch. To achieve robust representation learning, these methods require extremely large batch sizes and escalate computational demands to hundreds or even thousands of GPUs. Prior approaches to mitigate this issue often compromise downstream performance, prolong training duration, or face scalability challenges with very large datasets. To overcome these limitations, we propose AmorLIP, an efficient CLIP pretraining framework that amortizes expensive computations involved in contrastive learning through lightweight neural networks, which substantially improves training efficiency and performance. Leveraging insights from a spectral factorization of energy-based models, we introduce novel amortization objectives along with practical techniques to improve training stability. Extensive experiments across 38 downstream tasks demonstrate the superior zero-shot classification and retrieval capabilities of AmorLIP, consistently outperforming standard CLIP baselines with substantial relative improvements of up to 12.24%.

justification, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.18983

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Iterative Amortized Inference: Unifying In-Context Learning and Learned Optimizers

Mittal, Sarthak, Mahajan, Divyat, Lajoie, Guillaume, Pezeshki, Mohammad

arXiv.org Artificial IntelligenceOct-14-2025

Modern learning systems increasingly rely on amortized learning - the idea of reusing computation or inductive biases shared across tasks to enable rapid generalization to novel problems. This principle spans a range of approaches, including meta-learning, in-context learning, prompt tuning, learned optimizers and more. While motivated by similar goals, these approaches differ in how they encode and leverage task-specific information, often provided as in-context examples. In this work, we propose a unified framework which describes how such methods differ primarily in the aspects of learning they amortize - such as initializations, learned updates, or predictive mappings - and how they incorporate task data at inference. We introduce a taxonomy that categorizes amortized models into parametric, implicit, and explicit regimes, based on whether task adaptation is externalized, internalized, or jointly modeled. Building on this view, we identify a key limitation in current approaches: most methods struggle to scale to large datasets because their capacity to process task data at inference (e.g., context length) is often limited. To address this, we propose iterative amortized inference, a class of models that refine solutions step-by-step over mini-batches, drawing inspiration from stochastic optimization. Our formulation bridges optimization-based meta-learning with forward-pass amortization in models like LLMs, offering a scalable and extensible foundation for general-purpose task adaptation.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.11471

Country: North America (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution Ian Covert

Neural Information Processing SystemsOct-9-2025, 17:52:51 GMT

We therefore explore training amortized models with noisy labels, and we find that this is inexpensive and surprisingly effective.

amortization, dataset, feature attribution, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)
Europe > France (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

On the scalability of MSC V ariational inference based on KL(q||p) is scalable in the sense that it works by subsam-4 pling datasets both for exchangeable data, p (x

Neural Information Processing SystemsAug-22-2025, 00:39:37 GMT

We thank the reviewers for the constructive feedback, which will significantly improve the paper. We elaborate on this first and address specific comments and questions from the reviewers below. RWS, etc.) applications assumes the data is generated iid and and achieve scalability through use of subsampling and The current discussion in Section 3.5 for MSC on the other hand focuses on the more challenging case, We will clarify this in the revision. We compare the base versions of the respective algorithms. We will add these references to the related work section.

ariational inference, convergence, scalability, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.56)

Add feedback