AITopics | generator model

d00904cebc0d5b69fada8ad33d0f1422-Supplemental-Conference.pdf

Neural Information Processing SystemsMay-1-2026, 04:46:51 GMT

artificial intelligence, machine learning, pixel budget, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.49)
Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback

One-Step Diffusion Distillation through Score Implicit Matching

Neural Information Processing SystemsMar-22-2026, 14:30:23 GMT

Despite their strong performances on many generative tasks, diffusion models require a large number of sampling steps in order to generate realistic samples. This has motivated the community to develop effective methods to distill pre-trained diffusion models into more efficient models, but these methods still typically require few-step inference or perform substantially worse than the underlying model.

artificial intelligence, diffusion model, machine learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Supplementary Material for Learning Energy-based Model via Dual-MCMC Teaching

Neural Information Processing SystemsFeb-12-2026, 07:57:34 GMT

We show additional image synthesis in Fig.2. For reported numbers in main text, we adopt the network structure that contains Residue Blocks (see implementation details in Tab.5). We then test our model for the task of image inpainting. As shown in Fig.1, our This is the marginal version of Eqn.8 shown in the main text. 2 2.3 Learning Algorithm Three models are trained in an alternative and iterative manner based on the current model parameters. Compared to Eqn.3 and Eqn.6 in the main text, Eqn.5 and Eqn.6 start with initial points initialized We present the learning algorithm in Alg.1.

artificial intelligence, machine learning, nef, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.56)

Add feedback

Learning Energy-based Model via Dual-MCMC Teaching

Neural Information Processing SystemsFeb-12-2026, 07:57:31 GMT

This paper studies the fundamental learning problem of the energy-based model (EBM).

artificial intelligence, generator model, machine learning, (12 more...)

Neural Information Processing Systems

Country: Europe > France (0.04)

Genre: Research Report (0.48)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

AdaptiveMulti-stageDensityRatioEstimationfor LearningLatentSpaceEnergy-basedModel

Neural Information Processing SystemsFeb-10-2026, 12:31:33 GMT

Toeffectively tackle this issue and learn more expressiveprior models, wedevelop theadaptivemulti-stage density ratio estimation which breaks the estimation into multiple stages and learn different stages ofdensity ratiosequentially andadaptively. Thelatent priormodel canbe gradually learned using ratio estimated in previous stage so that the final latent spaceEBMpriorcanbenaturally formed byproduct ofratiosindifferentstages. The proposed method enables informativeand much sharper prior than existing baselines, and can be trained efficiently.

artificial intelligence, estimation, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Ontario > Toronto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning Energy-based Model via Dual-MCMC Teaching

Neural Information Processing SystemsDec-25-2025, 12:06:47 GMT

This paper studies the fundamental learning problem of the energy-based model (EBM). Learning the EBM can be achieved using the maximum likelihood estimation (MLE), which typically involves the Markov Chain Monte Carlo (MCMC) sampling, such as the Langevin dynamics. However, the noise-initialized Langevin dynamics can be challenging in practice and hard to mix. This motivates the exploration of joint training with the generator model where the generator model serves as a complementary model to bypass MCMC sampling. However, such a method can be less accurate than the MCMC and result in biased EBM learning.

Add feedback

Learning Latent Space Energy-Based Prior Model

Neural Information Processing SystemsDec-24-2025, 22:11:40 GMT

We propose an energy-based model (EBM) in the latent space of a generator model, so that the EBM serves as a prior model that stands on the top-down network of the generator model. Both the latent space EBM and the top-down network can be learned jointly by maximum likelihood, which involves short-run MCMC sampling from both the prior and posterior distributions of the latent vector. Due to the low dimensionality of the latent space and the expressiveness of the top-down network, a simple EBM in latent space can capture regularities in the data effectively, and MCMC sampling in latent space is efficient and mixes well. We show that the learned model exhibits strong performances in terms of image and text generation and anomaly detection. The one-page code can be found in supplementary materials.

latent space, name change, top-down network, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.61)

Add feedback

A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation

Lawton, Neal Gregory, Samuel, Alfy, Kumar, Anoop, Liu, Daben

arXiv.org Artificial IntelligenceOct-21-2025

A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation Download PDF Neal Gregory Lawton, Alfy Samuel, Anoop Kumar, Daben Liu Published: 20 Aug 2025, Retrieval augmented generation (RAG) is a popular framework for question answering that is powered by two large language models (LLMs): an embedding model that retrieves context documents from a database that are relevant to a given question, and a generator model that uses the retrieved context to generate an answer to the question. Both the embedding and generator models can be fine-tuned to increase performance of a RAG pipeline on a new task, but multiple fine-tuning strategies exist with different costs and benefits. In this paper, we evaluate and compare several RAG fine-tuning strategies, including independent, joint, and two-phase fine-tuning. In our experiments, we observe that all of these strategies achieve about equal improvement in EM and F1 generation quality metrics, although they have significantly different computational costs. We conclude the optimal fine-tuning strategy to use depends on whether the training dataset includes context labels and whether a grid search over the learning rates for the embedding and generator models is required.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.016

Genre: Research Report (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Systematic Diagnosis of Brittle Reasoning in Large Language Models

Parupudi, V. S. Raghu

arXiv.org Artificial IntelligenceOct-13-2025

A central question in artificial intelligence is the extent to which machine learning models comprehend mathematics. To address this, we propose a novel framework for measuring mathematical reasoning that moves beyond standard benchmarks to diagnose specific failure points. Our method first generates structured, step-by-step reasoning from gpt-3.5-turbo on the GSM8K dataset. We then use a more capable analyst model, gpt-4o-mini, to categorize errors and, crucially, perform an unsupervised clustering of every reasoning sentence to identify emergent "reasoning modes." This analysis reveals a cognitive profile with a stark, nonhuman-like brittleness: while the model achieves near-perfect accuracy on procedural modes like sequential calculation, its performance on modes requiring combinatorial reasoning with restrictions plummets. By identifying and quantifying the reliability of these distinct reasoning skills, our work provides a more granular method to evaluate mathematical comprehension and offers a precise roadmap for developing new capabilities and more reliable future applications.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.08595

Country: North America > United States > California > San Diego County (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.50)

Add feedback

Supplementary Material for Learning Energy-based Model via Dual-MCMC Teaching

Neural Information Processing SystemsOct-8-2025, 18:27:31 GMT

We show additional image synthesis in Fig.2. For reported numbers in main text, we adopt the network structure that contains Residue Blocks (see implementation details in Tab.5). We then test our model for the task of image inpainting. As shown in Fig.1, our This is the marginal version of Eqn.8 shown in the main text. 2 2.3 Learning Algorithm Three models are trained in an alternative and iterative manner based on the current model parameters. Compared to Eqn.3 and Eqn.6 in the main text, Eqn.5 and Eqn.6 start with initial points initialized We present the learning algorithm in Alg.1.

artificial intelligence, machine learning, nef, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.56)

Add feedback

Filters

Collaborating Authors

generator model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

d00904cebc0d5b69fada8ad33d0f1422-Supplemental-Conference.pdf

One-Step Diffusion Distillation through Score Implicit Matching

Supplementary Material for Learning Energy-based Model via Dual-MCMC Teaching

Learning Energy-based Model via Dual-MCMC Teaching

AdaptiveMulti-stageDensityRatioEstimationfor LearningLatentSpaceEnergy-basedModel

Learning Energy-based Model via Dual-MCMC Teaching

Learning Latent Space Energy-Based Prior Model

A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation

Systematic Diagnosis of Brittle Reasoning in Large Language Models

Supplementary Material for Learning Energy-based Model via Dual-MCMC Teaching