AITopics | variational form

Collaborating Authors

variational form

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AnovelvariationalformoftheSchatten-pquasi-norm

Neural Information Processing SystemsFeb-11-2026, 03:12:07 GMT

Clearly, due to nonconvexity, the task of finding a minimizer of (1) canbequite challenging andpotentially limits what canbeguaranteed theoretically about solving problems with form(1).

algorithm, artificial intelligence, variational form, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

Distributional Evaluation of Generative Models via Relative Density Ratio

Xu, Yuliang, Wei, Yun, Ma, Li

arXiv.org Machine LearningDec-29-2025

We propose a function-valued evaluation metric for generative models based on the relative density ratio (RDR) designed to characterize distributional differences between real and generated samples. As an evaluation metric, the RDR function preserves $ϕ$-divergence between two distributions, enables sample-level evaluation that facilitates downstream investigations of feature-specific distributional differences, and has a bounded range that affords clear interpretability and numerical stability. Function estimation of the RDR is achieved efficiently through optimization on the variational form of $ϕ$-divergence. We provide theoretical convergence rate guarantees for general estimators based on M-estimator theory, as well as the convergence rate of neural network-based estimators when the true ratio is in the anisotropic Besov space. We demonstrate the power of the proposed RDR-based evaluation through numerical experiments on MNIST, CelebA64, and the American Gut project microbiome data. We show that the estimated RDR enables not only effective overall comparison of competing generative models, but also a convenient way to reveal the underlying nature of goodness-of-fit. This enables one to assess support overlap, coverage, and fidelity while pinpointing regions of the sample space where generators concentrate and revealing the features that drive the most salient distributional differences.

generator, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2510.25507

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.90)

Add feedback

Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration

Neural Information Processing SystemsOct-3-2025, 06:51:38 GMT

Discrete structures play an important role in applications like program language modeling and software engineering.

arxiv preprint arxiv, autoregressive model, sampler, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Learning Energy-based Variational Latent Prior for VAEs

Dutta, Debottam, Amballa, Chaitanya, Xu, Zhongweiyang, Wei, Yu-Lin, Choudhury, Romit Roy

arXiv.org Artificial IntelligenceOct-2-2025

Variational Auto-Encoders (VAEs) are known to generate blurry and inconsistent samples. One reason for this is the "prior hole" problem. A prior hole refers to regions that have high probability under the VAE's prior but low probability under the VAE's posterior. This means that during data generation, high probability samples from the prior could have low probability under the posterior, resulting in poor quality data. Ideally, a prior needs to be flexible enough to match the posterior while retaining the ability to generate samples fast. Generative models continue to address this tradeoff. This paper proposes to model the prior as an energy-based model (EBM). While EBMs are known to offer the flexibility to match posteriors (and also improving the ELBO), they are traditionally slow in sample generation due to their dependency on MCMC methods. Our key idea is to bring a variational approach to tackle the normalization constant in EBMs, thus bypassing the expensive MCMC approaches. The variational form can be approximated with a sampler network, and we show that such an approach to training priors can be formulated as an alternating optimization problem. Moreover, the same sampler reduces to an implicit variational prior during generation, providing efficient and fast sampling. We compare our Energy-based Variational Latent Prior (EVaLP) method to multiple SOTA baselines and show improvements in image generation quality, reduced prior holes, and better sampling efficiency.

artificial intelligence, evalp, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.0026

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Two-sample comparison through additive tree models for density ratios

Awaya, Naoki, Xu, Yuliang, Ma, Li

arXiv.org Machine LearningAug-19-2025

The ratio of two densities characterizes their differences. We consider learning the density ratio given i.i.d. observations from each of the two distributions. We propose additive tree models for the density ratio along with efficient algorithms for training these models using a new loss function called the balancing loss. With this loss, additive tree models for the density ratio can be trained using algorithms original designed for supervised learning. Specifically, they can be trained from both an optimization perspective that parallels tree boosting and from a (generalized) Bayesian perspective that parallels Bayesian additive regression trees (BART). For the former, we present two boosting algorithms -- one based on forward-stagewise fitting and the other based on gradient boosting, both of which produce a point estimate for the density ratio function. For the latter, we show that due to the loss function's resemblance to an exponential family kernel, the new loss can serve as a pseudo-likelihood for which conjugate priors exist, thereby enabling effective generalized Bayesian inference on the density ratio using backfitting samplers designed for BART. The resulting uncertainty quantification on the inferred density ratio is critical for applications involving high-dimensional and complex distributions in which uncertainty given limited data can often be substantial. We provide insights on the balancing loss through its close connection to the exponential loss in binary classification and to the variational form of f-divergence, in particular that of the squared Hellinger distance. Our numerical experiments demonstrate the accuracy of the proposed approach while providing unique capabilities in uncertainty quantification. We demonstrate the application of our method in a case study involving assessing the quality of generative models for microbiome compositional data.

artificial intelligence, density ratio, machine learning, (16 more...)

arXiv.org Machine Learning

2508.03059

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Multiple Wasserstein Gradient Descent Algorithm for Multi-Objective Distributional Optimization

Nguyen, Dai Hai, Mamitsuka, Hiroshi, Nakamura, Atsuyoshi

arXiv.org Machine LearningMay-27-2025

We address the optimization problem of simultaneously minimizing multiple objective functionals over a family of probability distributions. This type of Multi-Objective Distributional Optimization commonly arises in machine learning and statistics, with applications in areas such as multiple target sampling, multi-task learning, and multi-objective generative modeling. To solve this problem, we propose an iterative particle-based algorithm, which we call Muliple Wasserstein Gradient Descent (MWGraD), which constructs a flow of intermediate empirical distributions, each being represented by a set of particles, which gradually minimize the multiple objective functionals simultaneously. Specifically, MWGraD consists of two key steps at each iteration. First, it estimates the Wasserstein gradient for each objective functional based on the current particles. Then, it aggregates these gradients into a single Wasserstein gradient using dynamically adjusted weights and updates the particles accordingly. In addition, we provide theoretical analysis and present experimental results on both synthetic and real-world datasets, demonstrating the effectiveness of MWGraD.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

2505.18765

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Hokkaidō (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Generalization Bounds for Quantum Learning via Rényi Divergences

Warsi, Naqueeb Ahmad, Dasgupta, Ayanava, Hayashi, Masahito

arXiv.org Artificial IntelligenceMay-19-2025

This work advances the theoretical understanding of quantum learning by establishing a new family of upper bounds on the expected generalization error of quantum learning algorithms, leveraging the framework introduced by Caro et al. (2024) and a new definition for the expected true loss. Our primary contribution is the derivation of these bounds in terms of quantum and classical Rényi divergences, utilizing a variational approach for evaluating quantum Rényi divergences, specifically the Petz and a newly introduced modified sandwich quantum Rényi divergence. Analytically and numerically, we demonstrate the superior performance of the bounds derived using the modified sandwich quantum Rényi divergence compared to those based on the Petz divergence. Furthermore, we provide probabilistic generalization error bounds using two distinct techniques: one based on the modified sandwich quantum Rényi divergence and classical Rényi divergence, and another employing smooth max Rényi divergence.

artificial intelligence, hyp, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.11025

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Design Amortization for Bayesian Optimal Experimental Design

Kennamer, Noble, Walton, Steven, Ihler, Alexander

arXiv.org Artificial IntelligenceOct-19-2022

Bayesian optimal experimental design is a sub-field of statistics focused on developing methods to make efficient use of experimental resources. Any potential design is evaluated in terms of a utility function, such as the (theoretically well-justified) expected information gain (EIG); unfortunately however, under most circumstances the EIG is intractable to evaluate. In this work we build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the EIG. Past work focused on learning a new variational model from scratch for each new design considered. Here we present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs. To further improve computational efficiency, we also propose to train the variational model on a significantly cheaper-to-evaluate lower bound, and show empirically that the resulting model provides an excellent guide for more accurate, but expensive to evaluate bounds on the EIG. We demonstrate the effectiveness of our technique on generalized linear models, a class of statistical models that is widely used in the analysis of controlled experiments. Experiments show that our method is able to greatly improve accuracy over existing approximation strategies, and achieve these results with far better sample efficiency.

artificial intelligence, experiment, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2210.03283

Country: