AITopics | variational representation

Collaborating Authors

variational representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Two Approaches to Direct Estimation of Riesz Representers

Bruns-Smith, David

arXiv.org Machine LearningMar-24-2026

The Riesz representer is a central object in semiparametric statistics and debiased/doubly-robust estimation. Two literatures in econometrics have highlighted the role for directly estimating Riesz representers: the automatic debiased machine learning literature (as in Chernozhukov et al., 2022b), and an independent literature on sieve methods for conditional moment models (as in Chen et al., 2014). These two literatures solve distinct optimization problems that in the population both have the Riesz representer as their solution. We show that with unregularized or ridge-regularized linear, sieve, or RKHS models, the two resulting estimators are numerically equivalent. However, for other regularization schemes such as the Lasso, or more general machine learning function classes including neural networks, the estimators are not necessarily equivalent. In the latter case, the Chen et al. (2014) formulation yields a novel constrained optimization problem for directly estimating Riesz representers with machine learning. Drawing on results from Birrell et al. (2022), we conjecture that this approach may offer statistical advantages at the cost of greater computational complexity.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Machine Learning

2603.20936

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.83)

Add feedback

edb446b67d69adbfe9a21068982000c2-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 18:57:36 GMT

cumulant generating function, occupancy measure, tnull, (14 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence (0.71)

Add feedback

A Implementation of PS CD Algorithm

Neural Information Processing SystemsFeb-10-2026, 22:36:12 GMT

In this section, we provide two different ways to prove Theorem 2. The first one is more straightforward and directly differentiates through the term To solve this issue, we introduce the following variational representation: Lemma 1. With Jensen's inequality, we have: log null null As introduced in Equation (9) in Section 2.3, the divergence corresponding to the This is a direct consequence of Lemma 2. It can also be verified by checking the PS-CD Lemma 3. When 1 γ < 0, we have: S We first make the following assumption, which is similar to the one used in [4, 47]: Assumption 1. The assumption is typically easy to enforce in practice. In this section, we analyze the convergence property of the PS-CD algorithm presented in Algorithm 1. We have the following theorem that characterizes the convergence property of Algorithm 2: Theorem 5. Monte Carlo estimation will incur additional approximation error.

artificial intelligence, equation, machine learning, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

On f-Divergence Principled Domain Adaptation: An Improved Framework

Neural Information Processing SystemsFeb-7-2026, 19:33:55 GMT

Unsupervised domain adaptation (UDA) plays a crucial role in addressing distribution shifts in machine learning. In this work, we improve the theoretical foundations of UDA proposed in Acuna et al. (2021) by refining their

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

Optimal Anytime-Valid Tests for Composite Nulls

Shekhar, Shubhanshu

arXiv.org Machine LearningDec-24-2025

We consider the problem of designing optimal level-$α$ power-one tests for composite nulls. Given a parameter $α\in (0,1)$ and a stream of $\mathcal{X}$-valued observations $\{X_n: n \geq 1\} \overset{i.i.d.}{\sim} P$, the goal is to design a level-$α$ power-one test $τ_α$ for the null $H_0: P \in \mathcal{P}_0 \subset \mathcal{P}(\mathcal{X})$. Prior works have shown that any such $τ_α$ must satisfy $\mathbb{E}_P[τ_α] \geq \tfrac{\log(1/α)}{γ^*(P, \mathcal{P}_0)}$, where $γ^*(P, \mathcal{P}_0)$ is the so-called $\mathrm{KL}_{\inf}$ or minimum divergence of $P$ to the null class. In this paper, our objective is to develop and analyze constructive schemes that match this lower bound as $α\downarrow 0$. We first consider the finite-alphabet case~($|\mathcal{X}| = m < \infty$), and show that a test based on \emph{universal} $e$-process~(formed by the ratio of a universal predictor and the running null MLE) is optimal in the above sense. The proof relies on a Donsker-Varadhan~(DV) based saddle-point representation of $\mathrm{KL}_{\inf}$, and an application of Sion's minimax theorem. This characterization motivates a general method for arbitrary $\mathcal{X}$: construct an $e$-process based on the empirical solutions to the saddle-point representation over a sufficiently rich class of test functions. We give sufficient conditions for the optimality of this test for compact convex nulls, and verify them for Hölder smooth density models. We end the paper with a discussion on the computational aspects of implementing our proposed tests in some practical settings.

assumption 1, denote, relative entropy, (13 more...)

arXiv.org Machine Learning

2512.20039

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On f-Divergence Principled Domain Adaptation: An Improved Framework

Neural Information Processing SystemsOct-9-2025, 18:21:25 GMT

generalization, lemma 2, representation, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

A Unified Framework for Diffusion Model Unlearning with f-Divergence

Novello, Nicola, Fontana, Federico, Cinque, Luigi, Gunduz, Deniz, Tonello, Andrea M.

arXiv.org Artificial IntelligenceSep-26-2025

Machine unlearning aims to remove specific knowledge from a trained model. While diffusion models (DMs) have shown remarkable generative capabilities, existing unlearning methods for text-to-image (T2I) models often rely on minimizing the mean squared error (MSE) between the output distribution of a target and an anchor concept. We show that this MSE-based approach is a special case of a unified $f$-divergence-based framework, in which any $f$-divergence can be utilized. We analyze the benefits of using different $f$-divergences, that mainly impact the convergence properties of the algorithm and the quality of unlearning. The proposed unified framework offers a flexible paradigm that allows to select the optimal divergence for a specific application, balancing different trade-offs between aggressive unlearning and concept preservation.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2509.21167

Country: Europe (0.45)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

edb446b67d69adbfe9a21068982000c2-Supplemental.pdf

Neural Information Processing SystemsAug-18-2025, 15:15:20 GMT

artificial intelligence, cumulant generating function, occupancy measure, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.71)

Add feedback

Proximal optimal transport divergences

Baptista, Ricardo, Birmpa, Panagiota, Katsoulakis, Markos A., Rey-Bellet, Luc, Zhang, Benjamin J.

arXiv.org Machine LearningMay-20-2025

We introduce proximal optimal transport divergence, a novel discrepancy measure that interpolates between information divergences and optimal transport distances via an infimal convolution formulation. This divergence provides a principled foundation for optimal transport proximals and proximal optimization methods frequently used in generative modeling. We explore its mathematical properties, including smoothness, boundedness, and computational tractability, and establish connections to primal-dual formulation and adversarial learning. Building on the Benamou-Brenier dynamic formulation of optimal transport cost, we also establish a dynamic formulation for proximal OT divergences. The resulting dynamic formulation is a first order mean-field game whose optimality conditions are governed by a pair of nonlinear partial differential equations, a backward Hamilton-Jacobi and a forward continuity partial differential equations.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

2505.12097

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Filters

Collaborating Authors

variational representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

bc5fcb0018cecacba559dc512740091b-Supplemental.pdf

Two Approaches to Direct Estimation of Riesz Representers

edb446b67d69adbfe9a21068982000c2-Supplemental.pdf

A Implementation of PS CD Algorithm

On f-Divergence Principled Domain Adaptation: An Improved Framework

Optimal Anytime-Valid Tests for Composite Nulls

On f-Divergence Principled Domain Adaptation: An Improved Framework

A Unified Framework for Diffusion Model Unlearning with f-Divergence

edb446b67d69adbfe9a21068982000c2-Supplemental.pdf

Proximal optimal transport divergences