AITopics | conditional expectation

Collaborating Authors

conditional expectation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Adjustment Sets for Nonparametric Estimation of Weighted Controlled Direct Effect

Neural Information Processing SystemsJun-19-2026, 08:11:53 GMT

The weighted controlled direct effect (WCDE) generalizes the standard controlled direct effect (CDE) by averaging over the mediator distribution, providing a robust estimate when treatment effects vary across mediator levels. This makes the WCDE especially relevant in fairness analysis, where it isolates the direct effect of an exposure on an outcome, independent of mediating pathways. This work establishes three fundamental advances for WCDE in observational studies: First, we establish necessary and sufficient conditions for the identifiability of the WCDE, clarifying when it diverges from the CDE. Next, we consider nonparametric estimation of the WCDE and derive its influence function, focusing on the class of regular and asymptotically linear estimators. Lastly, we characterize the optimal covariate adjustment set that minimizes the asymptotic variance, demonstrating how mediator-confounder interactions introduce distinct requirements compared to average treatment effect (ATE) estimation. Using synthetic and real-world data, we validate our theory numerically, showing that the proposed optimal valid adjustment set yields the lowest variance at practical sample sizes. Our results offer a principled framework for efficient estimation of direct effects in complex causal systems, with practical applications in fairness and mediation analysis.

adjustment, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Health & Medicine (1.00)
Law (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

A Solver-Free Training Method for Predict-then-Optimize

Wan, Beichen, Liu, Mo

arXiv.org Machine LearningJun-19-2026

We propose a scalable method for training prediction (machine learning) models in the predict-then-optimize paradigm, where model outputs serve as coefficients for a subsequent linear optimization task. Directly minimizing the empirical decision regret is intractable for linear programming and combinatorial optimization since the decision mapping is piecewise constant, and the gradients are zero almost everywhere. While existing methods address this by smoothing the differentiation process, they suffer from scalability issues, since a computationally expensive solver call is required for every gradient evaluation. To address this, we propose a decision-focused learning pipeline based on a measure transformation principle, which yields a new surrogate loss that is completely optimization-solver-free during training. We establish theoretical guarantees, including Fisher consistency and excess risk bounds. Empirically, our method achieves decision quality competitive with state-of-the-art methods while reducing training time by orders of magnitude.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2606.19587

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Kernel conditional tests from learning-theoretic bounds

Neural Information Processing SystemsJun-16-2026, 23:44:49 GMT

We propose a framework for hypothesis testing on conditional probability distributions, which we then use to construct statistical tests of functionals of conditional distributions. These tests identify the inputs where the functionals differ with high probability, and include tests of conditional moments or two-sample tests. Our key idea is to transform confidence bounds of a learning method into a test of conditional expectations.

artificial intelligence, assumption, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > Germany (0.27)
North America (0.27)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.65)

Add feedback

ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model

Neural Information Processing SystemsJun-14-2026, 19:23:21 GMT

Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. First, most approaches rely on modeling continuous embeddings, which are not wellsuited for inherently discrete data distributions.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment (0.93)
Media > Music (0.67)
Transportation > Ground > Road (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

Latent Process Generator Matching

Billera, Lukas, Nordlinder, Hedwig Nora, Murrell, Ben

arXiv.org Machine LearningMay-21-2026

A related situation arises when an auxiliary process is introduced to aid training but modelling its dynamics at generation time is unnecessary or difficult, as in Billera et al. [2025b] and Kim et al. [2025]. In each of these works, the projection result and its associated loss are derived on a case-by-case basis, and all theorems are restricted to marginalization over a discrete component of the extended state space. We introduce a general framework that removes these restrictions: given a time-inhomogeneous Feller process (Yt)0 t 1 on an arbitrary state space Y and a map Φ: Y X, one may learn a linear parametrisation of the generator of a Feller process on X whose one-time marginals coincide with those of (Φ(Yt))0 t 1. For Y = X Z and Φthe projection onto the first coordinate, this subsumes these prior works as special cases, allowing for a general class of latent processes (Zt)0 t 1 in a nearly arbitrary state space Z, using the formalism of generator matching to allow for continuous, discrete, or manifold-valued processes. In particular, the learnt process at t = 1 samples from the distribution of Φ(Y1), which is the desired data distribution. We give sufficient conditions for a loss function to be valid in this general setting, recovering the results of the works cited above as corollaries. This result has broad applicability, enabling the construction of a wide array of new flow matching schemes by allowing for a more general class of latent spaces. As a concrete new application, we outline a non-projection Φ: Y X with manifold-valued latents for protein structure generation that separates chain-level rigid-body motion from internal flexibility ( 4), where the particular chain-level versus residue-level or internal state is latent, and the model only sees the world state, which we plan to implement in future work. 2 EARLIERWORK Several recent generative models train with the aid of a latent stochastic process that is marginalised out at generation time.

artificial intelligence, generator, machine learning, (17 more...)

arXiv.org Machine Learning

2605.20547

Genre: Research Report (0.41)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Ambient Diffusion: Learning Clean Distributions from Corrupted Data

Neural Information Processing SystemsMay-1-2026, 01:31:36 GMT

We present the first diffusion-based framework that can learn an unknown distribution using only highly-corrupted samples. This problem arises in scientific applications where access to uncorrupted samples is impossible or expensive to acquire. Another benefit of our approach is the ability to train generative models that are less likely to memorize any individual training sample, since they never observe clean training data. Our main idea is to introduce additional measurement distortion during the diffusion process and require the model to predict the original corrupted image from the further corrupted image. We prove that our method leads to models that learn the conditional expectation of the full uncorrupted image given this additional measurement corruption. This holds for any corruption process that satisfies some technical conditions (and in particular includes inpainting and compressed sensing). We train models on standard benchmarks (CelebA, CIFAR-10 and AFHQ) and show that we can learn the distribution even when all the training samples have 90%of their pixels missing. We also show that we can finetune foundation models on small corrupted datasets (e.g. MRI scans with block corruptions) and learn the clean distribution without memorizing the training set.

artificial intelligence, diffusion model, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.54)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Health & Medicine > Health Care Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

On Learning Fairness and Accuracy on Multiple Subgroups

Neural Information Processing SystemsApr-28-2026, 02:35:39 GMT

We propose an analysis in fair learning that preserves the utility of the data while reducing prediction disparities under the criteria of group sufficiency. We focus on the scenario where the data contains multiple or even many subgroups, each with limited number of samples. As a result, we present a principled method for learning a fair predictor for all subgroups via formulating it as a bilevel objective. In the lower-level, the subgroup-specific predictors are learned through a small amount of data and the fair predictor. In the upper-level, the fair predictor is updated to be close to all subgroup specific predictors. We further prove that such a bilevel objective can effectively control the group sufficiency and generalization error. We evaluate the proposed framework on real-world datasets. Empirical evidence suggests the consistently improved fair predictions, as well as the comparable accuracy to the baselines.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec (0.28)

Genre: Research Report (0.46)

Industry:

Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

3152e3b1e52e2cb123363787d5f76c95-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 09:03:42 GMT

artificial intelligence, machine learning, projection, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Error Propagation and Model Collapse in Diffusion Models: A Theoretical Study

Khelifa, Nail B., Turner, Richard E., Venkataramanan, Ramji

arXiv.org Machine LearningFeb-19-2026

Machine learning models are increasingly trained or fine-tuned on synthetic data. Recursively training on such data has been observed to significantly degrade performance in a wide range of tasks, often characterized by a progressive drift away from the target distribution. In this work, we theoretically analyze this phenomenon in the setting of score-based diffusion models. For a realistic pipeline where each training round uses a combination of synthetic data and fresh samples from the target distribution, we obtain upper and lower bounds on the accumulated divergence between the generated and target distributions. This allows us to characterize different regimes of drift, depending on the score estimation error and the proportion of fresh data used in each generation. We also provide empirical results on synthetic data and images to illustrate the theory.

artificial intelligence, error propagation and model collapse, machine learning, (14 more...)

arXiv.org Machine Learning

2602.16601

Country: