AITopics

2606.29104

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Machine LearningJun-30-2026

ITSPACE: Monotone Gaussian Optimal Transport Updates

Na, Woojoo, Dy, Jennifer

Covariance matrices serve as compact descriptors of feature distributions in many machine-learning pipelines, including domain adaptation and Gaussian embeddings. Under a centered Gaussian approximation, the unregularized Wasserstein-2 optimal-transport (OT) discrepancy admits a closed form on covariances given by the Bures-Wasserstein (BW) objective on the symmetric positive definite (SPD) cone. We propose ITSPACE (Iterative Transport for Stable Proximal Alignment of Covariance Embeddings), a proximal majorization-minimization method that directly optimizes this exact BW objective through closed-form updates in a square-root factorization. In exact arithmetic, each iteration satisfies a sufficient-decrease inequality for the BW objective; under inexact polar computations, we provide an explicit certificate-gap bound controlling deviations from exact descent. The resulting iterations preserve PSD structure by construction and naturally support rank-restricted factors, making ITSPACE well-suited as a lightweight inner-loop primitive in settings where adaptation must be performed from unlabeled target batches under strict step and compute budgets. Across real-world covariance-alignment benchmarks, ITSPACE reaches low-BW-gap solutions substantially faster than BW-gradient descent, methods based on other covariance geometries, and entropically regularized sample-OT baselines.

artificial intelligence, machine learning, objective, (16 more...)

2606.30523

Country: Asia (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

arXiv.org Machine LearningJun-29-2026

The Decision Geometry of Covariance Estimation for the Global Minimum-Variance Portfolio under Heavy Tails

Fonseca, Xavier

The global minimum-variance portfolio (GMVP) is the canonical decision built from an estimated covariance matrix, yet covariance estimators are universally evaluated by matrix-norm loss, which is not the object the decision depends on. We characterise exactly how covariance-estimation error maps into GMVP suboptimality. We prove an exact regret identity and a non-asymptotic bound showing decision regret depends on the estimation error only through its action on the portfolio weights, scaled by portfolio concentration and the conditioning of the true covariance. From this we derive the decision geometry: GMVP regret is invariant to a (p-1)-dimensional projection of the p^2-dimensional error matrix, with invariance to the covariance-scale direction as an exact special case. We then apply the framework to heavy-tailed returns (tail index kappa in (2,4)), establishing the regret convergence rate implied by the centred operator-norm rate, and confirm the theory on a skew-t/t-copula simulation design with pre-registered analysis. The decision-focused advantage is a sharper constant and a concentration discount rather than a faster rate; we report an honest high-conditioning boundary of the rate prediction. The results complement recent decision-focused learning approaches by supplying the exact estimation geometry and consistency theory they lack.

artificial intelligence, machine learning, portfolio, (16 more...)

2606.27462

Country:

North America (0.46)
Europe (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Ouyang, Nathan, Wang, Kexin, Seigal, Anna

Tensor-based second-order causal discovery

arXiv.org Machine LearningJun-24-2026

Causal discovery seeks to uncover the causal dependencies among variables. For this purpose, we propose an algorithm called Tensor-based Second-order Causal Discovery (TSCD). Its input is a tensor obtained from the covariance matrices of observational and interventional data. Assuming the causal dependencies follow a linear structural equation model on a directed acyclic graph (DAG), TSCD outputs the DAG and the functions on its edges, requiring only that the noise variables are uncorrelated. We also implement a version of the approach for nonlinear models. Our focus on second-order statistics (via the covariance matrices) is motivated by their statistical and computational efficiency relative to higher-order moments, their identifiability relative to first-order statistics, and that they work regardless of whether the variables are Gaussian. We show that TSCD has identifiable causal order and parameters from a number of interventions that is logarithmic in the number of variables. Experiments show that TSCD is robust to noise, competitive with existing methods, and scales to hundreds of variables.

artificial intelligence, decomposition, machine learning, (19 more...)

2606.18074

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Neural Information Processing SystemsJun-23-2026, 03:51:29 GMT

Neural Evolution Strategy for Black-box Pareto Set Learning

Multi-objective optimization problems (MOPs) are prevalent in numerous realworld applications. Recently, Pareto Set Learning (PSL) has emerged as a powerful paradigm for solving MOPs. PSL can produce a neural network for modeling the set of all Pareto optimal solutions. However, applying PSL to black-box objectives, particularly those exhibiting non-separability, high dimensionality, and/or other complex properties, remains very challenging. To address this issue, we propose leveraging evolution strategies (ESs), a class of specialized blackbox optimization algorithms, within the PSL paradigm. Traditional ESs capture the complex dimensional dependencies less efficiently, which can significantly hinder their performance in PSL. To tackle this issue, we suggest encapsulating the dependencies within a neural network, which is then trained using a novel gradient estimation method. The proposed method, termed Neural-ES, is evaluated using a bespoke benchmark suite for black-box PSL. Experimental comparisons with other methods demonstrate the efficiency of Neural-ES, underscoring its ability to learn the Pareto sets of challenging black-box MOPs.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

Country: Asia > China (0.46)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Neural Information Processing SystemsJun-22-2026, 21:47:39 GMT

Variational Inference with Mixtures of Isotropic Gaussians

Variational inference (VI) is a popular approach in Bayesian inference, that looks for the best approximation of the posterior distribution within a parametric family, minimizing a loss that is typically the (reverse) Kullback-Leibler (KL) divergence. In this paper, we focus on the following parametric family: mixtures of isotropic Gaussians (i.e., with diagonal covariance matrices proportional to the identity) and uniform weights. We develop a variational framework and provide efficient algorithms suited for this family. In contrast with mixtures of Gaussian with generic covariance matrices, this choice presents a balance between accurate approximations of multimodal Bayesian posteriors, while being memory and computationally efficient. Our algorithms implement gradient descent on the location of the mixture components (the modes of the Gaussians), and either (an entropic) Mirror or Bures descent on their variance parameters. We illustrate the performance of our algorithms on numerical experiments.

Neural Information Processing SystemsJun-21-2026, 12:42:52 GMT

Seeds of Structure: Patch PCAReveals Universal Compositional Cues in Diffusion Models

Diffusion models transform random noise into images of remarkable fidelity, yet the structure of this noise-to-image map remains largely unexplored. We investigate this relationship using patch-wise Principal Component Analysis (PCA) and empirically demonstrate that low-frequency components of the initial noise predominantly influence the compositional structure of generated images. Our analyses reveal that noise seeds inherently contain universal compositional cues, evident when identical seeds produce images with similar structural attributes across different datasets and model architectures. Leveraging these insights, we develop and theoretically justify a simple yet effective Patch PCA denoiser that extracts underlying structure from noise using only generic natural image statistics. The robustness of these structural cues is observed to persist across both pixel-space models and latent diffusion models, highlighting their fundamental nature. Finally, we introduce a zero-shot editing method that enables injecting compositional control over generated images, providing an intuitive approach to guided generation without requiring model fine-tuning or additional training.

diffusion model, large language model, machine learning, (19 more...)

Country:

North America > United States > California (0.28)
North America > United States > Missouri (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Neural Information Processing SystemsJun-19-2026, 10:57:55 GMT

Locality in Image Diffusion Models Emerges from Data Statistics

Recent work has shown that the generalization ability of image diffusion models arises from the locality properties of the trained neural network. In particular, when denoising a particular pixel, the model relies on a limited neighborhood of the input image around that pixel, which, according to the previous work, is tightly related to the ability of these models to produce novel images. Since locality is central to generalization, it is crucial to understand why diffusion models learn local behavior in the first place, as well as the factors that govern the properties of locality patterns. In this work, we present evidence that the locality in deep diffusion models emerges as a statistical property of the image dataset and is not due to the inductive bias of convolutional neural networks, as suggested in previous work. Specifically, we demonstrate that an optimal parametric linear denoiser exhibits similar locality properties to deep neural denoisers. We show, both theoretically and experimentally, that this locality arises directly from pixel correlations present in the image datasets. Moreover, locality patterns are drastically different on specialized datasets, approximating principal components of the data's covariance. We use these insights to craft an analytical denoiser that better matches scores predicted by a deep diffusion model than prior expert-crafted alternatives. Our key takeaway is that while neural network architectures influence generation quality, their primary role is to capture locality patterns inherent in the data.

artificial intelligence, deep learning, machine learning, (18 more...)

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsJun-19-2026, 01:49:47 GMT

Topology-Aware Conformal Prediction for Stream Networks

Existing approaches either neglect dependencies, leading to overly conservative predictions, or rely solely on data-driven estimations, failing to capture the rich topological structure of the network. To address these challenges, we propose Spatio-Temporal Adaptive Conformal Inference (STACI), a novel framework that integrates network topology and temporal dynamics into the conformal prediction framework. STACIintroduces a topology-aware nonconformity score that respects directional flow constraints and dynamically adjusts prediction sets to account for temporal distributional shifts. We provide theoretical guarantees on the validity of our approach and demonstrate its superior performance on both synthetic and real-world datasets. Our results show that STACIeffectively balances prediction efficiency and coverage, outperforming existing conformal prediction methods for stream networks.

data mining, machine learning, prediction, (21 more...)

Country: North America > United States > Illinois (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Energy (0.67)
Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Mauri, Lorenzo, Dunson, David B.

Overfitted high-dimensional matrix factorizations via adaptive spectral shrinkage

arXiv.org Machine LearningJun-19-2026

Factor models are popular approaches for analyzing high-dimensional data to extract low-rank signals and estimate covariances. They decompose the covariance matrix as the sum of low-rank and diagonal components. A key issue is how to choose the latent dimension $k$, which is particularly challenging when the factor model only holds approximately and in low signal-to-noise scenarios. Bayesian overfitted factor models specify an upper bound on $k$ and rely on structured shrinkage priors to effectively remove extra components. Such approaches are popular and effective, but computationally expensive. We propose a much faster \texttt{EigenBayes} approach that provides valid uncertainty quantification, based on spectral estimation of latent factors and adaptive empirical Bayes calibration of key hyperparameters. The resulting posterior distribution factorizes across outcomes and is analytically tractable, bypassing Markov chain Monte Carlo. We show that \texttt{EigenBayes} adapts to the signal-to-noise ratio of each outcome and latent dimension, while shrinking superfluous latent components to zero. We establish favorable asymptotic properties and demonstrate strong empirical performance in numerical experiments and a genomics application, where EigenBayes outperforms state-of-the-art alternatives.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2606.1954

Country: Europe (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)