AITopics | coupling

Collaborating Authors

coupling

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Saddle Networks: Structure-Preserving Architectures for Convex-Concave Functions

Warin, Xavier

arXiv.org Machine LearningMay-29-2026

Saddle-point models arise throughout optimization, optimal transport, robust learning, and control. In many applications, the relevant function f(x,y) is convex in x and concave in y, and preserving this geometry is essential for obtaining tractable min--max formulations and reliable certificates. We introduce a structured separable decomposition that preserves the convex-concave geometry and prove a complete one-dimensional approximation theorem under a mixed Monge-type convexity condition. We then describe practical saddle network architectures that preserve convexity in x and concavity in y by construction. The proposed architectures require only convexity-preserving neural networks, together with simple output transformations enforcing sign and concavity constraints. Finally, we report numerical benchmarks in dimension 1 and 5, showing that the proposed saddle networks achieve high accuracy on smooth, nonsmooth, and high-rank convex--concave test functions.

artificial intelligence, convex, machine learning, (17 more...)

arXiv.org Machine Learning

2605.28894

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Counterfactually Fair Regression via Optimal Transport

Lince, M. Generali, Gaucher, S., Vie, J-J., Loiseau, P.

arXiv.org Machine LearningMay-28-2026

We consider the problem of learning a counterfactually fair regressor. We adopt a causal uncertainty view in which counterfactual fairness is defined with resampled noise. We focus on obtaining theoretical fairness guarantees for a new post-processing estimator. We begin by showing that counterfactual fairness is equivalent to satisfying demographic parity conditional on the latent variable. This allows us to provide a closed-form expression of the optimal fair regressor via a barycentric quantile map. In order to handle continuous latent variables, we propose a discretized post-processing method. Then, under mild regularity assumptions, we prove high-probability finite-sample fairness guarantees for our estimator, providing an unfairness decay at rate $\tilde O(n^{-1/3})$, and establishing a matching risk bound of order $\tilde O(n^{-1/3})$. We provide a matching lower bound on the excess risk of almost fair predictions. Finally, we extend our results to the setting of relaxed counterfactual fairness. We validate our approach on real-world and synthetic data.

artificial intelligence, machine learning, predictor, (19 more...)

arXiv.org Machine Learning

2605.28251

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.94)
Law (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Quadratically Regularized Optimal Transport: Localization Bounds and Affine Case Analysis

Nguyen-Chi, Long, Nguyen, Nam, Nguyen, Binh

arXiv.org Machine LearningMay-27-2026

Quadratic regularization has emerged as a potential alternative to the popular entropic regularization in computational optimal transport, offering the theoretical advantage of producing sparse couplings through its hinge density structure. Despite recent progress in one-dimensional settings and general upper bounds, fundamental questions about the localization rate of QOT optimizers around the Monge coupling have remained open. In this work, we establish a general lower bound showing that the support of the QOT optimizer cannot concentrate around the Monge graph faster than order $\varepsilon^{\frac{1}{d+2}}$ in the directed Hausdorff distance, matching the conjectured optimal exponent under standard regularity assumptions in \citet{wiesel2025sparsity}. We also show that the QOT value gap controls the mean-squared deviation $\mathbb E_{π_\varepsilon}\|y-T(x)\|^2$ by the scale of $\varepsilon^{\frac{2}{d+2}}$. As a corollary, in the affine Brenier regime, which includes Gaussian-to-Gaussian transport, we derive a sharp pointwise tube bound of order $\varepsilon^{\frac{1}{d+2}}$ by reducing the problem to self-transport and applying recent self-transport sparsity results. Finally, we validate our theoretical bound with a synthetic experiment in high-dimensional settings.

artificial intelligence, machine learning, quadratically regularized optimal transport, (13 more...)

arXiv.org Machine Learning

2605.24644

Country: Asia (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Sliced-Regularized Optimal Transport

Nguyen, Khai

arXiv.org Machine LearningMay-21-2026

We propose a new regularized optimal transport (OT) formulation, termed sliced-regularized optimal transport (SROT). Unlike entropic OT (EOT), which regularizes the transport plan toward an independent coupling, SROT regularizes it toward a smoothened sliced OT (SOT) plan. To the best of our knowledge, SROT is the first approach to leverage a version of SOT plan as a reference to improve classical OT. We provide a formal definition of SROT, derive its dual formulation, and provide a post-Bayesian interpretation of SROT. We then develop a Sinkhorn-style algorithm for efficient computation, retaining the same scalability advantages as EOT. By incorporating a scalable SOT plan as a prior, SROT yields more accurate approximations of the exact OT plan than EOT under the same level of regularization. Moreover, the resulting transport plan improves upon the reference SOT plan itself. We further introduce the corresponding OT divergence induced by SROT, named SROT divergence, and analyze its topological and computational properties. Finally, we validate our approach through experiments on synthetic datasets and color transfer tasks, demonstrating that SROT is better than both EOT and SOT in approximating exact OT. Additional experiments on gradient flows further highlight the advantages of SROT divergence.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2604.23944

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Optimizing Computational-Statistical Runtime for Wasserstein Distance Estimation

Jacobs, Peter Matthew, Phillips, Jeff M.

arXiv.org Machine LearningMay-20-2026

Squared Wasserstein distance is a frequently used tool to measure discrepancy between probability distributions. This distance is typically computed between empirical measures of size $n$ from two underlying random samples. Unfortunately, even in lower dimensional Euclidean space problems $\left( d \in \{2,3\} \right)$, algorithms for Wasserstein distance computation with approximate or exact precision guarantees scale poorly in the runtime as a function of $n$ and the desired precision. In response, we consider the computational-statistical runtime, where the goal is to estimate from samples the Wasserstein distance between potentially smooth measures up to $ε$-additive error in expectation with respect to the sampling; we allow $O(1)$ computational cost for collecting a sample. Towards this, we develop a Sample-Sketch-Solve paradigm where we introduce a regular cartesian grid sketch of the samples. We show that (especially under $α$-Hölder smooth distributions) this can compress the data without increasing asymptotic error, and also regularizes the structure which enables faster exact algorithms. Ultimately, we approximate $W_2^2(P,Q)$ within $ε$ error in $ε^{-\max(2,\frac{d+1+o(1)}{1+α})}$ time for $0 < α< 1$ Hölder smooth distributions $P,Q$ on $(0,1)^{d}$; an optimal $Θ(ε^{-2})$ for $α> 1/2$ when $d=2$ and nearly optimal as $α\to 1$ when $d = 3$.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2605.20122

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Reasoning Models Don't Just Think Longer, They Move Differently

Gjølbye, Anders, Hansen, Lars Kai, Koyejo, Sanmi

arXiv.org Machine LearningMay-18-2026

Reasoning-trained language models often spend more tokens on harder problems, but longer chains of thought do not show whether a model is merely computing for more steps or following a different internal trajectory. We study this distinction through hidden-state trajectories during chain-of-thought generation across competitive programming, mathematics, and Boolean satisfiability. Raw trajectory geometry is strongly shaped by generation length: longer generations mechanically alter path statistics, so difficulty-dependent comparisons are misleading without adjustment. After residualizing trajectory statistics on length, difficulty remains systematically coupled to corrected trajectory geometry across all domains studied. The clearest reasoning-specific separation appears in the code domain, where harder problems show more direct corrected trajectories and less heterogeneous local curvature in reasoning-trained models than in matched instruction-tuned baselines. Corrected difficulty-geometry coupling is weaker, but still present, in mathematics and Boolean satisfiability. Prompt-stage linear probes do not mirror the code-domain separation, and behavioral annotations show that stronger corrected coupling co-occurs with strategy shifts and uncertainty monitoring. Together, these findings establish length correction as a prerequisite for generation-time trajectory analysis and show that reasoning training can be associated with distinct corrected trajectory geometry, with the strength of the effect depending on the domain.

large language model, machine learning, trajectory, (22 more...)

arXiv.org Machine Learning

2605.15454

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

Breaking the Finite-Sample Barrier in Entropy Coupling

Asoodeh, Shahab, Chen, Jun

arXiv.org Machine LearningMay-18-2026

Dependence among marginally constrained observations can break a finite-sample barrier. To formalize this phenomenon, we introduce the \emph{minimum list entropy coupling} $H(P\|Q_1,\dots,Q_m)$, the minimum conditional entropy $H(X|Y_1,\dots,Y_m)$ over all joint distributions with prescribed discrete marginals $X\sim P$ and $Y_i\sim Q_i$. Unlike classical formulations based on independent observations, our model allows $Y_1,\dots,Y_m$ to be arbitrarily dependent while keeping each marginal fixed. This enlarged coupling space reveals a sharp dichotomy: independent observations reduce residual uncertainty exponentially, whereas dependent observations can eliminate it exactly after finitely many samples. We characterize this zero-entropy regime through necessary and sufficient conditions and give concrete structural criteria under which it occurs. In particular, under mild support assumptions, zero entropy is achieved with $O(\log(1/P_{\min}))$ observations, where $P_{\min}$ is the minimum nonzero mass of $P$. We also develop a greedy algorithm with monotone approximation guarantees for computing $H(P\|Q_1,\dots,Q_m)$. Finally, we show that the same framework formalizes finite-sample limits in distribution-matching representation learning and randomness extraction, where zero entropy corresponds to exact recovery and exact extraction.

artificial intelligence, coupling, machine learning, (19 more...)

arXiv.org Machine Learning

2605.16229

Country: North America (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.45)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Finite-size scaling of hetero-associative retrieval in continuous-signal-driven Ising spin systems

Ladiana, Andrea

arXiv.org Machine LearningMay-15-2026

Kosko's Bidirectional Associative Memory [17] first formalised this idea for two layers, showing that stable recallContent-addressable memory--the recovery of a complete stored record from a partial or degraded cue--is aarises from the same energy-descent principle as in Hopcornerstone of neural computation and a paradigmaticfield networks but across two distinct pattern spaces: a problem in the statistical mechanics of disordered sys-cue presented to one layer drives the other toward the tems. The Hopfield model [1] demonstrated that binarymatching stored pattern, enabling cross-modal compleNtion. Multi-species spin-glass analyses [18] subsequentlypatterns in { 1,+1} can be stored as fixed-point attractors of an energy landscape shaped by Hebbian couplings, provided a rigorous thermodynamic foundation for arwhile Little's earlier stochastic formulation [2] cast thechitectures with an arbitrary number of interacting popsame architecture in the language of equilibrium statisti-ulations, generalising the classical single-species phase cal mechanics through parallel probabilistic updates.

archetype, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2605.14059

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Coupling-Informed Transport Maps for Bayesian Filtering in Nonlinear Dynamical Systems

Zeng, Dengfei, Jiang, Lijian, Sun, Shuyu, Xiao, Dunhui

arXiv.org Machine LearningMay-14-2026

A likelihood-free transport filtering method is proposed based on the couplings between state and observation variables. By exploiting a block-triangular structure in the transport map, the analysis step of filtering is reformulated as the minimization of the maximum mean discrepancy (MMD) between the true joint measure and its transport-based approximation. To circumvent the non-convexity in the MMD optimization, we introduce a training-free transport filter method via gradient flows, which leads to an analytic computation for the transport map that implies the steepest descent direction of the MMD. The proposed approach accurately approximates non-Gaussian filtering posteriors and avoids particle collapse. We provide a convergence analysis for the expectation of the MMD between the approximated posterior and the truth posterior. Finally, we extend the method to high-dimensional problems through domain localization. Numerical examples demonstrate the superior performance of our approach over conventional filtering methods in nonlinear, non-Gaussian scenarios.

artificial intelligence, machine learning, transport map, (15 more...)

arXiv.org Machine Learning

2605.13174

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

Couple to Control: Joint Initial Noise Design in Diffusion Models

Jia, Jing, Shen, Liyue, Wang, Guanyang

arXiv.org Machine LearningMay-13-2026

Diffusion models typically generate image batches from independent Gaussian initial noises. We argue that this independence assumption is only one choice within a broader class of valid joint noise designs. Instead, one can specify a coupling of the initial noises: each noise remains marginally standard Gaussian, so the pretrained diffusion model receives the same single-sample input distribution, while the dependence across samples is chosen by design. This reframes initial-noise control from selecting or optimizing individual seeds to designing the dependence structure of a multi-sample gallery. This view gives a general framework for initial-noise design, covering several existing methods as special cases and leading naturally to new coupled-noise constructions. Coupled noise can improve generation on its own without adding sampling cost, and it is flexible enough to serve as a structured initialization for optimization-based pipelines when additional computation is available. Empirically, repulsive Gaussian coupling improves gallery diversity on SD1.5, SDXL, and SD3 while largely preserving prompt alignment and image quality. It matches or outperforms recent test-time noise-optimization baselines on several diversity metrics at the same sampling cost as independent generation. Subspace couplings also support fixed-object background generation, producing diverse, natural backgrounds compared with specialized inpainting baselines, with a tunable trade-off in foreground fidelity.

artificial intelligence, coupling, machine learning, (19 more...)

arXiv.org Machine Learning

2605.11311

Country: North America > United States > Michigan (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback