AITopics | asserstein distance

Collaborating Authors

asserstein distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalized infinite dimensional Alpha-Procrustes based geometries

Goomanee, Salvish, Han, Andi, Jawanpuria, Pratik, Mishra, Bamdev

arXiv.org Machine LearningNov-14-2025

Symmetric positive definite (SPD) matrices and operators are central to a wide range of problems in data science, including covariance estimation, kernel methods, diffusion geometry, and generative modeling. While the geometry of SPD matrices has been extensively studied in the finite-dimensional settingwith popular metrics such as the affine-invariant, Log-Euclidean, and Bures-Wasserstein (BW) distances, many real-world applications inherently involve infinite-dimensional SPD operators. These include covariance operators on functional spaces, integral kernels, and diffusion operators on manifolds. However, most existing geometric frameworks do not generalize coherently across finite and infinite dimensions, leading to inconsistencies in modeling, analysis, and computation. To address this, we propose a unifying family of Riemannian distances based on generalized alpha-Procrustes distances. This family includes the Log-Hilbert-Schmidt and infinite-dimensional GBW metrics as special cases and enables a continuous interpolation between them. Crucially, it is designed to extend smoothly from finite-dimensional SPD matrices to infinite-dimensional positive-definite Hilbert-Schmidt operators, offering a robust and flexible geometric foundation for both theoretical analysis and practical machine learning applications.

artificial intelligence, machine learning, operator, (17 more...)

arXiv.org Machine Learning

2511.09801

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The analogy theorem in Hoare logic

Nikita, Nikitin

arXiv.org Machine LearningOct-7-2025

The introduction of machine learning methods has led to significant advances in automation, optimization, and discoveries in various fields of science and technology. However, their widespread application faces a fundamental limitation: the transfer of models between data domains generally lacks a rigorous mathematical justification. The key problem is the lack of formal criteria to guarantee that a model trained on one type of data will retain its properties on another.This paper proposes a solution to this problem by formalizing the concept of analogy between data sets and models using first-order logic and Hoare logic.We formulate and rigorously prove a theorem that sets out the necessary and sufficient conditions for analogy in the task of knowledge transfer between machine learning models. Practical verification of the analogy theorem on model data obtained using the Monte Carlo method, as well as on MNIST and USPS data, allows us to achieving F1 scores of 0.84 and 0.88 for convolutional neural networks and random forests, respectively.The proposed approach not only allows us to justify the correctness of transfer between domains but also provides tools for comparing the applicability of models to different types of data.The main contribution of the work is a rigorous formalization of analogy at the level of program logic, providing verifiable guarantees of the correctness of knowledge transfer, which opens new opportunities for both theoretical research and the practical use of machine learning models in previously inaccessible areas.

analogy theorem, logic, metric, (14 more...)

arXiv.org Machine Learning

2510.03685

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.93)
Government > Regional Government > North America Government > United States Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Modelling bounded rational decision-making through Wasserstein constraints

Evans, Benjamin Patrick, Ardon, Leo, Ganesh, Sumitra

arXiv.org Artificial IntelligenceJun-2-2025

Modelling bounded rational decision-making through information constrained processing provides a principled approach for representing departures from rationality within a reinforcement learning framework, while still treating decision-making as an optimization process. However, existing approaches are generally based on Entropy, Kullback-Leibler divergence, or Mutual Information. In this work, we highlight issues with these approaches when dealing with ordinal action spaces. Specifically, entropy assumes uniform prior beliefs, missing the impact of a priori biases on decision-makings. KL-Divergence addresses this, however, has no notion of "nearness" of actions, and additionally, has several well known potentially undesirable properties such as the lack of symmetry, and furthermore, requires the distributions to have the same support (e.g. positive probability for all actions). Mutual information is often difficult to estimate. Here, we propose an alternative approach for modeling bounded rational RL agents utilising Wasserstein distances. This approach overcomes the aforementioned issues. Crucially, this approach accounts for the nearness of ordinal actions, modeling "stickiness" in agent decisions and unlikeliness of rapidly switching to far away actions, while also supporting low probability actions, zero-support prior distributions, and is simple to calculate directly.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2504.03743

Country: Europe (0.14)

Genre: Research Report (0.40)

Industry: Banking & Finance (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)

Add feedback

Power Spectrum Signatures of Graphs

Djima, Karamatou Yacoubou, Yim, Ka Man

arXiv.org Machine LearningMar-14-2025

Point signatures based on the Laplacian operators on graphs, point clouds, and manifolds have become popular tools in machine learning for graphs, clustering, and shape analysis. In this work, we propose a novel point signature, the power spectrum signature, a measure on $\mathbb{R}$ defined as the squared graph Fourier transform of a graph signal. Unlike eigenvectors of the Laplacian from which it is derived, the power spectrum signature is invariant under graph automorphisms. We show that the power spectrum signature is stable under perturbations of the input graph with respect to the Wasserstein metric. We focus on the signature applied to classes of indicator functions, and its applications to generating descriptive features for vertices of graphs. To demonstrate the practical value of our signature, we showcase several applications in characterizing geometry and symmetries in point cloud data, and graph regression problems.

artificial intelligence, machine learning, signature, (13 more...)

arXiv.org Machine Learning

2503.0966

Country:

North America > United States (0.14)
Europe > United Kingdom (0.04)

Genre:

Overview (0.92)
Research Report (0.81)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

Constrained Langevin Algorithms with L-mixing External Random Variables

Zheng, Yuping, Lamperski, Andrew

arXiv.org Artificial IntelligenceJan-7-2023

Langevin algorithms are gradient descent methods augmented with additive noise, and are widely used in Markov Chain Monte Carlo (MCMC) sampling, optimization, and machine learning. In recent years, the non-asymptotic analysis of Langevin algorithms for non-convex learning has been extensively explored. For constrained problems with non-convex losses over a compact convex domain with IID data variables, the projected Langevin algorithm achieves a deviation of $O(T^{-1/4} (\log T)^{1/2})$ from its target distribution [27] in $1$-Wasserstein distance. In this paper, we obtain a deviation of $O(T^{-1/2} \log T)$ in $1$-Wasserstein distance for non-convex losses with $L$-mixing data variables and polyhedral constraints (which are not necessarily bounded). This improves on the previous bound for constrained problems and matches the best-known bound for unconstrained problems.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2205.14192

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.04)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Gaussian Approximation of Quantization Error for Estimation from Compressed Data

Kipnis, Alon, Reeves, Galen

arXiv.org Machine LearningJan-9-2020

We consider the distributional connection between the lossy compressed representation of a high-dimensional signal $X$ using a random spherical code and the observation of $X$ under an additive white Gaussian noise (AWGN). We show that the Wasserstein distance between a bitrate-$R$ compressed version of $X$ and its observation under an AWGN-channel of signal-to-noise ratio $2^{2R}-1$ is sub-linear in the problem dimension. We utilize this fact to connect the risk of an estimator based on an AWGN-corrupted version of $X$ to the risk attained by the same estimator when fed with its bitrate-$R$ quantized version. We demonstrate the usefulness of this connection by deriving various novel results for inference problems under compression constraints, including noisy source coding and limited-bitrate parameter estimation.

artificial intelligence, information theory, machine learning, (16 more...)

arXiv.org Machine Learning

2001.03243

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Quantile Propagation for Wasserstein-Approximate Gaussian Processes

Zhang, Rui, Walder, Christian J., Bonilla, Edwin V., Rizoiu, Marian-Andrei, Xie, Lexing

arXiv.org Machine LearningDec-21-2019

In this work, we develop a new approximation method to solve the analytically intractable Bayesian inference for Gaussian process models with factorizable Gaussian likelihoods and single-output latent functions. Our method -- dubbed QP -- is similar to the expectation propagation (EP), however it minimizes the $L^2$ Wasserstein distance instead of the Kullback-Leibler (KL) divergence. We consider the specific case in which the non-Gaussian likelihood is approximated by the Gaussian likelihood. We show that QP has the following properties: (1) QP matches quantile functions rather than moments in EP; (2) QP and EP have the same local update for the mean of the approximate Gaussian likelihood; (3) the local variance estimate for the approximate likelihood is smaller for QP than for EP's, addressing EP's over-estimation of the variance; (4) the optimal approximate Gaussian likelihood enjoys a univariate parameterization, reducing memory consumption and computation time. Furthermore, we provide a unified interpretations of EP and QP -- both are coordinate descent algorithms of a KL and an $L^2$ Wasserstein global objective function respectively, under the same assumptions. In the performed experiments, we employ eight real world datasets and we show that QP outperforms EP for the task of Gaussian process binary classification.

asserstein distance, classification, likelihood, (15 more...)

arXiv.org Machine Learning

1912.102

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)

Add feedback

Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces

Alvarez-Melis, David, Mroueh, Youssef, Jaakkola, Tommi S.

arXiv.org Machine LearningNov-6-2019

This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This is a problem that appears across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vector-space representation of the items in the two hierarchies, we seek to infer correspondences across them. Our work derives from and interweaves hyperbolic-space representations for hierarchical data, on one hand, and unsupervised word-alignment methods, on the other. We first provide a set of negative results showing how and why Euclidean methods fail in this hyperbolic setting. We then propose a novel approach based on optimal transport over hyperbolic spaces, and show that it outperforms standard embedding alignment techniques in various experiments on cross-lingual WordNet alignment and ontology matching tasks.

correspondence, dataset, hyperbolic space, (14 more...)

arXiv.org Machine Learning

1911.02536

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Italy (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.77)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback