AITopics | Yao, Rentian

Collaborating Authors

Yao, Rentian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization

Kim, Kaheon, Yao, Rentian, Zhu, Changbo, Chen, Xiaohui

arXiv.org Machine LearningJan-24-2025

The optimal transport barycenter (a.k.a. Wasserstein barycenter) is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the unregularized barycenter for discretized probability distributions on point clouds is a challenging task when the domain dimension $d > 1$. Most practical algorithms for approximating the barycenter problem are based on entropic regularization. In this paper, we introduce a nearly linear time $O(m \log{m})$ and linear space complexity $O(m)$ primal-dual algorithm, the Wasserstein-Descent $\dot{\mathbb{H}}^1$-Ascent (WDHA) algorithm, for computing the exact barycenter when the input probability density functions are discretized on an $m$-point grid. The key success of the WDHA algorithm hinges on alternating between two different yet closely related Wasserstein and Sobolev optimization geometries for the primal barycenter and dual Kantorovich potential subproblems. Under reasonable assumptions, we establish the convergence rate and iteration complexity of WDHA to its stationary point when the step size is appropriately chosen. Superior computational efficacy, scalability, and accuracy over the existing Sinkhorn-type algorithms are demonstrated on high-resolution (e.g., $1024 \times 1024$ images) 2D synthetic and real data.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

2501.14635

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Mean-field Variational Inference via Wasserstein Gradient Flow

Yao, Rentian, Yang, Yun

arXiv.org Machine LearningSep-8-2023

One of the core problems of modern Bayesian inference is to compute the posterior distribution, a joint probability measure over unknown quantities, such as model parameters and unobserved latent variables, obtained by combining data information with prior knowledge in a principled manner. Modern statistics often rely on complex models for which the posterior distribution is analytically intractable and requires approximate computation. As a common alternative strategy to conventional Markov chain Monte Carlo (MCMC) sampling approach for approximating the posterior, variational inference (VI, [10]), or variational Bayes [27], finds the closest member in a user specified class of analytically tractable distributions, referred to as the variational (distribution) family, to approximate the target posterior. Although MCMC is asymptotically exact, VI is usually orders of magnitude faster [12, 62] since it turns the sampling or integration into an optimization problem. VI has successfully demonstrated its power in a wide variety of applications, including clustering [11, 23], semi-supervised learning [38], neural-network training [5, 52], and probabilistic modeling [13, 36]. Among various approximating schemes, the mean-field (MF) approximation, which originates from statistical mechanics and uses the approximating family consisting of all fully factorized density functions over (blocks of) the unknown quantities, is the most widely used and representative instance of VI that is conceptually simple yet practically powerful. On the downside, VI still requires certain conditional conjugacy structure to facilitate efficient computation (c.f.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2207.08074

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Gromov--Wasserstein Geometric View of Spectrum-Preserving Graph Coarsening

Chen, Yifan, Yao, Rentian, Yang, Yun, Chen, Jie

arXiv.org Artificial IntelligenceJun-15-2023

Graph coarsening is a technique for solving large-scale graph problems by working on a smaller version of the original graph, and possibly interpolating the results back to the original graph. It has a long history in scientific computing and has recently gained popularity in machine learning, particularly in methods that preserve the graph spectrum. This work studies graph coarsening from a different perspective, developing a theory for preserving graph distances and proposing a method to achieve this. The geometric approach is useful when working with a collection of graphs, such as in graph classification and regression. In this study, we consider a graph as an element on a metric space equipped with the Gromov--Wasserstein (GW) distance, and bound the difference between the distance of two graphs and their coarsened versions. Minimizing this difference can be done using the popular weighted kernel $K$-means method, which improves existing spectrum-preserving methods with the proper choice of the kernel. The study includes a set of experiments to support the theory and method, including approximating the GW distance, preserving the graph spectrum, classifying graphs using spectral information, and performing regression using graph convolutional networks. Code is available at https://github.com/ychen-stat-ml/GW-Graph-Coarsening .

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.08854

Country:

North America > United States > Ohio (0.14)
North America > United States > Illinois (0.14)
North America > United States > Hawaii (0.14)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback