AITopics | Hsieh, Ya-Ping

Collaborating Authors

Hsieh, Ya-Ping

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sinkhorn Flow: A Continuous-Time Framework for Understanding and Generalizing the Sinkhorn Algorithm

Karimi, Mohammad Reza, Hsieh, Ya-Ping, Krause, Andreas

arXiv.org Machine LearningNov-28-2023

Many problems in machine learning can be formulated as solving entropy-regularized optimal transport on the space of probability measures. The canonical approach involves the Sinkhorn iterates, renowned for their rich mathematical properties. Recently, the Sinkhorn algorithm has been recast within the mirror descent framework, thus benefiting from classical optimization theory insights. Here, we build upon this result by introducing a continuous-time analogue of the Sinkhorn algorithm. This perspective allows us to derive novel variants of Sinkhorn schemes that are robust to noise and bias. Moreover, our continuous-time dynamics not only generalize but also offer a unified perspective on several recently discovered dynamics in machine learning and mathematics, such as the "Wasserstein mirror flow" of (Deb et al. 2023) or the "mean-field Schr\"odinger equation" of (Claisse et al. 2023).

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2311.16706

Country: North America > United States (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Riemannian stochastic optimization methods avoid strict saddle points

Hsieh, Ya-Ping, Karimi, Mohammad Reza, Krause, Andreas, Mertikopoulos, Panayotis

arXiv.org Artificial IntelligenceNov-4-2023

Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, and are typically solved with a Riemannian stochastic gradient method (or some variant thereof). However, in many cases of interest, the resulting minimization problem is not geodesically convex, so the convergence of the chosen solver to a desirable solution - i.e., a local minimizer - is by no means guaranteed. In this paper, we study precisely this question, that is, whether stochastic Riemannian optimization algorithms are guaranteed to avoid saddle points with probability 1. For generality, we study a family of retraction-based methods which, in addition to having a potentially much lower per-iteration cost relative to Riemannian gradient descent, include other widely used algorithms, such as natural policy gradient methods and mirror descent in ordinary convex spaces. In this general setting, we show that, under mild assumptions for the ambient manifold and the oracle providing gradient information, the policies under study avoid strict saddle points / submanifolds with probability 1, from any initial condition. This result provides an important sanity check for the use of gradient methods on manifolds as it shows that, almost always, the limit state of a stochastic Riemannian algorithm can only be a local minimizer.

artificial intelligence, machine learning, saddle point, (18 more...)

arXiv.org Artificial Intelligence

2311.02374

Country:

Europe (1.00)
North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

A unified stochastic approximation framework for learning in games

Mertikopoulos, Panayotis, Hsieh, Ya-Ping, Cevher, Volkan

arXiv.org Artificial IntelligenceJul-3-2023

We develop a flexible stochastic approximation framework for analyzing the long-run behavior of learning in games (both continuous and finite). The proposed analysis template incorporates a wide array of popular learning algorithms, including gradient-based methods, the exponential / multiplicative weights algorithm for learning in finite games, optimistic and bandit variants of the above, etc. In addition to providing an integrated view of these algorithms, our framework further allows us to obtain several new convergence results, both asymptotic and in finite time, in both continuous and finite games. Specifically, we provide a range of criteria for identifying classes of Nash equilibria and sets of action profiles that are attracting with high probability, and we also introduce the notion of coherence, a game-theoretic property that includes strict and sharp equilibria, and which leads to convergence in finite time. Importantly, our analysis applies to both oracle-based and bandit, payoff-based methods - that is, when players only observe their realized payoffs.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2206.03922

Country:

North America > United States > New York (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Texas (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
(2 more...)

Add feedback

Aligned Diffusion Schr\"odinger Bridges

Somnath, Vignesh Ram, Pariset, Matteo, Hsieh, Ya-Ping, Martinez, Maria Rodriguez, Krause, Andreas, Bunne, Charlotte

arXiv.org Artificial IntelligenceJun-21-2023

Diffusion Schr\"odinger bridges (DSB) have recently emerged as a powerful framework for recovering stochastic dynamics via their marginal observations at different time points. Despite numerous successful applications, existing algorithms for solving DSBs have so far failed to utilize the structure of aligned data, which naturally arises in many biological phenomena. In this paper, we propose a novel algorithmic framework that, for the first time, solves DSBs while respecting the data alignment. Our approach hinges on a combination of two decades-old ideas: The classical Schr\"odinger bridge theory and Doob's $h$-transform. Compared to prior methods, our approach leads to a simpler training procedure with lower variance, which we further augment with principled regularization schemes. This ultimately leads to sizeable improvements across experiments on synthetic and real data, including the tasks of rigid protein docking and temporal evolution of cellular differentiation processes.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.11419

Country: Europe (0.28)

Genre: Research Report (0.63)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Unbalanced Diffusion Schr\"odinger Bridge

Pariset, Matteo, Hsieh, Ya-Ping, Bunne, Charlotte, Krause, Andreas, De Bortoli, Valentin

arXiv.org Artificial IntelligenceJun-15-2023

Schr\"odinger bridges (SBs) provide an elegant framework for modeling the temporal evolution of populations in physical, chemical, or biological systems. Such natural processes are commonly subject to changes in population size over time due to the emergence of new species or birth and death events. However, existing neural parameterizations of SBs such as diffusion Schr\"odinger bridges (DSBs) are restricted to settings in which the endpoints of the stochastic process are both probability measures and assume conservation of mass constraints. To address this limitation, we introduce unbalanced DSBs which model the temporal evolution of marginals with arbitrary finite mass. This is achieved by deriving the time reversal of stochastic differential equations with killing and birth terms. We present two novel algorithmic schemes that comprise a scalable objective function for training unbalanced DSBs and provide a theoretical analysis alongside challenging applications on predicting heterogeneous molecular single-cell responses to various cancer drugs and simulating the emergence and spread of new viral variants.

artificial intelligence, diffusion, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.09099

Country:

Europe (0.28)
North America > United States (0.27)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

The Schr\"odinger Bridge between Gaussian Measures has a Closed Form

Bunne, Charlotte, Hsieh, Ya-Ping, Cuturi, Marco, Krause, Andreas

arXiv.org Artificial IntelligenceMar-31-2023

The static optimal transport $(\mathrm{OT})$ problem between Gaussians seeks to recover an optimal map, or more generally a coupling, to morph a Gaussian into another. It has been well studied and applied to a wide variety of tasks. Here we focus on the dynamic formulation of OT, also known as the Schr\"odinger bridge (SB) problem, which has recently seen a surge of interest in machine learning due to its connections with diffusion-based generative models. In contrast to the static setting, much less is known about the dynamic setting, even for Gaussian distributions. In this paper, we provide closed-form expressions for SBs between Gaussian measures. In contrast to the static Gaussian OT problem, which can be simply reduced to studying convex programs, our framework for solving SBs requires significantly more involved tools such as Riemannian geometry and generator theory. Notably, we establish that the solutions of SBs between Gaussian measures are themselves Gaussian processes with explicit mean and covariance kernels, and thus are readily amenable for many downstream applications such as generative modeling or interpolation. To demonstrate the utility, we devise a new method for modeling the evolution of single-cell genomics data and report significantly improved numerical stability compared to existing SB-based approaches.

artificial intelligence, dinger bridge, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2202.05722

Country: Europe (0.67)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

A Dynamical System View of Langevin-Based Non-Convex Sampling

Karimi, Mohammad Reza, Hsieh, Ya-Ping, Krause, Andreas

arXiv.org Artificial IntelligenceMar-13-2023

Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain many important challenges: Existing guarantees (1) typically only hold for the averaged iterates rather than the more desirable last iterates, (2) lack convergence metrics that capture the scales of the variables such as Wasserstein distances, and (3) mainly apply to elementary schemes such as stochastic gradient Langevin dynamics. In this paper, we develop a new framework that lifts the above issues by harnessing several tools from the theory of dynamical systems. Our key result is that, for a large class of state-of-the-art sampling schemes, their last-iterate convergence in Wasserstein distances can be reduced to the study of their continuous-time counterparts, which is much better understood. Coupled with standard assumptions of MCMC sampling, our theory immediately yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators. Beyond existing methods, our framework also motivates more efficient schemes that enjoy the same rigorous guarantees.

artificial intelligence, convergence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2210.13867

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Riemannian stochastic approximation algorithms

Karimi, Mohammad Reza, Hsieh, Ya-Ping, Mertikopoulos, Panayotis, Krause, Andreas

arXiv.org Artificial IntelligenceDec-27-2022

We examine a wide class of stochastic approximation algorithms for solving (stochastic) nonlinear problems on Riemannian manifolds. Such algorithms arise naturally in the study of Riemannian optimization, game theory and optimal transport, but their behavior is much less understood compared to the Euclidean case because of the lack of a global linear structure on the manifold. We overcome this difficulty by introducing a suitable Fermi coordinate frame which allows us to map the asymptotic behavior of the Riemannian Robbins-Monro (RRM) algorithms under study to that of an associated deterministic dynamical system. In so doing, we provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes, despite the significant complications that arise due to the curvature and topology of the underlying manifold. We showcase the flexibility of the proposed framework by applying it to a range of retraction-based variants of the popular optimistic / extra-gradient methods for solving minimization problems and games, and we provide a unified treatment for their convergence.

artificial intelligence, machine learning, manifold, (19 more...)

arXiv.org Artificial Intelligence

2206.06795

Country:

North America > United States (0.28)
Europe > Switzerland (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Conditional gradient methods for stochastically constrained convex minimization

Vladarean, Maria-Luiza, Alacaoglu, Ahmet, Hsieh, Ya-Ping, Cevher, Volkan

arXiv.org Machine LearningJul-7-2020

We propose two novel conditional gradient-based methods for solving structured stochastic convex optimization problems with a large number of linear constraints. Instances of this template naturally arise from SDP-relaxations of combinatorial problems, which involve a number of constraints that is polynomial in the problem dimension. The most important feature of our framework is that only a subset of the constraints is processed at each iteration, thus gaining a computational advantage over prior works that require full passes. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees. Preliminary numerical experiments are provided for illustrating the practical performance of the methods.

artificial intelligence, conditional gradient method, optimization problem, (16 more...)

arXiv.org Machine Learning

2007.03795

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

The limits of min-max optimization algorithms: convergence to spurious non-critical sets

Hsieh, Ya-Ping, Mertikopoulos, Panayotis, Cevher, Volkan

arXiv.org Machine LearningJun-16-2020

Compared to minimization problems, the min-max landscape in machine learning applications is considerably more convoluted because of the existence of cycles and similar phenomena. Such oscillatory behaviors are well-understood in the convex-concave regime, and many algorithms are known to overcome them. In this paper, we go beyond the convex-concave setting and we characterize the convergence properties of a wide class of zeroth-, first-, and (scalable) second-order methods in non-convex/non-concave problems. In particular, we show that these state-of-the-art min-max optimization algorithms may converge with arbitrarily high probability to attractors that are in no way min-max optimal or even stationary. Spurious convergence phenomena of this type can arise even in two-dimensional problems, a fact which corroborates the empirical evidence surrounding the formidable difficulty of training GANs.

algorithm, artificial intelligence, optimization problem, (16 more...)

arXiv.org Machine Learning

2006.09065

Country:

Europe (0.93)
North America > United States > New York (0.14)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback