AITopics | Bai, Yikun

Collaborating Authors

Bai, Yikun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fused Partial Gromov-Wasserstein for Structured Objects

Bai, Yikun, Tran, Huy, Du, Hengrong, Liu, Xinran, Kolouri, Soheil

arXiv.org Artificial IntelligenceFeb-14-2025

Structured data, such as graphs, are vital in machine learning due to their capacity to capture complex relationships and interactions. In recent years, the Fused Gromov-Wasserstein (FGW) distance has attracted growing interest because it enables the comparison of structured data by jointly accounting for feature similarity and geometric structure. However, as a variant of optimal transport (OT), classical FGW assumes an equal mass constraint on the compared data. In this work, we relax this mass constraint and propose the Fused Partial Gromov-Wasserstein (FPGW) framework, which extends FGW to accommodate unbalanced data. Theoretically, we establish the relationship between FPGW and FGW and prove the metric properties of FPGW. Numerically, we introduce Frank-Wolfe solvers for the proposed FPGW framework and provide a convergence analysis. Finally, we evaluate the FPGW distance through graph classification and clustering experiments, demonstrating its robust performance, especially when data is corrupted by outlier noise.

artificial intelligence, information management, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.09934

Country:

North America > Canada (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Understanding Learning with Sliced-Wasserstein Requires Rethinking Informative Slices

Tran, Huy, Bai, Yikun, Shahbazi, Ashkan, Hershey, John R., Kolouri, Soheil

arXiv.org Machine LearningNov-15-2024

The practical applications of Wasserstein distances (WDs) are constrained by their sample and computational complexities. Sliced-Wasserstein distances (SWDs) provide a workaround by projecting distributions onto one-dimensional subspaces, leveraging the more efficient, closed-form WDs for one-dimensional distributions. However, in high dimensions, most random projections become uninformative due to the concentration of measure phenomenon. Although several SWD variants have been proposed to focus on \textit{informative} slices, they often introduce additional complexity, numerical instability, and compromise desirable theoretical (metric) properties of SWD. Amidst the growing literature that focuses on directly modifying the slicing distribution, which often face challenges, we revisit the classical Sliced-Wasserstein and propose instead to rescale the 1D Wasserstein to make all slices equally informative. Importantly, we show that with an appropriate data assumption and notion of \textit{slice informativeness}, rescaling for all individual slices simplifies to \textbf{a single global scaling factor} on the SWD. This, in turn, translates to the standard learning rate search for gradient-based learning in common machine learning workflows. We perform extensive experiments across various machine learning tasks showing that the classical SWD, when properly configured, can often match or surpass the performance of more complex variants. We then answer the following question: "Is Sliced-Wasserstein all you need for common learning tasks?"

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Machine Learning

2411.10651

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data

Liu, Xinran, Bai, Yikun, Martín, Rocío Díaz, Shi, Kaiwen, Shahbazi, Ashkan, Landman, Bennett A., Chang, Catie, Kolouri, Soheil

arXiv.org Artificial IntelligenceNov-8-2024

Efficient comparison of spherical probability distributions becomes important in fields such as computer vision, geosciences, and medicine. Sliced optimal transport distances, such as spherical and stereographic spherical sliced Wasserstein distances, have recently been developed to address this need. These methods reduce the computational burden of optimal transport by slicing hyperspheres into one-dimensional projections, i.e., lines or circles. Concurrently, linear optimal transport has been proposed to embed distributions into $ L^2 $ spaces, where the $ L^2 $ distance approximates the optimal transport distance, thereby simplifying comparisons across multiple distributions. In this work, we introduce the Linear Spherical Sliced Optimal Transport (LSSOT) framework, which utilizes slicing to embed spherical distributions into $ L^2 $ spaces while preserving their intrinsic geometry, offering a computationally efficient metric for spherical probability measures. We establish the metricity of LSSOT and demonstrate its superior computational efficiency in applications such as cortical surface registration, 3D point cloud interpolation via gradient flow, and shape embedding. Our results demonstrate the significant computational benefits and high accuracy of LSSOT in these applications.

artificial intelligence, linear spherical sliced optimal transport, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2411.06055

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Cognitive Science (0.92)

Add feedback

Linear Partial Gromov-Wasserstein Embedding

Bai, Yikun, Kothapalli, Abihith, Du, Hengrong, Martin, Rocio Diaz, Kolouri, Soheil

arXiv.org Artificial IntelligenceNov-2-2024

The Gromov Wasserstein (GW) problem, a variant of the classical optimal transport (OT) problem, has attracted growing interest in the machine learning and data science communities due to its ability to quantify similarity between measures in different metric spaces. However, like the classical OT problem, GW imposes an equal mass constraint between measures, which restricts its application in many machine learning tasks. To address this limitation, the partial Gromov-Wasserstein (PGW) problem has been introduced, which relaxes the equal mass constraint, enabling the comparison of general positive Radon measures. Despite this, both GW and PGW face significant computational challenges due to their non-convex nature. To overcome these challenges, we propose the linear partial Gromov-Wasserstein (LPGW) embedding, a linearized embedding technique for the PGW problem. For $K$ different metric measure spaces, the pairwise computation of the PGW distance requires solving the PGW problem $\mathcal{O}(K^2)$ times. In contrast, the proposed linearization technique reduces this to $\mathcal{O}(K)$ times. Similar to the linearization technique for the classical OT problem, we prove that LPGW defines a valid metric for metric measure spaces. Finally, we demonstrate the effectiveness of LPGW in practical applications such as shape retrieval and learning with transport-based embeddings, showing that LPGW preserves the advantages of PGW in partial matching while significantly enhancing computational efficiency.

artificial intelligence, lpgw, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.16669

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Expected Sliced Transport Plans

Liu, Xinran, Martín, Rocío Díaz, Bai, Yikun, Shahbazi, Ashkan, Thorpe, Matthew, Aldroubi, Akram, Kolouri, Soheil

arXiv.org Artificial IntelligenceOct-17-2024

The optimal transport (OT) problem has gained significant traction in modern machine learning for its ability to: (1) provide versatile metrics, such as Wasserstein distances and their variants, and (2) determine optimal couplings between probability measures. To reduce the computational complexity of OT solvers, methods like entropic regularization and sliced optimal transport have been proposed. The sliced OT framework improves efficiency by comparing one-dimensional projections (slices) of high-dimensional distributions. However, despite their computational efficiency, sliced-Wasserstein approaches lack a transportation plan between the input measures, limiting their use in scenarios requiring explicit coupling. In this paper, we address two key questions: Can a transportation plan be constructed between two probability measures using the sliced transport framework? If so, can this plan be used to define a metric between the measures? We propose a "lifting" operation to extend one-dimensional optimal transport plans back to the original space of the measures. By computing the expectation of these lifted plans, we derive a new transportation plan, termed expected sliced transport (EST) plans. We prove that using the EST plan to weight the sum of the individual Euclidean costs for moving from one point to another results in a valid metric between the input discrete probability measures. We demonstrate the connection between our approach and the recently proposed min-SWGG, along with illustrative numerical examples that support our theoretical findings.

artificial intelligence, machine learning, probability measure, (18 more...)

arXiv.org Artificial Intelligence

2410.12176

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sinkhorn algorithms and linear programming solvers for optimal partial transport problems

Bai, Yikun

arXiv.org Artificial IntelligenceJul-8-2024

In this note, we generalize the classical optimal partial transport (OPT) problem by modifying the mass destruction/creation term to function-based terms, introducing what we term ``generalized optimal partial transport'' problems. We then discuss the dual formulation of these problems and the associated Sinkhorn solver. Finally, we explore how these new OPT problems relate to classical optimal transport (OT) problems and introduce a linear programming solver tailored for these generalized scenarios.

artificial intelligence, optimization problem, ptv, (15 more...)

arXiv.org Artificial Intelligence

2407.06481

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.71)

Add feedback

Efficient Solvers for Partial Gromov-Wasserstein

Bai, Yikun, Martin, Rocio Diaz, Du, Hengrong, Shahbazi, Ashkan, Kolouri, Soheil

arXiv.org Artificial IntelligenceFeb-5-2024

In this paper, we demonstrate that the PGW problem can be transformed into a variant of the Gromov-Wasserstein problem, akin to the conversion of the partial optimal transport problem into an optimal transport problem. This transformation leads to two new solvers, mathematically and computationally equivalent, based on the Frank-Wolfe algorithm, that provide efficient solutions to the PGW problem. We further establish that the PGW problem constitutes a metric for metric measure spaces. Finally, we validate the effectiveness of our proposed solvers in terms of computation time and performance on shape-matching and positive-unlabeled learning problems, comparing them against existing baselines.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.03664

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Stereographic Spherical Sliced Wasserstein Distances

Tran, Huy, Bai, Yikun, Kothapalli, Abihith, Shahbazi, Ashkan, Liu, Xinran, Martin, Rocio Diaz, Kolouri, Soheil

arXiv.org Artificial IntelligenceFeb-4-2024

Applications involving distributions defined on a hypersphere are remarkably diverse, highlighting the importance of spherical geometries across various disciplines. These applications include: 1) mapping the distribution of geographic or geological features on celestial bodies, such as stars and planets [39, 8, 60], 2) magnetoencephalography (MEG) imaging [75] in medical domains, 3) spherical image representations and 360 images [13, 38], such as omnidirectional images in computer vision [40], 4) texture mapping in computer graphics [24, 21], and more recently, 5) deep representation learning, where the latent representation is often mapped to a bounded space, commonly a sphere, where cosine similarity is utilized for effective representation learning [11, 76]. The analysis of distributions on hyperspheres is traditionally approached through directional statistics, also referred to as circular/spherical statistics [37, 52, 50, 61]. This specialized field is dedicated to the statistical analysis of directions, orientations, and rotations. More recently, with the growing application of optimal transport theory [74, 62] in machine learning, due in part to its favorable statistical, geometrical, and topological properties, there has been an increasing interest in using optimal transport to compare spherical probability measures [14, 32]. One of the main bottlenecks in optimal transport theory is its high computational cost, generally of cubic complexity.

artificial intelligence, machine learning, projection, (19 more...)

arXiv.org Artificial Intelligence

2402.02345

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.45)

Genre:

Research Report (0.50)
Overview (0.45)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Information Technology (0.45)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.87)
(2 more...)

Add feedback

LCOT: Linear circular optimal transport

Martin, Rocio Diaz, Medri, Ivan, Bai, Yikun, Liu, Xinran, Yan, Kangbai, Rohde, Gustavo K., Kolouri, Soheil

arXiv.org Artificial IntelligenceOct-9-2023

The optimal transport problem for measures supported on non-Euclidean spaces has recently gained ample interest in diverse applications involving representation learning. In this paper, we focus on circular probability measures, i.e., probability measures supported on the unit circle, and introduce a new computationally efficient metric for these measures, denoted as Linear Circular Optimal Transport (LCOT). The proposed metric comes with an explicit linear embedding that allows one to apply Machine Learning (ML) algorithms to the embedded measures and seamlessly modify the underlying metric for the ML algorithm to LCOT. We show that the proposed metric is rooted in the Circular Optimal Transport (COT) and can be considered the linearization of the COT metric with respect to a fixed reference measure. We provide a theoretical analysis of the proposed metric and derive the computational complexities for pairwise comparison of circular probability measures. Lastly, through a set of numerical experiments, we demonstrate the benefits of LCOT in learning representations of circular measures.

artificial intelligence, machine learning, probability measure, (16 more...)

arXiv.org Artificial Intelligence

2310.06002

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Partial Transport for Point-Cloud Registration

Bai, Yikun, Tran, Huy, Damelin, Steven B., Kolouri, Soheil

arXiv.org Artificial IntelligenceSep-27-2023

Point cloud registration plays a crucial role in various fields, including robotics, computer graphics, and medical imaging. This process involves determining spatial relationships between different sets of points, typically within a 3D space. In real-world scenarios, complexities arise from non-rigid movements and partial visibility, such as occlusions or sensor noise, making non-rigid registration a challenging problem. Classic non-rigid registration methods are often computationally demanding, suffer from unstable performance, and, importantly, have limited theoretical guarantees. The optimal transport problem and its unbalanced variations (e.g., the optimal partial transport problem) have emerged as powerful tools for point-cloud registration, establishing a strong benchmark in this field. These methods view point clouds as empirical measures and provide a mathematically rigorous way to quantify the `correspondence' between (the transformed) source and target points. In this paper, we approach the point-cloud registration problem through the lens of optimal transport theory and first propose a comprehensive set of non-rigid registration methods based on the optimal partial transportation problem. Subsequently, leveraging the emerging work on efficient solutions to the one-dimensional optimal partial transport problem, we extend our proposed algorithms via slicing to gain significant computational efficiency, resulting in fast and robust non-rigid registration algorithms. We demonstrate the effectiveness of our proposed methods and compare them against baselines on various 3D and 2D non-rigid registration problems where the source and target point clouds are corrupted by random noise.

artificial intelligence, partial transport, point-cloud registration

arXiv.org Artificial Intelligence

2309.15787

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback