partial optimal transport
Supplementary Material for " Partial Optimal Transport with Applications on Positive-Unlabeled Learning '
The proof involves 3 steps: 1. we first justify the definition of p and q in the extended problem formulation, and show that T It is straighforward to see that, by doing so, we ensure that Γ remains an admissible coupling (see Figure 1 for an illustration). Figure 1: Repartition of the mass for matrices Γ and T. Each of them has a total mass of q A 5 (with a constant A > 2ξ) the GW formulation involves pairs of points. This yields the following cases: Case 1: a > 0. In that case, φ(γ) is a convex function, whose minimum on [0, 1] is reached for γ We have φ(0) = c > 0 and φ(1) = a + b + c. The minimum is then obtained for 0 if a + b > 0, and 1 otherwise. This gives the desired result.
Time-Varying Network Driver Estimation (TNDE) Quantifies Stage-Specific Regulatory Effects From Single-Cell Snapshots
Identifying key driver genes governing biological processes such as development and disease progression remains a challenge. While existing methods can reconstruct cellular trajectories or infer static gene regulatory networks (GRNs), they often fail to quantify time-resolved regulatory effects within specific temporal windows. Here, we present Time-varying Network Driver Estimation (TNDE), a computational framework quantifying dynamic gene driver effects from single-cell snapshot data under a linear Markov assumption. TNDE leverages a shared graph attention encoder to preserve the local topological structure of the data. Furthermore, by incorporating partial optimal transport, TNDE accounts for unmatched cells arising from proliferation or apoptosis, thereby enabling trajectory alignment in non-equilibrium processes. Benchmarking on simulated datasets demonstrates that TNDE outperforms existing baseline methods across diverse complex regulatory scenarios. Applied to mouse erythropoiesis data, TNDE identifies stage-specific driver genes, the functional relevance of which is corroborated by biological validation. TNDE offers an effective quantitative tool for dissecting dynamic regulatory mechanisms underlying complex biological processes.
Supplementary Material for " Partial Optimal Transport with Applications on Positive-Unlabeled Learning '
The proof involves 3 steps: 1. A null 5 (with a constant A > 2ξ) the GW formulation involves pairs of points. This yields the following cases: Case 1: a > 0. In that case, φ (γ) is a convex function, whose minimum on [0, 1] is reached for γ Using the development in Section 1.2.2 of the supplemental, we can establish that The partial-OT computation is based on a augmented problem with a dummy point and, as such, is convex. On the contrary, the GW problem is non-convex and, although the algorithm is proved to converge, there is no guarantee that the global optimum is reached. The quality of the solution is therefore highly dependent on the initialization.
Sparse Partial Optimal Transport via Quadratic Regularization
Tran, Khang, Nguyen, Khoa, Nguyen, Anh, Huynh, Thong, Pham, Son, Nguyen-Dang, Sy-Hoang, Pham, Manh, Vo, Bang, Tran, Mai Ngoc, Tran, Mai Ngoc, Luong, Dung
Partial Optimal Transport (POT) has recently emerged as a central tool in various Machine Learning (ML) applications. It lifts the stringent assumption of the conventional Optimal Transport (OT) that input measures are of equal masses, which is often not guaranteed in real-world datasets, and thus offers greater flexibility by permitting transport between unbalanced input measures. Nevertheless, existing major solvers for POT commonly rely on entropic regularization for acceleration and thus return dense transport plans, hindering the adoption of POT in various applications that favor sparsity. In this paper, as an alternative approach to the entropic POT formulation in the literature, we propose a novel formulation of POT with quadratic regularization, hence termed quadratic regularized POT (QPOT), which induces sparsity to the transport plan and consequently facilitates the adoption of POT in many applications with sparsity requirements. Extensive experiments on synthetic and CIFAR-10 datasets, as well as real-world applications such as color transfer and domain adaptations, consistently demonstrate the improved sparsity and favorable performance of our proposed QPOT formulation.
On Partial Optimal Transport: Revising the Infeasibility of Sinkhorn and Efficient Gradient Methods
Nguyen, Anh Duc, Nguyen, Tuan Dung, Nguyen, Quang Minh, Nguyen, Hoang H., Nguyen, Lam M., Toh, Kim-Chuan
This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports and its applications in various AI tasks such as color transfer or domain adaptation. There is hence the need for fast approximations of POT with increasingly large problem sizes in arising applications. We first theoretically and experimentally investigate the infeasibility of the state-of-the-art Sinkhorn algorithm for POT due to its incompatible rounding procedure, which consequently degrades its qualitative performance in real world applications like point-cloud registration. To this end, we propose a novel rounding algorithm for POT, and then provide a feasible Sinkhorn procedure with a revised computation complexity of $\mathcal{\widetilde O}(n^2/\varepsilon^4)$. Our rounding algorithm also permits the development of two first-order methods to approximate the POT problem. The first algorithm, Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD), finds an $\varepsilon$-approximate solution to the POT problem in $\mathcal{\widetilde O}(n^{2.5}/\varepsilon)$, which is better in $\varepsilon$ than revised Sinkhorn. The second method, Dual Extrapolation, achieves the computation complexity of $\mathcal{\widetilde O}(n^2/\varepsilon)$, thereby being the best in the literature. We further demonstrate the flexibility of POT compared to standard OT as well as the practicality of our algorithms on real applications where two marginal distributions are unbalanced.
An Efficient Mini-batch Method via Partial Transportation
Nguyen, Khai, Nguyen, Dang, Pham, Tung, Ho, Nhat
Mini-batch optimal transport (m-OT) has been widely used recently to deal with the memory issue of OT in large-scale applications. Despite their practicality, m-OT suffers from misspecified mappings, namely, mappings that are optimal on the mini-batch level but do not exist in the optimal transportation plan between the original measures. To address the misspecified mappings issue, we propose a novel mini-batch method by using partial optimal transport (POT) between mini-batch empirical measures, which we refer to as mini-batch partial optimal transport (m-POT). Leveraging the insight from the partial transportation, we explain the source of misspecified mappings from the m-OT and motivate why limiting the amount of transported masses among mini-batches via POT can alleviate the incorrect mappings. Finally, we carry out extensive experiments on various applications to compare m-POT with m-OT and recently proposed mini-batch method, mini-batch unbalanced optimal transport (m-UOT). We observe that m-POT is better than m-OT deep domain adaptation applications while having comparable performance with m-UOT. On other applications, such as deep generative model, gradient flow, and color transfer, m-POT yields more favorable performance than both m-OT and m-UOT.
On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity
Le, Khang, Nguyen, Huy, Pham, Tung, Ho, Nhat
The recent advances in computation of optimal transport (OT) [34, 11, 24, 12, 1] has brought new applications of optimal transport in machine learning and data science to the fore. Examples of these applications include generative models [2, 33, 16, 10], unsupervised learning [17, 18], computer vision [32, 25], and other applications [29, 27, 6]. However, due to the marginal constraints of transportation plans, optimal transport is only defined between balanced measures, namely, measures with equal masses. When measures are unbalanced, i.e., they can have different masses, there are two popular approaches for defining divergences between these measures. The first approach is unbalanced optimal transport [9, 8]. The main idea of unbalanced optimal transport is to regularize the objective function of optimal transport based on certain divergences between marginal constraints of transportation plan and the masses of measures. Despite its favorable computational complexity [28] and practical applications [31, 14, 19, 3], the optimal transportation plan from unbalanced optimal transport is often non-trivial to interpret in practice. The second approach for defining divergence between unbalanced measures is partial optimal transport (POT) [5, 13], which was originally used to analyze partial differential equations.