ground cost
The Ground Cost for Optimal Transport of Angular Velocity
Elamvazhuthi, Karthik, Halder, Abhishek
We revisit the optimal transport problem over angular velocity dynamics given by the controlled Euler equation. The solution of this problem enables stochastic guidance of spin states of a rigid body (e.g., spacecraft) over hard deadline constraint by transferring a given initial state statistics to a desired terminal state statistics. This is an instance of generalized optimal transport over a nonlinear dynamical system. While prior work has reported existence-uniqueness and numerical solution of this dynamical optimal transport problem, here we present structural results about the equivalent Kantorovich a.k.a. optimal coupling formulation. Specifically, we focus on deriving the ground cost for the associated Kantorovich optimal coupling formulation. The ground cost equals to the cost of transporting unit amount of mass from a specific realization of the initial or source joint probability measure to a realization of the terminal or target joint probability measure, and determines the Kantorovich formulation. Finding the ground cost leads to solving a structured deterministic nonlinear optimal control problem, which is shown to be amenable to an analysis technique pioneered by Athans et. al. We show that such techniques have broader applicability in determining the ground cost (thus Kantorovich formulation) for a class of generalized optimal mass transport problems involving nonlinear dynamics with translated norm-invariant drift.
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > Iowa > Story County > Ames (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Review for NeurIPS paper: Geometric Dataset Distances via Optimal Transport
Additional Feedback: ###### POST REBUTTAL After reading the author's response, I increased my score by 1. I believe the general idea of using conditional distributions to compare datasets with no prior training / modeling assumptions is interesting and could lead to potentially interesting future research. Here is why I still think this is not a clear accept, and I hope these remarks will be addressed in the final version: 1) The experiments that were conducted in the paper were very clear and well illustrated, I expect that the naive methods (i), (ii), (iii) discussed in the rebuttal will be included for a quantitative comparison in transfer learning and the other applications and not just comparing the values of OTDD with different methods (fig 1 of the rebuttal) which is not informative; the order of magnitude does not tell anything on the discriminative power of a distance. Could it be explained by the fact the dimension of MNIST is large making Bures too costly to compute? Would you agree that for large d, Sinkhorn is better than OT-N and otherwise for large d? My main concern is that while these results are promising, no baseline was provided to quantify the performance gain of OTDD.
Path Structured Multimarginal Schr\"odinger Bridge for Probabilistic Learning of Hardware Resource Usage by Control Software
Bondar, Georgiy A., Gifford, Robert, Phan, Linh Thi Xuan, Halder, Abhishek
The solution of the path structured multimarginal Schr\"{o}dinger bridge problem (MSBP) is the most-likely measure-valued trajectory consistent with a sequence of observed probability measures or distributional snapshots. We leverage recent algorithmic advances in solving such structured MSBPs for learning stochastic hardware resource usage by control software. The solution enables predicting the time-varying distribution of hardware resource availability at a desired time with guaranteed linear convergence. We demonstrate the efficacy of our probabilistic learning approach in a model predictive control software execution case study. The method exhibits rapid convergence to an accurate prediction of hardware resource utilization of the controller. The method can be broadly applied to any software to predict cyber-physical context-dependent performance at arbitrary time.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
- North America > United States > Iowa > Story County > Ames (0.04)
Exploring Wasserstein Distance across Concept Embeddings for Ontology Matching
An, Yuan, Kalinowski, Alex, Greenberg, Jane
Measuring the distance between ontological elements is fundamental for ontology matching. String-based distance metrics are notorious for shallow syntactic matching. In this exploratory study, we investigate Wasserstein distance targeting continuous space that can incorporate various types of information. We use a pre-trained word embeddings system to embed ontology element labels. We examine the effectiveness of Wasserstein distance for measuring similarity between ontologies, and discovering and refining matchings between individual elements. Our experiments with the OAEI conference track and MSE benchmarks achieved competitive results compared to the leading systems.
On making optimal transport robust to all outliers
Optimal transport (OT) is known to be sensitive against outliers because of its marginal constraints. Outlier robust OT variants have been proposed based on the definition that outliers are samples which are expensive to move. In this paper, we show that this definition is restricted by considering the case where outliers are closer to the target measure than clean samples. We show that outlier robust OT fully transports these outliers leading to poor performances in practice. To tackle these outliers, we propose to detect them by relying on a classifier trained with adversarial training to classify source and target samples. A sample is then considered as an outlier if the prediction from the classifier is different from its assigned label. To decrease the influence of these outliers in the transport problem, we propose to either remove them from the problem or to increase the cost of moving them by using the classifier prediction. We show that we successfully detect these outliers and that they do not influence the transport problem on several experiments such as gradient flows, generative models and label propagation.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Quebec > Montreal (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (3 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Pushing the right boundaries matters! Wasserstein Adversarial Training for Label Noise
Damodaran, Bharath Bhushan, Fatras, Kilian, Lobry, Sylvain, Flamary, Rémi, Tuia, Devis, Courty, Nicolas
Noisy labels often occur in vision datasets, especially when they are issued from crowdsourcing or Web scraping. In this paper, we propose a new regularization method which enables one to learn robust classifiers in presence of noisy data. To achieve this goal, we augment the virtual adversarial loss with a Wasserstein distance. This distance allows us to take into account specific relations between classes by leveraging on the geometric properties of this optimal transport distance. Notably, we encode the class similarities in the ground cost that is used to compute the Wasserstein distance. As a consequence, we can promote smoothness between classes that are very dissimilar, while keeping the classification decision function sufficiently complex for similar classes. While designing this ground cost can be left as a problem-specific modeling task, we show in this paper that using the semantic relations between classes names already leads to good results.Our proposed Wasserstein Adversarial Training (WAT) outperforms state of the art on four datasets corrupted with noisy labels: three classical benchmarks and one real case in remote sensing image semantic segmentation.
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (6 more...)