AITopics | transportation map

Collaborating Authors

transportation map

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging

Verma, Neha, Murray, Kenton, Duh, Kevin

arXiv.org Artificial IntelligenceJul-8-2025

Model compression offers a promising path to reducing the cost and inaccessibility of large pre-trained models, without significantly compromising their impressive performance. Large Transformer models, including large language models (LLMs), often contain computational redundancy, which can serve as a target for new model compression methods. In this work, we specifically target neuron-level redundancies in model layers by combining groups of similar neurons into fewer neurons. We frame this width reduction as a Discrete Optimal Transport problem, and propose DOTResize, a novel Transformer compression method that uses optimal transport theory to transform and compress model weights. To ensure applicability within the Transformer architecture, we motivate and incorporate entropic regularization and matrix factorization into the transportation maps produced by our method. Unlike pruning-based approaches which discard neurons based on importance measures, DOTResize re-projects the entire neuron width, allowing the retention and redistribution of useful signal across the reduced layer. Empirical results show that compared to simple or state-of-the-art neuron width-pruning techniques, DOTResize can outperform these methods across multiple LLM families and sizes, while achieving measurable reductions in real-world computational cost.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.04517

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Sparse Domain Transfer via Elastic Net Regularization

Zhang, Jingwei, Farnia, Farzan

arXiv.org Artificial IntelligenceMay-13-2024

Transportation of samples across different domains is a central task in several machine learning problems. A sensible requirement for domain transfer tasks in computer vision and language domains is the sparsity of the transportation map, i.e., the transfer algorithm aims to modify the least number of input features while transporting samples across the source and target domains. In this work, we propose Elastic Net Optimal Transport (ENOT) to address the sparse distribution transfer problem. The ENOT framework utilizes the $L_1$-norm and $L_2$-norm regularization mechanisms to find a sparse and stable transportation map between the source and target domains. To compute the ENOT transport map, we consider the dual formulation of the ENOT optimization task and prove that the sparsified gradient of the optimal potential function in the ENOT's dual representation provides the ENOT transport map. Furthermore, we demonstrate the application of the ENOT framework to perform feature selection for sparse domain transfer. We present the numerical results of applying ENOT to several domain transfer problems for synthetic Gaussian mixtures and real image and text data. Our empirical results indicate the success of the ENOT framework in identifying a sparse domain transport map.

enot, movie, potential function, (12 more...)

arXiv.org Artificial Intelligence

2405.07489

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (0.96)
Health & Medicine (0.93)
Leisure & Entertainment (0.71)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semidefinite Relaxations of the Gromov-Wasserstein Distance

Chen, Junyu, Nguyen, Binh T., Soh, Yong Sheng

arXiv.org Machine LearningDec-26-2023

The Gromov-Wasserstein (GW) distance is a variant of the optimal transport problem that allows one to match objects between incomparable spaces. At its core, the GW distance is specified as the solution of a non-convex quadratic program and is not known to be tractable to solve. In particular, existing solvers for the GW distance are only able to find locally optimal solutions. In this work, we propose a semi-definite programming (SDP) relaxation of the GW distance. The relaxation can be viewed as the dual of the GW distance augmented with constraints that relate the linear and quadratic terms of transportation maps. Our relaxation provides a principled manner to compute the approximation ratio of any transport map to the global optimal solution. Finally, our numerical experiments suggest that the proposed relaxation is strong in that it frequently computes the global optimal solution, together with a proof of global optimality.

gw-sdp, relaxation, vec, (12 more...)

arXiv.org Machine Learning

2312.14572

Country:

Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Express Construction for GANs from Latent Representation to Data Distribution

#artificialintelligenceJun-26-2022, 14:00:30 GMT

Generative Adversarial Networks (GANs) are powerful generative models for numerous tasks and datasets. However, most of the existing models suffer from mode collapse. The most recent research indicates that the reason for it is that the optimal transportation map from random noise to the data distribution is discontinuous, but deep neural networks (DNNs) can only approximate continuous ones. Instead, the latent representation is a better raw material used to construct a transportation map point to the data distribution than random noise. Because it is a low-dimensional mapping related to the data distribution, the construction procedure seems more like expansion rather than starting all over. Besides, we can also search for more transportation maps in this way with smoother transformation. Thus, we have proposed a new training methodology for GANs in this paper to search for more transportation maps and speed the training up, named Express Construction. The key idea is to train GANs with two independent phases for successively yielding latent representation and data distribution. To this end, an Auto-Encoder is trained to map the real data into the latent space, and two couples of generators and discriminators are used to produce them. To the best of our knowledge, we are the first to decompose the training procedure of GAN models into two more uncomplicated phases, thus tackling the mode collapse problem without much more computational cost. We also provide theoretical steps toward understanding the training dynamics of this procedure and prove assumptions. No extra hyper-parameters have been used in the proposed method, which indicates that Express Construction can be used to train any GAN models. Extensive experiments are conducted to verify the performance of realistic image generation and the resistance to mode collapse. The results show that the proposed method is lightweight, effective, and less prone to mode collapse.

data distribution, express construction, latent representation, (6 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Wasserstein Fair Classification

Jiang, Ray, Pacchiano, Aldo, Stepleton, Tom, Jiang, Heinrich, Chiappa, Silvia

arXiv.org Machine LearningJul-28-2019

We propose an approach to fair classification that enforces independence between the classifier outputs and sensitive information by minimizing Wasserstein-1 distances. The approach has desirable theoretical properties and is robust to specific choices of the threshold used to obtain class predictions from model outputs. We introduce different methods that enable hiding sensitive information at test time or have a simple and fast implementation. We show empirical performance against different fairness baselines on several benchmark fairness datasets.

artificial intelligence, international conference, machine learning, (15 more...)

arXiv.org Machine Learning

1907.12059

Genre: Research Report (0.52)

Industry: Information Technology > Security & Privacy (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

k-GANs: Ensemble of Generative Models with Semi-Discrete Optimal Transport

Ambrogioni, Luca, Güçlü, Umut, van Gerven, Marcel

arXiv.org Machine LearningJul-9-2019

Generative adversarial networks (GANs) are the state of the art in generative modeling. Unfortunately, most GAN methods are susceptible to mode collapse, meaning that they tend to capture only a subset of the modes of the true distribution. A possible way of dealing with this problem is to use an ensemble of GANs, where (ideally) each network models a single mode. In this paper, we introduce a principled method for training an ensemble of GANs using semi-discrete optimal transport theory. In our approach, each generative network models the transportation map between a point mass (Dirac measure) and the restriction of the data distribution on a tile of a Voronoi tessellation that is defined by the location of the point masses. We iteratively train the generative networks and the point masses until convergence. The resulting k-GANs algorithm has strong theoretical connection with the k-medoids algorithm. In our experiments, we show that our ensemble method consistently outperforms baseline GANs.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1907.0405

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections

Muzellec, Boris, Cuturi, Marco

arXiv.org Machine LearningMay-24-2019

Sliced Wasserstein metrics between probability measures solve the optimal transport (OT) problem on univariate projections, and average such maps across projections. The recent interest for the SW distance shows that much can be gained by looking at optimal maps between measures in smaller subspaces, as opposed to the curse-of-dimensionality price one has to pay in higher dimensions. Any transport estimated in a subspace remains, however, an object that can only be used in that subspace. We propose in this work two methods to extrapolate, from an transport map that is optimal on a subspace, one that is nearly optimal in the entire space. We prove that the best optimal transport plan that takes such "subspace detours" is a generalization of the Knothe-Rosenblatt transport. We show that these plans can be explicitly formulated when comparing Gaussians measures (between which the Wasserstein distance is usually referred to as the Bures or Fr\'echet distance). Building from there, we provide an algorithm to select optimal subspaces given pairs of Gaussian measures, and study scenarios in which that mediating subspace can be selected using prior information. We consider applications to NLP and evaluation of image quality (FID scores).

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

1905.10099

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Mode Collapse and Regularity of Optimal Transportation Maps

lei, Na, Guo, Yang, An, Dongsheng, Qi, Xin, Luo, Zhongxuan, Yau, Shing-Tung, Gu, Xianfeng

arXiv.org Machine LearningFeb-7-2019

This work builds the connection between the regularity theory of optimal transportation map, Monge-Amp\`{e}re equation and GANs, which gives a theoretic understanding of the major drawbacks of GANs: convergence difficulty and mode collapse. According to the regularity theory of Monge-Amp\`{e}re equation, if the support of the target measure is disconnected or just non-convex, the optimal transportation mapping is discontinuous. General DNNs can only approximate continuous mappings. This intrinsic conflict leads to the convergence difficulty and mode collapse in GANs. We test our hypothesis that the supports of real data distribution are in general non-convex, therefore the discontinuity is unavoidable using an Autoencoder combined with discrete optimal transportation map (AE-OT framework) on the CelebA data set. The testing result is positive. Furthermore, we propose to approximate the continuous Brenier potential directly based on discrete Brenier theory to tackle mode collapse. Comparing with existing method, this method is more accurate and effective.

mode collapse, optimal transportation map, transportation map, (13 more...)

arXiv.org Machine Learning

1902.02934

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Brenier approach for optimal transportation between a quasi-discrete measure and a discrete measure

Lu, Ying, Chen, Liming, Saidi, Alexandre, Gu, Xianfeng

arXiv.org Machine LearningJan-17-2018

Correctly estimating the discrepancy between two data distributions has always been an important task in Machine Learning. Recently, Cuturi proposed the Sinkhorn distance which makes use of an approximate Optimal Transport cost between two distributions as a distance to describe distribution discrepancy. Although it has been successfully adopted in various machine learning applications (e.g. in Natural Language Processing and Computer Vision) since then, the Sinkhorn distance also suffers from two unnegligible limitations. The first one is that the Sinkhorn distance only gives an approximation of the real Wasserstein distance, the second one is the `divide by zero' problem which often occurs during matrix scaling when setting the entropy regularization coefficient to a small value. In this paper, we introduce a new Brenier approach for calculating a more accurate Wasserstein distance between two discrete distributions, this approach successfully avoids the two limitations shown above for Sinkhorn distance and gives an alternative way for estimating distribution discrepancy.

brenier approach, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1801.05574

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Geometric View of Optimal Transportation and Generative Model

Lei, Na, Su, Kehua, Cui, Li, Yau, Shing-Tung, Gu, David Xianfeng

arXiv.org Machine LearningDec-18-2017

In this work, we show the intrinsic relations between optimal transportation and convex geometry, especially the variational approach to solve Alexandrov problem: constructing a convex polytope with prescribed face normals and volumes. This leads to a geometric interpretation to generative models, and leads to a novel framework for generative models. By using the optimal transportation view of GAN model, we show that the discriminator computes the Kantorovich potential, the generator calculates the transportation map. For a large class of transportation costs, the Kantorovich potential can give the optimal transportation map by a close-form formula. Therefore, it is sufficient to solely optimize the discriminator. This shows the adversarial competition can be avoided, and the computational architecture can be simplified. Preliminary experimental results show the geometric method outperforms WGAN for approximating probability measures with multiple clusters in low dimensional space.

arxiv preprint arxiv, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1710.05488

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback