AITopics | hierarchical optimal transport

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Neural Information Processing SystemsDec-26-2025, 02:10:29 GMT

In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which leverages clustered structure in data to improve alignment in noisy, ambiguous, or multimodal settings. To solve this numerically, we propose a distributed ADMM algorithm that also exploits the Sinkhorn distance, thus it has an efficient computational complexity that scales quadratically with the size of the largest cluster. When the transformation between two datasets is unitary, we provide performance guarantees that describe when and how well aligned cluster correspondences can be recovered with our formulation, as well as provide worst-case dataset geometry for such a strategy. We apply this method to synthetic datasets that model data as mixtures of low-rank Gaussians and study the impact that different geometric properties of the data have on alignment. Next, we applied our approach to a neural decoding application where the goal is to predict movement directions and instantaneous velocities from populations of neurons in the macaque primary motor cortex. Our results demonstrate that when clustered structure exists in datasets, and is consistent across trials or time points, a hierarchical alignment strategy that leverages such structure can provide significant improvements in cross-domain alignment.

dataset, hierarchical optimal transport, multimodal distribution alignment, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Hierarchical Optimal Transport for Document Representation

Neural Information Processing SystemsDec-25-2025, 16:56:13 GMT

The ability to measure similarity between documents enables intelligent summarization and analysis of large corpora. Past distances between documents suffer from either an inability to incorporate semantic similarities between words or from scalability issues. As an alternative, we introduce hierarchical optimal transport as a meta-distance between documents, where documents are modeled as distributions over topics, which themselves are modeled as distributions over words. We then solve an optimal transport problem on the smaller topic space to compute a similarity score. We give conditions on the topics under which this construction defines a distance, and we relate it to the word mover's distance. We evaluate our technique for k-NN classification and show better interpretability and scalability with comparable performance to current methods at a fraction of the cost.

document representation, hierarchical optimal transport, name change, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.41)
Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Adaptive Distribution Calibration for Few-Shot Learning with Hierarchical Optimal Transport

Neural Information Processing SystemsDec-23-2025, 23:32:38 GMT

Few-shot classification aims to learn a classifier to recognize unseen classes during training, where the learned model can easily become over-fitted based on the biased distribution formed by only a few training examples. A recent solution to this problem is calibrating the distribution of these few sample classes by transferring statistics from the base classes with sufficient examples, where how to decide the transfer weights from base classes to novel classes is the key. However, principled approaches for learning the transfer weights have not been carefully studied. To this end, we propose a novel distribution calibration method by learning the adaptive weight matrix between novel samples and base classes, which is built upon a hierarchical Optimal Transport (H-OT) framework. By minimizing the high-level OT distance between novel samples and base classes, we can view the learned transport plan as the adaptive weight information for transferring the statistics of base classes. The learning of the cost function between a base class and novel class in the high-level OT leads to the introduction of the low-level OT, which considers the weights of all the data samples in the base class. Experimental results on standard benchmarks demonstrate that our proposed plug-and-play model outperforms competing approaches and owns desired cross-domain generalization ability, indicating the effectiveness of the learned adaptive weights.

adaptive distribution calibration, base class, few-shot learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.59)

Add feedback

Reviews: Hierarchical Optimal Transport for Document Representation

Neural Information Processing SystemsJun-1-2025, 02:47:00 GMT

Originality: The authors clearly distinguish their work from previous efforts in the related work section. TMD seems to be the most similar work, at least thematically, but was not included as a baseline. Quality: The results and proofs are thorough. Classification is run on a variety of datasets against reasonable baselines. The classification performance numbers are supplemented with sensitivity analysis, runtime information, and two additional tasks (t-SNE visualizations and link prediction).

baseline, document representation, hierarchical optimal transport, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Hierarchical Optimal Transport for Document Representation

Neural Information Processing SystemsMay-27-2025, 13:08:26 GMT

The ability to measure similarity between documents enables intelligent summarization and analysis of large corpora. Past distances between documents suffer from either an inability to incorporate semantic similarities between words or from scalability issues. As an alternative, we introduce hierarchical optimal transport as a meta-distance between documents, where documents are modeled as distributions over topics, which themselves are modeled as distributions over words. We then solve an optimal transport problem on the smaller topic space to compute a similarity score. We give conditions on the topics under which this construction defines a distance, and we relate it to the word mover's distance.

document representation, hierarchical optimal transport, similarity

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Reviews: Hierarchical Optimal Transport for Multimodal Distribution Alignment

Neural Information Processing SystemsJan-27-2025, 17:13:19 GMT

The method essentially treats this problem as two nested transportation problems: one among the samples in pairs of clusters and one across the clusters themselves.

artificial intelligence, natural language, optimal transport, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.30)

Add feedback

Reviews: Hierarchical Optimal Transport for Document Representation

Neural Information Processing SystemsJan-25-2025, 11:13:55 GMT

This paper proposes a distance metric for documents. The proposed solution is to combine latent topics from topic models with the idea of using geometry from word embeddings to compute distances between pairs of documents (as in the WMD metric). First topics are computed, and WMD is performed at the topic level as opposed to the word level. The hypothesis presented is that modeling documents by their representative topics is better for highlighting differences despite the loss in resolution and is similar to how a person would do this task: breaking down each document into concepts, and then comparing the concepts. Since the topics are precomputed for a given corpus, speed up is gained at inference time when computing document similarities.

document representation, hierarchical optimal transport

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Neural Information Processing SystemsOct-11-2024, 03:23:07 GMT

In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which leverages clustered structure in data to improve alignment in noisy, ambiguous, or multimodal settings. To solve this numerically, we propose a distributed ADMM algorithm that also exploits the Sinkhorn distance, thus it has an efficient computational complexity that scales quadratically with the size of the largest cluster. When the transformation between two datasets is unitary, we provide performance guarantees that describe when and how well aligned cluster correspondences can be recovered with our formulation, as well as provide worst-case dataset geometry for such a strategy.

dataset, hierarchical optimal transport, multimodal distribution alignment, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Adaptive Distribution Calibration for Few-Shot Learning with Hierarchical Optimal Transport

Neural Information Processing SystemsOct-10-2024, 12:35:41 GMT

Few-shot classification aims to learn a classifier to recognize unseen classes during training, where the learned model can easily become over-fitted based on the biased distribution formed by only a few training examples. A recent solution to this problem is calibrating the distribution of these few sample classes by transferring statistics from the base classes with sufficient examples, where how to decide the transfer weights from base classes to novel classes is the key. However, principled approaches for learning the transfer weights have not been carefully studied. To this end, we propose a novel distribution calibration method by learning the adaptive weight matrix between novel samples and base classes, which is built upon a hierarchical Optimal Transport (H-OT) framework. By minimizing the high-level OT distance between novel samples and base classes, we can view the learned transport plan as the adaptive weight information for transferring the statistics of base classes. The learning of the cost function between a base class and novel class in the high-level OT leads to the introduction of the low-level OT, which considers the weights of all the data samples in the base class.

adaptive distribution calibration, base class, hierarchical optimal transport, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)

Add feedback

Hierarchical Optimal Transport for Document Representation

Neural Information Processing SystemsOct-10-2024, 11:23:13 GMT

The ability to measure similarity between documents enables intelligent summarization and analysis of large corpora. Past distances between documents suffer from either an inability to incorporate semantic similarities between words or from scalability issues. As an alternative, we introduce hierarchical optimal transport as a meta-distance between documents, where documents are modeled as distributions over topics, which themselves are modeled as distributions over words. We then solve an optimal transport problem on the smaller topic space to compute a similarity score. We give conditions on the topics under which this construction defines a distance, and we relate it to the word mover's distance.

document representation, hierarchical optimal transport, similarity

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Filters

Collaborating Authors

hierarchical optimal transport

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Hierarchical Optimal Transport for Document Representation

Adaptive Distribution Calibration for Few-Shot Learning with Hierarchical Optimal Transport

Reviews: Hierarchical Optimal Transport for Document Representation

Hierarchical Optimal Transport for Document Representation

Reviews: Hierarchical Optimal Transport for Multimodal Distribution Alignment

Reviews: Hierarchical Optimal Transport for Document Representation

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Adaptive Distribution Calibration for Few-Shot Learning with Hierarchical Optimal Transport

Hierarchical Optimal Transport for Document Representation