Model Fusion via Optimal Transport

Oct-12-2019–arXiv.org Machine Learning

Combining different models is a widely used paradigm in machine learning applications. While the most common approach is to form an ensemble of models and average their individual predictions, this approach is often rendered infeasible by given resource constraints in terms of memory and computation, which grow linearly with the number of models. We present a layer-wise model fusion procedure for neural networks that utilizes optimal transport to (soft-) align neurons across the models before averaging their associated parameters. We discuss two main algorithms for fusing neural networks in this "one-shot" manner, without requiring any retraining. Finally, we illustrate on CIFAR10 and MNIST how this significantly outperforms vanilla averaging on convolutional networks, such as VGG11 and multi-layer perceptrons, and for transfer tasks even surpasses the performance of both original models.

alignment, neural network, neuron, (16 more...)

arXiv.org Machine Learning

Oct-12-2019

arXiv.org PDF

Add feedback

Country:
- Europe > Russia (0.04)
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - California > San Francisco County
    - San Francisco (0.14)
- Asia
  - Russia (0.04)
  - China (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Neural Networks
    - Perceptrons (0.54)
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found